I am a non-programmer working on Prochlorococcus (a marine
bacteria) for which UCSC and Ensembl do not yet have
genome/transcriptome available or uploaded. However, the genome
and transcriptome of this organism have been solved and annotated
and are available on microbes online (http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=511145).
I have been trying to run transcriptome analyses using cufflinks, for which I need gtf files of the transcriptome. Microbes online has tab delimited files and I have been trying to convert them to gtf files using excel. Basically, I reorganized the data so that the first 8 columns seem fine when uploaded to galaxy. The way I have been doing this is to save the file as a tab delimited excel file, and then upload the file onto Galaxy by "telling" galaxy that it is a gtf file (instead of allowing galaxy to identify the file type itself using the auto-detect function) when using the file upload option. However, when I do this, I cant get the 9th column (attributes) to work.
I have tried either to separate the attributes in the 9th
column in my excel spreadsheet by either a space or a tab (using
concatenation with the char(9) function which I understand encodes
a tab in excel). In all cases, when I upload to galaxy by
identifying the .txt file as a .gtf file, the 9th column splits
into columns 9,10,11, etc when I use a char(9) function in excel)
or I get an error message from cufflinks (An error occurred
running this job: cufflinks v1.0.3 cufflinks -q
--no-update-check -I 300000 -F 0.050000 -j 0.050000 -p 8 -G
running cufflinks. [Errno 2] No such file or directory:
'transcripts.gtf') when I use spaces to separate the
Additionally, when I upload my data files I am able to choose the prochlorococcus genome on galaxy (genome 213 in the 'upload file' option), but am unable to chose it from the reference genome list when performing tophat on galaxy. This may solve the problem (or may be part of the same issue).
___________________________________________________________ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using "reply all" in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/