[galaxy-user] there was a wrong link in my previous mail - gtf file issues
The correct link: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919 Previous mail: I am a non-programmer working on Prochlorococcus (a marine bacteria) for which UCSC and Ensembl do not yet have genome/transcriptome available or uploaded. However, the genome and transcriptome of this organism have been solved and annotated and are available on microbes online (http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919). I have been trying to run transcriptome analyses using cufflinks, for which I need gtf files of the transcriptome. Microbes online has tab delimited files and I have been trying to convert them to gtf files using excel. Basically, I reorganized the data so that the first 8 columns seem fine when uploaded to galaxy. The way I have been doing this is to save the file as a tab delimited excel file, and then upload the file onto Galaxy by "telling" galaxy that it is a gtf file (instead of allowing galaxy to identify the file type itself using the auto-detect function) when using the file upload option. However, when I do this, I cant get the 9th column (attributes) to work. I have tried either to separate the attributes in the 9th column in my excel spreadsheet by either a space or a tab (using concatenation with the char(9) function which I understand encodes a tab in excel). In all cases, when I upload to galaxy by identifying the .txt file as a .gtf file, the 9th column splits into columns 9,10,11, etc when I use a char(9) function in excel) or I get an error message from cufflinks (An error occurred running this job: cufflinks v1.0.3 cufflinks -q --no-update-check -I 30 -F 0.05 -j 0.05 -p 8 -G /galaxy/main_database/files/003/377/dataset_3377315.dat Error running cufflinks. [Errno 2] No such file or directory: 'transcripts.gtf') when I use spaces to separate the attributes. I would be happy to know whether there is a way to convert my tab delimited transcriptome file from Microbes Online to a gtf file (either by excel or another program) which would enable me to use galaxy's NGS functions on Prochlorococcus. Additionally, when I upload my data files I am able to choose the prochlorococcus genome on galaxy (genome 213 in the 'upload file' option), but am unable to chose it from the reference genome list when performing tophat on galaxy. This may solve the problem (or may be part of the same issue). Many thanks, Noa Sher ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] there was a wrong link in my previous mail - gtf file issues
There are at least six of them there. Which one ? - Original Message - From: Noa Sher noa.s...@gmail.com To: Hiram Clawson hi...@soe.ucsc.edu Cc: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 11:10:06 AM Subject: Re: [galaxy-user] there was a wrong link in my previous mail - gtf fileissues Hi Hiram, I was trying to work with the tab delineated file (using the link under export genomic data). Thanks noa On 04/12/2011 20:29, Hiram Clawson wrote: Good Morning Noa: Which one of the files at microbesonline are you trying to work with ? --Hiram - Original Message - From: Noa Sher noa.s...@gmail.com To: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 1:38:35 AM Subject: [galaxy-user] there was a wrong link in my previous mail - gtf file issues The correct link: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] there was a wrong link in my previous mail - gtf file issues
Is this the genome you are working with: http://archaea.ucsc.edu/cgi-bin/hgGateway?db=procMari_CCMP1375 - Original Message - From: Noa Sher noa.s...@gmail.com To: Hiram Clawson hi...@soe.ucsc.edu Cc: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 11:10:06 AM Subject: Re: [galaxy-user] there was a wrong link in my previous mail - gtf fileissues Hi Hiram, I was trying to work with the tab delineated file (using the link under export genomic data). Thanks noa On 04/12/2011 20:29, Hiram Clawson wrote: Good Morning Noa: Which one of the files at microbesonline are you trying to work with ? --Hiram - Original Message - From: Noa Sher noa.s...@gmail.com To: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 1:38:35 AM Subject: [galaxy-user] there was a wrong link in my previous mail - gtf file issues The correct link: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] there was a wrong link in my previous mail - gtf file issues
Pardon me, I see there is only one that says tab-delimited file. That is a tough one to decode. It almost looks like GTF already, but not quite. If we take it as a simple file of annotations on the genome, without structure such as exons, introns, and merely rework the columns to turn it into a bed file. Extract columns in this order: 4,5,6,2,7 to get a bed file with the accession identities: awk -F'\t' '{printf %s\t%d\t%d\t%s\t%s\n, $4,$5,$6,$2,$7}' 59919.tab 59919.bed It would take some time to figure out how to convert this file to something useful since I am not familiar with the format. I can't see immediately how to use it properly. --Hiram - Original Message - From: Noa Sher noa.s...@gmail.com To: Hiram Clawson hi...@soe.ucsc.edu Cc: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 11:10:06 AM Subject: Re: [galaxy-user] there was a wrong link in my previous mail - gtf fileissues Hi Hiram, I was trying to work with the tab delineated file (using the link under export genomic data). Thanks noa On 04/12/2011 20:29, Hiram Clawson wrote: Good Morning Noa: Which one of the files at microbesonline are you trying to work with ? --Hiram - Original Message - From: Noa Sher noa.s...@gmail.com To: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 1:38:35 AM Subject: [galaxy-user] there was a wrong link in my previous mail - gtf file issues The correct link: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] there was a wrong link in my previous mail - gtf file issues
Hi Hiram, I managed to extract the columns in a different order (albeit I did it in excel and not using command line) but then the 9th column (attributes) of gtf is what I had problems with Thanks noa On 04/12/2011 21:43, Hiram Clawson wrote: Pardon me, I see there is only one that says "tab-delimited" file. That is a tough one to decode. It almost looks like GTF already, but not quite. If we take it as a simple file of annotations on the genome, without structure such as exons, introns, and merely rework the columns to turn it into a bed file. Extract columns in this order: 4,5,6,2,7 to get a bed file with the accession identities: awk -F'\t' '{printf "%s\t%d\t%d\t%s\t%s\n", $4,$5,$6,$2,$7}' 59919.tab 59919.bed It would take some time to figure out how to convert this file to something useful since I am not familiar with the format. I can't see immediately how to use it properly. --Hiram - Original Message - From: "Noa Sher" noa.s...@gmail.com To: "Hiram Clawson" hi...@soe.ucsc.edu Cc: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 11:10:06 AM Subject: Re: [galaxy-user] there was a wrong link in my previous mail - gtf file issues Hi Hiram, I was trying to work with the tab delineated file (using the link under export genomic data). Thanks noa On 04/12/2011 20:29, Hiram Clawson wrote: Good Morning Noa: Which one of the files at microbesonline are you trying to work with ? --Hiram - Original Message - From: "Noa Sher" noa.s...@gmail.com To: galaxy-user@lists.bx.psu.edu Sent: Sunday, December 4, 2011 1:38:35 AM Subject: [galaxy-user] there was a wrong link in my previous mail - gtf file issues The correct link: http://www.microbesonline.org/cgi-bin/genomeInfo.cgi?tId=59919 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/