[galaxy-user] Extract genomic DNA
Hi all, Does anyone know if you can use the Extract Genomic DNA command with a genome not in the database? I am working with an algal genome (C. merolae) that isn't currently in the pulldown Database/Build menu. I keep getting the Unspecified genome build error, and am assuming that's the problem, as my other files appear to be formatted correctly (tab delimited without spaces for the intervals, same names for chromosomes in interval and fasta file, etc). Thanks! Rebecca ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract genomic DNA
Rebecca, You should be able to use a custom genome with this tool by selecting History from the Source for Genomic Data parameter. The bug you're describing has, to the best of my knowledge, been fixed in Galaxy and should not be present anymore. On which Galaxy instance are you seeing this issue? If this is not the main Galaxy server ( http://main.g2.bx.psu.edu/ ), you'll want to contact the maintainers of the instance that you're using and ask them to update the instance. Best, J. On Dec 14, 2011, at 1:31 PM, Rebecca C Mueller wrote: Hi all, Does anyone know if you can use the Extract Genomic DNA command with a genome not in the database? I am working with an algal genome (C. merolae) that isn't currently in the pulldown Database/Build menu. I keep getting the Unspecified genome build error, and am assuming that's the problem, as my other files appear to be formatted correctly (tab delimited without spaces for the intervals, same names for chromosomes in interval and fasta file, etc). Thanks! Rebecca ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract genomic DNA-strand information is not recognized
Hi Sarah, One of the specifications of BED format is that the coordinates are with respect to the forward strand. BED format originated at UCSC, and this is their full specification: http://genome.ucsc.edu/FAQ/FAQformat.html#format1 And Galaxy's summary (also on tool forms that accept BED format): http://galaxyproject.org/wiki/Learn/Datatypes#Bed The rules to transform data in other coordinate formats to BED is explained in detail in this UCSC wiki document: http://genomewiki.ucsc.edu/index.php/Coordinate_Transforms There are no Galaxy wrapped automated tools to do this transformation, but perhaps someone on the mailing list has a workflow to offer. If not, the tools in Galaxy under Text Manipulation and Filter and Sort and a file containing the length of each chromosome can very likely be used in combination to perform the calculations (in several steps). If you create a process to do this, be sure to considering publishing the workflow for others to use. Hopefully this helps, Best, Jen Galaxy team On 10/24/11 7:20 PM, Sarah wrote: Hello, I am trying to extract sequences from a FASTA file containing genomic information. The coordinates are in a tab-delimited format, which is recognized as BED format by Galaxy (meaning that the 6th column is correctly interpreted as 6. Strand). However, upon running Fetch sequences , Extract Genomic DNA only the +-strand information is included in the output FASTA file and I receive the following ERROR message: 1,431 sequences format: fasta, database: ? http://main.g2.bx.psu.edu/datasets/2cccb18df8c9d753/edit Info: 1476 warnings, 1st is: Invalid interval, start '1616' end '1177'. Skipped 1476 invalid lines, 1st is #2, scaffold1 1616 1177 Fom - 1 Is this a bug? How can I can adjust my input data files to get the --strand sequences as well? I have seen a similar problem in an earlier posting and there it was suggested to manually adjust the strand information column 5, but this did not work for me neither. Many thanks for your all help! Sarah ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Extract genomic DNA-strand information is not recognized
Hello, I am trying to extract sequences from a FASTA file containing genomic information. The coordinates are in a tab-delimited format, which is recognized as BED format by Galaxy (meaning that the 6th column is correctly interpreted as 6. Strand). However, upon running Fetch sequences , Extract Genomic DNA only the +-strand information is included in the output FASTA file and I receive the following ERROR message: 1,431 sequences format: fasta, database: ? Info: 1476 warnings, 1st is: Invalid interval, start '1616' end '1177'. Skipped 1476 invalid lines, 1st is #2, scaffold1 16161177Fom - 1 Is this a bug? How can I can adjust my input data files to get the --strand sequences as well? I have seen a similar problem in an earlier posting and there it was suggested to manually adjust the strand information column 5, but this did not work for me neither. Many thanks for your all help! Sarah___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi jen, In your original dataset, there are extra spaces around the tabs. Where ^I indicates a tab and $ indicates an end-of-line character, the entire datafile looks like this: chr5 ^I 47258168 ^I 47259240$ chr18 ^I 1938527 ^I 1939965$ chr2 ^I 101973625 ^I 101974007$ chr4 ^I 75653898 ^I 75674045$ chr19 ^I 4258837 ^I 4263299$ chr4 ^I 39330049 ^I 39372715$ chr4 ^I 9606881 ^I 9610083$ chr15 ^I 7264937 ^I 7265599$ chr21 ^I 6659189 ^I 6667015$ chr2 ^I 351239 ^I 352821$ This could have been introduced in many ways, which is why the tools in Text Manipulation can be so handy. Hopefully this helps! Ah I see. Thanks for the tip. Steve Best, Jen Galaxy team On 6/21/11 7:58 AM, Stephen Taylor wrote: Hi Jeremy, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Thanks, this works, but as a user I cannot see (but obviously you can :-)) that there is a difference between my original and the one I did step (a) and (b) on. I thought I had uploaded a bed file and converted tabs to spaces. The data is shared here: http://main.g2.bx.psu.edu/u/stephentaylor/h/test Not working is my original Working is the new data that I did (a) and (b) on. What did I miss? Thanks, Steve Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 47258168 47259240 chr18 1938527 1939965 chr2 101973625 101974007 chr4 75653898 75674045 chr19 4258837 4263299 chr4 39330049 39372715 chr4 9606881 9610083 chr15 7264937 7265599 chr21 6659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi all, i am not sure if this is the right forum. but i need as much access to start up information on the system for developers. we are part of a group in Kenya trying to come up with some apps on a gnome sequencing project we are starting here. regards -- Fabian .J. Owuor www.adelphitrading.co.ke | www.iHub.co.ke (red member) www.four99.co.ke | www.fabian.me.ke | www.epiphany.co.ke Yu: +254 753.333.824 | Safaricom: +254 721.948.852 | Skype: kootie Orange: +254 772.189.962 | Airtel: +254 735.270.852 | tweet: @theusfab ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Extract Genomic DNA Problem
Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 4725816847259240 chr181938527 1939965 chr2 101973625 101974007 chr4 7565389875674045 chr194258837 4263299 chr4 3933004939372715 chr4 9606881 9610083 chr157264937 7265599 chr216659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Stephen, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 4725816847259240 chr18 1938527 1939965 chr2 101973625 101974007 chr4 7565389875674045 chr19 4258837 4263299 chr4 3933004939372715 chr4 9606881 9610083 chr15 7264937 7265599 chr21 6659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi Jeremy, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Thanks, this works, but as a user I cannot see (but obviously you can :-)) that there is a difference between my original and the one I did step (a) and (b) on. I thought I had uploaded a bed file and converted tabs to spaces. The data is shared here: http://main.g2.bx.psu.edu/u/stephentaylor/h/test Not working is my original Working is the new data that I did (a) and (b) on. What did I miss? Thanks, Steve Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 4725816847259240 chr181938527 1939965 chr2 101973625 101974007 chr4 7565389875674045 chr194258837 4263299 chr4 3933004939372715 chr4 9606881 9610083 chr157264937 7265599 chr216659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi Stephen, In your original dataset, there are extra spaces around the tabs. Where ^I indicates a tab and $ indicates an end-of-line character, the entire datafile looks like this: chr5 ^I 47258168 ^I 47259240$ chr18 ^I 1938527 ^I 1939965$ chr2 ^I 101973625 ^I 101974007$ chr4 ^I 75653898 ^I 75674045$ chr19 ^I 4258837 ^I 4263299$ chr4 ^I 39330049 ^I 39372715$ chr4 ^I 9606881 ^I 9610083$ chr15 ^I 7264937 ^I 7265599$ chr21 ^I 6659189 ^I 6667015$ chr2 ^I 351239 ^I 352821$ This could have been introduced in many ways, which is why the tools in Text Manipulation can be so handy. Hopefully this helps! Best, Jen Galaxy team On 6/21/11 7:58 AM, Stephen Taylor wrote: Hi Jeremy, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Thanks, this works, but as a user I cannot see (but obviously you can :-)) that there is a difference between my original and the one I did step (a) and (b) on. I thought I had uploaded a bed file and converted tabs to spaces. The data is shared here: http://main.g2.bx.psu.edu/u/stephentaylor/h/test Not working is my original Working is the new data that I did (a) and (b) on. What did I miss? Thanks, Steve Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 47258168 47259240 chr18 1938527 1939965 chr2 101973625 101974007 chr4 75653898 75674045 chr19 4258837 4263299 chr4 39330049 39372715 chr4 9606881 9610083 chr15 7264937 7265599 chr21 6659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi Jen, Where do you get your AXT or NIB files in order to do the extract genome operation? I understand that extract genomic DNA is dependent on those files and correct paths/files in AlignSec.loc? This is for our local instance of Galaxy. -John From: galaxy-user-boun...@lists.bx.psu.edu [galaxy-user-boun...@lists.bx.psu.edu] On Behalf Of Jennifer Jackson [j...@bx.psu.edu] Sent: Tuesday, June 21, 2011 10:23 AM To: Stephen Taylor Cc: galaxy-u...@bx.psu.edu Subject: Re: [galaxy-user] Extract Genomic DNA Problem Hi Stephen, In your original dataset, there are extra spaces around the tabs. Where ^I indicates a tab and $ indicates an end-of-line character, the entire datafile looks like this: chr5 ^I 47258168 ^I 47259240$ chr18 ^I 1938527 ^I 1939965$ chr2 ^I 101973625 ^I 101974007$ chr4 ^I 75653898 ^I 75674045$ chr19 ^I 4258837 ^I 4263299$ chr4 ^I 39330049 ^I 39372715$ chr4 ^I 9606881 ^I 9610083$ chr15 ^I 7264937 ^I 7265599$ chr21 ^I 6659189 ^I 6667015$ chr2 ^I 351239 ^I 352821$ This could have been introduced in many ways, which is why the tools in Text Manipulation can be so handy. Hopefully this helps! Best, Jen Galaxy team On 6/21/11 7:58 AM, Stephen Taylor wrote: Hi Jeremy, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Thanks, this works, but as a user I cannot see (but obviously you can :-)) that there is a difference between my original and the one I did step (a) and (b) on. I thought I had uploaded a bed file and converted tabs to spaces. The data is shared here: http://main.g2.bx.psu.edu/u/stephentaylor/h/test Not working is my original Working is the new data that I did (a) and (b) on. What did I miss? Thanks, Steve Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 47258168 47259240 chr18 1938527 1939965 chr2 101973625 101974007 chr4 75653898 75674045 chr19 4258837 4263299 chr4 39330049 39372715 chr4 9606881 9610083 chr15 7264937 7265599 chr21 6659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] Extract Genomic DNA Problem
Hi John, Currently, these come from the UCSC Genome Browser's download area. http://hgdownload.cse.ucsc.edu/downloads.html AXT files usually come from the primary data source. UCSC provides tools to create NIB, 2bit, etc. files from fasta (found under Source section, same link as above). If you have other genomes not in their primarily vertebrate dataset that you want to use with tools that require these formats, the tools would allow you to create your own. The Galaxy team, too, has been discussing creating files like this for a wider range of genomes, but no specific plans are in place yet. Please let us know if we can help more, Best, Jen Galaxy team On 6/21/11 10:24 AM, John David Osborne wrote: Hi Jen, Where do you get your AXT or NIB files in order to do the extract genome operation? I understand that extract genomic DNA is dependent on those files and correct paths/files in AlignSec.loc? This is for our local instance of Galaxy. -John From: galaxy-user-boun...@lists.bx.psu.edu [galaxy-user-boun...@lists.bx.psu.edu] On Behalf Of Jennifer Jackson [j...@bx.psu.edu] Sent: Tuesday, June 21, 2011 10:23 AM To: Stephen Taylor Cc: galaxy-u...@bx.psu.edu Subject: Re: [galaxy-user] Extract Genomic DNA Problem Hi Stephen, In your original dataset, there are extra spaces around the tabs. Where ^I indicates a tab and $ indicates an end-of-line character, the entire datafile looks like this: chr5 ^I 47258168 ^I 47259240$ chr18 ^I 1938527 ^I 1939965$ chr2 ^I 101973625 ^I 101974007$ chr4 ^I 75653898 ^I 75674045$ chr19 ^I 4258837 ^I 4263299$ chr4 ^I 39330049 ^I 39372715$ chr4 ^I 9606881 ^I 9610083$ chr15 ^I 7264937 ^I 7265599$ chr21 ^I 6659189 ^I 6667015$ chr2 ^I 351239 ^I 352821$ This could have been introduced in many ways, which is why the tools in Text Manipulation can be so handy. Hopefully this helps! Best, Jen Galaxy team On 6/21/11 7:58 AM, Stephen Taylor wrote: Hi Jeremy, This is a formatting issue with your input file; it needs to be tab-delimited but it's not currently. You'll need to: (a) convert spaces to tabs using the Convert delimiters to Tabs tool; (b) click on the pencil icon and set the data type to BED. Thanks, this works, but as a user I cannot see (but obviously you can :-)) that there is a difference between my original and the one I did step (a) and (b) on. I thought I had uploaded a bed file and converted tabs to spaces. The data is shared here: http://main.g2.bx.psu.edu/u/stephentaylor/h/test Not working is my original Working is the new data that I did (a) and (b) on. What did I miss? Thanks, Steve Best, J. On Jun 21, 2011, at 8:45 AM, Stephen Taylor wrote: Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 47258168 47259240 chr18 1938527 1939965 chr2 101973625 101974007 chr4 75653898 75674045 chr19 4258837 4263299 chr4 39330049 39372715 chr4 9606881 9610083 chr15 7264937 7265599 chr21 6659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org/ http://galaxyproject.org/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy
[galaxy-user] Extract Genomic DNA Problem
Hi, I was trying to extract FASTA sequences using the following tab separated data for Chicken on the Galaxy Main server: chr5 4725816847259240 chr181938527 1939965 chr2 101973625 101974007 chr4 7565389875674045 chr194258837 4263299 chr4 3933004939372715 chr4 9606881 9610083 chr157264937 7265599 chr216659189 6667015 chr2 351239 352821 I got the following galaxy output: 7: Extract Genomic DNA on data 6 empty format: fasta, database: galGal3 Info: 10 warnings, 1st is: Unable to fetch the sequence from '47258168' to '1072' for build 'galGal3'. Skipped 10 invalid lines, 1st is #1, chr5 47258168 47259240 Any ideas what I am doing wrong? Thanks, Steve ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/