Re: [galaxy-user] Fastq-groomer help
On Oct 3, 2012, at 2:02 PM, Kshama Aswath wrote: Hello: I have this 20GB data that I have uploaded onto my history and trying to get it run thr groomer. Just the first data set was uploaded yesterday and ran groomer on it and it was not done this morning. The message indicated taht it is still waiting to be run !!! I have 57 data sets to run and would appreciate if you could inform me about how long it may take to even get started or any other suggestion to get my job done will help. Thanks so much, user name :genenart Kshama. Hi Kshama, There were some problems dispatching jobs that have been resolved. Sorry for the inconvenience, and thanks for using Galaxy. --nate -- Kshama Aswath Graduate Student-(PhD) Bioinformatics and computational Biology Prince Williams Campus George Mason University Manasses,VA-20110 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] Fastq-groomer help
Hello: I have this 20GB data that I have uploaded onto my history and trying to get it run thr groomer. Just the first data set was uploaded yesterday and ran groomer on it and it was not done this morning. The message indicated taht it is still waiting to be run !!! I have 57 data sets to run and would appreciate if you could inform me about how long it may take to even get started or any other suggestion to get my job done will help. Thanks so much, user name :genenart Kshama. -- Kshama Aswath Graduate Student-(PhD) Bioinformatics and computational Biology Prince Williams Campus George Mason University Manasses,VA-20110 ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] FASTQ groomer processing time
I used FASTQ groomer on a 29 Gb Illumina 1.5+ FASTQ file to go from Illumina 1.3-1.7+ to Sanger and it is still processing after over 30 hrs. Is this a normal time frame for a FASTQ file this size ? Matthew The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] FASTQ groomer processing time
Matthew, yes we have seen such kind of long runs before (depending on server load). Happy most of our reads are now in 1.8+ format. You can parallelise the process by splitting the file in 4 or 6 and submit for grooming and afterwards merge them again... Alex Van: galaxy-user-boun...@lists.bx.psu.edu [galaxy-user-boun...@lists.bx.psu.edu] namens Matthew McCormack [mccorm...@molbio.mgh.harvard.edu] Verzonden: maandag 27 februari 2012 21:13 To: galaxy-user@lists.bx.psu.edu Onderwerp: [galaxy-user] FASTQ groomer processing time I used FASTQ groomer on a 29 Gb Illumina 1.5+ FASTQ file to go from Illumina 1.3-1.7+ to Sanger and it is still processing after over 30 hrs. Is this a normal time frame for a FASTQ file this size ? Matthew The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] FASTQ Groomer before BWA mapping ?
All, I'm wondering why do we need to convert Illumina FASTQ into sanger using FastQ Groomer before mapping with BWA in galaxy. The lastest version of BWA itself added -I option to use Illumina data directly. What's your opinion on this? Secondly, I found that Map with BWA for Illumina uses -I option in the commandline during execution, even for sanger formatted reads. How does it impact the results? Thanks, Raj This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely for the use of the addressee(s). If you are not the intended recipient, please notify the sender by e-mail and delete the original message. Further, you are not to copy, disclose, or distribute this e-mail or its contents to any other person and any such actions that are unlawful. This e-mail may contain viruses. Ocimum Biosolutions has taken every reasonable precaution to minimize this risk, but is not liable for any damage you may sustain as a result of any virus in this e-mail. You should carry out your own virus checks before opening the e-mail or attachment. The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system. OCIMUMBIO SOLUTIONS (P) LTD___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] fastq groomer
Hello Slon, In case you are still having issues, the best use case for Illumina 1.8+ data is to run the FASTQ Groomer tool with the option Sanger. As Peter noted, this assigns the expected datatype plus verifies content before investing time in downstream analysis. Please let us know if more help is needed, Best, Jen Galaxy team On 10/18/11 1:02 AM, arabidopsis wrote: Hi all, Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong? Thanks, Slon ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Jennifer Jackson http://usegalaxy.org http://galaxyproject.org/wiki/Support ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] fastq groomer
actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format ) my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values? On Tue, Oct 18, 2011 at 5:10 PM, Peter Cock p.j.a.c...@googlemail.comwrote: On Tue, Oct 18, 2011 at 9:21 AM, arabidopsis svine...@gmail.com wrote: If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only uploaded fastq file is displayed, without encoding details. S. ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] fastq groomer
On Tue, Nov 1, 2011 at 4:58 PM, Kevin Lam abou...@gmail.com wrote: actually Illumina 1.8+ has one more quality value higher than fastqsanger (see http://en.wikipedia.org/wiki/FASTQ_format ) my question now I guess is if I use fastqsanger would it break anything when it encounters the 'J' in the qual values? The Sanger FASTQ format has always allowed J (PHRED 41), the issue is some tools might treat that as an error as it is unusually high for a raw read. For instance, you need at least FASTX v0.0.13 to cope with this - older versions didn't like it. http://seqanswers.com/forums/showthread.php?p=49667 Peter ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-user] fastq groomer
Hi all, Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong? Thanks, Slon ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] fastq groomer
If Illumina 1.8+ is already using the Sanger FASTQ encoding, the file should be recognized by downstream applications, like Quality statistics computer, quality filter etc. However, my file is not visible by those programs and when I click on it, only uploaded fastq file is displayed, without encoding details. S. On Tue, Oct 18, 2011 at 10:12 AM, Peter Cock p.j.a.c...@googlemail.comwrote: On Tue, Oct 18, 2011 at 9:02 AM, arabidopsis svine...@gmail.com wrote: Hi all, Fastq groomer has Solexa or Illumina 1.3+ as an input quality format. I asked at the sequencing facility about their machine and output and they said their format was Illumina 1.8+ (the newest). I tried to convert my fastq file into Sanger by fastq groomer, using Illumina 1.3+ as an input option and got all reads with quality of around 10... Does it mean that Galaxy cannot be used on a dataset with 1.8+ encoding or something else was wrong? Thanks, Slon Illumina 1.8+ is already using the Sanger FASTQ encoding, so you don't need to convert it with the groomer. I think the Galaxy team might still recommend it as it doubles as a sanity test for corrupt FASTQ files. Peter ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics
Thanks Ross, I don't see it under my local install - are there any pre-written scripts to integrate it with a local galaxy instance? I assume you are talking about this tool here: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ -John From: Ross [ross.laza...@gmail.com] Sent: Wednesday, June 01, 2011 11:41 AM To: John David Osborne Cc: galaxy-u...@bx.psu.edu Subject: Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics You can avoid the space/time overhead of grooming and get comprehensive QC reports using the new wrapper for FastQC (under NGS: QC) - it takes fastq of any flavour (and bam) groomed or not, producing a superset of the compute quality stats output without the need for an intermediate step. Highly recommended. On Wed, Jun 1, 2011 at 12:02 PM, John David Osborne ozb...@uab.edu wrote: I noticed that for our new Ilumina data (which generate Sanger format) the FastQ groomer output is identical to the Ilumina FastQ input file. I was hoping to go ahead and just use the raw FastQ files as input (saving disk space) for computing quality statistics to look at box plots, but it appears that the tool Compute Quality Statistics appears to require that the data have been run through FastQ Groomer first. Is there a way to get around this and is this a bug? I assuming this is some sort of safety measure built into this tool? -John ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics
On Thu, Jun 9, 2011 at 10:12 AM, John David Osborne ozb...@uab.edu wrote: Thanks Ross, I don't see it under my local install - are there any pre-written scripts to integrate it with a local galaxy instance? I assume you are talking about this tool here: http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/ Hi, John. it's on main and test - ie the FastQC wrapper is distributed with the current stable and central branches so your local tool_conf.xml may be out of date since it's not automagically refreshed from the distro .sample ? If you do a diff of your local tool_conf.xml with the current distributed sample, you should see the lines you need to add which points to rgenetics/fastqc.xml Thu,Jun 09 at 10:22am grep -i fastqc tool_conf.xml label text=FastQC: fastq/sam/bam id=fastqcsambam / tool file=rgenetics/rgFastQC.xml / Like everything else, you'll want to install the jar locally so it can be found by the cluster - the default location is tool-data/shared/jars/FastQC so the tool can find the fastqc perl script (yes, I know...but it's worth it!) command interpreter=python rgFastQC.py -i $input_file -d $html_file.files_path -o $html_file -n $out_prefix -f $input_file.ext -e ${GALAXY_DATA_INDEX_DIR}/shared/jars/FastQC/fastqc I hope this helps? -John From: Ross [ross.laza...@gmail.com] Sent: Wednesday, June 01, 2011 11:41 AM To: John David Osborne Cc: galaxy-u...@bx.psu.edu Subject: Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics You can avoid the space/time overhead of grooming and get comprehensive QC reports using the new wrapper for FastQC (under NGS: QC) - it takes fastq of any flavour (and bam) groomed or not, producing a superset of the compute quality stats output without the need for an intermediate step. Highly recommended. On Wed, Jun 1, 2011 at 12:02 PM, John David Osborne ozb...@uab.edu wrote: I noticed that for our new Ilumina data (which generate Sanger format) the FastQ groomer output is identical to the Ilumina FastQ input file. I was hoping to go ahead and just use the raw FastQ files as input (saving disk space) for computing quality statistics to look at box plots, but it appears that the tool Compute Quality Statistics appears to require that the data have been run through FastQ Groomer first. Is there a way to get around this and is this a bug? I assuming this is some sort of safety measure built into this tool? -John ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics
You can avoid the space/time overhead of grooming and get comprehensive QC reports using the new wrapper for FastQC (under NGS: QC) - it takes fastq of any flavour (and bam) groomed or not, producing a superset of the compute quality stats output without the need for an intermediate step. Highly recommended. On Wed, Jun 1, 2011 at 12:02 PM, John David Osborne ozb...@uab.edu wrote: I noticed that for our new Ilumina data (which generate Sanger format) the FastQ groomer output is identical to the Ilumina FastQ input file. I was hoping to go ahead and just use the raw FastQ files as input (saving disk space) for computing quality statistics to look at box plots, but it appears that the tool Compute Quality Statistics appears to require that the data have been run through FastQ Groomer first. Is there a way to get around this and is this a bug? I assuming this is some sort of safety measure built into this tool? -John ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ -- Ross Lazarus MBBS MPH; Associate Professor, Harvard Medical School; Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850; Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444; ___ The Galaxy User list should be used for the discussion of Galaxy analysis and other features on the public server at usegalaxy.org. Please keep all replies on the list by using reply all in your mail client. For discussion of local Galaxy instances and the Galaxy source code, please use the Galaxy Development list: http://lists.bx.psu.edu/listinfo/galaxy-dev To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/