Re: [galaxy-user] BWA and FASTQ Joiner issues

Jennifer Jackson Thu, 10 Jan 2013 11:21:10 -0800

Hi Hilde,

Glad you wrote - we can try to help -


On 1/10/13 8:33 AM, Hilde Stawski wrote:

Hello,
I hope someone might be able to help me with these issues, as I'mrelatively new at Bioinformatics.
When analyzing data on the main website of Galaxy (all samples fromthe same Illumina MiSeq run) some sets fail in the BWA alignment. Ihave tried rerunning my workflow again, reuploading the FastQ files(in case they were corrupted) but BWA fails every single time. I'vepasted the error log below.Q1: Why are most of my datasets aligned by BWA without a problem, butsome consistently fail?

To confirm, you were first using the public Main Galaxy instance athttps://main.g2.bx.psu.edu (usegalaxy.org)? Normally I would suggestsending in a bug report from an error dataset so that we can providesome feedback. But the bit of the error you sent and your own analysissuggests that data content is the root issue. Next time though, this ishow to report an error:

http://wiki.galaxyproject.org/Support#Reporting_tool_errors

So, some people at SEQanswers suggested I install a local Galaxyclient on my computer, and I'm now trying to rework my workflow so Ican use BFAST as an aligner. BWA was giving us some trouble due toindels in our mtDNA sequence, so we are trying to find anotheraligning tool that is more capable of working with these indels, anyway.
However, now I'm running into a second problem. I have PE data, andafter grooming both FASTQ files, the FASTQ Joiner generates an emptyfile. I did try the workaround mentioned here(http://lists.bx.psu.edu/pipermail/galaxy-user/2012-April/004519.html)but I'm not sure I understand the following instructions
"NGS: QC and manipulation -> Tabular to FASTQ" run twice
Recreate both FASTQ files from the same tabular file."
I ran FASTQ to Tabular with two columns, and joined the files on thefirst column. Why/how should I recreate both FASTQ files if I wantedto create just one single file to use as input for alignment? FASTQGroomer doesn't seem to be able to groom the data either "Based uponquality and sequence, the input data is valid for: None Input ASCIIrange: '0'(48) - 'N'(78) Input decimal range: 15 - 45"

These instructions are for creating two inputs - one fastq dataset offorward reads and one fastq dataset of reverse reads - as required by atool such as BWA. And the overall method was for insuring the the samedata was QC'd and a match between these two inputs. This is not ourrecommended method anymore for the target analysis - it works, but isnot needed (for others reading this post - just map the data and filterfor pairs after if desired, but this too can often be skipped).

For BFAST, you want the reads interleaved and in the same fastq dataset.It sounds like you are having trouble with the 'FASTQ Joiner' tool.There have been known issues with certain sequence ID formats in thepast, so verifying format of the inputs would be the first step. If youcontinue to have no output, you can also send this in as a bug report(if there is an error), or if not in error and just empty, share a linkto your history with me and I can provide feedback. I know that thereare a few command line tools to join data, and that may be therecommendation - it just depends on your data, but let's check first tosee if there isn't another solution first.

Use "Option (gear icon) -> Share or Publish -> generate "share" link ->copy and paste into a return email and note the dataset #s that are aconcern. Please leave the dataset's undeleted so that I can check therun parameters. You should *not* cc the mailing list when sharing ahistory link, to keep your data private.


Hopefully this helps or will lead to a solution!

Jen
Galaxy team

Q2: How do I create a single file as input for the aligner?

Thanks in advance for any help,

Hilde

---

The alignment failed.
Error generating alignments. [bwa_sai2sam_pe_core] convert to sequencecoordinate...
[infer_isize] (25, 50, 75) percentile: (2402, 5234, 8910)
[infer_isize] low and high boundaries: 151 and 21926 for estimatingavg and std[infer_isize] inferred external isize from 251269 pairs: 5940.068 +/-4124.607
[infer_isize] skewness: 0.511; kurtosis: -0.744; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 23222 (4.19 sigma)
[bwa_sai2sam_pe_core] time elapses: 1.37 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 3297 out of 10352 Q17 singletons are mated.
[bwa_paired_sw] 0 out of 188377 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 2241.09 sec
[bwa_sai2sam_pe_core] refine gapped alignments... 1.37 sec
[bwa_sai2sam_pe_core] print alignments... 1.76 sec
[bwa_sai2sam_pe_core] 262144 sequences have been processed.
[bwa_sai2sam_pe_core] convert to sequence coordinate...
[infer_isize] (25, 50, 75) percentile: (2595, 7186, 11297)
[infer_isize] low and high boundaries: 151 and 28701 for estimatingavg and std[infer_isize] inferred external isize from 84557 pairs: 7236.733 +/-4785.357
[infer_isize] skewness: 0.066; kurtosis: -1.262; ap_prior: 1.00e-05
[infer_isize] inferred maximum insert size: 27144 (4.16 sigma)
[bwa_sai2sam_pe_core] time elapses: 0.38 sec
[bwa_sai2sam_pe_core] changing coordinates of 0 alignments.
[bwa_sai2sam_pe_core] align unmapped mate...
[bwa_paired_sw] 482 out of 3212 Q17 singletons are mated.
[bwa_paired_sw] 0 out of 40389 Q17 discordant pairs are fixed.
[bwa_sai2sam_pe_core] time elapses: 532.50 sec
[bwa_sai2sam_pe_core] refine gapped alignments... /bin/sh: line 1:28068 Segmentation fault bwa sampe/tmp/3030216.cyberstar.psu.edu/tmprNNXwj/tmpXybcjA<http://3030216.cyberstar.psu.edu/tmprNNXwj/tmpXybcjA>/tmp/3030216.cyberstar.psu.edu/tmpq3YCcl/tmpKvAb8e<http://3030216.cyberstar.psu.edu/tmpq3YCcl/tmpKvAb8e>/tmp/3030216.cyberstar.psu.edu/tmpq3YCcl/tmpvUAE3E<http://3030216.cyberstar.psu.edu/tmpq3YCcl/tmpvUAE3E>/galaxy/main_pool/pool3/files/005/540/dataset_5540834.dat/galaxy/main_pool/pool3/files/005/540/dataset_5540837.dat >>/galaxy/main_pool/pool2/tmp/job_working_directory/004/860/4860532/galaxy_dataset_5540842.dat
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org

___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] BWA and FASTQ Joiner issues

Reply via email to