Hi Hiroki,

This question has come up before, and the best advice our team has to offer is that in most cases, filtering the data this way is unnecessary. Still - there are a few methods to do this, but they are tedious to do - one is to basically covert everything to tabular format, extract the IDs, compare and join the datasets with the id lists, then convert back to fastq. Another is the one Carlos brings up - joining, filtering, splitting - but that has not worked for all sequence formats in the past. Neither of these is recommended, but you are of course welcome to test out and try whatever tools/methods you wish to.


With most analysis pipelines, is is fine to leave in the extra reads and proceed with the mapping step. Then, after mapping, this would be the next opportunity do some filtering if you wanted to only retain properly paired reads, etc. However, even this is not always necessary - it depends on what analysis you are doing (e.g. not required for RNA-seq analysis).

These tool groups manipulate/provide stats on SAM/BAM datasets:

NGS: SAM Tools     be sure to see -> Filter SAM
NGS: Picard (beta)


Hopefully this helps!

Jen
Galaxy team

On 1/7/13 9:54 PM, 柴田 弘紀 wrote:
Hi there,

I obtained two fastq files from GA paired end run. I filtered each file by 
quality using fastq tool kit. Then some forward reads may be removed by low 
quality whereas the reverse counterparts are OK to be remained on the other 
file, or vice versa.

I want to remove those "unpaired" reads from filtered fastq files so that the 
two new fastq files contain the identical sets of the reads.

Is it possible to do it on galaxy?

Thank you very much.

Hiroki
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Hillman-Jackson
Galaxy Support and Training
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to