Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-30 Thread Surya Saha
Hi Florent,

This looks great. Hope it gets committed into the repository soon.

Best,
Surya

On Tue, Mar 29, 2011 at 5:59 PM, Florent Angly wrote:

>  Hi Surya,
>
> I made Galaxy scripts, FASTQ interlacer and de-interlacer,  to do exactly
> what you are describing:
> https://bitbucket.org/fangly/galaxy-central/changeset/3fa11cf2730d
> The tools extend the Galaxy Python API and therefore need Galaxy to work.
> Unfortunately, FASTQ interlacer and de-interlacer are still waiting to be
> committed to the Galaxy development repository by a Galaxy maintainer.
>
> Florent
>
>
>
> On 30/03/11 01:29, Surya Saha wrote:
>
> Hi,
>
> I have two fastq files with the forward(/1) and reverse(/2) paired reads.
> The reads are not in same order in either file, some pairs are
> absent/missing and the files are 8 GB each with abt 30 mill reads each.
>
> I am trying to pull out all the paired reads for which both fwd and rev
> exist. Can I use a combination of fastq tools in Galaxy to do this?
>
> Thanks!
>
> -Surya
>
>
> ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
>
>   http://lists.bx.psu.edu/listinfo/galaxy-dev
>
>
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
>
>   http://lists.bx.psu.edu/
>
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Florent Angly

Hi Surya,

I made Galaxy scripts, FASTQ interlacer and de-interlacer,  to do 
exactly what you are describing: 
https://bitbucket.org/fangly/galaxy-central/changeset/3fa11cf2730d
The tools extend the Galaxy Python API and therefore need Galaxy to 
work. Unfortunately, FASTQ interlacer and de-interlacer are still 
waiting to be committed to the Galaxy development repository by a Galaxy 
maintainer.


Florent


On 30/03/11 01:29, Surya Saha wrote:

Hi,

I have two fastq files with the forward(/1) and reverse(/2) paired 
reads. The reads are not in same order in either file, some pairs are 
absent/missing and the files are 8 GB each with abt 30 mill reads each.


I am trying to pull out all the paired reads for which both fwd and 
rev exist. Can I use a combination of fastq tools in Galaxy to do this?


Thanks!

-Surya


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Surya Saha
Hi Tony,

Yes, that should work too. I have written up a BioPerl hack that indexes the
reads and pulls out the pairs that is chugging away right now. If that does
not work out somehow, I will give your idea a shot. Thanks!

Best,
Surya

On Tue, Mar 29, 2011 at 4:20 PM, Barbet,Anthony F  wrote:

> Can you not do fastq join on the 2 files, fastq filter for the single (same
> max and min bases) full length combined size (and quality if you want), then
> fastq splitter?
>
> Tony
> 
> From: galaxy-user-boun...@lists.bx.psu.edu [
> galaxy-user-boun...@lists.bx.psu.edu] On Behalf Of Surya Saha [
> ss2...@cornell.edu]
> Sent: Tuesday, March 29, 2011 4:00 PM
> To: Anton Nekrutenko
> Cc: galaxy-user@lists.bx.psu.edu
> Subject: Re: [galaxy-user] Combining the paired reads from Illumina run
>
> Hi Anton,
>
> Thank you for the tip. The sequence names do end in /1 and /2 but that can
> be fixed using Manipulate FASTQ tool, right?
>
> -Surya
>
> On Tue, Mar 29, 2011 at 3:46 PM, Anton Nekrutenko  <mailto:an...@bx.psu.edu>> wrote:
> >
> > You can try converting fastq to tabular (NGS: QC and Manipulation).
> Jointing (Join, Subtract and Group) the two files on ids (provided they do
> not have /1 and /2). Splitting into two files with cut (Text manipulation),
> and going back into fastq with tabulat-to-fastq (NGS: QC and Manipulation).
> With 30 mil reads this will likely take some time though.
> > Thanks,
> > anton
> >
> > On Mar 29, 2011, at 11:38 AM, Surya Saha wrote:
> >
> > These are Illumina reads
> >
> > -S.
> >
> > On Tue, Mar 29, 2011 at 11:37 AM, Anton Nekrutenko  <mailto:an...@bx.psu.edu>> wrote:
> >>
> >> Are these illumina or solid reads?
> >>
> >> Tx,
> >>
> >> anton
> >>
> >>
> >> On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:
> >>
> >> > Hi,
> >> >
> >> > I have two fastq files with the forward(/1) and reverse(/2) paired
> reads. The reads are not in same order in either file, some pairs are
> absent/missing and the files are 8 GB each with abt 30 mill reads each.
> >> >
> >> > I am trying to pull out all the paired reads for which both fwd and
> rev exist. Can I use a combination of fastq tools in Galaxy to do this?
> >> >
> >> > Thanks!
> >> >
> >> > -Surya ___
> >> > The Galaxy User list should be used for the discussion of
> >> > Galaxy analysis and other features on the public server
> >> > at usegalaxy.org<http://usegalaxy.org>.  Please keep all replies on
> the list by
> >> > using "reply all" in your mail client.  For discussion of
> >> > local Galaxy instances and the Galaxy source code, please
> >> > use the Galaxy Development list:
> >> >
> >> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> >> >
> >> > To manage your subscriptions to this and other Galaxy lists,
> >> > please use the interface at:
> >> >
> >> >  http://lists.bx.psu.edu/
> >>
> >> Anton Nekrutenko
> >> http://nekrut.bx.psu.edu
> >> http://usegalaxy.org
> >>
> >>
> >>
> >
> >
> > Anton Nekrutenko
> > http://nekrut.bx.psu.edu
> > http://usegalaxy.org
> >
> >
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Anton Nekrutenko
In a hacky way, where you translate "/1" into something else such as two spaces 
" ", or your favorite chemical element such as "He" ;)

a.


On Mar 29, 2011, at 4:00 PM, Surya Saha wrote:

> The sequence names do end in /1 and /2 but that can be fixed using Manipulate 
> FASTQ tool, right?

Anton Nekrutenko
http://nekrut.bx.psu.edu
http://usegalaxy.org




___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Surya Saha
Hi Anton,

Thank you for the tip. The sequence names do end in /1 and /2 but that can
be fixed using Manipulate FASTQ tool, right?

-Surya

On Tue, Mar 29, 2011 at 3:46 PM, Anton Nekrutenko  wrote:
>
> You can try converting fastq to tabular (NGS: QC and Manipulation).
Jointing (Join, Subtract and Group) the two files on ids (provided they do
not have /1 and /2). Splitting into two files with cut (Text manipulation),
and going back into fastq with tabulat-to-fastq (NGS: QC and Manipulation).
With 30 mil reads this will likely take some time though.
> Thanks,
> anton
>
> On Mar 29, 2011, at 11:38 AM, Surya Saha wrote:
>
> These are Illumina reads
>
> -S.
>
> On Tue, Mar 29, 2011 at 11:37 AM, Anton Nekrutenko 
wrote:
>>
>> Are these illumina or solid reads?
>>
>> Tx,
>>
>> anton
>>
>>
>> On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:
>>
>> > Hi,
>> >
>> > I have two fastq files with the forward(/1) and reverse(/2) paired
reads. The reads are not in same order in either file, some pairs are
absent/missing and the files are 8 GB each with abt 30 mill reads each.
>> >
>> > I am trying to pull out all the paired reads for which both fwd and rev
exist. Can I use a combination of fastq tools in Galaxy to do this?
>> >
>> > Thanks!
>> >
>> > -Surya ___
>> > The Galaxy User list should be used for the discussion of
>> > Galaxy analysis and other features on the public server
>> > at usegalaxy.org.  Please keep all replies on the list by
>> > using "reply all" in your mail client.  For discussion of
>> > local Galaxy instances and the Galaxy source code, please
>> > use the Galaxy Development list:
>> >
>> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
>> >
>> > To manage your subscriptions to this and other Galaxy lists,
>> > please use the interface at:
>> >
>> >  http://lists.bx.psu.edu/
>>
>> Anton Nekrutenko
>> http://nekrut.bx.psu.edu
>> http://usegalaxy.org
>>
>>
>>
>
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Anton Nekrutenko
You can try converting fastq to tabular (NGS: QC and Manipulation). Jointing 
(Join, Subtract and Group) the two files on ids (provided they do not have /1 
and /2). Splitting into two files with cut (Text manipulation), and going back 
into fastq with tabulat-to-fastq (NGS: QC and Manipulation). With 30 mil reads 
this will likely take some time though.

Thanks,

anton


On Mar 29, 2011, at 11:38 AM, Surya Saha wrote:

> These are Illumina reads
> 
> -S.
> 
> On Tue, Mar 29, 2011 at 11:37 AM, Anton Nekrutenko  wrote:
> Are these illumina or solid reads?
> 
> Tx,
> 
> anton
> 
> 
> On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:
> 
> > Hi,
> >
> > I have two fastq files with the forward(/1) and reverse(/2) paired reads. 
> > The reads are not in same order in either file, some pairs are 
> > absent/missing and the files are 8 GB each with abt 30 mill reads each.
> >
> > I am trying to pull out all the paired reads for which both fwd and rev 
> > exist. Can I use a combination of fastq tools in Galaxy to do this?
> >
> > Thanks!
> >
> > -Surya ___
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> >  http://lists.bx.psu.edu/
> 
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
> 
> 
> 
> 

Anton Nekrutenko
http://nekrut.bx.psu.edu
http://usegalaxy.org



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Surya Saha
These are Illumina reads

-S.

On Tue, Mar 29, 2011 at 11:37 AM, Anton Nekrutenko  wrote:

> Are these illumina or solid reads?
>
> Tx,
>
> anton
>
>
> On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:
>
> > Hi,
> >
> > I have two fastq files with the forward(/1) and reverse(/2) paired reads.
> The reads are not in same order in either file, some pairs are
> absent/missing and the files are 8 GB each with abt 30 mill reads each.
> >
> > I am trying to pull out all the paired reads for which both fwd and rev
> exist. Can I use a combination of fastq tools in Galaxy to do this?
> >
> > Thanks!
> >
> > -Surya ___
> > The Galaxy User list should be used for the discussion of
> > Galaxy analysis and other features on the public server
> > at usegalaxy.org.  Please keep all replies on the list by
> > using "reply all" in your mail client.  For discussion of
> > local Galaxy instances and the Galaxy source code, please
> > use the Galaxy Development list:
> >
> >  http://lists.bx.psu.edu/listinfo/galaxy-dev
> >
> > To manage your subscriptions to this and other Galaxy lists,
> > please use the interface at:
> >
> >  http://lists.bx.psu.edu/
>
> Anton Nekrutenko
> http://nekrut.bx.psu.edu
> http://usegalaxy.org
>
>
>
>
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Anton Nekrutenko
Are these illumina or solid reads?

Tx,

anton


On Mar 29, 2011, at 11:29 AM, Surya Saha wrote:

> Hi,
> 
> I have two fastq files with the forward(/1) and reverse(/2) paired reads. The 
> reads are not in same order in either file, some pairs are absent/missing and 
> the files are 8 GB each with abt 30 mill reads each.
> 
> I am trying to pull out all the paired reads for which both fwd and rev 
> exist. Can I use a combination of fastq tools in Galaxy to do this?
> 
> Thanks!
> 
> -Surya ___
> The Galaxy User list should be used for the discussion of
> Galaxy analysis and other features on the public server
> at usegalaxy.org.  Please keep all replies on the list by
> using "reply all" in your mail client.  For discussion of
> local Galaxy instances and the Galaxy source code, please
> use the Galaxy Development list:
> 
>  http://lists.bx.psu.edu/listinfo/galaxy-dev
> 
> To manage your subscriptions to this and other Galaxy lists,
> please use the interface at:
> 
>  http://lists.bx.psu.edu/

Anton Nekrutenko
http://nekrut.bx.psu.edu
http://usegalaxy.org




___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


[galaxy-user] Combining the paired reads from Illumina run

2011-03-29 Thread Surya Saha
Hi,

I have two fastq files with the forward(/1) and reverse(/2) paired reads.
The reads are not in same order in either file, some pairs are
absent/missing and the files are 8 GB each with abt 30 mill reads each.

I am trying to pull out all the paired reads for which both fwd and rev
exist. Can I use a combination of fastq tools in Galaxy to do this?

Thanks!

-Surya
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/