Re: [Bioc-sig-seq] About ShortRead Package

Pratap, Abhishek Fri, 10 Jul 2009 09:38:13 -0700

Hi Martin

Thanks for a quick reply. I understand the problem now. The export file
I am using is 2 GB and that would definitely crash my system if the req
are 3-5 times more.
Right now I am trying at my dev machine. May be I should ask IT to
install it centrally so that I can access it on a bigger machine.

For now let me I will pick a small chunk to do some initial tests.

Thanks,
_Abhi

-----Original Message-----
From: Martin Morgan [mailto:[email protected]] 
Sent: Friday, July 10, 2009 11:25 AM
To: Pratap, Abhishek
Cc: [email protected]
Subject: Re: [Bioc-sig-seq] About ShortRead Package

Hi Abhi --

Pratap, Abhishek wrote:
> Hi All
> 
>  
> 
> I have recently started to acquaint my self with new R packages for
NGS
> data processing/analysis.  I think the community has done a great work
> in developing these packages. I must say I am amazingly surprised to
see
> some of the capabilities. 
> 
>  
> 
> I have a quick comment.  While playing  with ShortRead package I am
not
> able to successfully load the export file even one for that matter. I
am
> using a single PC to do this .  I belv it has It has sufficient memory
> (4 GB) to handle one lane of data.  I waited for 15-18 minutes before
my
> pc started to show sign of sickness. I eventually had to kill the
> process.
> 
>  
> 
> Here is wat I did.
> 
>  
> 
> Library(ShortRead)
> 
> sp=SolexaPath("/local/seq_archive/solexa/090309_HWI-EAS397_0006")
> 
> path=analysisPath(sp)[4]  ### I just wanted to look at one GERALD
> folder.
> 
> aln=readAligned(path,type="SolexaExport","s_6_export.txt")

One tricky point is the number of files specified by your pattern. Does

  list.files(path, "s_6_export.txt")

return just a single file? if not (e.g., because there are both .txt and
.txt.gz versions of s_6_export) then specify the pattern more precisely,
e.g., "^s_6_export.txt$"

It might be that your computer does not have enough memory. Very
roughly, for this initial stage, you might expect R to require 3-5 times
 as much memory as the file occupies on disk. If your reads are
relatively short, your files might be 500MB or so and you might be fine,
but if your reads are longer the files could be > 1GB and you'd be in
trouble.

There are several options, the best being to use a computer with more
memory. You could also split the export file (using unix 'split'
command, for instance). We have also been working on making input more
space- and time- efficient, so the version of ShortRead available with
the development version of R will do a better job (but still require
considerable memory).

Martin

> GOT STUCK HERE
> 
>  
> 
> Is there anything I am doing the wrong way. Please let me know.  
> 
>  
> 
> Cheers,
> 
> -Abhi
> 
> ----------------------------- 
> Abhishek Pratap 
> Bioinformatics Software Engineer 
> Institute for Genome Sciences <http://www.igs.umaryland.edu/>  
> School of Medicine, Univ of Maryland 
> 801, W. Baltimore Street, Baltimore, MD 21209 
> Ph: (+1)-410-706-2296 
> 
> 
> 
> 
> 
> 
> 
>  
> 
>  
> 
> 
>       [[alternative HTML version deleted]]
> 
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] About ShortRead Package

Reply via email to