Thanks, I didn't know about the fold command. That's a great suggestion, and using fold solves the problem.
Thanks, Andrew On Wed, Sep 22, 2010 at 11:05 AM, Steve Lianoglou <[email protected]> wrote: > Hi, > > On Wed, Sep 22, 2010 at 10:51 AM, Andrew Yee <[email protected]> wrote: >> Is there a limit to the number of characters in a line for >> read.DNAStringSet()? > <snip> >>> bar <- read.DNAStringSet(filepath='~/sandbox/foo.fasta', format='fasta') >> Error in .read.fasta.in.XStringSet(filepath, set.names, elementType, lkup) : >> reading FASTA file : cannot read line 2, line is too long > > Apparently so :-) > > Assuming your on a *nix-type machine, you can use the `fold` command > (from the terminal) to pretty easily fix your problem ... you would > have to assume a maxlength for your header(?) lines (the ones in your > fasta file that start with ">"). Since you've already shown that the > read.DNAStringSet function can handle line lengths of 2000, maybe you > can use that (or some smaller number), if you like. > > From terminal: > > $ fold -w 2000 foo.fasta > foo.folded.fasta > > Then fire up R and do you reading as usual. > > -steve > > -- > Steve Lianoglou > Graduate Student: Computational Systems Biology > | Memorial Sloan-Kettering Cancer Center > | Weill Medical College of Cornell University > Contact Info: http://cbio.mskcc.org/~lianos/contact > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
