On 02/24/2010 06:19 AM, Martin Morgan wrote: > Hi Nicolas -- > > These sounds like very useful additions, and I'll try to incorporate > over the next day or so. > > Thank you very much for the contribution!
In ShortRead v. 1.5.7 (available in bioc-devel in the next day or so, or through svn now), for type="SolexaExport" there are additional options withId=FALSE, withMultiplexIndex, withPairedReadNumber and withAll. withMultiplexIndex and withPairedReadNumber read in the corresponding fields from the _export file into columns in alignedData; withId constructs an identifier from the machine, run, tile... information (accessible with id(aln)); withAll is a convenience to turn all flags on. See ?readAligned and news(Version>=1.5, "ShortRead"). Martin > > Martin > > On 02/24/2010 02:55 AM, Nicolas Delhomme wrote: >> Hi Martin, everyone, >> >> I've been looking forward to doing it for a long time now, and, >> finally, I got the time. So, I dove into the ShortRead C code to add >> some functionalities when loading Illumina export files. I've added an >> option to the readAligned method, specifically for the type >> "SolexaExport" that will in addition to the default information, >> retrieve the multiplex barcode and the paired read number (the 6 and 7th >> column of the export file, that were ignored so far). Additionally, >> using this option will create the sequence identifier (i.e. the one you >> get in a fastq file extracted from an export file) and populate the id >> slot of the alignedRead object. >> >> I've attached the diff of my local working copy with the revision 44842 >> of ShortRead (the current one, as of this morning), two example export >> files (one from a single-end (SE) and one from a paired-end (PE) >> sequencing experiment) and a small R script showing the modified usage. >> >> I think that these functionalities are very interesting for people, like >> me, who have to analyze PE, multiplexed data, and I'd be glad if they >> got integrated. >> >> Finally, I'm, by far, not a C expert, so you might wish/(need?) to >> optimize what I've written. >> >> Best, >> >> --------------------------------------------------------------- >> Nicolas Delhomme >> >> High Throughput Functional Genomics Center >> >> European Molecular Biology Laboratory >> >> Tel: +49 6221 387 8426 >> Email: [email protected] >> Meyerhofstrasse 1 - Postfach 10.2209 >> 69102 Heidelberg, Germany >> --------------------------------------------------------------- >> >> >> >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
