On Oct 7, 2009, at 10:08 PM, Martin Morgan wrote:

Hi Michael --

Michael Muratet wrote:
Greetings

I am working on adapting readIntensities from ShortRead to handle the
new Illumina intensity file format, *.cif. Illumina has dropped the
leading zeros from the file name so that if you use list.files to get
file names from the old style you get:




you could extract the lane and tile information along the lines of

 files = c("s_1_1.cif", "s_1_10.cif")
 lanes = as.integer(sub("s_([[:digit:]]+).*", "\\1", files))
 tiles = as.integer(sub(".*_([[:digit:]]+).cif", "\\1", files))

and then order the files with

 files[order(lanes, tiles)]

In earlier versions, I think the file name is actually configurable by
the pipeline software, and recorded in the xml configuration files; few
people seemed to actually do this though.


Martin

Thanks for the suggestions. Everything appears to be working in a satisfactory manner now. Give me a few days to test and verify, update the docs and tests, and I'll send you diffs to incorporate.

Regards

Mike



--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M1 B861
Phone: (206) 667-2793

Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
[email protected]
(256) 327-0473 (p)
(256) 327-0966 (f)

Room 4005
601 Genome Way
Huntsville, Alabama 35806

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to