On Oct 7, 2009, at 10:08 PM, Martin Morgan wrote:
Hi Michael --
Michael Muratet wrote:
Greetings
I am working on adapting readIntensities from ShortRead to handle the
new Illumina intensity file format, *.cif. Illumina has dropped the
leading zeros from the file name so that if you use list.files to get
file names from the old style you get:
you could extract the lane and tile information along the lines of
files = c("s_1_1.cif", "s_1_10.cif")
lanes = as.integer(sub("s_([[:digit:]]+).*", "\\1", files))
tiles = as.integer(sub(".*_([[:digit:]]+).cif", "\\1", files))
and then order the files with
files[order(lanes, tiles)]
In earlier versions, I think the file name is actually configurable by
the pipeline software, and recorded in the xml configuration files;
few
people seemed to actually do this though.
Martin
Thanks for the suggestions. Everything appears to be working in a
satisfactory manner now. Give me a few days to test and verify, update
the docs and tests, and I'll send you diffs to incorporate.
Regards
Mike
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
Michael Muratet, Ph.D.
Senior Scientist
HudsonAlpha Institute for Biotechnology
[email protected]
(256) 327-0473 (p)
(256) 327-0966 (f)
Room 4005
601 Genome Way
Huntsville, Alabama 35806
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing