________________________________
Hi Martin
"/../data" says "start at the root directory of the file system and go
one level up, then down to the 'data' directory". Is this what you
mean, or perhaps "./../data"?
The actual whole path is:
"/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data"
"/../data" is my way of shorteneing it for the e-mail
the 4 files are in the subdirectory "data"
readAligned as invoked above has no 'pattern' argument, and so will
match all files in the directory '/../data'. Likely what you want is
to add an argument pattern=".*_export.*" or similar; using
list.files("/../data") is a good way to see what you're trying to read
in, for instance list.files("/../data", ".*_export.*")
the subdirectory 'data' contains only the 4 ".*_export.*" files; I meant to
read in all 4.
The error itself likely comes from trying to allocate a very large
object. In general at least for the initial stages of an analysis a
good work flow might read a 'large' file, perform some processing to
result in a 'small' object, read the next, etc., and finally combine
the small objects. This is what qa() does (visit an _export file,
summarize, visit the next, combine results into the object that you
refer to as qaSummary); a simple (and too naive) start to a ChIP-seq
analysis might generate a list of files containing aligned reads and
then summarize where in the genome the reads align to using
ShortRead::coverage (I did not test the following code),
files <- list.files(dirPath, ".*_export.*", full=TRUE)
cvg <- lapply(files, function(file) {
aln <- readAligned(dirname(file), basename(file), type="SolexaExport")
coverage(aln)
})
the results of coverage() are quite small and manageable, and the cvg
object contains data from all 'files'; using srapply rather than
lapply would allow this to be distrbuted across processors.
I got a different error when I tried the code you suggested:
> files <- list.files("/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data",
> ".*_export.*", full=TRUE)
> files
[1] "/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data/s_2_1_export.txt"
[2] "/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data/s_2_2_export.txt"
[3] "/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data/s_3_1_export.txt"
[4] "/Users/jdhahbi/JosephDhahbi/SOLEXA/ShortRead/mono/data/s_3_2_export.txt"
>
> cvg <- lapply(files, function(file) {
+ aln <- readAligned(dirname(file), basename(file), type="SolexaExport")
+ coverage(aln)
+ })
Error in stop("'", argname, , "' cannot contain NAs") :
argument is missing, with no default
Error in coverage(IRanges(rstart, rend), start, end, ...) :
error in evaluating the argument 'x' in selecting a method for function
'coverage'
>
Thank you for help
>
>> sessioInfo()
> R version 2.8.1 Patched (2009-03-03 r48046)
> i386-apple-darwin9.6.0
> locale:
> en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] tools stats graphics grDevices utils datasets methods
> [8] base
> other attached packages:
> [1] ShortRead_1.0.6 lattice_0.17-20 Biobase_2.2.2
> Biostrings_2.10.16
> [5] IRanges_1.0.13
>
>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M2 B169
Phone: (206) 667-2793
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing