dear all,
I already reported that I have a problem using the estimate.mean.fraglen
function from the chipseq package using the method="correlation", finally I
think I found the reason for the problem:
short example: I filter my data set to unique reads on chromosomes 1-22, X
and Y and filter also for reads on + and - strand.
> chromFilter <- chromosomeFilter( "(^[1-9]|^X|^Y)" )
> strFilter <- strandFilter( c("+", "-") )
> theFilters <- compose( chromFilter, strFilter )
> AlignedReads <- AlignedReads[[1]]
> AlignedReads <- AlignedReads[ theFilters( AlignedReads ) ]
> uFilter <- uniqueFilter( withSread=FALSE )
> AlignedReads <- AlignedReads[ uFilter( AlignedReads ) ]
> AlignedReads
class: AlignedRead
length: 17860091 reads; width: 32 36 cycles
chromosome: 7 8 ... 1 15
position: 131934390 3275112 ... 64185954 45913614
strand: + + ... - -
alignQuality: NumericQuality
alignData varLabels: similar mismatch
> unique( strand( AlignedReads ))
[1] + -
Levels: - + *
this was actually the first surprise with the levels of strand being - + *
although the object does only contain reads on + and - strand. this was then
also the cause for the error in the estimate.mean.fraglen function:
> Test <- estimate.mean.fraglen( AlignedReads, method="correlation" )
Error in if (from > from0 || to < to0) stop("[from, to] smaller than support
not implemented yet (but easy to add)") :
missing value where TRUE/FALSE needed
Error in unlist(lapply(splitData, function(y) { :
error in evaluating the argument 'x' in selecting a method for function
'unlist'
by looking at the source code of the chipseq package I finally realised that
the 3 levels of the strand are the problem, since the estimate.mean.fraglen
function defined for AlignedRead classes splits the reads based on the
strand (and "split" with "drop=FALSE" (the default) will split the data set
into 3 columns, one for each level):
setMethod("estimate.mean.fraglen", "AlignedRead",
function(x, method = c("SISSR", "coverage", "correlation"), ...) {
splitData <-
split(data.frame(strand = strand(x),
start =
ifelse(strand(x) == "-",
position(x) + width(x) - 1L,
position(x))),
chromosome(x))
unlist(lapply(splitData,
function(y) {
estimate.mean.fraglen(split(y[["start"]],
## splits reads
y[["strand"]]),
method = method, ...)
}))
})
I suggest the following solutions:
1) either reduce the levels of strand( AlignedReads ) to only + and - after
applying a strandFilter, or
2) use split with the option drop=TRUE in the estinmate.mean.fraglen
function:
...
function(y) {
estimate.mean.fraglen(split(y[["start"]],
y[["strand"]],
drop=TRUE ),
method = method, ...)
}))
...
best regards, jo
--
Johannes Rainer, PhD
Bioinformatics Group,
Division Molecular Pathophysiology,
Biocenter, Medical University Innsbruck,
Fritz-Pregl-Str 3/IV, 6020 Innsbruck, Austria
and
Tyrolean Cancer Research Institute
Innrain 66, 6020 Innsbruck, Austria
Tel.: +43 512 570485 13
Email: [email protected]
[email protected]
URL: http://bioinfo.i-med.ac.at
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing