Re: [Bioc-sig-seq] 'coverage' error message

Patrick Aboyoun Mon, 04 Jan 2010 13:48:01 -0800

P.,

As the error message suggests, there is a mismatch betweennames(arab.chromlens) and levels(chromosome(alns)), meaning thechromosome lengths vector and the AlignedRead object are not in sync.The aligned reads for this experiment were from a mouse model, notarabidopsis thaliana, so you would need to referenceBSgenome.Mmusculus.UCSC.mm9 when performing these operations:



> filt1 <- alignDataFilter(expression(filtering=="Y"))
> filt2 <- chromosomeFilter("chr[0-9XYM]+.fa")
> filt <- compose(filt1, filt2)

> alns <- readAligned(extdataDir, pattern, type="SolexaExport",filter=filt)

> alns
class: AlignedRead
length: 195719 reads; width: 35 cycles
chromosome: chr11.fa chr9.fa ... chr8.fa chr4.fa
position: 104853312 3036336 ... 44295163 47191474
strand: - - ... - -
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig

> levels(a...@chromosome) <- sub(".fa$", "", levels(chromosome(alns)))

> library(BSgenome.Mmusculus.UCSC.mm9)
> mm9.chromlens <- seqlengths(Mmusculus)
> head(mm9.chromlens)
    chr1      chr2      chr3      chr4      chr5      chr6
197195432 181748087 159599783 155630120 152537259 149517037

> cov.mm9 <- coverage(alns, width = mm9.chromlens, extend = 126L)
> cov.mm9
SimpleRleList of length 22
$chr1
'integer' Rle of length 197195432 with 27263 runs
 Lengths:  3018534 161 16703 161 68815 161 33063 161 58217 161 ...
 Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr10
'integer' Rle of length 129993255 with 21699 runs
 Lengths:  3019736 161 11311 161 4238 161 10661 161 793 161 ...
 Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr11
'integer' Rle of length 121843856 with 22105 runs
 Lengths:  3000315 6 40 79 9 4 23 6 2 38 ...
 Values :  0 1 2 3 4 5 6 5 4 5 ...

$chr12
'integer' Rle of length 121257530 with 18183 runs
 Lengths:  3002552 161 6903 161 4375 161 5041 161 2491 161 ...
 Values :  0 1 0 1 0 1 0 1 0 1 ...

$chr13
'integer' Rle of length 120284312 with 15907 runs
 Lengths:  3001262 161 5650 161 29080 161 111 40 121 40 ...
 Values :  0 1 0 1 0 1 0 1 2 1 ...

...
<17 more elements>
> sessionInfo()
R version 2.11.0 Under development (unstable) (2010-01-02 r50884)
i386-apple-darwin9.8.0

locale:
[1] C/C/C/C/C/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets

[6] methods base

other attached packages:

[1] BSgenome.Mmusculus.UCSC.mm9_1.3.16[2] BSgenome.Athaliana.TAIR.04232008_1.3.16[3] ShortReadTutorial_0.0.1[4] ShortRead_1.5.10[5] lattice_0.17-26[6] BSgenome_1.15.3[7] Biostrings_2.15.11[8] IRanges_1.5.23

loaded via a namespace (and not attached):
[1] Biobase_2.7.3 grid_2.11.0   hwriter_1.1   tools_2.11.0


Cheers,
Patrick



[email protected] wrote:

Dear bioc-sig-sequencing,

I am trying to analyze Eland aligned files for differential expression, using 
the 'A ChIP-Seq Data Analysis' handout from a 11/19/09 session at the 'High 
throughput sequence analysis tools and approaches with Bioconductor' workshop 
in Seattle.

I generated an error message in the following output.  Can you comment?

...

alns_8 <- readAligned(cdataDir, pattern, "SolexaExport")
alns_8

class: AlignedRead
length: 1380439 reads; width: 35 cycles
chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas
position: 7568294 167488 ... 4687256 5376960
strand: + + ... + +
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig

head(sread(alns_8))

  A DNAStringSet instance of length 6
    width seq
[1]    35 AGCTATGATCAAGAGAACCTTTCACGATCANNNCN
[2]    35 CGGACGACGGGTAGTTTCGGGCTGTACCAANNNAN
[3]    35 AGCTCAGCGATCTGAGCCACTTGCTCTTTGNNNTN
[4]    35 GGGCCATAGGCCCGTTAAAATATTTTTCTCTNNCT
[5]    35 ATTGTCCATTGACAAATGAAGATATTGGGATNNTT
[6]    35 ACCCCTCCACCAGTATGTTGGCGAAAATCTCNNCC

table(strand(alns_8), useNA="ifany")


     -      +      *
689912 690527      0

...

library(BSgenome.Athaliana.TAIR.04232008)
arab.chromlens <- seqlengths(Athaliana)
head(arab.chromlens)

    chr1     chr2     chr3     chr4     chr5     chrC
30432563 19705359 23470805 18585042 26992728   154478

cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L)

Error: UserArgumentMismatch
  'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))'
  see ?"AlignedRead-class"

sessionInfo()

R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] BSgenome.Athaliana.TAIR.04232008_1.3.16
[2] chipseq_0.2.0
[3] ShortRead_1.4.0
[4] lattice_0.17-26
[5] BSgenome_1.14.0
[6] Biostrings_2.14.1
[7] IRanges_1.4.2

loaded via a namespace (and not attached):
[1] Biobase_2.6.0 grid_2.10.1   hwriter_1.1


Thanks,
P. Terry
[email protected]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] 'coverage' error message

Reply via email to