Dear bioc-sig-sequencing,

I am trying to analyze Eland aligned files (Arabidopsis) for differential 
expression, using as a guide the 'A ChIP-Seq Data Analysis' handout from a 
11/19/09 session at the 'High throughput sequence analysis tools and approaches 
with Bioconductor' workshop in Seattle.

I generated the error message in the following output.  Can you comment?

Notes:

i. This is my second email on this problem.  My first email omitted some info 
which appears to me may have affected the response by persons monitoring the 
mailing list.

ii. One misunderstanding would appear to be that I used the mouse 'reads' data 
supplied during the lab.  Instead, I used what I'm told is Eland-aligned 
Arabidopsis data (s_8_export_chr1.txt file derived from s_8_export.txt using 
grep).

iii. The error message suggests a mismatch between:
> names(arab.chromlens)
[1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM"
and
> levels(chromosome(alns_8))
[1] "chr1.fas"

I don't know what to do about this 'mismatch'?  Perhaps I need to arrange so:
> names(arab.chromlens)
gives output of only:
"chr1"?

iv. I note using 'available.genomes()', there are two BSgenome data packages 
for Arabidopsis.  Could my arbitrary choice be a problem?  Would one have to 
coordinate the choice with the Arabidopsis genome Eland must have used during 
alignment?


...

> cerudataDir <- "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data"
> cerudataDir
[1] "/home/mterry/data09/wang_892spr09/rob_hw6/ceru_data"

> pattern <- "s_8_export_chr1.txt"
> list.files(cerudataDir, pattern)
[1] "s_8_export_chr1.txt"
> filt1 <- alignDataFilter(expression(filtering=="Y"))
> filt2 <- chromosomeFilter("chr[0-9XYM]+.fa")
> filt <- compose(filt1, filt2)
> alns_8 <- readAligned(cerudataDir, pattern, type="SolexaExport",
+ filter=filt)
> alns_8
class: AlignedRead
length: 1022848 reads; width: 35 cycles
chromosome: chr1.fas chr1.fas ... chr1.fas chr1.fas
position: 7568294 167488 ... 4687256 5376960
strand: + + ... + +
alignQuality: NumericQuality
alignData varLabels: run lane ... filtering contig

> levels(aln...@chromosome) <- sub(".fa$", "", levels(chromosome(alns_8)))

> head(levels(aln...@chromosome))
[1] "chr1.fas"
> levels(chromosome(alns_8))
[1] "chr1.fas"
> library(BSgenome.Athaliana.TAIR.04232008)
> arab.chromlens <- seqlengths(Athaliana)
> head(arab.chromlens)
    chr1     chr2     chr3     chr4     chr5     chrC
30432563 19705359 23470805 18585042 26992728   154478
> names(arab.chromlens)
[1] "chr1" "chr2" "chr3" "chr4" "chr5" "chrC" "chrM"
> cov.arab8 <- coverage(alns_8, width = arab.chromlens, extend = 126L)
Error: UserArgumentMismatch
  'names(width)' (or 'names(end)') mismatch with 'levels(chromosome(x))'
  see ?"AlignedRead-class"

> sessionInfo()
R version 2.10.1 (2009-12-14)
x86_64-pc-linux-gnu

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 [5] LC_MONETARY=C              LC_MESSAGES=en_US.UTF-8
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
 [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] BSgenome.Athaliana.TAIR.04232008_1.3.16
[2] chipseq_0.2.1
[3] ShortRead_1.4.0
[4] lattice_0.17-26
[5] BSgenome_1.14.2
[6] Biostrings_2.14.10
[7] IRanges_1.4.9

loaded via a namespace (and not attached):
[1] Biobase_2.6.1 grid_2.10.1   hwriter_1.1
>


Thanks,
P. Terry
[email protected]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Reply via email to