Dear bioc-sig-sequencing,
I would like to annotate chip-seq peaks for the arabidopsis genome. In trying
to work thru the GenomicFeatures vignette dated 03/27/10, I need to apply
'findOverlaps' to learn which chipseq peaks will overlap with Arabidopsis
transcrips. However, I get the following error. Perhaps I need to use the
'sub' function to change the values in the 'seqnames' column in either
'GR_txdb' or 'r_gr_ChSeqPks'? Could someone recommend what I should try?
> mart4_at_eg_gene
TranscriptDb object:
| Db type: TranscriptDb
| Data source: BioMart
| BioMart database: plant_mart_4
| BioMart dataset: athaliana_eg_gene
| BioMart dataset description: Arabidopsis thaliana genes (TAIR9)
| BioMart dataset version: TAIR9
| Full dataset: yes
| transcript_nrow: 39640
| exon_nrow: 176581
| cds_nrow: 0
| Db created by: GenomicFeatures package from Bioconductor
| Creation time: 2010-04-01 11:11:41 -0500 (Thu, 01 Apr 2010)
| GenomicFeatures version at creation time: 0.5.0
| RSQLite version at creation time: 0.8-4
> rd0_chr1_s_8_trt_vs_INPctl[["strand"]] = "*"
> gr_ChSeqPks <- as(rd0_chr1_s_8_trt_vs_INPctl, "GRanges")
> gr_ChSeqPks
GRanges with 57 ranges and 2 elementMetadata values
seqnames ranges strand | ARAB8 ARAB7INPCTL
<Rle> <IRanges> <Rle> | <integer> <integer>
[1] chr1 [ 617092, 617094] * | 24 0
[2] chr1 [1808262, 1808262] * | 8 0
[3] chr1 [3889445, 3889452] * | 64 0
[4] chr1 [4404410, 4404410] * | 8 0
[5] chr1 [7081127, 7081127] * | 8 0
[6] chr1 [7128574, 7128581] * | 64 0
...
> GR_txdb <- transcripts(mart4_at_eg_gene)
> GR_txdb
GRanges with 39640 ranges and 2 elementMetadata values
seqnames ranges strand | tx_id tx_name
<Rle> <IRanges> <Rle> | <integer> <character>
[1] 1 [ 3631, 5899] + | 5480 AT1G01010.1-TAIR
[2] 1 [23146, 31227] + | 3216 AT1G01040.1-TAIR
[3] 1 [28500, 28706] + | 8461 AT1G01046.1-TAIR
[4] 1 [44677, 44787] + | 3566 AT1G01073.1-TAIR
[5] 1 [52239, 54692] + | 7451 AT1G01110.2-TAIR
[6] 1 [52869, 54685] + | 7450 AT1G01110.1-TAIR
...
seqlengths
1 2 3 4 5 Pt Mt
NA NA NA NA NA NA NA
> r_gr_ChSeqPks <- reduce(gr_ChSeqPks)
> OL <- findOverlaps(GR_txdb, r_gr_ChSeqPks)
Error in .local(query, subject, maxgap, minoverlap, type, select, ...) :
'query' and 'subject' do not use a similiar naming convention for seqnames
> sessionInfo()
R version 2.12.0 Under development (unstable) (2010-03-30 r51506)
x86_64-unknown-linux-gnu
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] biomaRt_2.3.5 GenomicFeatures_0.5.0 GenomicRanges_0.1.0
[4] IRanges_1.5.73
loaded via a namespace (and not attached):
[1] Biobase_2.7.5 Biostrings_2.15.26 BSgenome_1.15.20 DBI_0.2-5
[5] RCurl_1.3-1 RSQLite_0.8-4 rtracklayer_1.7.11 tools_2.12.0
[9] XML_2.8-1
>
Thanks,
P. Terry
[email protected]
[[alternative HTML version deleted]]
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing