Re: [Bioc-sig-seq] ChIPpeakAnno get sequences from ranged data

Zhu, Julie Wed, 24 Mar 2010 09:55:42 -0700

Hi Eric,

Please try the following code. You had the gene names in space which needs to 
contain chromosome names.


peaks = RangedData(IRanges(start=SeqTest$start, end=SeqTest$end, 
names=SeqTest$peakID), space=as.character(SeqTest$Chrom))
library(BSgenome.Dmelanogaster.UCSC.dm2)
peaksWithSequences = getAllPeakSequence(peaks, upstream=60, downstream=40, 
genome=Dmelanogaster)

Best regards,

Julie


On 3/21/10 8:20 PM, "Julie Zhu" <[email protected]> wrote:

Hi Eric,

Is CAGE the cap analysis of gene expression? Thanks!

I think the error has to do with the chromosome naming since there are only 
chromosome X, 2L, 2R, 3L, 3R and 4 are available in drosophila genome.

I am revising the ChIPpeakAnno paper which is due on Tuesday. Could I get back 
to you after that? Thank so much!

Meanwhile, if others could help out, that would be very much appreciated!

Best regards,

Julie



On 3/21/10 12:52 PM, "Eric Bremer" <[email protected]> wrote:

HI Julie,
I am trying to get sequence data from the CAGE peaks identified to be within 50 
bases of a TSS.  With your previous help I was able to nicely idenitify these 
using ChIPeakAnno.  I seem to be having trouble getting the gene identifier 
into ranged data following the example in the Vignette.  I have pasted my 
approach below.  Thanks in advance for any suggestions.

> peaks = RangedData(IRanges(start = c(100, 500), end = c(300,
+ 600), names = c("peak1", "peak2")), space = c("NC_008253",
+ "NC_010468"))
> peaks
RangedData with 2 rows and 0 value columns across 2 spaces
            space     ranges |
      <character>  <IRanges> |
peak1   NC_008253 [100, 300] |
peak2   NC_010468 [500, 600] |
> SeqTest = read.table(file="2Rp_50base.txt", header=TRUE)
> head(SeqTest)
  Chrom    start      end length peakID strand FlyBaseGene
1    2R 16831502 16831539     38   6010      1 FBgn0000044
2    2R 20702347 20702374     28   7922      1 FBgn0000157
3    2R  8146910  8146941     32   2598      1 FBgn0000253
4    2R 18952292 18952318     27   7048      1 FBgn0000487
5    2R 14120268 14120294     27   5159      1 FBgn0000658
6    2R 19830055 19830090     36   7367      1 FBgn0001123
> peaks = RangedData(IRanges(start=c(SeqTest$start), end=c(SeqTest$end), 
> names=c(SeqTest$peakID)), space=c(SeqTest$FlyBaseGene))
> head(peaks)
RangedData with 6 rows and 0 value columns across 247 spaces
           space               ranges |
     <character>            <IRanges> |
6010           1 [16831502, 16831539] |
7922           2 [20702347, 20702374] |
2598           3 [ 8146910,  8146941] |
7048           4 [18952292, 18952318] |
5159           5 [14120268, 14120294] |
7367           6 [19830055, 19830090] |
> peaks$space[1:6]
[1] "1" "2" "3" "4" "5" "6"

> library(BSgenome.Dmelanogaster.UCSC.dm2)
> peaksWithSequences = getAllPeakSequence(peaks, upstream=60, downstream=40, 
> genome=Dmelanogaster)
Error in .getOneSeq(x, name) : sequence chr1 not found
Error in subseq(.getOneSeq(x, name), start = start, end = end, width = width) :
  error in evaluating the argument 'x' in selecting a method for function 
'subseq'


        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing



        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] ChIPpeakAnno get sequences from ranged data

Reply via email to