Re: [Bioc-sig-seq] ChIPpeakAnno get sequences from ranged data

Eric Bremer Thu, 25 Mar 2010 10:02:23 -0700

Hi Julie,
Thanks so much for the help!!  Below is the result I get when using
chromosome names.  I still get a numeric value that does not correspond to
the chromosome from the original table (2R).  Another question is how do I
take into account "strand" when retreiving the sequences?
Thanks again,
Eric


>  peaks = RangedData(IRanges(start=c(SeqTest$start), end=c(SeqTest$end),
names=c(SeqTest$peakID)), space=c(SeqTest$Chrom))
> head(peaks)
RangedData with 6 rows and 0 value columns across 1 space
           space               ranges |
     <character>            <IRanges> |
6010           1 [16831502, 16831539] |
7922           1 [20702347, 20702374] |
2598           1 [ 8146910,  8146941] |
7048           1 [18952292, 18952318] |
5159           1 [14120268, 14120294] |
7367           1 [19830055, 19830090] |


On Wed, Mar 24, 2010 at 11:56 AM, Zhu, Julie <[email protected]> wrote:

>  Hi Eric,
>
> Please try the following code. You had the gene names in space which needs
> to contain chromosome names.
>
> peaks = RangedData(IRanges(start=SeqTest$start, end=SeqTest$end,
> names=SeqTest$peakID), space=as.character(SeqTest$Chrom))
>
> library(BSgenome.Dmelanogaster.UCSC.dm2)
> peaksWithSequences = getAllPeakSequence(peaks, upstream=60, downstream=40,
> genome=Dmelanogaster)
>
> Best regards,
>
> Julie
>
>
>
> On 3/21/10 8:20 PM, "Julie Zhu" <[email protected]> wrote:
>
> Hi Eric,
>
> Is CAGE the cap analysis of gene expression? Thanks!
>
> I think the error has to do with the chromosome naming since there are only
> chromosome X, 2L, 2R, 3L, 3R and 4 are available in drosophila genome.
>
> I am revising the ChIPpeakAnno paper which is due on Tuesday. Could I get
> back to you after that? Thank so much!
>
> Meanwhile, if others could help out, that would be very much appreciated!
>
> Best regards,
>
> Julie
>
>
>
> On 3/21/10 12:52 PM, "Eric Bremer" <[email protected]> wrote:
>
> HI Julie,
> I am trying to get sequence data from the CAGE peaks identified to be
> within 50 bases of a TSS.  With your previous help I was able to nicely
> idenitify these using ChIPeakAnno.  I seem to be having trouble getting the
> gene identifier into ranged data following the example in the Vignette.  I
> have pasted my approach below.  Thanks in advance for any suggestions.
>
> > peaks = RangedData(IRanges(start = c(100, 500), end = c(300,
> + 600), names = c("peak1", "peak2")), space = c("NC_008253",
> + "NC_010468"))
> > peaks
> RangedData with 2 rows and 0 value columns across 2 spaces
>             space     ranges |
>       <character>  <IRanges> |
> peak1   NC_008253 [100, 300] |
> peak2   NC_010468 [500, 600] |
> > SeqTest = read.table(file="2Rp_50base.txt", header=TRUE)
> > head(SeqTest)
>   Chrom    start      end length peakID strand FlyBaseGene
> 1    2R 16831502 16831539     38   6010      1 FBgn0000044
> 2    2R 20702347 20702374     28   7922      1 FBgn0000157
> 3    2R  8146910  8146941     32   2598      1 FBgn0000253
> 4    2R 18952292 18952318     27   7048      1 FBgn0000487
> 5    2R 14120268 14120294     27   5159      1 FBgn0000658
> 6    2R 19830055 19830090     36   7367      1 FBgn0001123
> > peaks = RangedData(IRanges(start=c(SeqTest$start), end=c(SeqTest$end),
> names=c(SeqTest$peakID)), space=c(SeqTest$FlyBaseGene))
> > head(peaks)
> RangedData with 6 rows and 0 value columns across 247 spaces
>            space               ranges |
>      <character>            <IRanges> |
> 6010           1 [16831502, 16831539] |
> 7922           2 [20702347, 20702374] |
> 2598           3 [ 8146910,  8146941] |
> 7048           4 [18952292, 18952318] |
> 5159           5 [14120268, 14120294] |
> 7367           6 [19830055, 19830090] |
> > peaks$space[1:6]
> [1] "1" "2" "3" "4" "5" "6"
>
> > library(BSgenome.Dmelanogaster.UCSC.dm2)
> > peaksWithSequences = getAllPeakSequence(peaks, upstream=60,
> downstream=40, genome=Dmelanogaster)
> Error in .getOneSeq(x, name) : sequence chr1 not found
> Error in subseq(.getOneSeq(x, name), start = start, end = end, width =
> width) :
>   error in evaluating the argument 'x' in selecting a method for function
> 'subseq'
>
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> [email protected]
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>
>
>


-- 
Eric Bremer, PhD
Chief Scientific Officer
Precision Biomarker Resources, Inc.
820 Davis St.,  Suite 216
Evanston, Illinois  60201

Office: 847.866.0406
Mobile: 708.267.2052
Skype: egbremer

www.precisionbiomarker.com

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

Re: [Bioc-sig-seq] ChIPpeakAnno get sequences from ranged data

Reply via email to