Hi Martin, As per your explanations I should not get out of bound error. TSS are from ensembl from chromosome 10 and histone modification data is also from human and I am searching only +/- 50 even +/-10000 should not give me out of bound error.
I don't know whether there is some concept of range for data types in R like we have for data types in C. I was also wondering if converting the 'integer' Rle of length* to 'numeric' Rle of length would be useful because may be integer vector gets out of range. But again I don't know how to do the conversion from integer to numeric. Thank you, Best regards, Kirti Prakash On Wed, Aug 18, 2010 at 5:05 PM, kirti prakash <[email protected]> wrote: > Hi Martin, > > Thanks again. I really like your way of explaining... even new comers > like me are able to learn how data compression works. Its really a > nice approach to break the problem in parts in order to understand it. > > I have some human histone modification data and I am trying to plot > the tag densities. > > This is the complete code(from the above link) that I am trying to > implement... > > aln.H3K4me1 <- readAligned("dirpath", type="Bowtie") > cov.H3K4me1 = coverage(aln.H3K4me1) > > deltaConvolve <- function( coverage, winCentres, revStrand, left, right) > { > result <- rep( 0, right - left + 1 ) > for( i in seq(along=winCentres) ) { > v <- as.vector(seqselect( coverage, winCentres[i] - left, > winCentres[i] + right )) > if( revStrand[i] ) > v <- rev( v ) > result <- result + v > } > result > } > > library("biomaRt") > mart <- useMart( "ensembl", dataset = "hsapiens_gene_ensembl" ) > martReply <- getBM( attributes=c( "transcript_start", > "transcript_end", "strand" ), filters="chromosome_name", values="10", > mart=mart ) > > tss <- ifelse( martReply$strand == 1, martReply$transcript_start, > martReply$transcript_end ) > > win = c(-50, 50) > H3K4me3 <- deltaConvolve( cov.H3K4me1$chr10, tss, martReply$strand == > -1, left=win[1], right=win[2] ) > > The code runs for a while but after couple of minutes it gives the > above errors. so possibly some but not all values are out of bound. > As clear from above I get TSS from ensembl. There are some 6000 TSS > for chr10 and window size I have reduced to 101 so that it doesn't > gets out of bound but still I get this error. > > Do I need to have some try-catch block? > > Thank you, > > Best regards, > > Kirti Prakash > > > On Wed, Aug 18, 2010 at 4:01 PM, Martin Morgan <[email protected]> wrote: >> On 08/17/2010 09:41 PM, kirti prakash wrote: >>> Hi Martin, >>> >>> Thanks a lot for your help. Bioconductor is a great place as one gets >>> the help from the best and its completely free. Its seems all one >>> needs to do good research in bioinformatics is a computer, some data >>> and Bioconductor. >>> >>> Its the same tutorial... the link you mentioned above . I cannot >>> attach the tutorials due to the limit of mail size. >> >> I meant that if the document is available in a public place, then >> provide a URL to it so that those trying to help can better understand >> what your problem is. >> >>> >>> I did exactly as you told and most the things work fine except... >>> >>> Later in the tutorial they use deltaConvolve function... >>> >>> H3K4me1 <- deltaConvolve( cov.H3K4me1.chr10, tss, martReply$strand == >>> -1, left=win[1], right=win[2] ) >>> Error in .bracket.Index(i, lx, asRanges = TRUE) : >>> range index out of bounds >>> Error in as.vector(seqselect(coverage, winCentres[i] + left, winCentres[i] >>> + : >>> error in evaluating the argument 'x' in selecting a method for >>> function 'as.vector' >>> >>> Also you mention endoapply above... it also gave me same error... >>> >>> endoapply(cov.H3K4me1, seqselect, 123456-50, 123456+50) >>> Error in .bracket.Index(i, lx, asRanges = TRUE) : >>> range index out of bounds >> >> One approach to solving problems is to try to simplify them. So let's >> try to reproduce the error but with a simpler example. You're selecting >> a subset of an Rle object >> >>> seqselect(Rle(1:10, 10:1), 10, 20) >> 'integer' Rle of length 11 with 3 runs >> Lengths: 1 9 1 >> Values : 1 2 3 >> >> The error says something about 'index out of bounds', which suggests >> >>> seqselect(Rle(1:10, 10:1), 10, 2000) >> Error in .bracket.Index(i, lx, asRanges = TRUE) : >> range index out of bounds >>> seqselect(Rle(1:10, 10:1), -10, 20) >> Error in .bracket.Index(i, lx, asRanges = TRUE) : >> range index out of bounds >> >> Any hints to what your original problem is, and how to solve it? >> >> Martin >> >>> >>> >>> I found that both of these are exception handling errors but I again I >>> don't know to fix them. It would be great if you can again help on >>> this. >> >>> >>> Thanks again, >>> >>> Best Regards, >>> >>> Kirti Prakash >>> >>> On Wed, Aug 18, 2010 at 7:33 AM, kirti prakash >>> <[email protected]> wrote: >>>> Hi Martin, >>>> >>>> Thanks a lot for your help. Bioconductor is a great place as one gets >>>> the help from the best and its completely free. Its seems all one >>>> needs to do good research in bioinformatics is a computer, some data >>>> and Bioconductor. >>>> >>>> Its the same tutorial the link you mentioned above . In future I will >>>> attach the pdfs of the tutorials also. >>>> >>>> I did exactly as you told and most the things work fine except... >>>> >>>> Later in the tutorial they use deltaConvolve function... >>>> >>>> H3K4me1 <- deltaConvolve( cov.H3K4me1.chr10, tss, martReply$strand == >>>> -1, left=win[1], right=win[2] ) >>>> Error in .bracket.Index(i, lx, asRanges = TRUE) : >>>> range index out of bounds >>>> Error in as.vector(seqselect(coverage, winCentres[i] + left, winCentres[i] >>>> + : >>>> error in evaluating the argument 'x' in selecting a method for >>>> function 'as.vector' >>>> >>>> Also you mention endoapply above... it also gave me same error... >>>> >>>> endoapply(cov.H3K4me1, seqselect, 123456-50, 123456+50) >>>> Error in .bracket.Index(i, lx, asRanges = TRUE) : >>>> range index out of bounds >>>> >>>> >>>> I found that both of these are exception handling errors but I again I >>>> don't know to fix them. It would be great if you can again help on >>>> this. >>>> >>>> Thanks again, >>>> >>>> Best Regards, >>>> >>>> Kirti Prakash >>>> >>>> >>>> >>>> >>>> On Tue, Aug 17, 2010 at 7:47 PM, Martin Morgan <[email protected]> wrote: >>>>> kirti prakash <[email protected]> writes: >>>>> >>>>>> Hi, >>>>>> >>>>>> I was reading the material "EMBL course on Short Read analysis with >>>>>> Bioconductor: An exercise with coverage vectors" by Simon Anders. >>>>> >>>>> It took me a second to find >>>>> >>>>> http://bioconductor.org/help/course-materials/2009/EMBLJune09/Practicals/TSS/ >>>>> >>>>> is that the tutorial you mean? >>>>> >>>>>> >>>>>> I tried this... >>>>>> aln <- readAligned("dirPath", pattern="sequence.map", type="Bowtie") >>>>>> cov = coverage(aln) >>>>>> cov >>>>>> [[1]] >>>>>> SimpleRleList of length 25 >>>>>> $chr1 >>>>>> 'integer' Rle of length 247188620 with 840106 runs >>>>>> Lengths: 463 36 6823 36 550058 ... 713 36 2034 >>>>>> 36 >>>>>> Values : 0 1 0 1 0 ... 0 1 0 >>>>>> 2 >>>>>> >>>>>> $chr10 >>>>>> 'integer' Rle of length 135373320 with 446681 runs >>>>>> Lengths: 88078 36 3880 36 12451 ... 20 50 36 22054 36 >>>>>> Values : 0 1 0 1 0 ... 1 0 1 0 1 >>>>>> . >>>>>> . >>>>>> . >>>>>> >>>>>>> cvg <- cov$chr10 >>>>>>> as.vector(subseq(cvg, 123456+50, 123456-50)) >>>>> >>>>> I think this should be >>>>> >>>>> seqselect(cvg$chr10, 123456+50, 123456-50) >>>>> >>>>> or, to subset all chromosomes, >>>>> >>>>> endoapply(cvg, seqselect, 123456+50, 123456-50) >>>>> >>>>> (though it doesn't seem like you'd usually want to select consistent >>>>> coordinates across chromosomes). >>>>> >>>>> To get here, I looked up the help page for Rle-class >>>>> >>>>> > class?Rle >>>>> >>>>> This is with >>>>> >>>>>> sessionInfo() >>>>> R version 2.11.1 Patched (2010-06-03 r52215) >>>>> x86_64-unknown-linux-gnu >>>>> >>>>> locale: >>>>> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C >>>>> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 >>>>> [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 >>>>> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C >>>>> [9] LC_ADDRESS=C LC_TELEPHONE=C >>>>> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C >>>>> >>>>> attached base packages: >>>>> [1] stats graphics grDevices utils datasets methods >>>>> base >>>>> >>>>> other attached packages: >>>>> [1] ShortRead_1.6.2 Rsamtools_1.0.5 lattice_0.18-8 >>>>> [4] Biostrings_2.16.9 GenomicRanges_1.0.5 IRanges_1.6.11 >>>>> >>>>> loaded via a namespace (and not attached): >>>>> [1] Biobase_2.8.0 grid_2.11.1 hwriter_1.2 tools_2.11.1 >>>>> >>>>> Martin >>>>> >>>>> >>>>>> Error in function (classes, fdef, mtable) : >>>>>> unable to find an inherited method for function "subseq", for signature >>>>>> "list" >>>>>> Error in as.vector(subseq(cvg, 123456 + 50, 123456 - 50)) : >>>>>> error in evaluating the argument 'x' in selecting a method for >>>>>> function 'as.vector' >>>>>> >>>>>> I guess * 'integer' Rle of length* should be 'numeric' Rle of length >>>>>> as per the booklet ... but I don't know how to fix it. >>>>>> >>>>>> I know I am making some stupid mistake. It would be great if anyone >>>>>> can provide some help on this. >>>>>> >>>>>> Thank you, >>>>>> >>>>>> Best regards, >>>>>> >>>>>> Kirti Prakash >>>>>> >>>>>> _______________________________________________ >>>>>> Bioc-sig-sequencing mailing list >>>>>> [email protected] >>>>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>>>> >>>>> -- >>>>> Martin Morgan >>>>> Computational Biology / Fred Hutchinson Cancer Research Center >>>>> 1100 Fairview Ave. N. >>>>> PO Box 19024 Seattle, WA 98109 >>>>> >>>>> Location: Arnold Building M1 B861 >>>>> Phone: (206) 667-2793 >>>>> >>>> >> >> >> -- >> Martin Morgan >> Computational Biology / Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N. >> PO Box 19024 Seattle, WA 98109 >> >> Location: Arnold Building M1 B861 >> Phone: (206) 667-2793 >> > _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
