Hi Martin Thanks a lot indeed for answering my questions. I was able to infact find ways to make it work and Patrick helped me with visualization part.
Sometimes no replies from the mailing list forces me to find a solution and eventually learn more which is good :)... _Abhi -----Original Message----- From: Martin Morgan [mailto:[email protected]] Sent: Sun 12/6/2009 10:14 PM To: Pratap, Abhishek Cc: Patrick Aboyoun; [email protected] Subject: Re: [Bioc-sig-seq] Plotting Coverage for inhouse genome Hi Abhi -- Pratap, Abhishek wrote: > Hi Patrick > > Trying few more things I was atleast able to deal with selecting only > one Chr with filtered reads. > > ned_chr <- chromosome(aln) == "K-10_OM_revised.fasta" > filtered <- alignData(aln)[["filtering"]] == "Y" > aln_chr_1_filt <- aln[ned_chr & filtered] > > levels(chromosome(aln_chr_1_filt) ) > [1] "K-10_OM_revised.fasta" > > > But the problem with coverage functions still remains the same. > > > cov_rLe <- coverage(x=aln_chr_1_filt, start=1, end=4832581, shift=0L, > width=NULL,weight=1L,coords="leftmost",extend=0L); > > Error: UserArgumentMismatch > 'names(shift)' (or 'names(start)') mismatch with > 'levels(chromosome(x))' > see ?"AlignedRead-class" > In addition: Warning message: > UserArgumentMismatch > use 'shift'/'width' instead of 'start'/'end' I wasn't sure whether this was ever resolved for you. The first thing is to choose between EITHER shift/width OR start/end, not both. Better to use shift/width, as it is the way this will be done in the future. Second, shift/width needs to be a named vector, with names matching those in 'aln'. This is illustrated in the example on ?"coverage,AlignedRead-method", where we see cvg <- coverage(aln, shift=c(ChrA=10L)) so for your example above you might try coverage(aln_chr_1_filt, width=c("K-10_OM_revised.fasta"=4832581L)) The 'L' is a way of making sure 4832581 is an integer, instead of a real number. No need to specify shift, weight, coords or extend, as these are all the default values. The quotation marks around K-10_OM_revised.fasta are required because "K-10_OM_revised.fasta" without quotes would not be a valid name (the - would be parsed as a minus sign, and R would complain about a syntax error). Hope that helps. Martin > > > -Abhi > > > > > > -----Original Message----- > From: [email protected] > [mailto:[email protected]] On Behalf Of Pratap, > Abhishek > Sent: Wednesday, December 02, 2009 7:53 PM > To: Patrick Aboyoun > Cc: [email protected] > Subject: Re: [Bioc-sig-seq] Plotting Coverage for inhouse genome > > Hi Patrick > > Thanks for letting me know how to search doc efficiently. Picking up on > the coverage method, I tried the following. Also I would like to use > only one chr, in this case (K-10_OM_revised.fasta *). How can I change > the aln object to just one one specific chromosome. I tired few of my > beginner tricks (aln$ K-10_OM_revised.fasta, > aln[[K-10_OM_revised.fasta]] but dint work. I guess I am not using right > method on aln object. > > cov_rLe <- coverage(x=aln, start=1, end=4832581, shift=0L, > width=NULL,weight=1L,coords="leftmost",extend=0L); > > Error: UserArgumentMismatch > 'names(shift)' (or 'names(start)') mismatch with > 'levels(chromosome(x))' > see ?"AlignedRead-class" > In addition: Warning message: > UserArgumentMismatch > use 'shift'/'width' instead of 'start'/'end' > > > Here 4832581 is the length of the genome. > > > * > Levels(chromosome(aln)) ======== Just to clear the confusion about the > chromosome name. > 175] "7:0:1" "7:1:0" "7:7:1" > > [178] "8:0:0" "K-10_OM_revised.fasta" "NM" > > [181] "QC" > > Cheers! > -Abhi > > > > -----Original Message----- > From: Patrick Aboyoun [mailto:[email protected]] > Sent: Wednesday, December 02, 2009 6:19 PM > To: Pratap, Abhishek > Cc: Michael Lawrence; [email protected] > Subject: Re: [Bioc-sig-seq] Plotting Coverage for inhouse genome > > Abhi, > To answer your first question, there is a coverage method for > AlignedRead objects in the ShortRead package. S4 methods can be hard to > find in R's help pages, particularly if the methods for a generic are > stored across multiple packages. When I'm unsure what is around, I use > the showMethods function to get a list of available methods for an S4 > generic: > > > suppressMessages(library(ShortRead)) > > showMethods("coverage") > Function: coverage (package IRanges) > x="AlignedRead" > x="AlignedXStringSet0" > x="IRanges" > x="MaskCollection" > x="MaskedXString" > x="MIndex" > x="numeric" > x="PairwiseAlignedFixedSubject" > x="PairwiseAlignedFixedSubjectSummary" > x="RangedData" > x="RangesList" > x="Views" > > From there I use a help command of the form > help("<<generic>>,<<signature>>-method") > > > help("coverage,AlignedRead-method") > > In that man page you will find information on how to use coverage for an > > AlignedRead object. > > Once you have the coverage vectors (in the form of an RleList), you can > use the approach Michael outlined for plotting. > > > Patrick > > > > Michael Lawrence wrote: >> On Wed, Dec 2, 2009 at 2:31 PM, Pratap, Abhishek >> <[email protected]>wrote: >> >> >>> Hi All >>> >>> After taking an exciting BioC course at FredHutch recently, I am > trying >>> to dive into BioC world to make more sense of NGS data. I would also >>> like to thank the entire BioC team present there. You guys were > simply >>> great. I appreciate you all patiently answering tons of questions and >>> taught some of the cool things we can do with BioC packages. >>> >>> For my current problem I would like to generate coverage plots for a >>> small bacterial genome. I have the reads imported as a AlignedRead >>> object and I am not sure how to proceed next. The help for the >>> "coverage" method didn't get me very far. I am looking for some > pointers >>> to help me resolve the following questions. >>> >>> 1. How to create a RLE list from the AlignRead object using coverage >>> method ? >>> 2. Do I need to pack my reference genome as BSgenome data package in >>> order to do this ? If yes I just have the sequence and no group II > mask >>> file. >>> 3. Finally how to plot a RLE list. >>> >>> >>> >> This is a pretty difficult question to answer. Do you want to see all >> chromosomes at once (like a karyotype plot)? An overview of entire >> chromosomes? Or do you want to explore the data in an interactive > genome >> browser? >> >> There are many packages in Bioconductor for plotting this sort of > data, >> including GenomeGraphs and HilbertVis. For genome browsers, there's >> rtracklayer and others. >> >> For example, with rtracklayer: >> session <- browserSession() >> session$coverage <- as(coverage, "RangedData") >> >> >> >> >>> Thanks! >>> -Abhi >>> >>> _______________________________________________ >>> Bioc-sig-sequencing mailing list >>> [email protected] >>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >>> >>> >> [[alternative HTML version deleted]] >> >> _______________________________________________ >> Bioc-sig-sequencing mailing list >> [email protected] >> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing >> > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing > > _______________________________________________ > Bioc-sig-sequencing mailing list > [email protected] > https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing -- Martin Morgan Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list [email protected] https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
