On Mon, Sep 19, 2011 at 11:31 AM, Martin Morgan <mtmor...@fhcrc.org> wrote:
> On 09/19/2011 11:26 AM, Rene Paradis wrote: > >> Thanks Martin and Michael for your constructive advices, >> >> I used the ScanBamParam object to successfully load a part of the Chr1 >> from a Bam file via ScanBam. Honestly I do not know what are the >> differences between readGappedAlignments, readBamGappedAlignment and >> ScanBam. The last two of them can take a ScanBamParam object. >> > > scanBam returns a list-of-lists, it's the most flexible but least > 'user-friendly'. > > readGappedAlignments is meant to be a 'front end' to read GappedAlignments > from several different sources, and readBamGappedAlignments is meant to be > one of those sources; usually the 'user' would readGappedAlignments. > > > But I wished I could select the seqname in GRanges to retrieve all the >> chr1 (as an example) data from the Bam file. It seems I must select a >> range. So I put a value that goes beyond the range of the chr1 because I >> do not know that range, and I got an<<INTEGER () can only be applied to >> a 'integer', not a special>>. > > Couldn't Rsamtools give something more informative? > There must be something I missed that >> could help me doing that. >> > > see ?scanBamHeader, e.g., > > > fl <- system.file("extdata", "ex1.bam", package="Rsamtools") > > scanBamHeader(fl)[[1]]$targets > seq1 seq2 > 1575 1584 > > Would be nice to have a method for getting a Seqinfo out of a BAM header. Then one can just coerce that to a GRanges. rtracklayer does the equivalent for BigWig. Michael > Martin > > > >> ultimately, I want to launch a PICS analysis that requires a >> segReadsList object. >> >> Overall I definitely progressed by your help, thank you. >> >> Rene >> >> >> >> >> On Fri, 2011-09-16 at 14:29 -0700, Martin Morgan wrote: >> >>> On 09/16/2011 02:11 PM, Michael Lawrence wrote: >>> >>>> It sounds like you're trying to use BED as an alternative to BAM? >>>> Probably >>>> not a good idea, especially at this scale. Why are you aiming for a >>>> GenomeData? A GappedAlignments might be more appropriate. See >>>> GenomicRanges::**readGappedAlignments() for bringing a BAM into a >>>> GappedAlignments. >>>> >>> >>> Hi Rene >>> >>> the 'which' argument to readGappedAlignments (it'll become 'param' with >>> the next release, and be a ScanBamParam object) allows you to select >>> regions to process, e.g., chromosome-at-a-time, to help with file size. >>> >>> Martin >>> >>>> >>>> This page might help: >>>> http://bioconductor.org/help/**workflows/high-throughput-** >>>> sequencing/#sequencing-**resources<http://bioconductor.org/help/workflows/high-throughput-sequencing/#sequencing-resources> >>>> >>>> But it could really be improved. >>>> >>>> Michael >>>> >>>> On Fri, Sep 16, 2011 at 1:44 PM, Rene Paradis<rene.paradis@genome.** >>>> ulaval.ca <rene.para...@genome.ulaval.ca> >>>> >>>>> wrote: >>>>> >>>> >>>> Hello, >>>>> >>>>> I am experiencing a problem regarding the load in memory of bed files >>>>> of >>>>> 30 GB. my function read.table unleash the error : Error in unique(x) : >>>>> length xxxxxx is too large for hashing. >>>>> >>>>> this is generated by the function MKsetup of the unique.c file. Even by >>>>> increasing by 10 000x the value, the error persists. I believe the >>>>> function pushes more data in ram, but I am not sure this is the good >>>>> way >>>>> to focus on. >>>>> >>>>> Ultimately, I would like to produce a GenomeData object from either a >>>>> BAM file or a bed file. >>>>> >>>>> has someone ever worked with very very big BAM files (about 30 GB) >>>>> >>>>> thanks >>>>> >>>>> Rene paradis >>>>> >>>>> ______________________________**_________________ >>>>> Bioc-sig-sequencing mailing list >>>>> Bioc-sig-sequencing@r-project.**org<Bioc-sig-sequencing@r-project.org> >>>>> https://stat.ethz.ch/mailman/**listinfo/bioc-sig-sequencing<https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing> >>>>> >>>>> >>>> [[alternative HTML version deleted]] >>>> >>>> ______________________________**_________________ >>>> Bioc-sig-sequencing mailing list >>>> Bioc-sig-sequencing@r-project.**org <Bioc-sig-sequencing@r-project.org> >>>> https://stat.ethz.ch/mailman/**listinfo/bioc-sig-sequencing<https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing> >>>> >>> >>> >>> >> >> > > -- > Computational Biology > Fred Hutchinson Cancer Research Center > 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 > > Location: M1-B861 > Telephone: 206 667-2793 > > ______________________________**_________________ > Bioc-sig-sequencing mailing list > Bioc-sig-sequencing@r-project.**org <Bioc-sig-sequencing@r-project.org> > https://stat.ethz.ch/mailman/**listinfo/bioc-sig-sequencing<https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing> > [[alternative HTML version deleted]] _______________________________________________ Bioc-sig-sequencing mailing list Bioc-sig-sequencing@r-project.org https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing