great, thanks!!

robert.

On 02/25/2015 10:06 PM, Michael Lawrence wrote:
I checked in a fix for the splitting to CompressedVRangesList. The
slowness of creating a SimpleVRangesList is due to the cost of
extracting a VRanges for each sample. Depending your exact use case, it
might be better to pay that cost up-front, instead of deferring it to
when the user wants to extract an element, which happens with the
compressed list. As long as the number of samples is small, the memory
overhead should be minimal.

Michael

On Wed, Feb 25, 2015 at 9:59 AM, Michael Lawrence <micha...@gene.com
<mailto:micha...@gene.com>> wrote:

    Yea, I know, just need to get around to that one. Technically, it
    works, but it's obviously not ideal.

    On Wed, Feb 25, 2015 at 8:52 AM, Gabe Becker <becker.g...@gene.com
    <mailto:becker.g...@gene.com>> wrote:

        Why does splitting a VRanges give a GRangesList with VRanges
        objects as elements? Seems like it should return a VRangesList.

             > spl = split(vr, sampleNames(vr))
             > class(spl)
            [1] "GRangesList"
            attr(,"package")
            [1] "GenomicRanges"
             > class(spl[[1]])
            [1] "VRanges"
            attr(,"package")
            [1] "VariantAnnotation"


        ~G

        On Wed, Feb 25, 2015 at 8:39 AM, Michael Lawrence
        <lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>>
        wrote:

            Construction will take longer; the savings are in the
            accessing of the
            elements. But this seems like too much longer, so I will
            look into it.

            On Wed, Feb 25, 2015 at 8:12 AM, Robert Castelo
            <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>>
            wrote:

             > my current reason to prefer a CompressedVRangesList
            object over a
             > SimpleVRangesList object is that i find one order of
            magnitude difference
             > in creation time in each of these classes of objects:
             >
             > library(VariantAnnotation)
             >
             > fl <- system.file("extdata", "CEUtrio.vcf.bgz",
             >                   package="VariantFiltering")
             >
             > vcf <- readVcf(fl, genome="hg19")
             > vr <- as(vcf, "VRanges")
             > length(vr)
             > [1] 15000
             >
             > ## create a VRangesList object
             > system.time(vrl <- do.call("VRangesList", split(vr,
            sampleNames(vr))))
             >    user  system elapsed
             >   0.247   0.004   0.252
             >
             > ## create a CompressedVRangesList object
             > system.time(cvrl <- new("CompressedVRangesList", split(vr,
             > sampleNames(vr))))
             >    user  system elapsed
             >   0.019   0.000   0.019
             >
             > 0.252/0.019
             > [1] 13.26316
             >
             > with a larger vcf differences increase:
             >
             > [... load vcf, coerce to VRanges ...]
             > length(vr)
             > [1] 25916
             >
             > system.time(vrl <- do.call("VRangesList", split(vr,
            sampleNames(vr))))
             >    user  system elapsed
             >   2.672   0.000   2.676
             >
             > system.time(cvrl <- new("CompressedVRangesList", split(vr,
             > sampleNames(vr))))
             >    user  system elapsed
             >   0.014   0.000   0.014
             >
             > 2.676 / 0.014
             > [1] 191.1429
             >
             >
             > so maybe i'm using the wrong way to construct a
            VRangesList object, but
             > according to our last conversation about this, there was
            no obvious default
             > fast way to do it, starting from a VRanges object:
             >
             >
            https://stat.ethz.ch/pipermail/bioc-devel/2015-January/006905.html
             >
             > it would be great if there's a fast way to do this kind
            of construction.
             >
             > thanks,
             >
             > robert.
             >
             > On 02/25/2015 04:42 PM, Michael Lawrence wrote:
             >
             >> If you're storing data on a relatively small number of
            individuals (say,
             >> hundreds), you should use SimpleVRangesList, not
            CompressedVRangesList.
             >>
             >> On Wed, Feb 25, 2015 at 7:10 AM, Robert Castelo
            <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
             >> <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>>> wrote:
             >>
             >>     i see you point, the logic i was thinking about is
            to use a list of
             >>     VRanges objects to hold separately the variants of
            multiple
             >>     individuals, with one VRanges object per individual.
             >>
             >>     if i type the name of such a list object on the R
            shell, having the
             >>     GRangesList show method, i feel i do not see much
            information
             >>     because the screen just scrolls up tens or hundreds
            of lines
             >>     specifiying variants per individual. however, the
            concise appearance
             >>     of something like a VRangesList:
             >>
             >> > vrl
             >>     VRangesList of length 10
             >>     names(32): S1 S2 S3 S4 ... S7 S8 S9 S10
             >>
             >>     at least suggests the user that the object holding
            the variants has
             >>     information for 10 samples and belongs to the class
            'VRangesList'.
             >>
             >>     i thought this made general sense but i'm fine if
            you feel this
             >>     interpretation does not warrant such a change.
             >>
             >>     cheers,
             >>
             >>     robert.
             >>
             >>     On 02/25/2015 01:25 AM, Michael Lawrence wrote:
             >>
             >>         Why not have the SimpleVRangesList be shown like
             >>         CompressedVRangesList,
             >>         for consistency with GRangesList? In other
            words, the opposite
             >>         of what
             >>         you propose. A strong argument could also be
            made that a
             >>         SimpleGenomicRangesList should be shown like a
            GRangesList.
             >>         Unless there
             >>         is some aversion to the more verbose output....
             >>
             >>         On Tue, Feb 24, 2015 at 2:36 PM, Robert Castelo
             >> <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
            <mailto:robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>>
             >> <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>
             >>
             >> <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>>__>> wrote:
             >>
             >>              so, yes, but IMO rather than inheriting the
            show method from
             >> a
             >>              GRangesList, i think that the show method for
             >>         CompressedVRangesList
             >>              objects should be inherited from a
            VRangesList object.
             >>         right now
             >>              this is the situation:
             >>
             >>              library(VariantAnnotation)
             >>
             >>              example(VRangesList)
             >>              vrl
             >>              VRangesList of length 2
             >>              names(2): sampleA sampleB
             >>
             >>              cvrl <- new("CompressedVRangesList", split(vr,
             >>         sampleNames(vr)))
             >>              cvrl
             >>              CompressedVRangesList object of length 2:
             >>              $a
             >>              VRanges object with 1 range and 1 metadata
            column:
             >>                     seqnames    ranges strand
              ref              alt
             >>              totalDepth       refDepth       altDepth
             >> <Rle> <IRanges> <Rle> <character> <characterOrRle>
            <integerOrRle>
             >> <integerOrRle> <integerOrRle>
             >>                 [1]     chr1    [1, 5]      +           T
             >>              C             12              5              7
             >>                       sampleNames softFilterMatrix |
            tumorSpecific
             >> <factorOrRle> <matrix> | <logical>
             >>                 [1]             a             TRUE |
                  FALSE
             >>
             >>              $b
             >>              VRanges object with 1 range and 1 metadata
            column:
             >>                     seqnames   ranges strand ref alt
            totalDepth refDepth
             >>         altDepth
             >>              sampleNames softFilterMatrix |
             >>                 [1]     chr2 [10, 20]      +   A   T
                  17       10
             >>              6           b            FALSE |
             >>                     tumorSpecific
             >>                 [1]          TRUE
             >>
             >>              -------
             >>              seqinfo: 2 sequences from an unspecified
            genome; no
             >> seqlengths
             >>
             >>              would it be possible to have the
            VRangesList show method for
             >>              CompressedVRangesList objects?
             >>
             >>              robert.
             >>
             >>
             >>
             >>              On 2/24/15 7:24 PM, Michael Lawrence wrote:
             >>
             >>                  I think you might be missing an import.
            It should
             >>             inherit the
             >>                  method for GRangesList.
             >>
             >>                  On Tue, Feb 24, 2015 at 9:53 AM, Robert
            Castelo
             >> <robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>
            <mailto:robert.cast...@upf.edu <mailto:robert.cast...@upf.edu>>
             >> <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>
             >> <mailto:robert.cast...@upf.edu
            <mailto:robert.cast...@upf.edu>>__>> wrote:
             >>
             >>                      hi,
             >>
             >>                      i'm using the CompressedVRangesList
            class in
             >>             VariantFiltering
             >>                      to hold variants and their
            annotations across
             >>             multiple samples
             >>                      and found that there was no show
            method for this
             >>             class (unless
             >>                      i'm missing the right import here)
            so i made one
             >> within
             >>                      VariantFiltering by copying&pasting
            from other
             >>             similar classes:
             >>
             >>                      setMethod("show",
             >>             signature(object="__CompressedVRangesList"),
             >>                                function(object) {
             >>                                  lo <- length(object)
             >>
            cat(classNameForDisplay(__object), " of
             >>             length ",
             >>                      lo, "\n",
             >>                                      sep = "")
             >>                                  if
            (!is.null(names(object)))
             >>                                    cat(BiocGenerics:::__
             >> labeledLine("names",
             >>                      names(object)))
             >>                                })
             >>
             >>                      i guess, however, that the right
            home for this would
             >> be
             >>                      VariantAnnotation. let me know if
            you consider
             >>             adding it there
             >>                      (or somewhere else) and i'll remove
            it from
             >>             VariantFiltering.
             >>
             >>                      thanks,
             >>
             >>                      robert.
             >>
             >>
            _________________________________________________
             >> Bioc-devel@r-project.org
            <mailto:Bioc-devel@r-project.org>
            <mailto:Bioc-devel@r-project.org
            <mailto:Bioc-devel@r-project.org>>
             >> <mailto:Bioc-devel@r-project.
            <mailto:Bioc-devel@r-project.>__org
             >> <mailto:Bioc-devel@r-project.org
            <mailto:Bioc-devel@r-project.org>>>
             >>                      mailing list
             >> https://stat.ethz.ch/mailman/__listinfo/bioc-devel
             >> <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
             >>
             >>
             >>
             >>
             >>
             >>     --
             >>     Robert Castelo, PhD
             >>     Associate Professor
             >>     Dept. of Experimental and Health Sciences
             >>     Universitat Pompeu Fabra (UPF)
             >>     Barcelona Biomedical Research Park (PRBB)
             >>     Dr Aiguader 88
             >>     E-08003 Barcelona, Spain
             >>     telf: +34.933.160.514 <tel:%2B34.933.160.514>
            <tel:%2B34.933.160.514>
             >>     fax: +34.933.160.550 <tel:%2B34.933.160.550>
            <tel:%2B34.933.160.550>
             >>
             >>
             >>
             > --
             > Robert Castelo, PhD
             > Associate Professor
             > Dept. of Experimental and Health Sciences
             > Universitat Pompeu Fabra (UPF)
             > Barcelona Biomedical Research Park (PRBB)
             > Dr Aiguader 88
             > E-08003 Barcelona, Spain
             > telf: +34.933.160.514 <tel:%2B34.933.160.514>
             > fax: +34.933.160.550 <tel:%2B34.933.160.550>
             >

                     [[alternative HTML version deleted]]

            _______________________________________________
            Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
            mailing list
            https://stat.ethz.ch/mailman/listinfo/bioc-devel




        --
        Gabriel Becker, Ph.D
        Computational Biologist
        Genentech Research




--
Robert Castelo, PhD
Associate Professor
Dept. of Experimental and Health Sciences
Universitat Pompeu Fabra (UPF)
Barcelona Biomedical Research Park (PRBB)
Dr Aiguader 88
E-08003 Barcelona, Spain
telf: +34.933.160.514
fax: +34.933.160.550

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to