On Wed, May 22, 2013 at 11:15 AM, Hervé Pagès <hpa...@fhcrc.org> wrote:

> On 05/21/2013 06:37 PM, Michael Lawrence wrote:
>
>> It's clear we're not going to agree. I will volunteer to maintain these,
>> OK? I will maintain them anyway for my own use.
>>
>
> That's fine with me. I didn't mean we shouldn't have those wrappers (and
> I think Martin or Val are going to come back here with a proposal).
> I just wanted to make it clear that the seqlevels setter is not that
> hideous thing
>
>
>   seqlevels(x, new2old = match("chr1", seqlevels(x)), force = TRUE) <-
> "chr1"
>
> Sorry if I kept beating again and again the same dead horse...
>
>

I think your email got truncated again but anyway no hard feelings here.
I'm glad things are being resolved.

Looking forward to seeing you guys in July.


> Cheers,
> H.
>
>
>> Michael
>>
>>
>> On Tue, May 21, 2013 at 2:34 PM, Hervé Pagès <hpa...@fhcrc.org
>> <mailto:hpa...@fhcrc.org>> wrote:
>>
>>
>>
>>     On 05/21/2013 08:00 AM, Michael Lawrence wrote:
>>
>>
>>
>>
>>         On Mon, May 20, 2013 at 10:36 PM, Hervé Pagès <hpa...@fhcrc.org
>>         <mailto:hpa...@fhcrc.org>
>>         <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>> wrote:
>>
>>              Michael,
>>
>>
>>              On 05/20/2013 08:44 PM, Michael Lawrence wrote:
>>
>>
>>
>>
>>                  On Mon, May 20, 2013 at 4:13 PM, Hervé Pagès
>>         <hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>                  <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>
>>                  <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>         <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>>> wrote:
>>
>>                       Hi,
>>
>>
>>                       On 05/20/2013 03:15 PM, Michael Lawrence wrote:
>>
>>                           On Mon, May 20, 2013 at 3:11 PM, Martin Morgan
>>                           <mtmor...@fhcrc.org
>>         <mailto:mtmor...@fhcrc.org> <mailto:mtmor...@fhcrc.org
>>         <mailto:mtmor...@fhcrc.org>>
>>                  <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>
>>         <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>>>> wrote:
>>
>>                               Hi Michael --
>>
>>
>>                               On 5/20/2013 5:56 AM, Michael Lawrence
>> wrote:
>>
>>                                   While it's cool that seqlevels<- has
>>         become so
>>                  flexible,
>>                                   I still claim
>>                                   that
>>                                   rename/keep/drop would be a lot easier
>>         to read,
>>                  because
>>                                   they describe the
>>                                   high-level operation, and there's no
>>         reason to
>>                  deprecate
>>                                   them. They're
>>                                   also
>>                                   easier to remember. For example, for
>>         dropping with
>>                                   seqlevels<-, the user
>>                                   needs
>>                                   to remember that "force" is necessary
>>         to drop. Much
>>                                   easier to just say
>>                                   "dropSeqlevels(), please". And
>>         reimplementing
>>                                   keepSeqlevels is still too
>>                                   complicated. Is it such a maintenance
>>         burden to
>>                  have
>>                                   these simple
>>                                   wrappers sit
>>                                   on top of the low-level manipulators?
>>
>>                                   Another suggestion: renameSeqlevels
>>         should not
>>                  require a
>>                                   named vector (in
>>                                   cases
>>                                   where it is unnamed, require that it is
>>                  parallel to the
>>                                   existing
>>                                   seqlevels, as
>>                                   with seqlevels<-).
>>
>>
>>                               I didn't really indicate what drove my
>>         desire to see
>>                               keepSeqlevels
>>                               deprecated. keepSeqlevels, seqlevels<-, and
>>                  isActiveSeq were
>>                               developed more
>>                               or less independently, and have different
>>                  contracts. The
>>                               different
>>                               contracts are confusing to the user, as is
>>         creating
>>                  a usable
>>                               help system
>>                               when there are multiple end points for
>>         what is a common
>>                               operation. The help
>>                               pages of each were inadequate in different
>>         ways.
>>                  And because
>>                               the code was
>>                               developed independently, support for
>> different
>>                  objects was
>>                               inconsistent. So
>>                               actually, yes, the maintenance (and use)
>>         burden for the
>>                               previous state of
>>                               affairs was substantial.
>>
>>                               On the other hand, I agree that
>>         keepSeqlevels is
>>                  convenient
>>                               as a simple
>>                               wrapper around seqlevels<-, in the same
>>         way that
>>                  setNames
>>                               and names<- are
>>                               both useful.
>>
>>                               So we could iterate to
>>
>>                                   keepSeqlevels <-
>>                                       function(x, value, ...)
>>                                   {
>>                                       seqlevels(x, force=TRUE) <- value
>>                                       x
>>                                   }
>>
>>                               but I'd be less enthusiastic about
>>         maintaining the
>>                  original
>>                               contract of
>>                               keepSeqlevels, where keepSeqlevels(gr1,
>>         gr2), would
>>                  have
>>                               worked for two
>>                               GRanges objects.
>>
>>
>>                       Why would this be called keepSeqlevels() if,
>>         depending on
>>                  how it's used,
>>                       it will either add, drop, rename, or permute the
>>         seqlevels?
>>                       Couldn't this be called setSeqlevels?
>>
>>
>>                  I thought that new2old was necessary for permuting.
>>
>>
>>              The seqlevels setter has no 'new2old' arg.
>>
>>
>>                  You're right that
>>                  adding should be disallowed for keepSeqlevels(). Adding
>>                  seqlevels is not
>>                  a common operation. The two common operations are:
>>                  * Permuting, usually because the data were imported in
>>         non-canonical
>>                  order (seqnameStyle was designed to address this, no?).
>>
>>
>>              Permuting is straightforward with seqlevels<-:
>>
>>                 > seqlevels(gr)
>>                 [1] "chr1" "chr2" "chr3"
>>
>>                 > seqlevels(gr) <- rev(seqlevels(gr))
>>
>>                 > seqlevels(gr)
>>                 [1] "chr3" "chr2" "chr1"
>>
>>
>>                  * Subsetting, either by keeping or dropping.
>>
>>
>>              Also straightforward:
>>
>>                 > seqlevels(gr, force=TRUE) <- "chr2"
>>
>>                 > seqlevels(gr)
>>                 [1] "chr2"
>>
>>
>>                  Two main reasons: taking a
>>                  small slice of the data for prototyping, or removing
>>         problematic
>>                  chromosomes (sex, circular). This goes back to bsapply
>>         and the
>>                  exclude
>>                  argument.
>>
>>
>>              There are other important use cases:
>>
>>                 - Dropping seqlevels that are *not* in use. This happens
>>         for example
>>                   when subsetting a BAM file with Rsamtools::filterBam
>>         to keep only
>>                   the alignments located on a given chromosome. The
>>         entire sequence
>>                   dictionary (in the header) is propagated to the
>>         resulting BAM file
>>                   so when you read the file with readGAlignments() you
>>         end up with a
>>                   bunch of useless seqlevels that you generally start by
>>         dropping.
>>                   You don't want to drop any alignment so force=FALSE is
>>         your friend.
>>
>>
>>         This is sort of tangential to the discussion, but do you really
>>         want to
>>         do this? I would preserve the universe as given by the BAM.
>>
>>                 - An even more common operation is renaming: 90% of the
>>         times that's
>>                   what I use seqlevels<- for, and that's what I tell
>>         people to use
>>                   when they need to rename their chromosomes. Also
>>         straightforward:
>>
>>                     > seqlevels(gr)
>>                     [1] "chr1" "chr2" "chr3" "chrM"
>>
>>                     > seqlevels(gr) <- sub("^chr", "", seqlevels(gr))
>>                     > seqlevels(gr)
>>                     [1] "1" "2" "3" "M"
>>
>>                     > seqlevels(gr)[seqlevels(gr) == "M"] <- "MT"
>>                     > seqlevels(gr)
>>                     [1] "1"  "2"  "3"  "MT"
>>
>>                   Note that this is simpler to use than
>>         renameSeqlevels() which
>>                   always required you to build the entire right value as
>>         a named
>>                   vector.
>>
>>
>>         Sure, renaming is a common use case. Not sure how I forgot that.
>>
>>         As I wrote earlier in the thread, renameSeqlevels should be
>>         changed so
>>         as not to require naming of the vector.
>>
>>              So the added-value of keepSeqlevels() seems to boil down to:
>>
>>                 (a) it always uses force=TRUE
>>
>>                 (b) it preserves the original object and returns the
>>         modified object
>>
>>              If you want to restrict the use of keepSeqlevels() to
>>         permuting and
>>              dropping, you'll also have to disallow renaming (in
>>         addition to adding).
>>              Then its name will more or less reflect what it does (if
>>         "keep" means
>>              "permute" or "drop"). The final result will be something
>>         that does less
>>              than setSeqlevels() and that is not necessarily easier to
>>         read: both
>>              will set on the object whatever vector of seqlevels they
>>         are supplied,
>>              except that one will fail when doing so would actually
>>         result in adding
>>              or renaming seqlevels.
>>
>>
>>         So it looks like seqlevels<- is pretty powerful. Now I see that
>> if I
>>         scroll down to the examples in the man page, I find the actual
>>         documentation. It's cool to learn that we can replace seqlevels
>>         on TxDb
>>         objects.
>>
>>
>>     Has been working on TranscriptDb objects for more than
>>
>>
>>         My argument has always been that since these low-level
>>         operations are so powerful, it's nice to have high-level
>>         operations to
>>         clarify the code. We have to think hard to know what the RHS
>>         will do in
>>         that replacement.
>>
>>
>>     When you do something like seqlevels(gr) <- new_seqlevels, you're
>> saying
>>     "can you please replace the current seqlevels by those new ones".
>>     That's all. The request/contract is simple and all the complexity is
>>     taken care of internally by seqlevels<-. It might be that, depending
>> on
>>     what's currently on 'gr', this will actually result in dropping,
>> adding,
>>     permuting. Or *a combination of the 3*. Like here:
>>
>>
>>        > seqlevels(gr)
>>        [1] "chr1" "chr2" "chr3"
>>
>>        > seqlevels(gr) <- c("chr4", "chr3", "chr1")
>>        > seqlevels(gr)
>>        [1] "chr4" "chr3" "chr1"
>>
>>        # one seqlevel was drop, one was added, and the 2 seqlevels that
>> were
>>        # preserved were reordered.
>>
>>     When you want to subset a vector to keep all elements with an even
>>     index you do x[c(FALSE, TRUE)]. Is that keeping or dropping elements?
>>     Do we need to have a subsetting operator for keeping elements, another
>>     one for dropping elements, and another one for permuting them (like
>>     in x[sample(length(x))])?
>>
>>     H.
>>
>>         It's probably one of the most complex replacement
>>         operations; far more complex than the typical one, including
>>         levels<- on
>>         factor. There's nothing wrong with that; it's just complexity
>>         that we
>>         should be able to hide.
>>
>>         Michael
>>
>>              H.
>>
>>
>>                  Michael
>>
>>
>>                       H.
>>
>>
>>
>>                           Well, I wasn't even aware of that feature, so
>>         you've
>>                  made your
>>                           point about
>>                           the documentation ;) Sounds like a good
>>         compromise to me.
>>
>>                           Thanks for understanding,
>>                           Michael
>>
>>
>>
>>                               Martin
>>
>>
>>                                   Michael
>>
>>
>>
>>
>>
>>
>>                                   On Sat, May 18, 2013 at 6:00 PM,
>>         Martin Morgan
>>                                   <mtmor...@fhcrc.org
>>         <mailto:mtmor...@fhcrc.org> <mailto:mtmor...@fhcrc.org
>>         <mailto:mtmor...@fhcrc.org>>
>>                  <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>
>>         <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>>>
>>                                   <mailto:mtmor...@fhcrc.org
>>         <mailto:mtmor...@fhcrc.org>
>>                  <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>>
>>         <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org>
>>                  <mailto:mtmor...@fhcrc.org <mailto:mtmor...@fhcrc.org
>> >>>>**>
>>
>>
>>                                   wrote:
>>
>>                                         On 05/18/2013 05:42 PM, Martin
>>         Morgan wrote:
>>
>>                                             Some of the most common
>>         operations are
>>                                   straight-forward, e.g.,
>>
>>                                                   > gr =
>>         GRanges(paste0("chr",
>>                  c(1:22,
>>                                   "X", "Y")), IRanges(1,
>>                                   100))
>>                                                   > seqlevels(gr) =
>>         sub("chr", "ch",
>>                                   seqlevels(gr))
>>
>>
>>                                         To get a more comprehensive
>>         example I
>>                  should have
>>                                   followed my own
>>                                   advice and
>>                                         grabbed from the help page!
>>
>>                                               ## Rename:
>>                                               seqlevels(txdb) <-
>>         sub("chr", "",
>>                                   seqlevels(txdb))
>>                                               seqlevels(txdb)
>>
>>                                               seqlevels(txdb) <-
>>         paste0("CH",
>>                  seqlevels(txdb))
>>                                               seqlevels(txdb)
>>
>>
>>                  seqlevels(txdb)[seqlevels(__****______txdb) ==
>>
>>
>>
>>                                   "CHM"] <- "M"
>>
>>
>>                                               seqlevels(txdb)
>>
>>
>>                                         --
>>                                         Computational Biology / Fred
>>         Hutchinson
>>                  Cancer
>>                                   Research Center
>>                                         1100 Fairview Ave. N.
>>                                         PO Box 19024 Seattle, WA 98109
>>
>>                                         Location: Arnold Building M1 B861
>>                                         Phone: (206) 667-2793
>>         <tel:%28206%29%20667-2793>
>>                  <tel:%28206%29%20667-2793> <tel:%28206%29%20667-2793>
>>                                   <tel:%28206%29%20667-2793>
>>
>>
>>
>>
>>                               --
>>                               Dr. Martin Morgan, PhD
>>
>>                               Fred Hutchinson Cancer Research Center
>>                               1100 Fairview Ave. N.
>>                               PO Box 19024 Seattle, WA 98109
>>
>>
>>                                    [[alternative HTML version deleted]]
>>
>>
>>           ______________________________**_______________________
>>
>>         Bioc-devel@r-project.org 
>> <mailto:Bioc-devel@r-project.**org<Bioc-devel@r-project.org>
>> >
>>         <mailto:Bioc-devel@r-project._**_org
>>         <mailto:Bioc-devel@r-project.**org <Bioc-devel@r-project.org>>>
>>                  <mailto:Bioc-devel@r-project.
>>         <mailto:Bioc-devel@r-project.>**____org
>>                  <mailto:Bioc-devel@r-project._**_org
>>         <mailto:Bioc-devel@r-project.**org <Bioc-devel@r-project.org>>>>
>>                           mailing list
>>         
>> https://stat.ethz.ch/mailman/_**_____listinfo/bioc-devel<https://stat.ethz.ch/mailman/______listinfo/bioc-devel>
>>         
>> <https://stat.ethz.ch/mailman/**____listinfo/bioc-devel<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>> >
>>
>>                  
>> <https://stat.ethz.ch/mailman/**____listinfo/bioc-devel<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>         
>> <https://stat.ethz.ch/mailman/**__listinfo/bioc-devel<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>> >>
>>
>>
>>
>>           
>> <https://stat.ethz.ch/mailman/**____listinfo/bioc-devel<https://stat.ethz.ch/mailman/____listinfo/bioc-devel>
>>         
>> <https://stat.ethz.ch/mailman/**__listinfo/bioc-devel<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>> >
>>                  
>> <https://stat.ethz.ch/mailman/**__listinfo/bioc-devel<https://stat.ethz.ch/mailman/__listinfo/bioc-devel>
>>         
>> <https://stat.ethz.ch/mailman/**listinfo/bioc-devel<https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>> >>>
>>
>>
>>                       --
>>                       Hervé Pagès
>>
>>                       Program in Computational Biology
>>                       Division of Public Health Sciences
>>
>>                       Fred Hutchinson Cancer Research Center
>>                       1100 Fairview Ave. N, M1-B514
>>                       P.O. Box 19024
>>                       Seattle, WA 98109-1024
>>
>>                       E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>         <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>
>>                  <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>
>>         <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>>
>>
>>                       Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>         <tel:%28206%29%20667-5791>
>>                  <tel:%28206%29%20667-5791>
>>                       Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>         <tel:%28206%29%20667-1319>
>>                  <tel:%28206%29%20667-1319>
>>
>>
>>
>>              --
>>              Hervé Pagès
>>
>>              Program in Computational Biology
>>              Division of Public Health Sciences
>>              Fred Hutchinson Cancer Research Center
>>              1100 Fairview Ave. N, M1-B514
>>              P.O. Box 19024
>>              Seattle, WA 98109-1024
>>
>>              E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>         <mailto:hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>>
>>              Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>         <tel:%28206%29%20667-5791>
>>              Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>         <tel:%28206%29%20667-1319>
>>
>>
>>
>>     --
>>     Hervé Pagès
>>
>>     Program in Computational Biology
>>     Division of Public Health Sciences
>>     Fred Hutchinson Cancer Research Center
>>     1100 Fairview Ave. N, M1-B514
>>     P.O. Box 19024
>>     Seattle, WA 98109-1024
>>
>>     E-mail: hpa...@fhcrc.org <mailto:hpa...@fhcrc.org>
>>     Phone: (206) 667-5791 <tel:%28206%29%20667-5791>
>>     Fax: (206) 667-1319 <tel:%28206%29%20667-1319>
>>
>>
>>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpa...@fhcrc.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to