Thanks Sean. Probably also need an "isSubstitution" for any substitution,
either SNV or complex.


On Wed, Mar 19, 2014 at 3:20 PM, Sean Davis <sdav...@mail.nih.gov> wrote:

>
>
> On Wed, Mar 19, 2014 at 4:26 PM, Valerie Obenchain <voben...@fhcrc.org>wrote:
>
>> Thanks for the feedback.
>>
>> I'll look into nchar for XStringSetList.
>>
>> I'm in favor of supporting isDeletion(), isInsertion(), isIndel() and
>> isSNV() for the VCF classes and removing restrictToSNV(). I could add an
>> argument 'all_alt' or 'all_alt_agreement' to be used with CollapsedVCF in
>> the case where not all alternate alleles meet the criteria.
>>
>> Here are the current definitions:
>>
>>  isDeletion <- function(x) {
>>>   nchar(alt(x)) == 1L & nchar(ref(x)) > 1L & substring(ref(x), 1, 1) ==
>>> alt(x)
>>> }
>>>
>>> isInsertion <- function(x) {
>>>   nchar(ref(x)) == 1L & nchar(alt(x)) > 1L & substring(alt(x), 1, 1) ==
>>> ref(x)
>>> }
>>>
>>> isIndel <- function(x) {
>>>   isDeletion(x) | isInsertion(x)
>>> }
>>>
>>> isSNV <- function(x) {
>>>   nchar(alt(x)) == 1L & nchar(ref(x)) == 1L
>>> }
>>>
>>
>>
> To be thorough:
>
> isTransition()
>
> isSV()
>
> isSVPrecise()
>
> We haven't been using VCF for SVs much yet, but there are probably some
> fun things to be done on that front.
>
> Sean
>
>
>
>
>>
>> Valerie
>>
>>
>>
>> On 03/19/2014 01:07 PM, Vincent Carey wrote:
>>
>>>
>>>
>>>
>>> On Wed, Mar 19, 2014 at 4:00 PM, Michael Lawrence
>>> <lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> wrote:
>>>
>>>     It would be nice to have functions like isSNV, isIndel, isDeletion,
>>>     etc that at least provide precise definitions of the terminology.
>>>     I've added these, but they're designed only for VRanges. Should work
>>>     for ExpandedVCF.
>>>
>>>     Also, it would be nice if restrictToSNV just assumed that alt(x)
>>>     must be something with nchar() support (with special handling for
>>>     any List), so that the 'character' vector of alt,VRanges would work
>>>     immediately. Basically restrictToSNV should just be x[isSNV(x)]. Is
>>>     there even a use-case for the restrictToSNV abstraction if we did
>>> that?
>>>
>>>
>>> for VCF instance it would be x[isSNV(x),] and indeed I think that would
>>> be sufficient.  i like the idea of having this family of predicates for
>>> variant classes to allow such selections
>>>
>>>     Michael
>>>
>>>
>>>
>>>     On Tue, Mar 18, 2014 at 10:36 AM, Valerie Obenchain
>>>     <voben...@fhcrc.org <mailto:voben...@fhcrc.org>> wrote:
>>>
>>>         Hi,
>>>
>>>         I've added a restrictToSNV() function to VariantAnnotation
>>>         (1.9.46). The return value is a subset VCF object containing
>>>         SNVs only. The function operates on CollapsedVCF or ExapandedVCF
>>>         and the alt(VCF) value must be nucleotides (i.e., no structural
>>>         variants).
>>>
>>>         A variant is considered a SNV if the nucleotide sequences in
>>>         both ref(vcf) and alt(x) are of length 1. I have a question
>>>         about how variants with multiple 'ALT' values should be handled.
>>>
>>>         Should we consider row 4 a SNV? One 'ALT' is length 1, the other
>>>         is not.
>>>
>>>         ALT <- DNAStringSetList("A", c("TT"), c("G", "A"), c("TT", "C"))
>>>         REF <- DNAStringSet(c("G", c("AA"), "T", "G"))
>>>
>>>                 DataFrame(REF, ALT)
>>>
>>>             DataFrame with 4 rows and 2 columns
>>>                           REF                ALT
>>>                <DNAStringSet> <DNAStringSetList>
>>>             1              G                  A
>>>             2             AA                 TT
>>>             3              T                G,A
>>>             4              G               TT,C
>>>
>>>
>>>
>>>         Thanks.
>>>         Valerie
>>>
>>>         _________________________________________________
>>>         Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org>
>>>         mailing list
>>>         https://stat.ethz.ch/mailman/__listinfo/bioc-devel
>>>         <https://stat.ethz.ch/mailman/listinfo/bioc-devel>
>>>
>>>
>>>
>>>
>>
>> --
>> Valerie Obenchain
>> Program in Computational Biology
>> Fred Hutchinson Cancer Research Center
>> 1100 Fairview Ave. N, M1-B155
>> P.O. Box 19024
>> Seattle, WA 98109-1024
>>
>> E-mail: voben...@fhcrc.org
>> Phone:  (206) 667-3158
>> Fax:    (206) 667-1319
>>
>>
>> _______________________________________________
>> Bioc-devel@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>>
>
>

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to