Thanks Sean. Probably also need an "isSubstitution" for any substitution, either SNV or complex.
On Wed, Mar 19, 2014 at 3:20 PM, Sean Davis <sdav...@mail.nih.gov> wrote: > > > On Wed, Mar 19, 2014 at 4:26 PM, Valerie Obenchain <voben...@fhcrc.org>wrote: > >> Thanks for the feedback. >> >> I'll look into nchar for XStringSetList. >> >> I'm in favor of supporting isDeletion(), isInsertion(), isIndel() and >> isSNV() for the VCF classes and removing restrictToSNV(). I could add an >> argument 'all_alt' or 'all_alt_agreement' to be used with CollapsedVCF in >> the case where not all alternate alleles meet the criteria. >> >> Here are the current definitions: >> >> isDeletion <- function(x) { >>> nchar(alt(x)) == 1L & nchar(ref(x)) > 1L & substring(ref(x), 1, 1) == >>> alt(x) >>> } >>> >>> isInsertion <- function(x) { >>> nchar(ref(x)) == 1L & nchar(alt(x)) > 1L & substring(alt(x), 1, 1) == >>> ref(x) >>> } >>> >>> isIndel <- function(x) { >>> isDeletion(x) | isInsertion(x) >>> } >>> >>> isSNV <- function(x) { >>> nchar(alt(x)) == 1L & nchar(ref(x)) == 1L >>> } >>> >> >> > To be thorough: > > isTransition() > > isSV() > > isSVPrecise() > > We haven't been using VCF for SVs much yet, but there are probably some > fun things to be done on that front. > > Sean > > > > >> >> Valerie >> >> >> >> On 03/19/2014 01:07 PM, Vincent Carey wrote: >> >>> >>> >>> >>> On Wed, Mar 19, 2014 at 4:00 PM, Michael Lawrence >>> <lawrence.mich...@gene.com <mailto:lawrence.mich...@gene.com>> wrote: >>> >>> It would be nice to have functions like isSNV, isIndel, isDeletion, >>> etc that at least provide precise definitions of the terminology. >>> I've added these, but they're designed only for VRanges. Should work >>> for ExpandedVCF. >>> >>> Also, it would be nice if restrictToSNV just assumed that alt(x) >>> must be something with nchar() support (with special handling for >>> any List), so that the 'character' vector of alt,VRanges would work >>> immediately. Basically restrictToSNV should just be x[isSNV(x)]. Is >>> there even a use-case for the restrictToSNV abstraction if we did >>> that? >>> >>> >>> for VCF instance it would be x[isSNV(x),] and indeed I think that would >>> be sufficient. i like the idea of having this family of predicates for >>> variant classes to allow such selections >>> >>> Michael >>> >>> >>> >>> On Tue, Mar 18, 2014 at 10:36 AM, Valerie Obenchain >>> <voben...@fhcrc.org <mailto:voben...@fhcrc.org>> wrote: >>> >>> Hi, >>> >>> I've added a restrictToSNV() function to VariantAnnotation >>> (1.9.46). The return value is a subset VCF object containing >>> SNVs only. The function operates on CollapsedVCF or ExapandedVCF >>> and the alt(VCF) value must be nucleotides (i.e., no structural >>> variants). >>> >>> A variant is considered a SNV if the nucleotide sequences in >>> both ref(vcf) and alt(x) are of length 1. I have a question >>> about how variants with multiple 'ALT' values should be handled. >>> >>> Should we consider row 4 a SNV? One 'ALT' is length 1, the other >>> is not. >>> >>> ALT <- DNAStringSetList("A", c("TT"), c("G", "A"), c("TT", "C")) >>> REF <- DNAStringSet(c("G", c("AA"), "T", "G")) >>> >>> DataFrame(REF, ALT) >>> >>> DataFrame with 4 rows and 2 columns >>> REF ALT >>> <DNAStringSet> <DNAStringSetList> >>> 1 G A >>> 2 AA TT >>> 3 T G,A >>> 4 G TT,C >>> >>> >>> >>> Thanks. >>> Valerie >>> >>> _________________________________________________ >>> Bioc-devel@r-project.org <mailto:Bioc-devel@r-project.org> >>> mailing list >>> https://stat.ethz.ch/mailman/__listinfo/bioc-devel >>> <https://stat.ethz.ch/mailman/listinfo/bioc-devel> >>> >>> >>> >>> >> >> -- >> Valerie Obenchain >> Program in Computational Biology >> Fred Hutchinson Cancer Research Center >> 1100 Fairview Ave. N, M1-B155 >> P.O. Box 19024 >> Seattle, WA 98109-1024 >> >> E-mail: voben...@fhcrc.org >> Phone: (206) 667-3158 >> Fax: (206) 667-1319 >> >> >> _______________________________________________ >> Bioc-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/bioc-devel >> > > [[alternative HTML version deleted]] _______________________________________________ Bioc-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel