Re: [Bioc-devel] asVCF error coming from normalizeSingleBracketSubscript

2013-11-25 Thread Valerie Obenchain

Hi Stephanie,

The error is thrown from SeqArray:::.info at line 216 in the file and is 
related to the handing of NA values.


  x[x == ] - NA

Output from the == comparison can contain NAs and therefore can't be 
used (consistently) in subsetting operations.


'x' is a NumericList.

Browse[2] x
NumericList of length 5
[[1]] 0.5
[[2]] 0.01700923872
[[3]] 0.33304291534 0.66695708466
[[4]] NA
[[5]] NA NA

Here we see NAs returned for the NA values,

Browse[2] x == 
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] NA
[[5]] NA NA

which fail on subsetting.

Browse[2] x[x == ]
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

One solution is use %in% which does not return NAs.

Browse[2] x %in% 
LogicalList of length 5
[[1]] FALSE
[[2]] FALSE
[[3]] FALSE FALSE
[[4]] FALSE
[[5]] FALSE FALSE


Valerie


On 11/22/2013 03:11 PM, Stephanie M. Gogarten wrote:

Hi Valerie,

The asVCF method in SeqArray is failing as of today with a (to me)
mysterious error.  I get it for the test files chr22.vcf.gz, ex2.vcf,
and gl_chr1.vcf in extdata of VariantAnnotation, but not for
SeqArray/extdata/CEU_Exon.vcf.  Do you have any suggestions of where I
might look to figure out where this error is coming from?

thanks,
Stephanie

  vcffile - system.file(extdata, ex2.vcf,
package=VariantAnnotation)
  gdsfile - tempfile()
  seqVCF2GDS(vcffile, gdsfile)
  gdsobj - seqOpen(gdsfile)
  options(error=recover)
  vcfg - asVCF(gdsobj)
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

Enter a frame number, or 0 to exit

  1: asVCF(gdsobj)
  2: asVCF(gdsobj)
  3: .local(x, ...)
  4: VCF(rowData = .rowData(x), colData = .colData(x), exptData =
SimpleList(hea
  5: .info(x, info)
  6: `[-`(`*tmp*`, x == , value = NA)
  7: `[-`(`*tmp*`, x == , value = NA)
  8: lsubset_List_by_List(x, i, value)
  9: .fast_lsubset_List_by_List(x, i, value)
10: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
11: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
12: extractROWS(setNames(seq_along(x), names(x)), i)
13: extractROWS(setNames(seq_along(x), names(x)), i)
14: normalizeSingleBracketSubscript(i, x)

  sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.1Biostrings_2.30.1
[4] GenomicRanges_1.14.3XVector_0.2.0   IRanges_1.20.6
[7] BiocGenerics_0.8.0  SeqArray_1.2.0  gdsfmt_1.0.0

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.24.0   Biobase_2.22.0 biomaRt_2.18.0
  [4] bitops_1.0-6   BSgenome_1.30.0DBI_0.2-7
  [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1 RSQLite_0.11.4
[10] rtracklayer_1.22.0 stats4_3.0.2   tools_3.0.2
[13] XML_3.95-0.2   zlibbioc_1.8.0


___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel


[Bioc-devel] asVCF error coming from normalizeSingleBracketSubscript

2013-11-22 Thread Stephanie M. Gogarten

Hi Valerie,

The asVCF method in SeqArray is failing as of today with a (to me) 
mysterious error.  I get it for the test files chr22.vcf.gz, ex2.vcf, 
and gl_chr1.vcf in extdata of VariantAnnotation, but not for 
SeqArray/extdata/CEU_Exon.vcf.  Do you have any suggestions of where I 
might look to figure out where this error is coming from?


thanks,
Stephanie

 vcffile - system.file(extdata, ex2.vcf, package=VariantAnnotation)
 gdsfile - tempfile()
 seqVCF2GDS(vcffile, gdsfile)
 gdsobj - seqOpen(gdsfile)
 options(error=recover)
 vcfg - asVCF(gdsobj)
Error in normalizeSingleBracketSubscript(i, x) : subscript contains NAs

Enter a frame number, or 0 to exit

 1: asVCF(gdsobj)
 2: asVCF(gdsobj)
 3: .local(x, ...)
 4: VCF(rowData = .rowData(x), colData = .colData(x), exptData = 
SimpleList(hea

 5: .info(x, info)
 6: `[-`(`*tmp*`, x == , value = NA)
 7: `[-`(`*tmp*`, x == , value = NA)
 8: lsubset_List_by_List(x, i, value)
 9: .fast_lsubset_List_by_List(x, i, value)
10: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
11: replaceROWS(unlisted_x, unlisted_i, unlisted_value)
12: extractROWS(setNames(seq_along(x), names(x)), i)
13: extractROWS(setNames(seq_along(x), names(x)), i)
14: normalizeSingleBracketSubscript(i, x)

 sessionInfo()
R version 3.0.2 (2013-09-25)
Platform: x86_64-apple-darwin10.8.0 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] parallel  stats graphics  grDevices utils datasets  methods
[8] base

other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.1Biostrings_2.30.1
[4] GenomicRanges_1.14.3XVector_0.2.0   IRanges_1.20.6
[7] BiocGenerics_0.8.0  SeqArray_1.2.0  gdsfmt_1.0.0

loaded via a namespace (and not attached):
 [1] AnnotationDbi_1.24.0   Biobase_2.22.0 biomaRt_2.18.0
 [4] bitops_1.0-6   BSgenome_1.30.0DBI_0.2-7
 [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1 RSQLite_0.11.4
[10] rtracklayer_1.22.0 stats4_3.0.2   tools_3.0.2
[13] XML_3.95-0.2   zlibbioc_1.8.0

___
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel