Hi,

I can't reproduce this error. Here is a read/write example using a file from VariantAnnotation where the results are as expected.

fl <- system.file("extdata", "ex2.vcf", package="VariantAnnotation")
dest <- tempfile()
vcf1 <- readVcf(fl, "hg19")
> rownames(vcf1)
[1] "rs6054257"      "20:17330_T/A"   "rs6040355"      "20:1230237_T/."
[5] "microsat1"

writeVcf(vcf1, dest)
vcf2 <- readVcf(dest, "hg19")
> rownames(vcf2)
[1] "rs6054257"      "20:17330_T/A"   "rs6040355"      "20:1230237_T/."
[5] "microsat1"

I need a reproducible example in order to help. Is the vcf you're working with publicly available?

Valerie

On 11/27/2013 03:37 AM, Becq, Jennifer wrote:
Hi Valerie,

Thank you for cc'ing my message.

The "ID" values are removed when reading a VCF through readVcf() and re-writing 
it with writeVcf():

V = readVcf("test.vcf", "hg19")
rownames(V)
[1] "DEL:561590:0:1:0:0:0"        "BND:81424:0:1:1:1"
[3] "BND:54200:0:1:0:1"           "DEL:572151:1:1:6:4:0"
[5] "DEL:572151:1:1:6:4:1"        "DEL:572151:1:1:11:0:0"
[7] "DEL:572433:0:0:5:2:0"        "DUP:TANDEM:572544:0:0:8:0:0"
writeVcf(V, "writeTest.vcf")
V2 = readVcf("writeTest.vcf", "hg19")
rownames(V2)
[1] "chr20:14855644_C/<DEL>"
[2] "chr20:29627290_G/[chr2:114173319[G"
[3] "chr20:35365307_T/]chr1:230941520]T"
[4] 
"chr20:60520225_AACGATGAGGAGCATCGCGGCTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCACGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACTGTGGCCGCCCCTTCTCACCG/A"
[5] 
"chr20:60520443_GACTGTCTGCACCGTGGCCGCCCCTTCTCACTGACGATGAGGAGCACTGCGACTGTCTGCACCGTGGCCGCCCTTTCTGACTGATGATAAGGAACATTGCGACTGTCTGCACCGTGGCTGCCCCTTCTCACCAACGCTGAGGAGCACTGCAACCATCTGCACCGTGGCCGCCCCTTCTCACCGATGATGAGGAACATTGAGACTGTCTGCCCCGTGGCTGCCCCTTCTCACCGATGCTGAGGAGCACTGTGACTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCGCGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACCGTGGCCGCCCCTTCTCACCGATGACGAGGAGCACTGCGA/GC"
[6] "chr20:60520937_C/<DEL>"
[7] 
"chr20:61766068_CAGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCACAGGGGAGGCAGGGCCCAGAGAGGAGGCGGGGCCACAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCAC/CGG"
[8] "chr20:62063686_T/<DUP:TANDEM>"

sessionInfo()
R version 3.0.2 Patched (2013-10-27 r64116)
Platform: x86_64-unknown-linux-gnu (64-bit)

locale:
  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
  [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C
  [9] LC_ADDRESS=C               LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] VariantAnnotation_1.8.6 Rsamtools_1.14.2        Biostrings_2.30.1
[4] GenomicRanges_1.14.3    XVector_0.2.0           IRanges_1.20.5
[7] BiocGenerics_0.8.0

loaded via a namespace (and not attached):
  [1] AnnotationDbi_1.24.0   Biobase_2.22.0         biomaRt_2.18.0
  [4] bitops_1.0-6           BSgenome_1.30.0        DBI_0.2-7
  [7] GenomicFeatures_1.14.2 RCurl_1.95-4.1         RSQLite_0.11.4
[10] rtracklayer_1.22.0     stats4_3.0.2           tools_3.0.2
[13] XML_3.98-1.1           zlibbioc_1.8.0


*****  With the following VCF test.vcf:

##fileformat=VCFv4.1
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO
chr20   14855644        DEL:561590:0:1:0:0:0    C       <DEL>     .       PASS  
  .
chr20   29627290        BND:81424:0:1:1:1       G       [chr2:114173319[G       
.       MaxDepth        .
chr20   35365307        BND:54200:0:1:0:1       T       ]chr1:230941520]T       
.       PASS    .
chr20   60520225        DEL:572151:1:1:6:4:0    
AACGATGAGGAGCATCGCGGCTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCACGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACTGTGGCCGCCCCTTCTCACCG
 A       
.       PASS    .
chr20   60520443        DEL:572151:1:1:6:4:1    
GACTGTCTGCACCGTGGCCGCCCCTTCTCACTGACGATGAGGAGCACTGCGACTGTCTGCACCGTGGCCGCCCTTTCTGACTGATGATAAGGAACATTGCGACTGTCTGCACCGTGGCTGCCCCTTCTCACCAACGCTGAGGAGCACTGCAACCATCTGC
ACCGTGGCCGCCCCTTCTCACCGATGATGAGGAACATTGAGACTGTCTGCCCCGTGGCTGCCCCTTCTCACCGATGCTGAGGAGCACTGTGACTGTCTGCACCATGGGAGCCCCTTCTCACTGACAATGAGGAGCATTCAGAGTGTCTACACCGTGGCCGCGCCTTCTCACCGATGCTGAGGAGCACCGAGACTGTCTGCACCGTGGC
CGCCCCTTCTCACCGATGACGAGGAGCACTGCGA      GC      .       PASS    .
chr20   60520937        DEL:572151:1:1:11:0:0   C       <DEL>     .       PASS  
  .
chr20   61766068        DEL:572433:0:0:5:2:0    
CAGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCCACAGGGGAGGCAGGGCCCAGAGAGGAGGCGGGGCCACAGGGGAGGCGGGTCCCGGAGGGGAGGCGGGTCCCGGAGGGGAGGCAGGGCC
AC      CGG     .       PASS    .
chr20   62063686        DUP:TANDEM:572544:0:0:8:0:0     T       <DUP:TANDEM>    
  .       PASS    .

Thanks
Jennifer


Jennifer Becq
Bioinformatics Scientist
Illumina Cambridge Ltd
Tel: +44 (0) 1799 532300
email: jb...@illumina.com



-----Original Message-----
From: Valerie Obenchain [mailto:voben...@fhcrc.org]
Sent: 20 November 2013 17:28
To: Becq, Jennifer; bioc-devel@r-project.org
Subject: Re: VariantAnnotation writeVcf problem

Hi Jennifer,

I've cc'd your message to the Bioconductor mailing list. We have two lists, one 
for general questions and the other for bug reports/feature requests. Please 
post future questions to one of these lists instead of sending them to a single 
person. The lists reach a wider audience and others can chime in with their 
responses/experience. You can find info about the mailing lists here,

http://www.bioconductor.org/help/mailing-list/

writeVcf() should only write out '.' for ID if the ID is missing. There is no 
restriction on the format of the ID. Can you provide a small sample of the vcf 
file you're having trouble with (just a few lines is enough)? Also include the 
output of your sessionInfo().

Valerie


On 11/15/2013 08:56 AM, Becq, Jennifer wrote:
Hi Valerie,

I've been using VariantAnnotation for quite a while now and it's been great!

However I've just encountered a problem:

If I read in a VCF and re-write it directly, the ID column has
disappeared and becomes "." instead of the original
"DEL:9586:0:1:0:0:0", even though the rownames of my VCF object are
correctly populated with the original ID column.

  > library(VariantAnnotation)

  > in1 = readVcf("my.vcf.gz", "hg19")

  > writeVcf(in1, "test.vcf")

I was wondering if that was because ID only accepts a specific format
(rsID or chr:pos)?

Thank you for your help

Jennifer

*Jennifer Becq*

*Bioinformatics Scientist*

*Illumina Cambridge Ltd*

Tel: +44 (0) 1799 532300

email: jb...@illumina.com <mailto:jb...@illumina.com>



_______________________________________________
Bioc-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

Reply via email to