GitHub user mhsaul opened a pull request:
https://github.com/apache/beam/pull/3979
[BEAM-2774] Add I/O source to read VCF files
Added I/O transform, `ReadFromVcf`, to read VCF files into a `PCollection`
of `Variant` objects. Modified `TextSource` to be able to process file headers
to be used for VCF files.
Design Doc:
https://docs.google.com/document/d/1jsdxOPALYYlhnww2NLURS8NKXaFyRSJrcGbEDpY9Lkw/edit
CC: @arostamianfar @chamikaramj @aaltay
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mhsaul/beam miles_saul--vsf-io-source
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/beam/pull/3979.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3979
----
commit 3f699fcd286c8509cfc404d7c2bec35fd6342347
Author: Miles Saul <[email protected]>
Date: 2017-10-11T19:00:03Z
Added vcf file io source and modified _TextSource to optionally handle
headers
----
---