[GitHub] beam pull request #4147: [BEAM-3209] Clarify documentation on support for re...

2017-11-17 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/4147 [BEAM-3209] Clarify documentation on support for reading from/writing to time par… …titioned BQ tables. Follow this checklist to help us incorporate your contribution quickly

[GitHub] beam pull request #4067: Updates Python datastore wordcount example to take ...

2017-10-31 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/4067 Updates Python datastore wordcount example to take a dataset parameter. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure

[GitHub] beam pull request #4064: [BEAM-1630] Adds support for processing Splittable ...

2017-10-31 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/4064 [BEAM-1630] Adds support for processing Splittable DoFns using DirectRunner. Updates DoFn invocation logic to allow invoking SDF methods. Adds SDF machinery that will be common

[GitHub] beam pull request #4025: [BEAM-3088] Improves size estimation of BigQueryTab...

2017-10-21 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/4025 [BEAM-3088] Improves size estimation of BigQueryTableSource. Updates BigQueryTableSource to consider data in streaming buffer when determining estimated size. Follow this checklist

[GitHub] beam pull request #3998: [BEAM-3029] Sets user agent in BigTableIO.Read.getB...

2017-10-18 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/3998 ---

[GitHub] beam pull request #4007: [BEAM-3065] Avoids generating proto files for Windo...

2017-10-17 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/4007 [BEAM-3065] Avoids generating proto files for Windows if grpcio-tools is not installed. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make

[GitHub] beam pull request #3998: [BEAM-3029] Sets user agent in BigTableIO.Read.getB...

2017-10-16 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3998 [BEAM-3029] Sets user agent in BigTableIO.Read.getBigTableService(). Cherry-picking this commit to 2.2.0 release branch. Follow this checklist to help us incorporate your contribution

[GitHub] beam pull request #3996: [BEAM-3029] Sets userAgent option in BigTableReadIT

2017-10-16 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3996 [BEAM-3029] Sets userAgent option in BigTableReadIT Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https

[GitHub] beam pull request #3962: [Beam-3028] Fixes a bug in DatastoreIO query splitt...

2017-10-08 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3962 [Beam-3028] Fixes a bug in DatastoreIO query splitting. We were returning original query instead of the sub-queries resulting in data duplication when reading. Follow this checklist

[GitHub] beam pull request #3892: [BEAM-2985] Updates WriteToBigQuery PTransform to g...

2017-09-22 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3892 [BEAM-2985] Updates WriteToBigQuery PTransform to get project id from GoogleCloud… …Options when using DirectRunner. WriteToBigQuery PTransform behaves differently for DirectRunner

[GitHub] beam pull request #3882: [BEAM-1630] Adds API for defining Splittable DoFns ...

2017-09-21 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3882 [BEAM-1630] Adds API for defining Splittable DoFns using Python SDK. See https://s.apache.org/splittable-do-fn-python-sdk for the design. This PR and the above doc were updated

[GitHub] beam pull request #3820: [BEAM-2545] Updates bigtable.version to 1.0.0-pre3.

2017-09-08 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3820 [BEAM-2545] Updates bigtable.version to 1.0.0-pre3. Performs a slight update to BigtableServiceImpl to comply with the new version. Follow this checklist to help us incorporate your

[GitHub] beam pull request #3731: Fixes a pydocs validation failure due to a recent c...

2017-08-17 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3731 Fixes a pydocs validation failure due to a recent commit. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue

[GitHub] beam pull request #3715: [BEAM-2711] Updates ByteKeyRangeTracker so that get...

2017-08-10 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3715 [BEAM-2711] Updates ByteKeyRangeTracker so that getFractionConsumed() does not fail for completed trackers After this update: * getFractionConsumed() returns 1.0 after markDone() is set

[GitHub] beam pull request #3701: Updates BEAM_CONTAINER_VERSION to 2.2.0.

2017-08-08 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3701 Updates BEAM_CONTAINER_VERSION to 2.2.0. Follow this checklist to help us incorporate your contribution quickly and easily: - [ ] Make sure there is a [JIRA issue](https

[GitHub] beam pull request #3681: [BEAM-2708] Adds support for reading concatenated b...

2017-08-03 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3681 [BEAM-2708] Adds support for reading concatenated bzip2 files Cherry-picking into 2.1.0 release branch. Corresponding fix for Java SDK was already cherry picked into 2.1.0 branch. I

[GitHub] beam pull request #3678: [BEAM-2708] Adds support for reading concatenated b...

2017-08-03 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3678 [BEAM-2708] Adds support for reading concatenated bzip2 files Adds tests for concatenated gzip and bzip2 files. Removes test 'test_model_textio_gzip_concatenated' in 'snippets_test.py

[GitHub] beam pull request #3668: [BEAM-2141] Updates jenkins job for JDBCIOIT

2017-07-31 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3668 [BEAM-2141] Updates jenkins job for JDBCIOIT This is a slightly updated version of Stephen Sisk's https://github.com/apache/beam/pull/3604. Follow this checklist to help us incorporate

[GitHub] beam pull request #3661: [BEAM-2643] Adds two new Read PTransforms that can ...

2017-07-28 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3661 [BEAM-2643] Adds two new Read PTransforms that can be used to read a massive number of files textio.ReadAllFromText is for reading a PCollection of text files/file patterns

[GitHub] beam pull request #3414: [BEAM-2494] Remove GroupedShuffleRangeTracker which...

2017-06-21 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3414 [BEAM-2494] Remove GroupedShuffleRangeTracker which is unused in the SDK Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make

[GitHub] beam pull request #3333: [BEAM-1630] Adds ability to dynamically replace PTr...

2017-06-08 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/ [BEAM-1630] Adds ability to dynamically replace PTransforms during runtime. Adds two new interfaces, PTransformMatcher and PTransformOverride. Currently only supports replacements where

[GitHub] beam-site pull request #253: [BEAM-3240] Improves development and testing in...

2017-05-25 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam-site/pull/253 [BEAM-3240] Improves development and testing instructions related to Python SDK Updates contribution guide to include development and testing instructions for Python SDK. You can merge

[GitHub] beam pull request #3089: [BEAM-1340] Adds __all__ tags to classes in package...

2017-05-11 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3089 [BEAM-1340] Adds __all__ tags to classes in package apache_beam/io. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure

[GitHub] beam pull request #3074: [BEAM-1340] Adds __all__ tags to modules in package...

2017-05-11 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/3074 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #3074: [BEAM-1340] Adds __all__ tags to classes in package...

2017-05-10 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3074 [BEAM-1340] Adds __all__ tags to classes in package apache_beam/io Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure

[GitHub] beam pull request #3041: [BEAM-2241] Renames some python classes and functio...

2017-05-10 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/3041 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #3041: [BEAM-2241] Renames some python classes and functio...

2017-05-10 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3041 [BEAM-2241] Renames some python classes and functions that were unnecessarily public Adds a note to documentation of classes that are public but should be only used internally by the SDK (non

[GitHub] beam pull request #3036: [BEAM-2241] Renames some python classes and functio...

2017-05-09 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/3036 [BEAM-2241] Renames some python classes and functions that were unnecessarily public Adds a note to documentation of classes that are public but should be only used internally by the SDK (non

[GitHub] beam pull request #2770: [BEAM-539] Fixes several issues of FileSink

2017-04-28 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/2770 [BEAM-539] Fixes several issues of FileSink (1) Updates FileSink to fail for file name prefixes that only contain a single component (for example GCS buckets). For example, currently

[GitHub] beam pull request #2536: [BEAM-1179] Renames assertions of source_test_utils

2017-04-13 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/2536 [BEAM-1179] Renames assertions of source_test_utils Renames assertions of source_test_utils from camelcase to underscore-separated. You can merge this pull request into a Git repository

[GitHub] beam pull request #2519: [BEAM-1925] Updates DoFn invocation logic to be mor...

2017-04-12 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/2519 [BEAM-1925] Updates DoFn invocation logic to be more extensible. Adds following abstractions. DoFnSignature: describes the signature of a given DoFn object. DoFnInvoker: defines

[GitHub] beam pull request #2289: [BEAM-1782] Updates BigQuery read transform to corr...

2017-04-03 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/2289 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #2289: [BEAM-1782] Updates BigQuery read transform to corr...

2017-03-22 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/2289 [BEAM-1782] Updates BigQuery read transform to correctly process empty repeated fields. This fixes DirectRunnner. DataflowRunner is already processing these fields correctly. You can merge

[GitHub] beam-site pull request #186: Add chamikara as a committer

2017-03-17 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam-site/pull/186 Add chamikara as a committer You can merge this pull request into a Git repository by running: $ git pull https://github.com/chamikaramj/beam-site website_add_to_team Alternatively

[GitHub] beam pull request #1978: [BEAM-1463] Updates BigQuery read transform to hand...

2017-02-10 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/1978 [BEAM-1463] Updates BigQuery read transform to handle 'null' fields properly for DirectRunner Updates BigQuery read transform so that DirectRunner handles 'null' fields properly

[GitHub] beam pull request #1932: [BEAM-1406] Removes deprecated fileio.TextFileSink

2017-02-06 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/1932 [BEAM-1406] Removes deprecated fileio.TextFileSink Users should be using textio.WriteToText() transform instead of fileio.TextFileSink. You can merge this pull request into a Git repository

[GitHub] beam pull request #1916: [BEAM-1388] Updates default values used by retry de...

2017-02-03 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/1916 [BEAM-1388] Updates default values used by retry decorator. Updates following defaults so that total wait time by default is more practical. num_retries from 16 to 7. max_delay_secs

[GitHub] beam pull request #1866: [BEAM-1338] Moves ThreadPool creation to a util fun...

2017-01-30 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/1866 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #1820: [BEAM-1299] Removes Dataflow native text source and...

2017-01-24 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/1820 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #1820: [BEAM-1299] Removes Dataflow native text source and...

2017-01-23 Thread chamikaramj
GitHub user chamikaramj reopened a pull request: https://github.com/apache/beam/pull/1820 [BEAM-1299] Removes Dataflow native text source and sink from Beam Python SDK. Users should be using Beam text source and sink available in module 'textio.py' instead of this. Also

[GitHub] beam pull request #1820: [BEAM-1299] Removes Dataflow native text source and...

2017-01-23 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/1820 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] beam pull request #1820: [BEAM-1299] Removes Dataflow native text source and...

2017-01-23 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/1820 [BEAM-1299] Removes Dataflow native text source and sink from Beam Python SDK. Users should be using Beam text source and sink available in module 'textio.py' instead of this. Also

[GitHub] beam pull request #1818: [BEAM-1298] Increments major used by Dataflow runne...

2017-01-23 Thread chamikaramj
GitHub user chamikaramj opened a pull request: https://github.com/apache/beam/pull/1818 [BEAM-1298] Increments major used by Dataflow runner to 5 Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR

[GitHub] beam pull request #1728: [BEAM-1239] Updates Python SDK examples to use Beam...

2017-01-03 Thread chamikaramj
Github user chamikaramj closed the pull request at: https://github.com/apache/beam/pull/1728 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature