ASF GitHub Bot commented on BEAM-744:

GitHub user amitsela opened a pull request:


    [BEAM-744] A runner should be able to override KafkaIO max wait prope…

    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
     - [ ] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [ ] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [ ] If this contribution is large, please file an Apache
           [Individual Contributor License 
    Add KafkaOptions for the UnboundedKafkaReader.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/amitsela/incubator-beam BEAM-744

Alternatively you can review and apply these changes as the patch at:


To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1125
commit 627f50cc510783117b0642d4f699d4b4d9b342c7
Author: Sela <ans...@paypal.com>
Date:   2016-10-18T11:36:04Z

    [BEAM-744] A runner should be able to override KafkaIO max wait properties.
    Add KafkaOptions for the UnboundedKafkaReader.


> A runner should be able to override KafkaIO max wait properties.
> ----------------------------------------------------------------
>                 Key: BEAM-744
>                 URL: https://issues.apache.org/jira/browse/BEAM-744
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-extensions
>            Reporter: Amit Sela
> KafkaIO has two "wait" properties:
> {{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, 
> default: 5 seconds.
> {{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, 
> default: 10 msec.
> [~rangadi] mentioned some of these were set to due to limitations of the 
> DirectRunner, and I can add that they are now limiting the Spark runner 
> (which reads in defined time frames, which may be smaller then the wait time 
> and so never actually read).
> This feels like defaults should be set for optimal read from Kafka, while a 
> runner may override those if it needs to.
> [~rangadi] also mentioned that this could be set in {{PipelineOptions}} which 
> may be passed when creating the reader. 

This message was sent by Atlassian JIRA

Reply via email to