[jira] [Commented] (BEAM-744) A runner should be able to override KafkaIO max wait properties.
[ https://issues.apache.org/jira/browse/BEAM-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585260#comment-15585260 ] ASF GitHub Bot commented on BEAM-744: - GitHub user amitsela opened a pull request: https://github.com/apache/incubator-beam/pull/1125 [BEAM-744] A runner should be able to override KafkaIO max wait prope… Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[BEAM-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). --- …rties. Add KafkaOptions for the UnboundedKafkaReader. You can merge this pull request into a Git repository by running: $ git pull https://github.com/amitsela/incubator-beam BEAM-744 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-beam/pull/1125.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1125 commit 627f50cc510783117b0642d4f699d4b4d9b342c7 Author: SelaDate: 2016-10-18T11:36:04Z [BEAM-744] A runner should be able to override KafkaIO max wait properties. Add KafkaOptions for the UnboundedKafkaReader. > A runner should be able to override KafkaIO max wait properties. > > > Key: BEAM-744 > URL: https://issues.apache.org/jira/browse/BEAM-744 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: Amit Sela > > KafkaIO has two "wait" properties: > {{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 5 seconds. > {{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 10 msec. > [~rangadi] mentioned some of these were set to due to limitations of the > DirectRunner, and I can add that they are now limiting the Spark runner > (which reads in defined time frames, which may be smaller then the wait time > and so never actually read). > This feels like defaults should be set for optimal read from Kafka, while a > runner may override those if it needs to. > [~rangadi] also mentioned that this could be set in {{PipelineOptions}} which > may be passed when creating the reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-744) A runner should be able to override KafkaIO max wait properties.
[ https://issues.apache.org/jira/browse/BEAM-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15585107#comment-15585107 ] Amit Sela commented on BEAM-744: That's Correct, I updated the JIRA, thanks! > A runner should be able to override KafkaIO max wait properties. > > > Key: BEAM-744 > URL: https://issues.apache.org/jira/browse/BEAM-744 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: Amit Sela > > KafkaIO has two "wait" properties: > {{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 5 seconds. > {{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 10 msec. > [~rangadi] mentioned some of these were set to due to limitations of the > DirectRunner, and I can add that they are now limiting the Spark runner > (which reads in defined time frames, which may be smaller then the wait time > and so never actually read). > This feels like defaults should be set for optimal read from Kafka, while a > runner may override those if it needs to. > [~rangadi] also mentioned that this could be set in {{PipelineOptions}} which > may be passed when creating the reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (BEAM-744) A runner should be able to override KafkaIO max wait properties.
[ https://issues.apache.org/jira/browse/BEAM-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15569856#comment-15569856 ] Raghu Angadi commented on BEAM-744: --- > KAFKA_POLL_TIMEOUT - consumer poll timeout, default: 1 second. This timeout is KafkaIO internal implementation detail and should be ignored here. It does not impose any limitations on the reader (i.e. the reader can be closed before this timeout and everything is cleaned up properly). > A runner should be able to override KafkaIO max wait properties. > > > Key: BEAM-744 > URL: https://issues.apache.org/jira/browse/BEAM-744 > Project: Beam > Issue Type: Bug > Components: sdk-java-extensions >Reporter: Amit Sela > > KafkaIO has three "wait" properties: > {{KAFKA_POLL_TIMEOUT}} - consumer poll timeout, default: 1 second. > {{START_NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 5 seconds. > {{NEW_RECORDS_POLL_TIMEOUT}} - wait for new records inside {{start()}}, > default: 10 msec. > [~rangadi] mentioned some of these were set to due to limitations of the > DirectRunner, and I can add that they are now limiting the Spark runner > (which reads in defined time frames, which may be smaller then the wait time > and so never actually read). > This feels like defaults should be set for optimal read from Kafka, while a > runner may override those if it needs to. > [~rangadi] also mentioned that this could be set in {{PipelineOptions}} which > may be passed when creating the reader. -- This message was sent by Atlassian JIRA (v6.3.4#6332)