[
https://issues.apache.org/jira/browse/BEAM-8382?focusedWorklogId=327158&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-327158
]
ASF GitHub Bot logged work on BEAM-8382:
----------------------------------------
Author: ASF GitHub Bot
Created on: 12/Oct/19 00:23
Start Date: 12/Oct/19 00:23
Worklog Time Spent: 10m
Work Description: jfarr commented on issue #9765: [BEAM-8382] Add polling
interval to KinesisIO.Read
URL: https://github.com/apache/beam/pull/9765#issuecomment-541262879
> Thank you for contribution. Beam tends to minimize number of tuning knobs.
So, I'm wondering if it's possible to detect such throttling behaviour and
increase timeout automatically in the runtime?
Hi @aromanenko-dev, thanks for the feedback. That makes sense and I think
that it would be possible but I'm concerned that it would overly complicate the
implementation and the runtime behavior. Also, an algorithm that could perform
well would likely only add more knobs and if those aren't exposed then they
just aren't tunable. Personally I would prefer the simplicity and
predictability of a simple rate limit. If preserving the default behavior
unchanged isn't a concern what do you think about just setting a reasonable
default (1 second is the AWS recommendation) but leaving that knob in for the
rare case that users need to tune it? I doubt that the vast majority would ever
need to touch it.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 327158)
Time Spent: 0.5h (was: 20m)
> Add polling interval to KinesisIO.Read
> --------------------------------------
>
> Key: BEAM-8382
> URL: https://issues.apache.org/jira/browse/BEAM-8382
> Project: Beam
> Issue Type: Improvement
> Components: io-java-kinesis
> Affects Versions: 2.13.0, 2.14.0, 2.15.0
> Reporter: Jonothan Farr
> Assignee: Jonothan Farr
> Priority: Major
> Time Spent: 0.5h
> Remaining Estimate: 0h
>
> With the current implementation we are observing Kinesis throttling due to
> ReadProvisionedThroughputExceeded on the order of hundreds of times per
> second, regardless of the actual Kinesis throughput. This is because the
> ShardReadersPool readLoop() method is polling getRecords() as fast as
> possible.
> From the KDS documentation:
> {quote}Each shard can support up to five read transactions per second.
> {quote}
> and
> {quote}For best results, sleep for at least 1 second (1,000 milliseconds)
> between calls to getRecords to avoid exceeding the limit on getRecords
> frequency.
> {quote}
> [https://docs.aws.amazon.com/streams/latest/dev/service-sizes-and-limits.html]
> [https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-sdk.html]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)