Rui Abreu created STORM-4016:
--------------------------------
Summary: Kafka spout: start using poll(Duration)
Key: STORM-4016
URL: https://issues.apache.org/jira/browse/STORM-4016
Project: Apache Storm
Issue Type: Improvement
Affects Versions: 2.6.0
Reporter: Rui Abreu
Assignee: Rui Abreu
Kafka has deprecated poll(long) in favour of poll(Duration): [KIP-266: Fix
consumer indefinite blocking
behavior|https://cwiki.apache.org/confluence/display/KAFKA/KIP-266%3A+Fix+consumer+indefinite+blocking+behavior]
There is also an interesting report about the behaviour of it poll:
_The pre-existing variant {{poll(long timeout)}} would block indefinitely for
metadata updates if they were needed, then it would issue a fetch and poll for
{{timeout}} ms for new records. The initial indefinite metadata block caused
applications to become stuck when the brokers became unavailable. The existence
of the timeout parameter made the indefinite block especially unintuitive._
_We will add a new method {{poll(Duration timeout)}} with the semantics:_
# _iff a metadata update is needed:_
## _send (asynchronous) metadata requests_
## _poll for metadata responses (counts against timeout)_
*** _if no response within timeout, return an empty collection immediately_
# _if there is fetch data available, return it immediately_
# _if there is no fetch request in flight, send fetch requests_
# _poll for fetch responses (counts against timeout)_
** _if no response within timeout, return an empty collection (leaving async
fetch request for the next poll)_
** _if we get a response, return the response_
_We will deprecate the original method, {{{}poll(long timeout){}}}, and we will
not change its semantics, so it remains:_
# _iff a metadata update is needed:_
## _send (asynchronous) metadata requests_
## _poll for metadata responses indefinitely until we get it_
# _if there is fetch data available, return it immediately_
# _if there is no fetch request in flight, send fetch requests_
# _poll for fetch responses (counts against timeout)_
** _if no response within timeout, return an empty collection (leaving async
fetch request for the next poll)_
** _if we get a response, return the response_
_One notable usage is prohibited by the new {{{}poll{}}}: previously, you could
call {{poll(0)}} to block for metadata updates, for example to initialize the
client, supposedly without fetching records. Note, though, that this behavior
is not according to any contract, and there is no guarantee that {{poll(0)}}
won't return records the first time it's called. Therefore, it has always been
unsafe to ignore the response._
[https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=75974886|http://example.com]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)