[
https://issues.apache.org/jira/browse/NIFI-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949745#comment-15949745
]
ASF subversion and git services commented on NIFI-3189:
-------------------------------------------------------
Commit fd92999dafc040940011c87bb2ee2c8edf5f96a2 in nifi's branch
refs/heads/master from [~ijokarumawak]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=fd92999 ]
NIFI-3189: ConsumeKafka 0.9 and 0.10 with downstream backpressure
Currently, NiFi Kafka consumer processors have following issue.
While downstream connections are full, ConsumeKafka is not scheduled to run
onTrigger.
It stopps executing poll to tell Kafka server that this client is alive.
Thus, after a while in that situation, Kafka server rebalances the client.
When downstream connections back to normal, although ConsumeKafka is scheduled
again,
the client is no longer a part of a consumer group.
If this happens, Kafka client succeeds polling messages when ConsumeKafka
processor resumes, but fails to commit offset.
Received messages are already committed into NiFi flow, but since consumer
offset is not updated, those will be consumed again, duplicated.
In order to address above issue:
- For ConsumeKafka_0_10, use latest client library
Above issue has been addressed by KIP-62.
The latest Kafka consumer poll checks if the client instance is still
valid, and rejoin the group if not, before consuming messages.
- For ConsumeKafka (0.9), added manual retention logic using pause/resume
Kafka client 0.9 doesn't have background thread heartbeat, so similar
machanism is added manually.
Use Kafka pause/resume consumer API to tell Kafka server that the client
stops consuming messages but is still alive.
Another internal thread is used to perform paused poll periodically based
on the time passed since the last onTrigger(poll) is executed.
This closes #1527.
Signed-off-by: Bryan Bende <[email protected]>
> ConsumeKafka 0.9 and 0.10 can cause consumer rebalance when backpressure is
> engaged
> -----------------------------------------------------------------------------------
>
> Key: NIFI-3189
> URL: https://issues.apache.org/jira/browse/NIFI-3189
> Project: Apache NiFi
> Issue Type: Bug
> Affects Versions: 1.1.0
> Reporter: Joseph Witt
> Assignee: Koji Kawamura
> Fix For: 1.2.0
>
>
> ConsumeKafka processors can alert to rebalance issues when backpressure is
> engaged on the output connection and is then freed up. This is because we're
> not doing anything with those consumers for a period of time and the kafka
> client detects this and initiates a rebalance. We should ensure that even
> when we cannot send more data due to back pressure that we at least have some
> sort of keep alive behavior with the kafka client. Or, if that isn't an
> option we should at least document the situation.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)