[ 
https://issues.apache.org/jira/browse/NIFI-3189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15949745#comment-15949745
 ] 

ASF subversion and git services commented on NIFI-3189:
-------------------------------------------------------

Commit fd92999dafc040940011c87bb2ee2c8edf5f96a2 in nifi's branch 
refs/heads/master from [~ijokarumawak]
[ https://git-wip-us.apache.org/repos/asf?p=nifi.git;h=fd92999 ]

NIFI-3189: ConsumeKafka 0.9 and 0.10 with downstream backpressure

Currently, NiFi Kafka consumer processors have following issue.

While downstream connections are full, ConsumeKafka is not scheduled to run 
onTrigger.
It stopps executing poll to tell Kafka server that this client is alive.
Thus, after a while in that situation, Kafka server rebalances the client.
When downstream connections back to normal, although ConsumeKafka is scheduled 
again,
the client is no longer a part of a consumer group.

If this happens, Kafka client succeeds polling messages when ConsumeKafka 
processor resumes, but fails to commit offset.
Received messages are already committed into NiFi flow, but since consumer 
offset is not updated, those will be consumed again, duplicated.

In order to address above issue:

- For ConsumeKafka_0_10, use latest client library

    Above issue has been addressed by KIP-62.
    The latest Kafka consumer poll checks if the client instance is still 
valid, and rejoin the group if not, before consuming messages.

- For ConsumeKafka (0.9), added manual retention logic using pause/resume

    Kafka client 0.9 doesn't have background thread heartbeat, so similar 
machanism is added manually.
    Use Kafka pause/resume consumer API to tell Kafka server that the client 
stops consuming messages but is still alive.
    Another internal thread is used to perform paused poll periodically based 
on the time passed since the last onTrigger(poll) is executed.

This closes #1527.

Signed-off-by: Bryan Bende <[email protected]>


> ConsumeKafka 0.9 and 0.10 can cause consumer rebalance when backpressure is 
> engaged
> -----------------------------------------------------------------------------------
>
>                 Key: NIFI-3189
>                 URL: https://issues.apache.org/jira/browse/NIFI-3189
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Joseph Witt
>            Assignee: Koji Kawamura
>             Fix For: 1.2.0
>
>
> ConsumeKafka processors can alert to rebalance issues when backpressure is 
> engaged on the output connection and is then freed up.  This is because we're 
> not doing anything with those consumers for a period of time and the kafka 
> client detects this and initiates a rebalance.  We should ensure that even 
> when we cannot send more data due to back pressure that we at least have some 
> sort of keep alive behavior with the kafka client.  Or, if that isn't an 
> option we should at least document the situation.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to