[ 
https://issues.apache.org/jira/browse/APEXMALHAR-2518?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081461#comment-16081461
 ] 

ASF GitHub Bot commented on APEXMALHAR-2518:
--------------------------------------------

GitHub user PramodSSImmaneni opened a pull request:

    https://github.com/apache/apex-malhar/pull/644

    APEXMALHAR-2518 Terminating operator when there is a server error in 
processing commit offsets

    

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/PramodSSImmaneni/apex-malhar APEXMALHAR-2518

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/apex-malhar/pull/644.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #644
    
----
commit df4798e3a7e048e25cc7790951d7462dab257cd3
Author: Pramod Immaneni <pra...@datatorrent.com>
Date:   2017-07-11T01:09:38Z

    APEXMALHAR-2518 Terminating operator execution when there is an error in 
commit offset processing

----


> Kafka input operator stops reading tuples when there is a UNKNOWN_MEMBER_ID 
> error during committed offset processing
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: APEXMALHAR-2518
>                 URL: https://issues.apache.org/jira/browse/APEXMALHAR-2518
>             Project: Apache Apex Malhar
>          Issue Type: Bug
>            Reporter: Pramod Immaneni
>            Assignee: Pramod Immaneni
>
> Kafka 0.9 operator stores offsets that are completely processed and no longer 
> needed (committed offsets) back in kafka. It does so by making a kafka API 
> call. If the response from kafka server to this call comes back with an 
> UNKNOWN_MEMBER_ID error, it results in the kafka consumer state changing to 
> needing partition re-assignment and no further messages are returned by the 
> consumer. There are a couple of other errors that result in the same state 
> including when rebalance is in progress.
> What exactly caused this error is not known but the following is the likely 
> reason due to the conditions surrounding the application. When the operator 
> has temporarily stalled due to back-pressure exerted by the slow downstream, 
> it will eventually stall the operator kafka consumer thread that is reading 
> messages from kafka. This will result in the thread not making any kafka 
> consumer API calls and it will result in no heartbeats being sent to kafka 
> server. This can cause the server to evict the consumer after a timeout 
> period. This could have been the cause for the UNKNOWN_MEMBER_ID error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to