[ 
https://issues.apache.org/jira/browse/KAFKA-4362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15644945#comment-15644945
 ] 

Mayuresh Gharat commented on KAFKA-4362:
----------------------------------------

[~jasong35] form the details that Joel has listed, I think there are 2 issues :
1) Offsets commit fail when the Offsets topic partition is moved. This happens 
because the old coordinator incorrectly returns an iilegalArgumentException 
when checking for the MessageVersion format, when its infact checking first if 
the replica is local. So the correct way here would be to return 
"NotCoordinatorForGroupException" from server side.
2) On client side, right not due to illegalArgumentException thrown by server 
which is bubbled as UnknownException, the consumer is not able to handle it 
correctly. 

I think once we return the correct (NotCoordinatorForGroupException) exception, 
the consumer should be able to handle it and proceed. 

> Consumer can fail after reassignment of the offsets topic partition
> -------------------------------------------------------------------
>
>                 Key: KAFKA-4362
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4362
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.10.1.0
>            Reporter: Joel Koshy
>            Assignee: Mayuresh Gharat
>
> When a consumer offsets topic partition reassignment completes, an offset 
> commit shows this:
> {code}
> java.lang.IllegalArgumentException: Message format version for partition 100 
> not found
>     at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
>     at 
> kafka.coordinator.GroupMetadataManager$$anonfun$14.apply(GroupMetadataManager.scala:633)
>  ~[kafka_2.10.jar:?]
>     at scala.Option.getOrElse(Option.scala:120) ~[scala-library-2.10.4.jar:?]
>     at 
> kafka.coordinator.GroupMetadataManager.kafka$coordinator$GroupMetadataManager$$getMessageFormatVersionAndTimestamp(GroupMetadataManager.scala:632)
>  ~[kafka_2.10.jar:?]
>     at 
> ...
> {code}
> The issue is that the replica has been deleted so the 
> {{GroupMetadataManager.getMessageFormatVersionAndTimestamp}} throws this 
> exception instead which propagates as an unknown error.
> Unfortunately consumers don't respond to this and will fail their offset 
> commits.
> One workaround in the above situation is to bounce the cluster - the consumer 
> will be forced to rediscover the group coordinator.
> (Incidentally, the message incorrectly prints the number of partitions 
> instead of the actual partition.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to