[
https://issues.apache.org/jira/browse/SPARK-23017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319462#comment-16319462
]
Bhaskar E commented on SPARK-23017:
-----------------------------------
[~srowen]: How to start a mailing list?
> Why would spark-kafka stream fail stating `Got wrong record for <groupid>
> <topic> <partition> even after seeking to offset #` when using kafka API to
> commit offset
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-23017
> URL: https://issues.apache.org/jira/browse/SPARK-23017
> Project: Spark
> Issue Type: Question
> Components: Structured Streaming
> Affects Versions: 2.2.1
> Reporter: Bhaskar E
>
> My spark-kafka streaming job started failing after multiple messages stating
> - `Got wrong record for <groupid> <topic> <partition> even after seeking to
> offset # `.
> I disabled `enable.auto.commit` and saving the commits (to kafka itself)
> manually using kafka API
> {code}((CanCommitOffsets)
> messages.inputDStream()).commitAsync(offsetRanges.get());{code}
> When I'm manually commit offsets to kafka and my job resumes requesting
> (kafka) (say after 1 hr recovering from some failure) for data then kafka
> should send the next available offsets (from last committed offset).
> So, when I'm using kafka itself to store my committed offsets then my spark
> job clearly doesn't know what's the next offset to request. But, here in the
> error message it states that it `Got wrong record .... even after seeking to
> a particular offset #`. *So, how is this possible?*
> If I assume that the spark-driver gets some offsets ahead from kafka (before
> initially reading the actual records) and then start requesting for the
> offsets even then it's confusing how could spark receive wrong offset when it
> is requesting for the offsets which it got from kafka itself in the first
> place?
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]