[ 
https://issues.apache.org/jira/browse/SPARK-23017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319462#comment-16319462
 ] 

Bhaskar E commented on SPARK-23017:
-----------------------------------

[~srowen]: How to start a mailing list?

> Why would spark-kafka stream fail stating `Got wrong record for <groupid> 
> <topic> <partition> even after seeking to offset #` when using kafka API to 
> commit offset
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-23017
>                 URL: https://issues.apache.org/jira/browse/SPARK-23017
>             Project: Spark
>          Issue Type: Question
>          Components: Structured Streaming
>    Affects Versions: 2.2.1
>            Reporter: Bhaskar E
>
> My spark-kafka streaming job started failing after multiple messages stating 
> - `Got wrong record for <groupid> <topic> <partition> even after seeking to 
> offset # `.
> I disabled `enable.auto.commit` and saving the commits (to kafka itself) 
> manually using kafka API
> {code}((CanCommitOffsets) 
> messages.inputDStream()).commitAsync(offsetRanges.get());{code}
> When I'm manually commit offsets to kafka and my job resumes requesting 
> (kafka) (say after 1 hr recovering from some failure) for data then kafka 
> should send the next available offsets (from last committed offset). 
> So, when I'm using kafka itself to store my committed offsets then my spark 
> job clearly doesn't know what's the next offset to request. But, here in the 
> error message it states that it `Got wrong record ....  even after seeking to 
> a particular offset #`. *So, how is this possible?*
> If I assume that the spark-driver gets some offsets ahead from kafka (before 
> initially reading the actual records) and then start requesting for the 
> offsets even then it's confusing how could spark receive wrong offset when it 
> is requesting for the offsets which it got from kafka itself in the first 
> place?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to