[ 
https://issues.apache.org/jira/browse/KAFKA-10173?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17143075#comment-17143075
 ] 

John Roesler commented on KAFKA-10173:
--------------------------------------

Hi [~karsten.schnitter] ,

Thanks for your efforts. I'm also adding some additional system tests 
specifically for suppression in combination with an upgrade from 2.3.1 to 2.5.0.

Thanks for the confirmation about the context. I think logging both the 
serialization and deserialization path should provide a lot of clarity for now. 
The stacktrace above implies the record was written by the 2.5.0 application, 
so it will be interesting to see if it really prints out your serialization log 
messages.

I'll also add some "trace" level log messages to suppression in AK to help in 
future debugging efforts.

If you're able, I guess this should be sufficient for now:
 * Log the who record during restore
 * Log the data in InMemoryTimeOrderedKeyValueBuffer#logValue, both the Key and 
BufferValue before serialization and the byte array that we return from 
BufferValue#serialize

In response to your last question, certainly! You'd know best what data this 
application is producing. If the value for two different records could be 
exactly the same, then the identical priorValue may be expected.

What doesn't seem expected to me is that there would be a priorValue at all for 
the record that was at offset zero of the input. It makes me wonder if the 
application state is corrupted somehow, but I can't wrap my head around _how_.

I'll let you know how my testing efforts progress today.

-John

> BufferUnderflowException during Kafka Streams Upgrade
> -----------------------------------------------------
>
>                 Key: KAFKA-10173
>                 URL: https://issues.apache.org/jira/browse/KAFKA-10173
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 2.5.0
>            Reporter: Karsten Schnitter
>            Assignee: John Roesler
>            Priority: Major
>              Labels: suppress
>             Fix For: 2.5.1
>
>
> I migrated a Kafka Streams application from version 2.3.1 to 2.5.0. I 
> followed the steps described in the upgrade guide and set the property 
> {{migrate.from=2.3}}. On my dev system with just one running instance I got 
> the following exception:
> {noformat}
> stream-thread [0-StreamThread-2] Encountered the following error during 
> processing:
> java.nio.BufferUnderflowException: null
>       at java.base/java.nio.HeapByteBuffer.get(Unknown Source)
>       at java.base/java.nio.ByteBuffer.get(Unknown Source)
>       at 
> org.apache.kafka.streams.state.internals.BufferValue.extractValue(BufferValue.java:94)
>       at 
> org.apache.kafka.streams.state.internals.BufferValue.deserialize(BufferValue.java:83)
>       at 
> org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBuffer.restoreBatch(InMemoryTimeOrderedKeyValueBuffer.java:368)
>       at 
> org.apache.kafka.streams.processor.internals.CompositeRestoreListener.restoreBatch(CompositeRestoreListener.java:89)
>       at 
> org.apache.kafka.streams.processor.internals.StateRestorer.restore(StateRestorer.java:92)
>       at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.processNext(StoreChangelogReader.java:350)
>       at 
> org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:94)
>       at 
> org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:401)
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:779)
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:697)
>       at 
> org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:670)
> {noformat}
> I figured out, that this problem only occurs for stores, where I use the 
> suppress feature. If I rename the changelog topics during the migration, the 
> problem will not occur. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to