shunping commented on issue #36387:
URL: https://github.com/apache/beam/issues/36387#issuecomment-3373054961

   After digging in the code, it seems that when we convert the TestStream's 
element events from/to pipeline proto, we call `CoderUtils.encodeToByteArray` 
and `CoderUtils.decodeFromByteArray`.
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/TestStreamTranslation.java#L124-L126
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/util/construction/TestStreamTranslation.java#L150-L153.
   
   If called without a context, 
`CoderUtils.encodeToByteArray`/`CoderUtils.decodeFromByteArray` would use 
context outer.
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/util/CoderUtils.java#L96
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/util/CoderUtils.java#L116
   
    In such context, `StringUtf8Coder` will not have length prefix when 
encoding and decoding the data.
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StringUtf8Coder.java#L76-L82
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/StringUtf8Coder.java#L95-L98
   
   This works if there is not fusion break between TestStream and its following 
stages. However, for portable runner like Prism, there could be a fusion break 
after TestStream. If so, the TestStream values encoded in the pipeline proto is 
generated with outer context, while the downstream step use nested context to 
decode the buffer.
   
   
https://github.com/apache/beam/blob/09aa10c52f1d24846ab30e791c3e5bd544e9321f/sdks/java/core/src/main/java/org/apache/beam/sdk/values/WindowedValues.java#L838
   
   This leads to the stacktrace we see in the description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to