[
https://issues.apache.org/jira/browse/DAFFODIL-2608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456476#comment-17456476
]
Steve Lawrence commented on DAFFODIL-2608:
------------------------------------------
I've tracked down the issue, not sure yet of the best fix. Here's the jist of
the issue:
It is possible that a layer operation may suspend. When it does this, we
correctly clone the UState and create a UStateForSuspension and do all the
normal suspension stuff. When the suspension un-suspends, we call the
continuation function for the layer defined here:
[https://github.com/apache/daffodil/blob/178867d806d57bace29a3966d9fd6d6fda3fa235/daffodil-runtime1/src/main/scala/org/apache/daffodil/layers/LayerTransformer.scala#L417-L426]
We can see that a UState is passed in, which is the UStateForSuspension that
was cloned as part of the suspension stuff. However, we do not actually use
this UState in this function. Instead we call layerRuntimeInfo.setVariable(...)
and that function calls state.setVariable(...). The state variable inside the
layerRuntimeInfo is the UState from when the layer was created, which is the
main UState, not the cloned UStateForSuspension. This means that when the
suspended layer un-suspends, it modifies the wrong state–main UState instead of
UStateForSuspension. This means all layers are modifying the same (incorrect)
state, which can lead to them setting the same variable multiple times.
Some potential ways to fix this:
# When we suspend, change the state in layerRuntimeData to be the cloned
UStateForSuspension, rather than UState. I'm not sure if there's a great place
to do this, since I don't think the layer has access to the cloned
UStateForSuspension until the continuation function is called
# Modify the continuation function to not use layerRuntimeData at all, and
instead directly use the passed in UState, which is the correct
UStateForSuspension. I'm not sure if other places might need to be modified
too, or if even leaving the layerRuntimeData with the incorrect UState is safe.
But maybe once it suspends, it doesn't matter?
> PCAP fails with Daf 3.2.0 and IPv4 layers with checksum
> -------------------------------------------------------
>
> Key: DAFFODIL-2608
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2608
> Project: Daffodil
> Issue Type: Bug
> Components: Back End
> Affects Versions: 3.2.0
> Reporter: Mike Beckerle
> Priority: Critical
>
> Updating PCAP to use the new ethernetIP with IPv4 (and Daffodil 3.2.0) is
> failing.
> Please look at https://github.com/DFDLSchemas/PCAP/pull/15
> Or you can clone my repo: [email protected]:mbeckerle/PCAP.git
> Checkout master branch
> $ sbt
> > testOnly *.TestPCAP -- --tests=test_pcap_test_dns
> Will reproduce the error.
> That test used to be a parserTestCase with roundTrip onePass. However, I have
> temporarily converted it into an unparserTestCase with roundTrip 'none' to
> isolate that this bug is only when unparsing.
> I think this is a Daffodil 3.2.0 bug.
> However, I recall somewhere bugs/errors like this being discussed not too
> long ago.
>
--
This message was sent by Atlassian Jira
(v8.20.1#820001)