[
https://issues.apache.org/jira/browse/DAFFODIL-2755?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Lawrence updated DAFFODIL-2755:
-------------------------------------
Description:
Functions in SequenceChildParseResultHelper.scala and
SeparatedSequenceChildParseResultHelper.scala call the
maybeMostRecentlyAddedChild() function, which has a purpose of getting the
child that was most recently added to the infoset and inspecting it element to
determine things nil/normal/missing representation and determine the parse
status. This seems to usually be used when dealing with postfix separators, but
there may be other cases and more specific circumstances needed.
However, in some cases it is possible that the InfosetWalker has already walked
and remove elements from the Infoset because it thinks they aren't needed
anymore. But this function does actually still need them, which can lead to
null exceptions.
Fortunately, the InfosetWalker only removes infoset elements periodically
depending on a number of factors, so it's fairly rare that this case actually
happens. But when it does it appears very random. To reliably cause the issue,
we can force the infoset walker to remove elements as soon as it thinks it's
possible by setting the InfosetWalker minSkip value to zero. After doing that,
the below schema and data reliably reproduce the issue.
Data:
{code}
|classificiation;body1;body1;body3;|
{code}
Schema:
{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<schema
xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/">
<include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"
/>
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
<dfdl:format xmlns="" ref="GeneralFormat" lengthKind="delimited" />
</appinfo>
</annotation>
<element name="file">
<complexType>
<sequence dfdl:initiator="|" dfdl:terminator="|" dfdl:separator=";">
<element name="first" type="xs:string" />
<sequence dfdl:separator=";" dfdl:separatorPosition="postfix">
<element name="field" type="xs:string" minOccurs="0"
maxOccurs="unbounded" />
</sequence>
</sequence>
</complexType>
</element>
</schema>
{code}
was:
Functions in SequenceChildParseResultHelper.scala and
SeparatedSequenceChildParseResultHelper.scala call the
maybeMostRecentlyAddedChild() function, which has a purpose of getting the
child that was most recently added to the infoset and inspecting it element to
determine things nil/normal/missing representation and determine the parse
status. This seems to usually be used when dealing with postfix separators, but
there may be other cases and more specific circumstances needed.
However, in some cases it is possible that the InfosetWalker has already walked
and remove elements from the Infoset because it thinks they aren't needed
anymore. But this function does actually still need them, which can lead to
null exceptions.
Fortunately, the InfosetWalker only removes infoset elements periodically
depending on a number of factors, so it's fairly rare that this case actually
happens. But when it does it appears very random. To reliably cause the issue,
we can force the infoset walker to remove elements as soon as it thinks it's
possible by setting the InfosetWalker minSkip value to zero. After doing that,
the below schema and data reliably reproduce the issue.
Data:
{code}
|classificiation;body1;body1;body3;|
{code}
{code:xml}
<?xml version="1.0" encoding="UTF-8"?>
<schema
xmlns="http://www.w3.org/2001/XMLSchema"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/">
<include schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd"
/>
<annotation>
<appinfo source="http://www.ogf.org/dfdl/">
<dfdl:format xmlns="" ref="GeneralFormat" lengthKind="delimited" />
</appinfo>
</annotation>
<element name="file">
<complexType>
<sequence dfdl:initiator="|" dfdl:terminator="|" dfdl:separator=";">
<element name="first" type="xs:string" />
<sequence dfdl:separator=";" dfdl:separatorPosition="postfix">
<element name="field" type="xs:string" minOccurs="0"
maxOccurs="unbounded" />
</sequence>
</sequence>
</complexType>
</element>
</schema>
{code}
> infoset nodes removed by InfosetWalker tha are still needed, null pointer
> exception
> -----------------------------------------------------------------------------------
>
> Key: DAFFODIL-2755
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2755
> Project: Daffodil
> Issue Type: Bug
> Components: Back End
> Affects Versions: 3.4.0
> Reporter: Steve Lawrence
> Priority: Major
>
> Functions in SequenceChildParseResultHelper.scala and
> SeparatedSequenceChildParseResultHelper.scala call the
> maybeMostRecentlyAddedChild() function, which has a purpose of getting the
> child that was most recently added to the infoset and inspecting it element
> to determine things nil/normal/missing representation and determine the parse
> status. This seems to usually be used when dealing with postfix separators,
> but there may be other cases and more specific circumstances needed.
> However, in some cases it is possible that the InfosetWalker has already
> walked and remove elements from the Infoset because it thinks they aren't
> needed anymore. But this function does actually still need them, which can
> lead to null exceptions.
> Fortunately, the InfosetWalker only removes infoset elements periodically
> depending on a number of factors, so it's fairly rare that this case actually
> happens. But when it does it appears very random. To reliably cause the
> issue, we can force the infoset walker to remove elements as soon as it
> thinks it's possible by setting the InfosetWalker minSkip value to zero.
> After doing that, the below schema and data reliably reproduce the issue.
> Data:
> {code}
> |classificiation;body1;body1;body3;|
> {code}
> Schema:
> {code:xml}
> <?xml version="1.0" encoding="UTF-8"?>
> <schema
> xmlns="http://www.w3.org/2001/XMLSchema"
> xmlns:xs="http://www.w3.org/2001/XMLSchema"
> xmlns:dfdl="http://www.ogf.org/dfdl/dfdl-1.0/">
> <include
> schemaLocation="org/apache/daffodil/xsd/DFDLGeneralFormat.dfdl.xsd" />
> <annotation>
> <appinfo source="http://www.ogf.org/dfdl/">
> <dfdl:format xmlns="" ref="GeneralFormat" lengthKind="delimited" />
> </appinfo>
> </annotation>
> <element name="file">
> <complexType>
> <sequence dfdl:initiator="|" dfdl:terminator="|" dfdl:separator=";">
> <element name="first" type="xs:string" />
> <sequence dfdl:separator=";" dfdl:separatorPosition="postfix">
> <element name="field" type="xs:string" minOccurs="0"
> maxOccurs="unbounded" />
> </sequence>
> </sequence>
> </complexType>
> </element>
> </schema>
> {code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)