The MarkPool currently leads to some issues that make tracking down bugs
difficult, and can also make API usage difficult. I think something
needs to change, but I'm not sure exactly what. The issue and a
potential solution is described below.

A few places in our code create marks of the PState to support
backtracking. This includes various RepParsers, the AltCompParser, and
the ElementCombinatorParser. The contract we use for this mark pool is
that any marks that a parser takes it must also return with a call to
either discard() or reset(). At the end of a parse, we check that all
marks have been returned, and if they have not we throw an
invariantFailed(). This is considered a bug so an invariant does not
seem unreasonable.

I believe our logic for when to return marks is correct, but it cases
where unexpected exceptions could be thrown (e.g. OutOfMemoryError),
marks are not returned. These are generally fatal exception, and so I
think that's the reason we chose to not care about them.

However, a problem arises where an API user might want to catch what we
see as a fatal exception and try recover. For example, NiFi will catch
many exceptions, yield for some amount of time, and then retry
processing. So in the Daffodil NiFi processor, when we throw our
invariantFailed exception, NiFi catches it, yields, and then retries.
But since the mark pool is a thread local, it still contains marks that
will never be returned, and so the same invariantFailed is thrown at the
end of the next parse. NiFi will again yeild and repeat, and every
subsequent parse will fail.

Another issue with the way this all works is that when this invariant
failed is thrown we have no idea what caused the mark to not be
returned. It may have been a fatal exception, it may have been something
else like a bug (like a ProcessingError or something that we just failed
to catch). It's very difficult to know the cause.

So I think two things we need to resolve is:

1) Have some method to reset the state of the mark pool so that
processing can continue. Perhaps validateFinalState should just log an
error and then discard all existing marks. This would allow one to
continue using Daffodil if they wanted.

2) Have some method to determine the cause of a mark failing to be
returned, since it is possibly a bug. Perhaps we want to wrap all uses
of marks in a try/catch. The catch block could catch all exceptions and
log  if there were any non-returned marks. We could also potentially
have a finally block that also returns all non-returned marks. One
downside of this is that it is possible that the returning of the mark
could throw an exception and mask the exception rethrown in the catch
block. But that does not seem that likely to me. And it does ensure the
contract of any marks taken are returned. This would also mean the above
suggestion (changing validateFinalState) is less necessary. This handles
both the logging of exceptions that would cause a mark to not be
returned, and forces returning of the mark.

Thoughts?

Reply via email to