stevedlawrence opened a new pull request, #879:
URL: https://github.com/apache/daffodil/pull/879

   In SAX unparsing, remove coroutines--it's not totally clear why this helps, 
but they do cause a noticeable overhead with coroutines. Instead events are 
stored in thread-safe FIFO ArrayBlockingQueue which is shared between the 
DaffodilUnparseContentHandler and the SAXInfosetInputter. As the 
DaffodilContentHandler receives SAX events it put()'s them on the queue. As the 
SAXInfosetInputter is ready for events it takes from the queue. A bit more care 
is needed to ensure there are no deadlocks, but this allows the two threads to 
work in parallel, which may help speed things up a bit, especially if SAX 
events come in slowly.
   
   In addition to not using coroutines, the unparse() call is now run in a 
Future with an ExecutionContent using a cached thread pool. This should allow 
subsequent SAX unparse() calls to reuse threads, avoiding some overhead with 
thread creation.
   
   In SAX unparsing, keep a reference to the current event being mutated in a 
pre-allocated array of events. Not only does this avoid many array index 
lookups, it makes the code much cleaner.
   
   In SAX unparsing, initialize prefixMapping to TopScope instead of null. With 
a null value it was possible for getURI() to throw a NullPointerException if 
there is no prefix mapping, which is common when an element does not have a 
prefix and there is no default mapping. Before we caught and handled the 
exception, but we've seen some exceptions to have noticeable overhead. By using 
TopScope, there is no exception and instead getURI returns null if no mapping 
is found.
   
   In SAX unaparsing, avoid .split() which compiles and evaluates a slow 
regular expression behind the scenes. We only need to split on a single char, 
which we can do much faster using indexOf() and substring().
   
   In SAX parsing, we now store the computed prefix/prefixedName in a val 
instead of a def. The def was a noticeable hotspot during profiling that this 
removes.
   
   Also included was a handful of refactorings, such as moving 
SAXInfosetInputterEvents out of the DFDLParserUnparse API file (since they 
aren't really API related), move functions around to improve logical groupings, 
and rename variables/functions to be consistent between the 
DaffodilUnparseContentHandler and SAXInfosetInputter.
   
   DAFFODIL-2400


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to