stevedlawrence opened a new pull request, #1159:
URL: https://github.com/apache/daffodil/pull/1159

   The XMLTextInfosetOutputter and JSONInfosetOutputter do not use any 
buffering when writing data. Modifying these to wrap a BufferedWriter around 
the existing OutputStreamWriter gives significant performance improvements. 
Using a BufferedWriter also allows us to use its built-in newLine() function 
for pretty printing.
   
   This also modifies the "Standard" XML escape style in the xml infoset 
outputter so that it first checks if there are any characters that need to be 
escaped, similar to what we do for CDATA escape style. In most cases, there 
will not be any characters that need escaping, so we can avoid Scala XML's 
escape utility, which has noticeable overhead, even if nothing needs escaping.
   
   With these changes tested on a large file with lots of strings, this saw 
total parse + infoset output time drop from about 125 seconds to 93 seconds, 
about a 25% decrease. Note that parsing with the null infoset outputter takes 
about 78 seconds, so the xml infoset outputter overhead went from about 37% of 
the total parse time down to about 20%.
   
   DAFFODIL-2872


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to