[ 
https://issues.apache.org/jira/browse/DAFFODIL-2400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627733#comment-17627733
 ] 

Steve Lawrence commented on DAFFODIL-2400:
------------------------------------------

I did some quick testing parsing 150,000 disv6 files.

Using -I null, it averages about 15 seconds

Using -I sax with the DefaultHandler (which is essentially null), it averages 
about 21 seconds

Using -I sax with DaffodilParseOutputStreamContentHandler (which outputs sax 
events to XML text), it averages about 35 seconds.

So using our SAX InfosetOutputter that does nothing is about 40% slower 
compared to null, and using a SAX InfosetOutputter that outsputs to text is 
more than 130% slower compared to null.

For reference, using -I xml it averages about 26 seconds, which is about 75% 
slower than null.

I'm not sure how accurate these results are and how representative disv6 is, 
but it is one sample that shows there's definitely overhead with using SAX, and 
even more overhead with converting it to XML.

So whatever we're doing in the SAXInfosetOutputter is definitely a bit more 
than just a strait wrapper. And skimming the code, there's a decent amount of 
logic to deal with namespaces depending on which SAX features are enabled, so 
it's not too surprising there's some overhead.

> New SAX API causes performance degradations
> -------------------------------------------
>
>                 Key: DAFFODIL-2400
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2400
>             Project: Daffodil
>          Issue Type: Improvement
>          Components: Performance, SAX
>            Reporter: Steve Lawrence
>            Priority: Major
>
> The new SAX API caused performance degredations across the board of file 
> types. The SAX API is basically just a wrapper around the current API, so 
> this is a bit surprising. Need to investigate what is causing these slowdows 
> and see if it can be resolved.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to