[ 
https://issues.apache.org/jira/browse/DAFFODIL-2246?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Lawrence updated DAFFODIL-2246:
-------------------------------------
    Priority: Minor  (was: Major)

> Basic performance test built into daffodil
> ------------------------------------------
>
>                 Key: DAFFODIL-2246
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2246
>             Project: Daffodil
>          Issue Type: Improvement
>          Components: Infrastructure, QA
>    Affects Versions: 2.5.0
>            Reporter: Mike Beckerle
>            Priority: Minor
>
> We need a performance test - very simple one that uses built-in 
> non-restricted DFDL schemas that are small and easily maintained. 
> This should be easily run as part of regression testing by developers as part 
> of normal developer edit-compile-test workflow.
> The goal is to catch significant performance regressions earlier. 
> This is not a substitute for more serious performance testing on a controlled 
> platform using realistic customer-centric DFDL schemas. That is still needed, 
> and should cover things like multi-threaded "throughput" tests.
> This is a quicker/simpler thing. Single thread. 
> Thoughts:
> * Measure performance relative to Tak calls aka "Takeon" units. This makes 
> the timings self-relative to the speed of the JVM so that different people 
> with different speed systems have a chance of still getting somewhat 
> consistent timings. 
> * Isolate parsing and unparsing timings.
> * Avoid I/O - we should read from in-memory buffers, write to in-memory 
> buffers, which should be small enough (maybe 1 Mbyte) to not introduce 
> memory-allocation/memory-footprint artifacts. 
> * Single threaded only.
> * Use message-streaming API calls to parse repeatedly to create modest-sized 
> infoset objects. 
> * Isolate basic parse to create a DFDL Infoset from InfosetOutputter 
> overhead. 
> Isolate basic unparse from a DFDL Infoset from InfosetInputter overhead. 
> * Test performance of schema compilation also (e.g., save parser saving to a 
> stream that just discards the data) 
> * Maintain per-developer history - each developer will have a file or 
> something on their development system which is updated with timings and 
> baselines so that when running these perf tests, results are compared to the 
> prior-results for that same developer on that same machine. 
> ** This also allows for computation of standard-deviation and Z-score which 
> make performance results far easier to analyze - as one can flag performance 
> variations which are out of the norm not in some absolute timing sense, but 
> relative to the standard-deviation of timings for that same test. (E.g., 
> Z-Score more than 1.05 - more than 5% slower for the test relative to that 
> test's own typical performance)
> Once we have the framework, we will want to put perf-tests that isolate 
> performance of specific features so as to focus attention when regressions 
> are seen. E.g., a perf test may want to use say,  lengthKind 'prefixed' 
> exclusively. Another test may focus on delimited text data, another on 
> non-byte-sized/aligned data. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to