[
https://issues.apache.org/jira/browse/DAFFODIL-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483816#comment-17483816
]
Steve Lawrence commented on DAFFODIL-2627:
------------------------------------------
Yeah, sounds like the garbage collector is just too aggressive, or maybe
IntelliJ triggers the garbage collector between test suites? How much memory
are you giving the JVM? It's possible that your low on memory so its triggering
the garbage collector more?
Regardless, relying on the garbage collector is probably not a great idea in
hindsight, we just don't have enough control over the cache. We probably do
need to replace the WeakHashMap cache with an actual cache with a timer and
more control over when cache items are evicted. Though, I wonder if we also
need to detect changes in schema to trigger a rebuild even if something is in
the cache? We didn't have the probably when the cache was in a Runner because
runners were recreated everytime a test ran. Maybe we a small enough expiration
time this isn't an issue?
That said, I feel like the problem here is really JUnit (and probably most
other unit test tools)--they just aren't designed to support sharing objects
between test suites, so we have to hack together this global cache. And our
Runner implementation is also maybe part of the problem since each Runner only
supports a single tdml file, which makes it difficult to share schemas in
different TDML files.
It almost feels like the ideal approach would be to scrap JUnit altogether and
have a custom TDML test interface. This could then scan all TDML files and
figure out which schemas need to be compiled. It could detect which schemas are
shared among different TDML files and run them together. And it could throw
away schemas at exactly the right time because it would know when all tests are
done with a schema. This interface would have all the necessary information to
run tests efficiently because it actually knows what a TDML file is.
Unfortunately, that's a pretty sizable effort, especially since we'd have to
create separate plugins for supported IDEs. A cache is certainly the best short
term solution, and is probably sufficient long term, even though it feels a bit
hacky to me.
> Performance regression in TDML processor
> ----------------------------------------
>
> Key: DAFFODIL-2627
> URL: https://issues.apache.org/jira/browse/DAFFODIL-2627
> Project: Daffodil
> Issue Type: Bug
> Components: TDML Runner
> Affects Versions: 3.2.0
> Reporter: Josh Adams
> Assignee: Steve Lawrence
> Priority: Major
> Fix For: 3.3.0
>
>
> While working on a customer project we noticed a significant increase in the
> amount of time it took to run our test suit (over 600 tests) after upgrading
> from Daffodil 2.7.0 to 3.2.1. We were seeing roughly a 4x increase in time
> to complete the same set of tests.
> I've narrowed the performance regression to commit
> 0700ee8dc9531497f3e8b0fdf9266f8e3b105c27 which involved a removal of the
> schema compilation cache, which is likely causing the schema to need to be
> recompile much more frequently.
> We use a relatively large schema (over 10,000 lines), but it is the same
> schema used for all tests.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)