[ 
https://issues.apache.org/jira/browse/DAFFODIL-2627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17483894#comment-17483894
 ] 

Mike Beckerle commented on DAFFODIL-2627:
-----------------------------------------

Ah, yes I guess that would end up being persistent and long duration.

So we have to think of the "key" for this cache in two parts. There's the 
primary key, which is what we use now. Then there is a secondary key which 
combines the primary key and a digraph of the import/include structure with 
URLs and modification timestamps. 

If you construct the primary key and miss the cache, then you have to compile 
it. At that point you get the compiled schema plus the digraph of the 
import/include and mod timestamp information. Both are stored.

Subsequently a hit has to match the primary key, and the secondary key has to 
be traversed, examining the URLs to get new modificaiton timestamps, and if 
they are all the same it's a hit, otherwise upon the first detected change, we 
invalidate the cache and recompile, gathering also a new secondary key. 

If the TDML file has embedded schema, then it and its last modification 
timestamp become part of the secondary key. 

If schemas are being accessed from jars on the classpath, then one must be able 
to pull a modification timestamp for a file within the jar as well. 

 

 

> Performance regression in TDML processor
> ----------------------------------------
>
>                 Key: DAFFODIL-2627
>                 URL: https://issues.apache.org/jira/browse/DAFFODIL-2627
>             Project: Daffodil
>          Issue Type: Bug
>          Components: TDML Runner
>    Affects Versions: 3.2.0
>            Reporter: Josh Adams
>            Assignee: Steve Lawrence
>            Priority: Major
>             Fix For: 3.3.0
>
>
> While working on a customer project we noticed a significant increase in the 
> amount of time it took to run our test suit (over 600 tests) after upgrading 
> from Daffodil 2.7.0 to 3.2.1.  We were seeing roughly a 4x increase in time 
> to complete the same set of tests.
> I've narrowed the performance regression to commit 
> 0700ee8dc9531497f3e8b0fdf9266f8e3b105c27 which involved a removal of the 
> schema compilation cache, which is likely causing the schema to need to be 
> recompile much more frequently.
> We use a relatively large schema (over 10,000 lines), but it is the same 
> schema used for all tests.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to