Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you.
Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:45 PM, "Christian Grün" <christian.gr...@gmail.com> wrote: I would have expected some MBs to be sufficient for parsing even complex DTDs if nothing is cached (but caching could definitely speed up processing), so maybe there’s still something that we could have a look at. If you are interested, feel free to provide me with your files via a private message. On Mon, May 14, 2018 at 7:40 PM, Eliot Kimber <ekim...@contrext.com> wrote: > Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). > > Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the same JVM without both using lots of memory and eating an avoidable processing cost of re-processing the grammar files again for each document. > > DITA is probably somewhat unique in this regard because it takes a such a different approach to grammar organization and use than pretty much any other XML application. > > Cheers, > > E.