Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Christian Grün
Good to know; I’ll record this as positive news ;) Feel free to give me an update once you encounter a similar behavior. On Mon, May 14, 2018 at 8:40 PM, Eliot Kimber wrote: > Hmm. > > In the process of testing my test data set I can't reproduce the earlier > behavior. >

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Eliot Kimber
Hmm. In the process of testing my test data set I can't reproduce the earlier behavior. In my current tests, using the same data and the same BaseX version, I get a maximum of maybe 1GB for the largest file but just a few hundred MBs once everything is loaded. For 3800 topics of roughly 50K

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Eliot Kimber
Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test data set for you. Cheers, E. -- Eliot Kimber http://contrext.com On 5/14/18, 12:45 PM, "Christian Grün" wrote: I would have expected some MBs to be sufficient for parsing even

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Christian Grün
I would have expected some MBs to be sufficient for parsing even complex DTDs if nothing is cached (but caching could definitely speed up processing), so maybe there’s still something that we could have a look at. If you are interested, feel free to provide me with your files via a private

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Eliot Kimber
Yes, I would want caching on by default with the option to turn it off. I'm assuming it's currently not turned on (but to be honest I haven't taken the time to check the source code). Certainly for DITA content grammar caching is the only practical way to parse a large number of topics in the

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-14 Thread Christian Grün
Hi Eliot, Thanks for your observations. > I think the solution is to turn on Xerces' grammar caching. I’m wondering what is happening here. Did you want to say that caching is enabled by default, and that it should be possible to turn it off? Cheers, Christian The only danger there is that

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-11 Thread Eliot Kimber
More experimentation indicates that the issue is the DTDs--if I load the same content without DTD parsing then it loads fine and takes the expected relatively small amount of memory. I think the solution is to turn on Xerces' grammar caching. The only danger there is that different DTDs within

Re: [basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-04 Thread Eliot Kimber
Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately locked up with the memory meter showing 13GB. I'm thinking this must be some kind of memory leak. I tried importing the DITA Open Toolkit's documentation source and that worked fine with the max memory being about

[basex-talk] 9.0.1: High Memory Usage Loading Docs Via GUI

2018-05-04 Thread Eliot Kimber
In the context of trying to do fun things with DITA docs in BaseX I downloaded the latest BaseX (9.0.1) and tried creating a new database and loading docs into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM available. My corpus is about 4000 DITA topics totaling