Good to know; I’ll record this as positive news ;) Feel free to give
me an update once you encounter a similar behavior.
On Mon, May 14, 2018 at 8:40 PM, Eliot Kimber wrote:
> Hmm.
>
> In the process of testing my test data set I can't reproduce the earlier
> behavior.
>
Hmm.
In the process of testing my test data set I can't reproduce the earlier
behavior.
In my current tests, using the same data and the same BaseX version, I get a
maximum of maybe 1GB for the largest file but just a few hundred MBs once
everything is loaded.
For 3800 topics of roughly 50K
Yes, I wouldn't expect the grammars to chew up gigabytes. I'll provide a test
data set for you.
Cheers,
E.
--
Eliot Kimber
http://contrext.com
On 5/14/18, 12:45 PM, "Christian Grün" wrote:
I would have expected some MBs to be sufficient for parsing even
I would have expected some MBs to be sufficient for parsing even
complex DTDs if nothing is cached (but caching could definitely speed
up processing), so maybe there’s still something that we could have a
look at. If you are interested, feel free to provide me with your
files via a private
Yes, I would want caching on by default with the option to turn it off. I'm
assuming it's currently not turned on (but to be honest I haven't taken the
time to check the source code).
Certainly for DITA content grammar caching is the only practical way to parse a
large number of topics in the
Hi Eliot,
Thanks for your observations.
> I think the solution is to turn on Xerces' grammar caching.
I’m wondering what is happening here. Did you want to say that caching
is enabled by default, and that it should be possible to turn it off?
Cheers,
Christian
The only danger there is that
More experimentation indicates that the issue is the DTDs--if I load the same
content without DTD parsing then it loads fine and takes the expected
relatively small amount of memory.
I think the solution is to turn on Xerces' grammar caching. The only danger
there is that different DTDs within
Follow up--I tried giving BaseX the full 16GB of RAM and it still ultimately
locked up with the memory meter showing 13GB.
I'm thinking this must be some kind of memory leak.
I tried importing the DITA Open Toolkit's documentation source and that worked
fine with the max memory being about
In the context of trying to do fun things with DITA docs in BaseX I downloaded
the latest BaseX (9.0.1) and tried creating a new database and loading docs
into it using the BaseX GUI. This is on macOS 10.13.4 with 16GB of hardware RAM
available.
My corpus is about 4000 DITA topics totaling
9 matches
Mail list logo