Lack of capability to deal appropriately with whitespaces (and punctuation)
results in false positives in our StratML-enabled query service at
https://search.aboutthem.info/
Will look forward to learning if anything can be done about it.
Owen Amburhttps://www.linkedin.com/in/owenambur/
On
Whitespace is probably only a minor factor here. It can’t explain the loading
times that grow non-linearly with document count.
Dietmar, have you looked at the memory consumption? My experience is that if
memory gets scarce, garbage collection will kick in frequently, slowing down
the import
Thanks for the addition, Liam; I should have mentioned that.
If your input has mixed content, and if the relevant sections have
xml:space='preserve' attributes…
The very tc34q.
…whitespace stripping will be safe.
Similarly, it may be helpful to know that the whitspace gets lost if XML
strings…
On Tue, 2024-02-13 at 20:29 +0100, Christian Grün wrote:
>
> If your XML input has been properly indented to improve readibility,
> you can reduce the size of your database by dropping superfluous
> whitespace during the import:
>
> SET STRIPWS ON; CREATE DB ...
> db:create('db',
4 matches
Mail list logo