Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-06 Thread first name last name
Regarding selective full-text indexing, I just tried XQUERY db:optimize("linuxquestions.org-selective", true(), map { 'ftindex': true(), 'ftinclude': 'div table td a' }) And I got OOM on that, the exact stacktrace attached in this message. I will open a separate thread regarding migrating the

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-06 Thread Christian Grün
The current full text index builder provides a similar outsourcing mechanism to that of the index builder for the default index structures; but the meta data structures are kept in main-memory, and they are more bulky. There are definitely ways to tackle this technically; it hasn't been of high

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-05 Thread first name last name
Attached a more complete output of ./bin/basexhttp . Judging from this output, it would seem that everything was ok, except for the full-text index. I now realize that I have another question about full-text indexes. It seems like the full-text index here is dependent on the amount of memory

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-05 Thread Christian Grün
The stack Trace indicates that you enabled the fulltext index as well. For this index, you definitely need more memory than available on your system. So I assume you didn't encounter trouble with the default index structures? first name last name schrieb am Sa., 5. Okt. 2019, 20:52: > Yes,

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-05 Thread first name last name
Yes, I did, with -Xmx3100m (that's the maximum amount of memory I can allocate on that system for BaseX) and I got OOM. On Sat, Oct 5, 2019 at 2:19 AM Christian Grün wrote: > About option 1: How much memory have you been able to assign to the Java > VM? > > > > > > first name last name schrieb

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-04 Thread Christian Grün
About option 1: How much memory have you been able to assign to the Java VM? first name last name schrieb am Sa., 5. Okt. 2019, 01:11: > I had another look at the script I wrote and realized that it's not > working as it's supposed to. > Apparently the order of operations should be this: >

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-04 Thread first name last name
I had another look at the script I wrote and realized that it's not working as it's supposed to. Apparently the order of operations should be this: - turn on all the types of indexes required - create the db - the parser settings and the filter settings - add all the files to the db - run

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-04 Thread first name last name
Hi Christian, About option 4: I agree with the options you laid out. I am currently diving deeper into option 4 in the list you wrote. Regarding the partitioning strategy, I agree. I did manage however to partition the files to be imported, into separate sets, with a constraint on max partition

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-03 Thread Christian Grün
Exactly, it seems to be the final MERGE step during index creation that blows up your system. If you are restricted to the 2 GB of main-memory, this is what you could try next: 1. Did you already try to tweak the JVM memory limit via -Xmx? What’s the largest value that you can assign on your

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-03 Thread Imsieke, Gerrit, le-tex
Hi, just saying that 16 GB of DDR3 RAM cost about 40 € now. Gerrit On 03.10.2019 08:53, first name last name wrote: I tried again, using SPLITSIZE = 12 in the .basex config file The batch(console) script I used is attached mass-import.xq This time I didn't do the optimize or index creation

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-03 Thread first name last name
I tried again, using SPLITSIZE = 12 in the .basex config file The batch(console) script I used is attached mass-import.xq This time I didn't do the optimize or index creation post-import, but instead, I did it as part of the import similar to what is described in [4]. This time I got a different

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-02 Thread first name last name
Hey Christian, Thank you for your answer :) I tried setting in .basex the SPLITSIZE = 24000 but I've seen the same OOM behavior. It looks like the memory consumption is moderate until when it reaches about 30GB (the size of the db before optimize) and then memory consumption spikes, and OOM

Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-01 Thread Christian Grün
Hi first name, If you optimize your database, the indexes will be rebuilt. In this step, the builder tries to guess how much free memory is still available. If memory is exhausted, parts of the index will be split (i. e., partially written to disk) and merged in a final step. However, you can

[basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-09-29 Thread first name last name
Hi, Let's say there's a 30GB dataset [3] containing most threads/posts from [1]. After importing all of it, when I try to run /dba/db-optimize/ on it (which must have some corresponding command) I get the OOM error in the stacktrace attached. I am using -Xmx2g so BaseX is limited to 2GB of memory