If you don't need to add documents to existing databases (or if you use the
ADDCACHE option for that), I recommend you to simply start with the JVM
defaults, create a sample database from let’s say 10 GB of XML input and
see if it works out of the box. Only if it doesn’t, it may be necessary to
specify -Xmx, and it may additionally be interesting to analyse if you run
into troubles at all if parts of your memory are reserved by the JVM. In
our own use cases (with gigabytes of XML data), we nearly always work with
the JVM defaults, even if our servers are also used for other stuff (but it
may very well be that there is some need to free memory in your case).





Am 04.11.2017 7:41 nachm. schrieb "Dinu Marina" <dinumar...@gmail.com>:

> So indeed usedmemory says it's about 3M. If that's the heap size, does
> this mean the memory is hogged in the JVM?
> That's why I tried to use
>
> -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:+UseSerialGC
>
> it says here http://www.stefankrause.net/wp/?p=14 that the serial GC does
> return memory to the system...
> It would sure be a nice thing to have, since the server is idle 23 hours a
> day. I thought about restarting the server, but there can be async requests
> coming in from different clients, so a restart mechanism would probably
> involve an external sync mechanism.
>
>
> On 04.11.2017 19:40, Christian Grün wrote:
>
>> No, actually the memory is not freed even after CREATE DB, I was watching
>>> another java process; it does vary a little at import end (to about
>>> 800M).
>>> So this problem seems to be common.
>>>
>> This behavior is common indeed, and it is not related to BaseX, but to
>> the Java virtual machine in general. Garbage collection is a very
>> complex process, and allocated memory won’t automatically be freed
>> after a memory consuming thread has finised, but only if it is
>> actually required by another thread.
>>
>> Q{java:java.lang.System}gc()
>>> Stopped at , 1/29:
>>> Unknown command: Q{java:java.lang.System}gc(). Try HELP.
>>>
>> The string needs to be run as XQuery expression (see my initial mail).
>> If you want to run it on command-line, you will need to use the XQUERY
>> command. Find more information on BaseX commands and command-line
>> processing in our Wiki [1,2].
>>
>> The following query will give you some idea of the current memory
>> consumption:
>>
>>    (1 to 3) ! Q{java:java.lang.System}gc(),
>>    string(db:system()//usedmemory)
>>
>> It returns the value computed via [3] (see [4] as well). The result is
>> just a rough guess (it’s generally difficult to compute something like
>> the “real” memory consumption of a JVM), but it might suffice to
>> detect real memory leaks. If it turns out that this query yields a
>> really large value (e.g. > 1gb) after creating a database and adding a
>> zip file, then we might need to do something about it.
>>
>> Hope this helps,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Commands
>> [2] http://docs.basex.org/wiki/Command-Line_Options
>> [3] https://github.com/BaseXdb/basex/blob/master/basex-core/src/
>> main/java/org/basex/util/Performance.java#L68
>> [4] https://stackoverflow.com/questions/37916136/how-to-calculat
>> e-memory-usage-of-java-program
>>
>>
>>
>> Dinu
>>>
>>>
>>> On 04.11.2017 19:02, Christian Grün wrote:
>>>
>>> Fine. One more question: How do you measure the "memory leak" on
>>> command-line, and are you sure that this value is comparable to the value
>>> that is shown in the bottom bar of the BaseX GUI?
>>>
>>>
>>>
>>> Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" <dinumar...@gmail.com>:
>>>
>>>> Indeed, I use the create function from the GUI, I just assumed it's the
>>>> same 2 separate operations.
>>>>
>>>> Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also
>>>> gets GC'ed and returned to system afterwards with no additional
>>>> intervention, after CREATE DB memory shrinks immediately back to ~30M.
>>>>
>>>> So confirmed, huge memory usage and memory "leak" (or whatever it is) is
>>>> linked to ADD only.
>>>>
>>>> Thanks,
>>>> Dinu
>>>>
>>>>
>>>> On 04.11.2017 18:46, Christian Grün wrote:
>>>>
>>>>> Hi Dinu,
>>>>>
>>>>> yes, I have downloaded the file.
>>>>>
>>>>> Just one more question:
>>>>>
>>>>> 2) using basexclient:
>>>>>>
>>>>>> CHECK somedb
>>>>>> ADD /path/to/1_feed.zip
>>>>>>
>>>>> If you use the GUI, do you really add your zip file to an existing
>>>>> database, or do you specify it as initial input when creating a new
>>>>> database? The latter option is definitely more efficient, and the
>>>>> command-line equivalent would be
>>>>>
>>>>>     CREATE DB somedb /path/to/1_feed.zip
>>>>>
>>>>> For adding resources to existing databases, enabling ADDCACHE can help
>>>>> [1].
>>>>>
>>>>> Cheers,
>>>>> Christian
>>>>>
>>>>> [1] http://docs.basex.org/wiki/Options#ADDCACHE
>>>>>
>>>>>
>>>>>
>>>>> Result:
>>>>>> Out of Main Memory.
>>>>>>
>>>>>> To reproduce 2), start server with -Xmx2048m, repeat operations, then
>>>>>> drop
>>>>>> db, close client, check server memory usage.
>>>>>>
>>>>>> Thanks,
>>>>>> Dinu
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 04.11.2017 18:18, Christian Grün wrote:
>>>>>>
>>>>>>> The fact is, the GUI runs with no problem with -Xmx512M to do the
>>>>>>>> same
>>>>>>>> thing, while basexclient fails without -Xmx2048M.
>>>>>>>>
>>>>>>> That’s surprising indeed – mostly because I would have expected the
>>>>>>> BaseX client to always consume a small and constant amount of memory
>>>>>>> (the BaseX server instance should be the process to consume all the
>>>>>>> memory). I did some quick tests with large zipped input, but I failed
>>>>>>> to reproduce the behavior you described. Feel free to provide me with
>>>>>>> a step-by-step guide.
>>>>>>>
>>>>>>> I will try that, thanks, but shouldn't this be the case
>>>>>>>> automatically?
>>>>>>>> Since
>>>>>>>> I assume BaseX does free references to data structures, at least to
>>>>>>>> a
>>>>>>>> dropped DB?
>>>>>>>>
>>>>>>> Absolutely. Anything that’s reproducible is welcome.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 04.11.2017 18:00, Christian Grün wrote:
>>>>>>>>
>>>>>>>>> Hi Dinu,
>>>>>>>>>
>>>>>>>>> Question 1:
>>>>>>>>>>
>>>>>>>>> Memory consumption of the BaseX GUI is similar as on command-line,
>>>>>>>>> but
>>>>>>>>> it may be due to garbage collection that some memory will be freed.
>>>>>>>>> How do you add documents outside the GUI?
>>>>>>>>>
>>>>>>>>> Question 2:
>>>>>>>>>>
>>>>>>>>> If a certain amount of memory is reserved by Java’s virtual
>>>>>>>>> machine,
>>>>>>>>> it may still be used by other applications on your system (provided
>>>>>>>>> that the memory can be freed by garbage collection). You can
>>>>>>>>> enforce
>>>>>>>>> some GC calls by running the following XQuery expression (this
>>>>>>>>> should
>>>>>>>>> only be done for testing purposes):
>>>>>>>>>
>>>>>>>>>       (1 to 5) ! Q{java:java.lang.System}gc()
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Christian
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> After the data is extracted, it's no longer needed and I DROP the
>>>>>>>>>> DB;
>>>>>>>>>> also
>>>>>>>>>> connection is closed. But memory (the huge 2G mentioned above) is
>>>>>>>>>> never
>>>>>>>>>> returned to the system.
>>>>>>>>>>
>>>>>>>>>> The script I use to run BaseX is:
>>>>>>>>>>
>>>>>>>>>> export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10
>>>>>>>>>> -XX:MaxHeapFreeRatio=20
>>>>>>>>>> -XX:+UseSerialGC -Dorg.basex.LOG=false
>>>>>>>>>> -Dorg.basex.DBPATH=/var/basex/data
>>>>>>>>>> -Dorg.basex.REPOPATH=/var/basex/repo"
>>>>>>>>>> BaseX/bin/basexserver -S
>>>>>>>>>>
>>>>>>>>>> So basically I tried specifying MaxHeapFreeRatio and SerialGC for
>>>>>>>>>> java,
>>>>>>>>>> but
>>>>>>>>>> it's no improvement and it doesn't help so I assume the memory
>>>>>>>>>> isn't
>>>>>>>>>> hogged
>>>>>>>>>> in java... is there a way to free up the memory once operations
>>>>>>>>>> complete
>>>>>>>>>> (like mentioned above, "complete" means created DB is dropped,
>>>>>>>>>> connection
>>>>>>>>>> closed, waiting for another batch to start over).
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Dinu
>>>>>>>>>>
>>>>>>>>>>
>

Reply via email to