Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
If you don't need to add documents to existing databases (or if you use the
ADDCACHE option for that), I recommend you to simply start with the JVM
defaults, create a sample database from let’s say 10 GB of XML input and
see if it works out of the box. Only if it doesn’t, it may be necessary to
specify -Xmx, and it may additionally be interesting to analyse if you run
into troubles at all if parts of your memory are reserved by the JVM. In
our own use cases (with gigabytes of XML data), we nearly always work with
the JVM defaults, even if our servers are also used for other stuff (but it
may very well be that there is some need to free memory in your case).





Am 04.11.2017 7:41 nachm. schrieb "Dinu Marina" :

> So indeed usedmemory says it's about 3M. If that's the heap size, does
> this mean the memory is hogged in the JVM?
> That's why I tried to use
>
> -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:+UseSerialGC
>
> it says here http://www.stefankrause.net/wp/?p=14 that the serial GC does
> return memory to the system...
> It would sure be a nice thing to have, since the server is idle 23 hours a
> day. I thought about restarting the server, but there can be async requests
> coming in from different clients, so a restart mechanism would probably
> involve an external sync mechanism.
>
>
> On 04.11.2017 19:40, Christian Grün wrote:
>
>> No, actually the memory is not freed even after CREATE DB, I was watching
>>> another java process; it does vary a little at import end (to about
>>> 800M).
>>> So this problem seems to be common.
>>>
>> This behavior is common indeed, and it is not related to BaseX, but to
>> the Java virtual machine in general. Garbage collection is a very
>> complex process, and allocated memory won’t automatically be freed
>> after a memory consuming thread has finised, but only if it is
>> actually required by another thread.
>>
>> Q{java:java.lang.System}gc()
>>> Stopped at , 1/29:
>>> Unknown command: Q{java:java.lang.System}gc(). Try HELP.
>>>
>> The string needs to be run as XQuery expression (see my initial mail).
>> If you want to run it on command-line, you will need to use the XQUERY
>> command. Find more information on BaseX commands and command-line
>> processing in our Wiki [1,2].
>>
>> The following query will give you some idea of the current memory
>> consumption:
>>
>>(1 to 3) ! Q{java:java.lang.System}gc(),
>>string(db:system()//usedmemory)
>>
>> It returns the value computed via [3] (see [4] as well). The result is
>> just a rough guess (it’s generally difficult to compute something like
>> the “real” memory consumption of a JVM), but it might suffice to
>> detect real memory leaks. If it turns out that this query yields a
>> really large value (e.g. > 1gb) after creating a database and adding a
>> zip file, then we might need to do something about it.
>>
>> Hope this helps,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Commands
>> [2] http://docs.basex.org/wiki/Command-Line_Options
>> [3] https://github.com/BaseXdb/basex/blob/master/basex-core/src/
>> main/java/org/basex/util/Performance.java#L68
>> [4] https://stackoverflow.com/questions/37916136/how-to-calculat
>> e-memory-usage-of-java-program
>>
>>
>>
>> Dinu
>>>
>>>
>>> On 04.11.2017 19:02, Christian Grün wrote:
>>>
>>> Fine. One more question: How do you measure the "memory leak" on
>>> command-line, and are you sure that this value is comparable to the value
>>> that is shown in the bottom bar of the BaseX GUI?
>>>
>>>
>>>
>>> Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" :
>>>
 Indeed, I use the create function from the GUI, I just assumed it's the
 same 2 separate operations.

 Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also
 gets GC'ed and returned to system afterwards with no additional
 intervention, after CREATE DB memory shrinks immediately back to ~30M.

 So confirmed, huge memory usage and memory "leak" (or whatever it is) is
 linked to ADD only.

 Thanks,
 Dinu


 On 04.11.2017 18:46, Christian Grün wrote:

> Hi Dinu,
>
> yes, I have downloaded the file.
>
> Just one more question:
>
> 2) using basexclient:
>>
>> CHECK somedb
>> ADD /path/to/1_feed.zip
>>
> If you use the GUI, do you really add your zip file to an existing
> database, or do you specify it as initial input when creating a new
> database? The latter option is definitely more efficient, and the
> command-line equivalent would be
>
> CREATE DB somedb /path/to/1_feed.zip
>
> For adding resources to existing databases, enabling ADDCACHE can help
> [1].
>
> Cheers,
> Christian
>
> [1] http://docs.basex.org/wiki/Options#ADDCACHE
>
>
>
> Result:
>> Out of Main Memory.
>>
>> To reproduce 2), start server with -Xmx2048m, repeat operations, then
>> drop
>> 

Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina
So indeed usedmemory says it's about 3M. If that's the heap size, does 
this mean the memory is hogged in the JVM?

That's why I tried to use

-XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20 -XX:+UseSerialGC

it says here http://www.stefankrause.net/wp/?p=14 that the serial GC 
does return memory to the system...
It would sure be a nice thing to have, since the server is idle 23 hours 
a day. I thought about restarting the server, but there can be async 
requests coming in from different clients, so a restart mechanism would 
probably involve an external sync mechanism.



On 04.11.2017 19:40, Christian Grün wrote:

No, actually the memory is not freed even after CREATE DB, I was watching
another java process; it does vary a little at import end (to about 800M).
So this problem seems to be common.

This behavior is common indeed, and it is not related to BaseX, but to
the Java virtual machine in general. Garbage collection is a very
complex process, and allocated memory won’t automatically be freed
after a memory consuming thread has finised, but only if it is
actually required by another thread.


Q{java:java.lang.System}gc()
Stopped at , 1/29:
Unknown command: Q{java:java.lang.System}gc(). Try HELP.

The string needs to be run as XQuery expression (see my initial mail).
If you want to run it on command-line, you will need to use the XQUERY
command. Find more information on BaseX commands and command-line
processing in our Wiki [1,2].

The following query will give you some idea of the current memory consumption:

   (1 to 3) ! Q{java:java.lang.System}gc(),
   string(db:system()//usedmemory)

It returns the value computed via [3] (see [4] as well). The result is
just a rough guess (it’s generally difficult to compute something like
the “real” memory consumption of a JVM), but it might suffice to
detect real memory leaks. If it turns out that this query yields a
really large value (e.g. > 1gb) after creating a database and adding a
zip file, then we might need to do something about it.

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Commands
[2] http://docs.basex.org/wiki/Command-Line_Options
[3] 
https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Performance.java#L68
[4] 
https://stackoverflow.com/questions/37916136/how-to-calculate-memory-usage-of-java-program




Dinu


On 04.11.2017 19:02, Christian Grün wrote:

Fine. One more question: How do you measure the "memory leak" on
command-line, and are you sure that this value is comparable to the value
that is shown in the bottom bar of the BaseX GUI?



Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" :

Indeed, I use the create function from the GUI, I just assumed it's the
same 2 separate operations.

Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also
gets GC'ed and returned to system afterwards with no additional
intervention, after CREATE DB memory shrinks immediately back to ~30M.

So confirmed, huge memory usage and memory "leak" (or whatever it is) is
linked to ADD only.

Thanks,
Dinu


On 04.11.2017 18:46, Christian Grün wrote:

Hi Dinu,

yes, I have downloaded the file.

Just one more question:


2) using basexclient:

CHECK somedb
ADD /path/to/1_feed.zip

If you use the GUI, do you really add your zip file to an existing
database, or do you specify it as initial input when creating a new
database? The latter option is definitely more efficient, and the
command-line equivalent would be

CREATE DB somedb /path/to/1_feed.zip

For adding resources to existing databases, enabling ADDCACHE can help
[1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Options#ADDCACHE




Result:
Out of Main Memory.

To reproduce 2), start server with -Xmx2048m, repeat operations, then
drop
db, close client, check server memory usage.

Thanks,
Dinu



On 04.11.2017 18:18, Christian Grün wrote:

The fact is, the GUI runs with no problem with -Xmx512M to do the same
thing, while basexclient fails without -Xmx2048M.

That’s surprising indeed – mostly because I would have expected the
BaseX client to always consume a small and constant amount of memory
(the BaseX server instance should be the process to consume all the
memory). I did some quick tests with large zipped input, but I failed
to reproduce the behavior you described. Feel free to provide me with
a step-by-step guide.


I will try that, thanks, but shouldn't this be the case automatically?
Since
I assume BaseX does free references to data structures, at least to a
dropped DB?

Absolutely. Anything that’s reproducible is welcome.




On 04.11.2017 18:00, Christian Grün wrote:

Hi Dinu,


Question 1:

Memory consumption of the BaseX GUI is similar as on command-line,
but
it may be due to garbage collection that some memory will be freed.
How do you add documents outside the GUI?


Question 2:

If a certain amount of memory is reserved by Java’s virtual machine,
it may still be used by other applications on your system 

Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
If you have indexing enabled, more temporary files will be written to disk.
Memory usage won't be completely constant, but surely better than linear.



Am 04.11.2017 7:17 nachm. schrieb "Dinu Marina" :

Having indexes is a must, so, without disabling it, will the memory
requirements grow liniar with the data size? Or does it work in batches, as
I can understand from "will write to disk if memory is too low"? This would
qualify for constant memory for my purpose :) The idea is, if files get 10x
bigger, will I need to put in -Xmx10G? Or will the 1G still be enough?


On Nov 4, 2017 19:50, "Christian Grün"  wrote:

Hi Dinu,

> So, to make an architectural decision, can you tell me if CREATE DB is
> running in quasi-constant space?

If you create new databases, it’s mostly the indexing of texts and
attributes that requires additional memory. Memory usage is not
constant, but the standard value indexes will automatically write
temporary data structures to disk if your memory gets low. If you
disable the text and attribute index, you can build initial databases
from input up to 500 GB [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Statistics


Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina
Having indexes is a must, so, without disabling it, will the memory
requirements grow liniar with the data size? Or does it work in batches, as
I can understand from "will write to disk if memory is too low"? This would
qualify for constant memory for my purpose :) The idea is, if files get 10x
bigger, will I need to put in -Xmx10G? Or will the 1G still be enough?

On Nov 4, 2017 19:50, "Christian Grün"  wrote:

Hi Dinu,

> So, to make an architectural decision, can you tell me if CREATE DB is
> running in quasi-constant space?

If you create new databases, it’s mostly the indexing of texts and
attributes that requires additional memory. Memory usage is not
constant, but the standard value indexes will automatically write
temporary data structures to disk if your memory gets low. If you
disable the text and attribute index, you can build initial databases
from input up to 500 GB [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Statistics


Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
Hi Dinu,

> So, to make an architectural decision, can you tell me if CREATE DB is
> running in quasi-constant space?

If you create new databases, it’s mostly the indexing of texts and
attributes that requires additional memory. Memory usage is not
constant, but the standard value indexes will automatically write
temporary data structures to disk if your memory gets low. If you
disable the text and attribute index, you can build initial databases
from input up to 500 GB [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Statistics


Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina

Ok, thanks, I'll try to shed some more light on it.

So, to make an architectural decision, can you tell me if CREATE DB is 
running in quasi-constant space? So that I may rely on it if feeds get 
10x larger?


Dinu



Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
> No, actually the memory is not freed even after CREATE DB, I was watching
> another java process; it does vary a little at import end (to about 800M).
> So this problem seems to be common.

This behavior is common indeed, and it is not related to BaseX, but to
the Java virtual machine in general. Garbage collection is a very
complex process, and allocated memory won’t automatically be freed
after a memory consuming thread has finised, but only if it is
actually required by another thread.

> Q{java:java.lang.System}gc()
> Stopped at , 1/29:
> Unknown command: Q{java:java.lang.System}gc(). Try HELP.

The string needs to be run as XQuery expression (see my initial mail).
If you want to run it on command-line, you will need to use the XQUERY
command. Find more information on BaseX commands and command-line
processing in our Wiki [1,2].

The following query will give you some idea of the current memory consumption:

  (1 to 3) ! Q{java:java.lang.System}gc(),
  string(db:system()//usedmemory)

It returns the value computed via [3] (see [4] as well). The result is
just a rough guess (it’s generally difficult to compute something like
the “real” memory consumption of a JVM), but it might suffice to
detect real memory leaks. If it turns out that this query yields a
really large value (e.g. > 1gb) after creating a database and adding a
zip file, then we might need to do something about it.

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Commands
[2] http://docs.basex.org/wiki/Command-Line_Options
[3] 
https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/Performance.java#L68
[4] 
https://stackoverflow.com/questions/37916136/how-to-calculate-memory-usage-of-java-program



>
> Dinu
>
>
> On 04.11.2017 19:02, Christian Grün wrote:
>
> Fine. One more question: How do you measure the "memory leak" on
> command-line, and are you sure that this value is comparable to the value
> that is shown in the bottom bar of the BaseX GUI?
>
>
>
> Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" :
>>
>> Indeed, I use the create function from the GUI, I just assumed it's the
>> same 2 separate operations.
>>
>> Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also
>> gets GC'ed and returned to system afterwards with no additional
>> intervention, after CREATE DB memory shrinks immediately back to ~30M.
>>
>> So confirmed, huge memory usage and memory "leak" (or whatever it is) is
>> linked to ADD only.
>>
>> Thanks,
>> Dinu
>>
>>
>> On 04.11.2017 18:46, Christian Grün wrote:
>>>
>>> Hi Dinu,
>>>
>>> yes, I have downloaded the file.
>>>
>>> Just one more question:
>>>
 2) using basexclient:

 CHECK somedb
 ADD /path/to/1_feed.zip
>>>
>>> If you use the GUI, do you really add your zip file to an existing
>>> database, or do you specify it as initial input when creating a new
>>> database? The latter option is definitely more efficient, and the
>>> command-line equivalent would be
>>>
>>>CREATE DB somedb /path/to/1_feed.zip
>>>
>>> For adding resources to existing databases, enabling ADDCACHE can help
>>> [1].
>>>
>>> Cheers,
>>> Christian
>>>
>>> [1] http://docs.basex.org/wiki/Options#ADDCACHE
>>>
>>>
>>>
 Result:
 Out of Main Memory.

 To reproduce 2), start server with -Xmx2048m, repeat operations, then
 drop
 db, close client, check server memory usage.

 Thanks,
 Dinu



 On 04.11.2017 18:18, Christian Grün wrote:
>>
>> The fact is, the GUI runs with no problem with -Xmx512M to do the same
>> thing, while basexclient fails without -Xmx2048M.
>
> That’s surprising indeed – mostly because I would have expected the
> BaseX client to always consume a small and constant amount of memory
> (the BaseX server instance should be the process to consume all the
> memory). I did some quick tests with large zipped input, but I failed
> to reproduce the behavior you described. Feel free to provide me with
> a step-by-step guide.
>
>> I will try that, thanks, but shouldn't this be the case automatically?
>> Since
>> I assume BaseX does free references to data structures, at least to a
>> dropped DB?
>
> Absolutely. Anything that’s reproducible is welcome.
>
>
>
>> On 04.11.2017 18:00, Christian Grün wrote:
>>>
>>> Hi Dinu,
>>>
 Question 1:
>>>
>>> Memory consumption of the BaseX GUI is similar as on command-line,
>>> but
>>> it may be due to garbage collection that some memory will be freed.
>>> How do you add documents outside the GUI?
>>>
 Question 2:
>>>
>>> If a certain amount of memory is reserved by Java’s virtual machine,
>>> it may still be used by other applications on your system (provided
>>> that the memory can be freed by garbage collection). You can enforce
>>> some GC calls by running the following XQuery expression (this 

Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina

I also tried your snippet, but I get:

Q{java:java.lang.System}gc()
Stopped at , 1/29:
Unknown command: Q{java:java.lang.System}gc(). Try HELP.


On 04.11.2017 19:02, Christian Grün wrote:
Fine. One more question: How do you measure the "memory leak" on 
command-line, and are you sure that this value is comparable to the 
value that is shown in the bottom bar of the BaseX GUI?




Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" >:


Indeed, I use the create function from the GUI, I just assumed
it's the same 2 separate operations.

Indeed, with CREATE DB it doesn't get out of memory at 1G. And it
also gets GC'ed and returned to system afterwards with no
additional intervention, after CREATE DB memory shrinks
immediately back to ~30M.

So confirmed, huge memory usage and memory "leak" (or whatever it
is) is linked to ADD only.

Thanks,
Dinu


On 04.11.2017 18 :46, Christian Grün wrote:

Hi Dinu,

yes, I have downloaded the file.

Just one more question:

2) using basexclient:

CHECK somedb
ADD /path/to/1_feed.zip

If you use the GUI, do you really add your zip file to an existing
database, or do you specify it as initial input when creating
a new
database? The latter option is definitely more efficient, and the
command-line equivalent would be

   CREATE DB somedb /path/to/1_feed.zip

For adding resources to existing databases, enabling ADDCACHE
can help [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Options#ADDCACHE




Result:
Out of Main Memory.

To reproduce 2), start server with -Xmx2048m, repeat
operations, then drop
db, close client, check server memory usage.

Thanks,
Dinu



On 04.11.2017 18 :18, Christian Grün
wrote:

The fact is, the GUI runs with no problem with
-Xmx512M to do the same
thing, while basexclient fails without -Xmx2048M.

That’s surprising indeed – mostly because I would have
expected the
BaseX client to always consume a small and constant
amount of memory
(the BaseX server instance should be the process to
consume all the
memory). I did some quick tests with large zipped
input, but I failed
to reproduce the behavior you described. Feel free to
provide me with
a step-by-step guide.

I will try that, thanks, but shouldn't this be the
case automatically?
Since
I assume BaseX does free references to data
structures, at least to a
dropped DB?

Absolutely. Anything that’s reproducible is welcome.



On 04.11.2017 18 :00,
Christian Grün wrote:

Hi Dinu,

Question 1:

Memory consumption of the BaseX GUI is similar
as on command-line, but
it may be due to garbage collection that some
memory will be freed.
How do you add documents outside the GUI?

Question 2:

If a certain amount of memory is reserved by
Java’s virtual machine,
it may still be used by other applications on
your system (provided
that the memory can be freed by garbage
collection). You can enforce
some GC calls by running the following XQuery
expression (this should
only be done for testing purposes):

     (1 to 5) ! Q{java:java.lang.System}gc()

Best,
Christian


After the data is extracted, it's no
longer needed and I DROP the DB;
also
connection is closed. But memory (the huge
2G mentioned above) is never
returned to the system.

The script I use to run BaseX is:

export BASEX_JVM="-Xmx2048m
-XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=20
-XX:+UseSerialGC 

Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina

Going back on my previous mail:

No, actually the memory is not freed even after CREATE DB, I was 
watching another java process; it does vary a little at import end (to 
about 800M). So this problem seems to be common.


I track system memory usage, via ps / top. I don't know how to measure 
actual heap usage on the server, just the system memory consumed.


Dinu


On 04.11.2017 19:02, Christian Grün wrote:
Fine. One more question: How do you measure the "memory leak" on 
command-line, and are you sure that this value is comparable to the 
value that is shown in the bottom bar of the BaseX GUI?




Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" >:


Indeed, I use the create function from the GUI, I just assumed
it's the same 2 separate operations.

Indeed, with CREATE DB it doesn't get out of memory at 1G. And it
also gets GC'ed and returned to system afterwards with no
additional intervention, after CREATE DB memory shrinks
immediately back to ~30M.

So confirmed, huge memory usage and memory "leak" (or whatever it
is) is linked to ADD only.

Thanks,
Dinu


On 04.11.2017 18 :46, Christian Grün wrote:

Hi Dinu,

yes, I have downloaded the file.

Just one more question:

2) using basexclient:

CHECK somedb
ADD /path/to/1_feed.zip

If you use the GUI, do you really add your zip file to an existing
database, or do you specify it as initial input when creating
a new
database? The latter option is definitely more efficient, and the
command-line equivalent would be

   CREATE DB somedb /path/to/1_feed.zip

For adding resources to existing databases, enabling ADDCACHE
can help [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Options#ADDCACHE




Result:
Out of Main Memory.

To reproduce 2), start server with -Xmx2048m, repeat
operations, then drop
db, close client, check server memory usage.

Thanks,
Dinu



On 04.11.2017 18 :18, Christian Grün
wrote:

The fact is, the GUI runs with no problem with
-Xmx512M to do the same
thing, while basexclient fails without -Xmx2048M.

That’s surprising indeed – mostly because I would have
expected the
BaseX client to always consume a small and constant
amount of memory
(the BaseX server instance should be the process to
consume all the
memory). I did some quick tests with large zipped
input, but I failed
to reproduce the behavior you described. Feel free to
provide me with
a step-by-step guide.

I will try that, thanks, but shouldn't this be the
case automatically?
Since
I assume BaseX does free references to data
structures, at least to a
dropped DB?

Absolutely. Anything that’s reproducible is welcome.



On 04.11.2017 18 :00,
Christian Grün wrote:

Hi Dinu,

Question 1:

Memory consumption of the BaseX GUI is similar
as on command-line, but
it may be due to garbage collection that some
memory will be freed.
How do you add documents outside the GUI?

Question 2:

If a certain amount of memory is reserved by
Java’s virtual machine,
it may still be used by other applications on
your system (provided
that the memory can be freed by garbage
collection). You can enforce
some GC calls by running the following XQuery
expression (this should
only be done for testing purposes):

     (1 to 5) ! Q{java:java.lang.System}gc()

Best,
Christian


After the data is extracted, it's no
longer needed and I DROP the DB;
also
connection is closed. But memory (the huge
2G mentioned above) is never
returned to the system.

The script I use to run 

Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
Fine. One more question: How do you measure the "memory leak" on
command-line, and are you sure that this value is comparable to the value
that is shown in the bottom bar of the BaseX GUI?



Am 04.11.2017 5:58 nachm. schrieb "Dinu Marina" :

> Indeed, I use the create function from the GUI, I just assumed it's the
> same 2 separate operations.
>
> Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also
> gets GC'ed and returned to system afterwards with no additional
> intervention, after CREATE DB memory shrinks immediately back to ~30M.
>
> So confirmed, huge memory usage and memory "leak" (or whatever it is) is
> linked to ADD only.
>
> Thanks,
> Dinu
>
>
> On 04.11.2017 18:46, Christian Grün wrote:
>
>> Hi Dinu,
>>
>> yes, I have downloaded the file.
>>
>> Just one more question:
>>
>> 2) using basexclient:
>>>
>>> CHECK somedb
>>> ADD /path/to/1_feed.zip
>>>
>> If you use the GUI, do you really add your zip file to an existing
>> database, or do you specify it as initial input when creating a new
>> database? The latter option is definitely more efficient, and the
>> command-line equivalent would be
>>
>>CREATE DB somedb /path/to/1_feed.zip
>>
>> For adding resources to existing databases, enabling ADDCACHE can help
>> [1].
>>
>> Cheers,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Options#ADDCACHE
>>
>>
>>
>> Result:
>>> Out of Main Memory.
>>>
>>> To reproduce 2), start server with -Xmx2048m, repeat operations, then
>>> drop
>>> db, close client, check server memory usage.
>>>
>>> Thanks,
>>> Dinu
>>>
>>>
>>>
>>> On 04.11.2017 18:18, Christian Grün wrote:
>>>
 The fact is, the GUI runs with no problem with -Xmx512M to do the same
> thing, while basexclient fails without -Xmx2048M.
>
 That’s surprising indeed – mostly because I would have expected the
 BaseX client to always consume a small and constant amount of memory
 (the BaseX server instance should be the process to consume all the
 memory). I did some quick tests with large zipped input, but I failed
 to reproduce the behavior you described. Feel free to provide me with
 a step-by-step guide.

 I will try that, thanks, but shouldn't this be the case automatically?
> Since
> I assume BaseX does free references to data structures, at least to a
> dropped DB?
>
 Absolutely. Anything that’s reproducible is welcome.



 On 04.11.2017 18:00, Christian Grün wrote:
>
>> Hi Dinu,
>>
>> Question 1:
>>>
>> Memory consumption of the BaseX GUI is similar as on command-line, but
>> it may be due to garbage collection that some memory will be freed.
>> How do you add documents outside the GUI?
>>
>> Question 2:
>>>
>> If a certain amount of memory is reserved by Java’s virtual machine,
>> it may still be used by other applications on your system (provided
>> that the memory can be freed by garbage collection). You can enforce
>> some GC calls by running the following XQuery expression (this should
>> only be done for testing purposes):
>>
>>  (1 to 5) ! Q{java:java.lang.System}gc()
>>
>> Best,
>> Christian
>>
>>
>> After the data is extracted, it's no longer needed and I DROP the DB;
>>> also
>>> connection is closed. But memory (the huge 2G mentioned above) is
>>> never
>>> returned to the system.
>>>
>>> The script I use to run BaseX is:
>>>
>>> export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10
>>> -XX:MaxHeapFreeRatio=20
>>> -XX:+UseSerialGC -Dorg.basex.LOG=false
>>> -Dorg.basex.DBPATH=/var/basex/data
>>> -Dorg.basex.REPOPATH=/var/basex/repo"
>>> BaseX/bin/basexserver -S
>>>
>>> So basically I tried specifying MaxHeapFreeRatio and SerialGC for
>>> java,
>>> but
>>> it's no improvement and it doesn't help so I assume the memory isn't
>>> hogged
>>> in java... is there a way to free up the memory once operations
>>> complete
>>> (like mentioned above, "complete" means created DB is dropped,
>>> connection
>>> closed, waiting for another batch to start over).
>>>
>>> Thanks,
>>> Dinu
>>>
>>>
>


Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina
Indeed, I use the create function from the GUI, I just assumed it's the 
same 2 separate operations.


Indeed, with CREATE DB it doesn't get out of memory at 1G. And it also 
gets GC'ed and returned to system afterwards with no additional 
intervention, after CREATE DB memory shrinks immediately back to ~30M.


So confirmed, huge memory usage and memory "leak" (or whatever it is) is 
linked to ADD only.


Thanks,
Dinu


On 04.11.2017 18:46, Christian Grün wrote:

Hi Dinu,

yes, I have downloaded the file.

Just one more question:


2) using basexclient:

CHECK somedb
ADD /path/to/1_feed.zip

If you use the GUI, do you really add your zip file to an existing
database, or do you specify it as initial input when creating a new
database? The latter option is definitely more efficient, and the
command-line equivalent would be

   CREATE DB somedb /path/to/1_feed.zip

For adding resources to existing databases, enabling ADDCACHE can help [1].

Cheers,
Christian

[1] http://docs.basex.org/wiki/Options#ADDCACHE




Result:
Out of Main Memory.

To reproduce 2), start server with -Xmx2048m, repeat operations, then drop
db, close client, check server memory usage.

Thanks,
Dinu



On 04.11.2017 18:18, Christian Grün wrote:

The fact is, the GUI runs with no problem with -Xmx512M to do the same
thing, while basexclient fails without -Xmx2048M.

That’s surprising indeed – mostly because I would have expected the
BaseX client to always consume a small and constant amount of memory
(the BaseX server instance should be the process to consume all the
memory). I did some quick tests with large zipped input, but I failed
to reproduce the behavior you described. Feel free to provide me with
a step-by-step guide.


I will try that, thanks, but shouldn't this be the case automatically?
Since
I assume BaseX does free references to data structures, at least to a
dropped DB?

Absolutely. Anything that’s reproducible is welcome.




On 04.11.2017 18:00, Christian Grün wrote:

Hi Dinu,


Question 1:

Memory consumption of the BaseX GUI is similar as on command-line, but
it may be due to garbage collection that some memory will be freed.
How do you add documents outside the GUI?


Question 2:

If a certain amount of memory is reserved by Java’s virtual machine,
it may still be used by other applications on your system (provided
that the memory can be freed by garbage collection). You can enforce
some GC calls by running the following XQuery expression (this should
only be done for testing purposes):

 (1 to 5) ! Q{java:java.lang.System}gc()

Best,
Christian



After the data is extracted, it's no longer needed and I DROP the DB;
also
connection is closed. But memory (the huge 2G mentioned above) is never
returned to the system.

The script I use to run BaseX is:

export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10
-XX:MaxHeapFreeRatio=20
-XX:+UseSerialGC -Dorg.basex.LOG=false
-Dorg.basex.DBPATH=/var/basex/data
-Dorg.basex.REPOPATH=/var/basex/repo"
BaseX/bin/basexserver -S

So basically I tried specifying MaxHeapFreeRatio and SerialGC for java,
but
it's no improvement and it doesn't help so I assume the memory isn't
hogged
in java... is there a way to free up the memory once operations
complete
(like mentioned above, "complete" means created DB is dropped,
connection
closed, waiting for another batch to start over).

Thanks,
Dinu





Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
> The fact is, the GUI runs with no problem with -Xmx512M to do the same
> thing, while basexclient fails without -Xmx2048M.

That’s surprising indeed – mostly because I would have expected the
BaseX client to always consume a small and constant amount of memory
(the BaseX server instance should be the process to consume all the
memory). I did some quick tests with large zipped input, but I failed
to reproduce the behavior you described. Feel free to provide me with
a step-by-step guide.

> I will try that, thanks, but shouldn't this be the case automatically? Since
> I assume BaseX does free references to data structures, at least to a
> dropped DB?

Absolutely. Anything that’s reproducible is welcome.



> On 04.11.2017 18:00, Christian Grün wrote:
>>
>> Hi Dinu,
>>
>>> Question 1:
>>
>> Memory consumption of the BaseX GUI is similar as on command-line, but
>> it may be due to garbage collection that some memory will be freed.
>> How do you add documents outside the GUI?
>>
>>> Question 2:
>>
>> If a certain amount of memory is reserved by Java’s virtual machine,
>> it may still be used by other applications on your system (provided
>> that the memory can be freed by garbage collection). You can enforce
>> some GC calls by running the following XQuery expression (this should
>> only be done for testing purposes):
>>
>>(1 to 5) ! Q{java:java.lang.System}gc()
>>
>> Best,
>> Christian
>>
>>
>>> After the data is extracted, it's no longer needed and I DROP the DB;
>>> also
>>> connection is closed. But memory (the huge 2G mentioned above) is never
>>> returned to the system.
>>>
>>> The script I use to run BaseX is:
>>>
>>> export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10
>>> -XX:MaxHeapFreeRatio=20
>>> -XX:+UseSerialGC -Dorg.basex.LOG=false -Dorg.basex.DBPATH=/var/basex/data
>>> -Dorg.basex.REPOPATH=/var/basex/repo"
>>> BaseX/bin/basexserver -S
>>>
>>> So basically I tried specifying MaxHeapFreeRatio and SerialGC for java,
>>> but
>>> it's no improvement and it doesn't help so I assume the memory isn't
>>> hogged
>>> in java... is there a way to free up the memory once operations complete
>>> (like mentioned above, "complete" means created DB is dropped, connection
>>> closed, waiting for another batch to start over).
>>>
>>> Thanks,
>>> Dinu
>>>
>


Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Dinu Marina

Hi,

1)
On command-line I run:

basexclient -U user -P pass -c "CHECK ""dbname""; DELETE /; ADD 
""file.zip"""


(the zip contains XML files)

The fact is, the GUI runs with no problem with -Xmx512M to do the same 
thing, while basexclient fails without -Xmx2048M. The GUI seems to also 
immediately reclaim all memory used in the import process, the bottom 
bar shows an usage of 40M after import.


Also, is this memory usage normal? Isn't there some kind of serial batch 
import process? This high a memory usage looks almost like the whole XML 
DOM is reconstructed in RAM, which should always be a problem because we 
are expecting even larger feeds, on the order of 5X bigger.


2)
I will try that, thanks, but shouldn't this be the case automatically? 
Since I assume BaseX does free references to data structures, at least 
to a dropped DB? If not, then any amount of GC is unlikely to work either :)


Thanks,
Dinu


On 04.11.2017 18:00, Christian Grün wrote:

Hi Dinu,


Question 1:

Memory consumption of the BaseX GUI is similar as on command-line, but
it may be due to garbage collection that some memory will be freed.
How do you add documents outside the GUI?


Question 2:

If a certain amount of memory is reserved by Java’s virtual machine,
it may still be used by other applications on your system (provided
that the memory can be freed by garbage collection). You can enforce
some GC calls by running the following XQuery expression (this should
only be done for testing purposes):

   (1 to 5) ! Q{java:java.lang.System}gc()

Best,
Christian



After the data is extracted, it's no longer needed and I DROP the DB; also
connection is closed. But memory (the huge 2G mentioned above) is never
returned to the system.

The script I use to run BaseX is:

export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20
-XX:+UseSerialGC -Dorg.basex.LOG=false -Dorg.basex.DBPATH=/var/basex/data
-Dorg.basex.REPOPATH=/var/basex/repo"
BaseX/bin/basexserver -S

So basically I tried specifying MaxHeapFreeRatio and SerialGC for java, but
it's no improvement and it doesn't help so I assume the memory isn't hogged
in java... is there a way to free up the memory once operations complete
(like mentioned above, "complete" means created DB is dropped, connection
closed, waiting for another batch to start over).

Thanks,
Dinu





Re: [basex-talk] Memory requirements, release memory, heap shrink

2017-11-04 Thread Christian Grün
Hi Dinu,

> Question 1:

Memory consumption of the BaseX GUI is similar as on command-line, but
it may be due to garbage collection that some memory will be freed.
How do you add documents outside the GUI?

> Question 2:

If a certain amount of memory is reserved by Java’s virtual machine,
it may still be used by other applications on your system (provided
that the memory can be freed by garbage collection). You can enforce
some GC calls by running the following XQuery expression (this should
only be done for testing purposes):

  (1 to 5) ! Q{java:java.lang.System}gc()

Best,
Christian


> After the data is extracted, it's no longer needed and I DROP the DB; also
> connection is closed. But memory (the huge 2G mentioned above) is never
> returned to the system.
>
> The script I use to run BaseX is:
>
> export BASEX_JVM="-Xmx2048m -XX:MinHeapFreeRatio=10 -XX:MaxHeapFreeRatio=20
> -XX:+UseSerialGC -Dorg.basex.LOG=false -Dorg.basex.DBPATH=/var/basex/data
> -Dorg.basex.REPOPATH=/var/basex/repo"
> BaseX/bin/basexserver -S
>
> So basically I tried specifying MaxHeapFreeRatio and SerialGC for java, but
> it's no improvement and it doesn't help so I assume the memory isn't hogged
> in java... is there a way to free up the memory once operations complete
> (like mentioned above, "complete" means created DB is dropped, connection
> closed, waiting for another batch to start over).
>
> Thanks,
> Dinu
>