Hi, I'm new to BaseX and to XQuery. I already knew XPath. I'm evaluating BaseX 
to store our XML files and make queries on them. We have to store about 1 
million of XML files per month. The XML files are little (~1 KB to 5 KB). So 
our case is: High number of files, little size.

I've read that BaseX is scalable and has high performance, so it is probably a 
good tool for us. But, in the tests I'm doing, I'm getting an "Out of Main 
Memory" error when loading high number of XML files.

For exaple, if I create a new database ("testdb"), and add 3 XML files, no 
problem occurs. Files are stored correctly, and I can make queries on them. 
Then, if I try to add 18000 XML files to the same database ("testdb") (by using 
GUI > Database > Properties > Add Resources), then I see how the coloured 
memory bar grows and grows... until an error appears:

    Out of Main Memory.
    You can try to:
    - increase Java's heap size with the flag -Xmx<size>
    - deactivate the text and attribute indexes.

The text and attribute indexes are disabled, so it is not the cause. And I 
increased the Java size with the flag -Xmx<size> (by editing the basexgui.bat 
script), and same error happens.

Probaly BaseX loads all files to main memory first, and then dumps them to the 
database files. That shouldn't be done in that way. For each XML file, it 
should be loaded into main memory, then procesed and then dumped to the db 
files. For each file, independently from the rest.

So I have two questions:
1. Do I have to use an special way to add high number of XML files?
2. Is BaseX sufficiently stable to store and manage our data (about 1 million 
of files will be added per month)?

Thank you for our help and for your great software, and excuse me if I am 
asking for recurrent questions.
_______________________________________________
BaseX-Talk mailing list
[email protected]
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to