GitHub user afs added a comment to the discussion: How to load big dataset to 
new database

You'd need to prepare a quads file for tdb2.xloader. (PubChem happens to be 
parse-clean, which is not a given for many large datasets. In general, checking 
the data is good before loading is often a good idea.)

For the memory, yes, that makes sense. It's memory mapped file caching by the 
OS.

TDB databases are portable - you can build on one machine and move it to 
another. For example, get a large memory machine, maybe with local SSD, build 
the database and then copy it. The database are big - copying is not instant. 
Caveat - if copying them around, preserve sparse files e.g. `rsync --sparse`.


GitHub link: 
https://github.com/apache/jena/discussions/3701#discussioncomment-15682995

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to