GitHub user afs added a comment to the discussion: How to load big dataset to new database
You'd need to prepare a quads file for tdb2.xloader. (PubChem happens to be parse-clean, which is not a given for many large datasets. In general, checking the data is good before loading is often a good idea.) For the memory, yes, that makes sense. It's memory mapped file caching by the OS. TDB databases are portable - you can build on one machine and move it to another. For example, get a large memory machine, maybe with local SSD, build the database and then copy it. The database are big - copying is not instant. Caveat - if copying them around, preserve sparse files e.g. `rsync --sparse`. GitHub link: https://github.com/apache/jena/discussions/3701#discussioncomment-15682995 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
