Hi, I've been running Apache Jena Fuseki 4.5.0 in a docker container. I've loaded data to it two ways: though the graph store protocol, and using tdb2.tdbloader before starting Jena Fuseki. No issues with either, however I'm interested in what differences the two methods have.
With the graph store protocol, I can put larger RDF files 'close' to where the docker container is running and handle any network issues, so the loads have been fine. Loading data this way is convenient and allows updates while Jena Fuseki is running. Are indexes continually updated as more data is loaded through the graph store protocol? Are there any other disadvantages to this method or reasons it (may) not be advised for large datasets? Conversely, I'm aware tdb2.tdbloader can load large datasets, is there any reason/s it should be used over graph store protocol? Are there any other methods I should be considering (other than SPARQL INSERT)? I'll also be running GeoSPARQL Jena for some instances, and needing to spatially index data. I think this will necessitate using tdb2.tdbloader and generating the spatial index 'offline' before starting Jena/Fuseki - or are there other ways? Thanks David Habgood
