Re: TDB2 merged

ajs6f Sat, 07 Oct 2017 06:47:01 -0700

Okay, that makes sense. We might even just swap the "namespaces" at some future point when TDB2 becomes the default,i.e. go to tdbquery being for TDB2 and there being a tdb1.tdbquery, as a stop on the road to deprecation.


ajs6f
Andy Seaborne wrote on 10/7/17 9:42 AM:



On 06/10/17 21:17, [email protected] wrote:

The commands are in the binary distribution "apache-jena" download but there 
are no script wrappers (easy to copy and
fix though).


Just a thought-- maybe better to add flags to the current scripts? Having 
all-new loader scripts for TDB2 would make
for three different bulk loader scripts...


Maybe though it's not so simple a thing to do as the scripts are a general 
wrapper template to call the java code.

For now, the TDB2 commands are of the form "tdb2.tdb*"

tdb2.tdbquery ...

Sometime, detecting the database type would be great but not critical path for 
the 3.5.0.

    Andy



ajs6f

Andy Seaborne wrote on 10/6/17 7:36 AM:

That would be very helpful.

"documentation" is a task in the next few days. It's the block on sending any 
messages to users@ etc about it.


The raw material is in git:

https://github.com/apache/jena/blob/master/jena-db/use-fuseki-tdb2.md
https://github.com/apache/jena/blob/master/jena-db/use-tdb2-cmds.md

The commands are in the binary distribution "apache-jena" download but there 
are no script wrappers (easy to copy and
fix though).

Either run from development or

java -cp 'DIR/lib/*' tdb2.tdbloader ... args ...

some of my data files are too big to
be loaded via the Graph Store API.


From TDB2 and Fuseki's point of view, that's no longer true.
You can (should be able to) load any amount.

The fuseki-basic server also has TDB2 in it so if you are doing everything 
script-driven, you can run that "--conf
config-tdb2.ttl"

There is no progress indicator in the server log so you may wish to set set 
some kind of verbose option in the sender.

    Andy

Uploading large files:

The UI does this all quite well.

What's the magic for a command line/scripted process?

It needs a tool that does not buffer or inspect the file or otherwise try to be 
helpful.

Anyone know of good tools for this?

I haven't managed to work out which set of "curl" arguments do this without 
buffering the file (--data* seem to
buffer the file; -F is a form upload, not pure
POST).

This seems to work:

wget --post-file=/home/afs/Datasets/BSBM/bsbm-200m.nt --header 'Content-type: 
application/n-triples'
http://localhost:3030/data

200M BSBM (49Gbytes) loaded at 42K triples/s.

The content length in the fuskei log is reported wrongly (1002691465 ... 
int/long error) but the triple count is right.

It does ruins the interactive performance of the machine!

s-post crashes immediately if given a large files - don't know why.

On 06/10/17 07:50, Osma Suominen wrote:

Excellent!

I have a couple of Fuseki installations where I could test drive this. I'd just 
need to know how to do the
configuration, and also a tool like tdbloader for
offline loading since some of my data files are too big to be loaded via the 
Graph Store API.

No hurry though.

-Osma


Andy Seaborne kirjoitti 04.10.2017 klo 00:43:

It's in the build joined in at apache-jena-libs.

It is in Fuseki2 server jar, but not the UI - a user needs to use a 
configuration file. That also works in
fuseki-basic.

Documentation to follow.

    Andy

Re: TDB2 merged

Reply via email to