Re: wrong content-types in s-get | Re: Export named graph from TDB to several ntriples files

vincent ventresque Thu, 31 Jan 2019 03:58:50 -0800

Thanks for your quick reply!

> $mtAppJSON isn't used.

I think my previous msg wasn't clear : I meant raw json and not json-ld(my code works now for both, and I use $mtAppJSON ; but I had toreplace 'application/json' with 'application/rdf+json' in order to getjson instead of XML ; see the file herehttps://sourceforge.net/projects/ffl-misc/files/fuseki_scripts_custom-ruby/s-get/download)


> The settings are: ...

I made a little test : comment these lines and the "names" part, andyou'll get XML!



Le 31/01/2019 à 12:48, Andy Seaborne a écrit :

On 31/01/2019 11:26, vincent ventresque wrote:
Hello,
I found the origin of the problem for json : the $mtAppJSON had thevalue
'application/json'
$mtAppJSON isn't used.

"application/rdf+json"
isn't JSON-LD (it's the old Talis format).

There is:

$mtJSONLD           = 'application/ld+json'
it has to be replaced with

'application/rdf+json'

I've updated the file here :
https://sourceforge.net/projects/ffl-misc/files/fuseki_scripts_custom-ruby/s-get/download
Maybe I'm going to submit a pull request as Andy suggested, but I'dlike to understand why 'application/json' returns xml. Besides, it'sthe same thing for nquads : I tried to replace
$mtNQuads = 'application/n-quads'

with

$mtNQuads = 'application/x-trig'

but still have xml...
The settings are:

# Default for GET
# At least allow anything (and hope!)
$accept_rdf="#{$mtTurtle} , #{$mtNTriples};q=0.9 , #{$mtRDF};q=0.8 ,#{$mtJSONLD};q=0.5"
# Datasets
$accept_ds="#{$mtTrig} , #{$mtNQuads};q=0.9 , #{$mtJSONLD};q=0.5"
# For SPARQL query
$accept_results="#{$mtSparqlResultsJ} , #{$mtSparqlResultsX};q=0.9 ,#{$accept_rdf}"
# Accept any in case of trouble.
$accept_rdf="#{$accept_rdf} , */*;q=0.1"
$accept_results="#{$accept_results} , */*;q=0.1"
Is there a kind of default setting somewhere (if content-type isn'trecognized in Fuseki, the response is xml) ?
Yes.

RDF/XML for graphs, N-Quads for datasets.
Run Fuseki/full with "-v" and it should print the content negotiationdetails.
    Andy
Thanks in advance

VV


Ok, maybe I'm going to submit a pull request, but I'd

Le 29/01/2019 à 17:11, vincent ventresque a écrit :
Hi Andy,
Thanks again for your idea to modify the s-get script, it helped meunderstand ruby utilities and http requests (I often use the rubyscripts but never really looked inside).
Don't know how to submit a pull request, and I'm not a ruby expert!Therefore I've put a small test file here :
https://sourceforge.net/projects/ffl-misc/files/fuseki_scripts_custom-ruby/s-get/download
-- added "--output" in options + created a new function(set_output_format)
-- it works for ntriples, xml, Json-LD,

-- doesn't work for json (returns xml...)
N.B. : in this test file, I've removed large parts of the originalcode in order to improve readability
Le 28/01/2019 à 15:28, Vincent Ventresque a écrit :
Hi Andy,
Many thanks for these ideas, I'm going to try the curl & riotsolutions.
> Modify the s-get script to handle --output and set the "Accept:"header then please submit a pull request for the changes
I had made an attempt to modify the s-get script in the same way asfor s-query but it didn't work : if I have a moment I'll try tounderstand how the options are handled.
Le 28/01/2019 à 14:19, Andy Seaborne a écrit :
On 28/01/2019 11:04, Vincent Ventresque wrote:
Hello,
I want to export a named graph which is stored in a TDB dataset,and I want to store the output in several files (for the namedgraph contains +/- 9.5 M triples).
My idea is to use "split" command in order to cut the output ofthe export into pieces. However, this solution with "split"requires ntriples or nquads (one triple per line, so that thefiles are not cut in the middle of an assertion ; besides, it'salso more practical to have a triple per line if I want totransform the data with perl or sed).
I found a solution with s-query but had to edit the ruby s-queryscript to get ntriples (see below).
There are other possible solutions for an export via command-lineutilities : "s-get" and "tdbdump". If I understand well,"tdbdump" gives nquads as output, but one can't export only apart of the data, everything is exported at once. The "s-get"solution allows to select a named graph in the dataset, but Icouldn't change the output format.
Are there better solutions to get an export in several files?
Ways I can think of:
1/ Modify the s-get script to handle --output and set the"Accept:" header then please submit a pull request for the changes.
2/ Use curl

curl --header 'Accept: application/n-triples' \
   'http://localhost:3030/ds?graph=http://bnf_titres'

3/ Parse the s-get output:

s-get ... | riot --syntax TTL

    Andy
Thanks in advance,

VV.



~~~~~~~~~~~ 1) SOLUTION WITH s-query ~~~~~~~~~~~~~~~~~~~~~

1.1) Edit s-query ruby script (add nt)

-- l. 572 : when  "json","xml","text","csv","tsv","nt"
-- l. 574 : when :json,:xml,:text,:csv,:tsv,:nt
-- l. 515 : opts.on('--output=TYPE',[:json,:xml,:text,:csv,:tsv,:nt],-- l. 519 : opts.on('--accept=TYPE',[:json,:xml,:text,:csv,:tsv,:nt],
1.2) Command
/my/path/to/fuseki/bin/s-query--service=http://localhost:3030/BnF_text_v2/ "construct { ?s ?p?o } where { graph <http://bnf_titres> { ?s ?p ?o }}" --output=nt| split -l 500000 - --additional-suffix=.nt BnfTextTitres-
~~~~~~~~~~~ 2) SOLUTION WITH tdbdump (nquads but no named graph)~~~~~~~~~~~~~~~~~~~~~
/my/path/to/jena/bin/tdbdump--loc=/my/path/to/fuseki/run/databases/BnF_text_v2--graph=http://bnf_titres | split -l 500000 ---additional-suffix=.nt BnfTextTitres-
=> Unknown argument: graph
~~~~~~~~~~~ 3) SOLUTION WITH s-get (named graph ok, but turtleoutput) ~~~~~~~~~~~~~~~~~~~~~
/my/path/to/fuseki/bin/s-gethttp://localhost:3030/BnF_text_v2/data http://bnf_titres--output=text | split -l 500000 - --additional-suffix=.ntBnfTextTitres-
=> /my/path/to/fuseki/bin/s-get:364:in `cmd_soh': invalid option:--output=text (OptionParser::InvalidOption)
from /my/path/to/fuseki/bin/fuseki/bin/s-get:715:in `<main>'

Re: wrong content-types in s-get | Re: Export named graph from TDB to several ntriples files

Reply via email to