On 29/03/16 13:53, Adrian Gschwend wrote:
Hi group,
I try to convert a 800MB JSON-LD file to something more readable (NT or
Turtle) using riot. Unfortunately I run into memory issues:
Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
exceeded
I tried -Xmx up to 16GB but no luck so far. According to the
documentation riot should try streaming if possible, is that not
available for JSON-LD or am I missing something else?
regards
Adrian
Jena uses jsonld-java [1] for JSON-LD. jsonld-java uses Jackosn which
reads the entire file before letting the client operate on the file.
The actual JSON to RDF step is streaming.
So JSON-LD does not stream end-to-end.
(if the JSON-LD is arranged carefully, @context before data, a streaming
parsers is theoretically possible. Jena does this for SPARQL results in
JSON - if the headers are seem before the results, it streams else it
has to buffer).
At 800Mb I would have expected a large enough heap to work for N-triples
output. Is the file available online anywhere?
Andy
(and is the JSON-LD one big object? It is not really JSON sweet spot
for large objects)
[1] https://github.com/jsonld-java/jsonld-java