I got always OutOfMemoryError no matter how much RAM is available and it
fails wheather the jsonld file is archived or not
I attempt to convert this file
https://data.dnb.de/opendata/authorities-gnd_entityfacts.jsonld.gz with
Jena 4.10.0 or 5.0.0
Here my tests:
# free -m
gesamt belegt frei gemeinsam Zwischen
verfügbar
Speicher: 128821 24038 100091 1 4691 103714
Auslager: 1903 0 1903
# JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"; export JAVA_HOME
# cd /opt/apache-jena-4.10.0
# VM_ARGS="-Xmx100G" bin/riot --out=RDF/XML
/var/tmp/authorities-gnd_entityfacts.jsonld.gz >
authorities-gnd_entityfacts.rdf
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
java.base/java.util.LinkedHashMap.newNode(LinkedHashMap.java:256)
at java.base/java.util.HashMap.putVal(HashMap.java:627)
at java.base/java.util.HashMap.put(HashMap.java:608)
at
org.glassfish.json.JsonObjectBuilderImpl.putValueMap(JsonObjectBuilderImpl.java:209)
at
org.glassfish.json.JsonObjectBuilderImpl.add(JsonObjectBuilderImpl.java:81)
at
org.glassfish.json.JsonParserImpl.getObject(JsonParserImpl.java:334)
at
org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:175)
at
org.glassfish.json.JsonParserImpl.getArray(JsonParserImpl.java:321)
at
org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:173)
at
com.apicatalog.jsonld.document.JsonDocument.doParse(JsonDocument.java:163)
at
com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:112)
at
com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:90)
at
org.apache.jena.riot.lang.LangJSONLD11.read(LangJSONLD11.java:73)
at org.apache.jena.riot.RDFParser.read(RDFParser.java:416)
at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:385)
at org.apache.jena.riot.RDFParser.parse(RDFParser.java:360)
at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:384)
at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:331)
at riotcmd.CmdLangParse.exec$(CmdLangParse.java:229)
at riotcmd.CmdLangParse.exec(CmdLangParse.java:169)
at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
at riotcmd.riot.main(riot.java:29)
# VM_ARGS="-Xmx100G" bin/riot --out=RDF/XML
/var/tmp/authorities-gnd_entityfacts.jsonld >
authorities-gnd_entityfacts.rdf
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space:
failed reallocation of scalar replaced objects
# JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"; export JAVA_HOME
# cd /opt/apache-jena-5.0.0
# VM_ARGS="-Xmx100G" bin/riot --out=RDF/XML
/var/tmp/authorities-gnd_entityfacts.jsonld.gz >
authorities-gnd_entityfacts.rdf
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space:
failed reallocation of scalar replaced objects
# VM_ARGS="-Xmx100G" bin/riot --out=RDF/XML
/var/tmp/authorities-gnd_entityfacts.jsonld >
authorities-gnd_entityfacts.rdf
09:42:56 INFO riot :: File:
/var/tmp/authorities-gnd_entityfacts.jsonld
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.base/java.lang.StringUTF16.compress(StringUTF16.java:161)
at java.base/java.lang.String.<init>(String.java:4501)
at java.base/java.lang.String.<init>(String.java:300)
at
org.glassfish.json.JsonTokenizer.getValue(JsonTokenizer.java:510)
at
org.glassfish.json.JsonParserImpl.getString(JsonParserImpl.java:101)
at
org.glassfish.json.JsonParserImpl.getObject(JsonParserImpl.java:332)
at
org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:175)
at
org.glassfish.json.JsonParserImpl.getArray(JsonParserImpl.java:321)
at
org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:173)
at
com.apicatalog.jsonld.document.JsonDocument.doParse(JsonDocument.java:163)
at
com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:112)
at
com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:90)
at
org.apache.jena.riot.lang.LangJSONLD11.read(LangJSONLD11.java:73)
at org.apache.jena.riot.RDFParser.read(RDFParser.java:444)
at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:413)
at org.apache.jena.riot.RDFParser.parse(RDFParser.java:375)
at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:391)
at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:337)
at riotcmd.CmdLangParse.exec$(CmdLangParse.java:234)
at riotcmd.CmdLangParse.exec(CmdLangParse.java:174)
at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
at riotcmd.riot.main(riot.java:29)
Regards
Sorin
Am 05.07.2024 um 00:35 schrieb Andy Seaborne:
On 03/07/2024 10:22, Sorin Gheorghiu wrote:
Greetings,
here my attempt to convert a large file from json-ld to rdf format,
does riot tool support archived files?
Yes.
$ riot --out=RDF/XML filein.jsonld.gz > fileout.rdf
That should work (Jena 5.0.0)
What happened?
Best regards
Sorin
--
Sorin Gheorghiu Tel: +49 7531 88-3198
Universität Konstanz Raum: B705
78464 Konstanz sorin.gheorg...@uni-konstanz.de
Kommunikations-, Informations-, Medienzentrum (KIM)
- Abteilung IT-Dienste Forschung und Lehre -