I got always OutOfMemoryError no matter how much RAM is available and it fails wheather the jsonld file is archived or not

I attempt to convert this file https://data.dnb.de/opendata/authorities-gnd_entityfacts.jsonld.gz with Jena 4.10.0 or 5.0.0

Here my tests:

# free -m
               gesamt      belegt       frei     gemeinsam Zwischen   verfügbar
Speicher:      128821       24038      100091           1 4691 103714
Auslager:        1903           0        1903

# JAVA_HOME="/usr/lib/jvm/java-11-openjdk-amd64"; export JAVA_HOME

# cd /opt/apache-jena-4.10.0

# VM_ARGS="-Xmx100G"  bin/riot --out=RDF/XML /var/tmp/authorities-gnd_entityfacts.jsonld.gz > authorities-gnd_entityfacts.rdf

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
         at java.base/java.util.LinkedHashMap.newNode(LinkedHashMap.java:256)
         at java.base/java.util.HashMap.putVal(HashMap.java:627)
         at java.base/java.util.HashMap.put(HashMap.java:608)
         at org.glassfish.json.JsonObjectBuilderImpl.putValueMap(JsonObjectBuilderImpl.java:209)          at org.glassfish.json.JsonObjectBuilderImpl.add(JsonObjectBuilderImpl.java:81)          at org.glassfish.json.JsonParserImpl.getObject(JsonParserImpl.java:334)          at org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:175)          at org.glassfish.json.JsonParserImpl.getArray(JsonParserImpl.java:321)          at org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:173)          at com.apicatalog.jsonld.document.JsonDocument.doParse(JsonDocument.java:163)          at com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:112)          at com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:90)          at org.apache.jena.riot.lang.LangJSONLD11.read(LangJSONLD11.java:73)
         at org.apache.jena.riot.RDFParser.read(RDFParser.java:416)
         at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:385)
         at org.apache.jena.riot.RDFParser.parse(RDFParser.java:360)
         at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:384)
         at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:331)
         at riotcmd.CmdLangParse.exec$(CmdLangParse.java:229)
         at riotcmd.CmdLangParse.exec(CmdLangParse.java:169)
         at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
         at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
         at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
         at riotcmd.riot.main(riot.java:29)

# VM_ARGS="-Xmx100G"  bin/riot --out=RDF/XML /var/tmp/authorities-gnd_entityfacts.jsonld > authorities-gnd_entityfacts.rdf Exception in thread "main" java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects

# JAVA_HOME="/usr/lib/jvm/java-17-openjdk-amd64"; export JAVA_HOME

# cd /opt/apache-jena-5.0.0

# VM_ARGS="-Xmx100G"  bin/riot --out=RDF/XML /var/tmp/authorities-gnd_entityfacts.jsonld.gz > authorities-gnd_entityfacts.rdf Exception in thread "main" java.lang.OutOfMemoryError: Java heap space: failed reallocation of scalar replaced objects

# VM_ARGS="-Xmx100G"  bin/riot --out=RDF/XML /var/tmp/authorities-gnd_entityfacts.jsonld > authorities-gnd_entityfacts.rdf 09:42:56 INFO  riot            :: File: /var/tmp/authorities-gnd_entityfacts.jsonld
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
         at java.base/java.lang.StringUTF16.compress(StringUTF16.java:161)
         at java.base/java.lang.String.<init>(String.java:4501)
         at java.base/java.lang.String.<init>(String.java:300)
         at org.glassfish.json.JsonTokenizer.getValue(JsonTokenizer.java:510)          at org.glassfish.json.JsonParserImpl.getString(JsonParserImpl.java:101)          at org.glassfish.json.JsonParserImpl.getObject(JsonParserImpl.java:332)          at org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:175)          at org.glassfish.json.JsonParserImpl.getArray(JsonParserImpl.java:321)          at org.glassfish.json.JsonParserImpl.getValue(JsonParserImpl.java:173)          at com.apicatalog.jsonld.document.JsonDocument.doParse(JsonDocument.java:163)          at com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:112)          at com.apicatalog.jsonld.document.JsonDocument.of(JsonDocument.java:90)          at org.apache.jena.riot.lang.LangJSONLD11.read(LangJSONLD11.java:73)
         at org.apache.jena.riot.RDFParser.read(RDFParser.java:444)
         at org.apache.jena.riot.RDFParser.parseURI(RDFParser.java:413)
         at org.apache.jena.riot.RDFParser.parse(RDFParser.java:375)
         at riotcmd.CmdLangParse.parseRIOT(CmdLangParse.java:391)
         at riotcmd.CmdLangParse.parseFile(CmdLangParse.java:337)
         at riotcmd.CmdLangParse.exec$(CmdLangParse.java:234)
         at riotcmd.CmdLangParse.exec(CmdLangParse.java:174)
         at org.apache.jena.cmd.CmdMain.mainMethod(CmdMain.java:87)
         at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:56)
         at org.apache.jena.cmd.CmdMain.mainRun(CmdMain.java:43)
         at riotcmd.riot.main(riot.java:29)

Regards
Sorin

Am 05.07.2024 um 00:35 schrieb Andy Seaborne:


On 03/07/2024 10:22, Sorin Gheorghiu wrote:
Greetings,

here my attempt to convert a large file from json-ld to rdf format, does riot tool support archived files?

Yes.


$ riot --out=RDF/XML filein.jsonld.gz > fileout.rdf

That should work (Jena 5.0.0)

What happened?


Best regards
Sorin

--
Sorin Gheorghiu             Tel: +49 7531 88-3198
Universität Konstanz        Raum: B705
78464 Konstanz              sorin.gheorg...@uni-konstanz.de

Kommunikations-, Informations-, Medienzentrum (KIM)
- Abteilung IT-Dienste Forschung und Lehre -

Reply via email to