Hi, I am using Apache Jena 3.12.0 with OpenJDK version 1.8.0_212 on a 64-Bit Ubuntu 18.04.2 LTS (bionic) server with no changes to any default configurations.
I have a 3.2G sized-Turtle (.ttl) RDF file that has ~25M triples that I'd like to convert to a JSON-LD representation. I first looked at jena.rdfcat which suggested I should be using 'riot' instead. I then tried riotcmd.turtle with 2 different GCs with up to 40G max-heap size but in about 12 mins it ran into a "java.lang.OutOfMemoryError: Java heap space" (stack trace at the end). $ cd apache-jena-3.12.0/bin FAILED-1: $ riotcmd.turtle --time --verbose --syntax=TURTLE > --output=JSON-LD large_file.ttl -Xmx40G -XX:+OptimizeStringConcat > -XX:+UseG1GC -XX:+UseStringDeduplication > -XX:+PrintStringDeduplicationStatistics > -Dlog4j.configuration=file:~/apache-jena-3.12.0/jena-log4j.properties FAILED-2: $ riotcmd.turtle --time --verbose --syntax=TURTLE > --output=JSON-LD large_file.ttl -Xmx40G -XX:+OptimizeStringConcat > -XX:+UseConcMarkSweepGC > -Dlog4j.configuration=file:~/apache-jena-3.12.0/jena-log4j.properties Question: I believe I may be missing some parameters or configurations that I could fine-tune. Any suggestions on what could I try? If not, are there any alternate mechanisms by which I could convert the large TTL to a JSON-LD? Stack trace follows below: Exception in thread "main" java.lang.OutOfMemoryError: Java heap space > at java.util.LinkedHashMap.newNode(LinkedHashMap.java:256) > at java.util.HashMap.putVal(HashMap.java:631) > at java.util.HashMap.put(HashMap.java:612) > at com.github.jsonldjava.core.RDFDataset$IRI.<init>(RDFDataset.java:317) > at com.github.jsonldjava.core.RDFDataset$Quad.<init>(RDFDataset.java:52) > at com.github.jsonldjava.core.RDFDataset.addQuad(RDFDataset.java:540) > at org.apache.jena.riot.writer.JenaRDF2JSONLD.parse(JenaRDF2JSONLD.java:85) > at > org.apache.jena.riot.writer.JsonLDWriter.toJsonLDJavaAPI(JsonLDWriter.java:205) > at > org.apache.jena.riot.writer.JsonLDWriter.serialize(JsonLDWriter.java:178) > at org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:139) > at org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:145) > at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:207) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165) > at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112) > at org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:178) > at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1277) > at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162) > at riotcmd.CmdLangParse$1.postParse(CmdLangParse.java:334) > at riotcmd.CmdLangParse.exec$(CmdLangParse.java:170) > at riotcmd.CmdLangParse.exec(CmdLangParse.java:128) > at jena.cmd.CmdMain.mainMethod(CmdMain.java:93) > at jena.cmd.CmdMain.mainRun(CmdMain.java:58) > at jena.cmd.CmdMain.mainRun(CmdMain.java:45) > at riotcmd.turtle.main(turtle.java:30) -- Ankit Dangi
