Hi,

I am using Apache Jena 3.12.0 with OpenJDK version 1.8.0_212 on a 64-Bit
Ubuntu 18.04.2 LTS (bionic) server with no changes to any default
configurations.

I have a 3.2G sized-Turtle (.ttl) RDF file that has ~25M triples that I'd
like to convert to a JSON-LD representation. I first looked at jena.rdfcat
which suggested I should be using 'riot' instead. I then tried
riotcmd.turtle with 2 different GCs with up to 40G max-heap size but in
about 12 mins it ran into a "java.lang.OutOfMemoryError: Java heap space"
(stack trace at the end).

$ cd apache-jena-3.12.0/bin



FAILED-1: $ riotcmd.turtle --time --verbose --syntax=TURTLE
> --output=JSON-LD large_file.ttl -Xmx40G -XX:+OptimizeStringConcat
> -XX:+UseG1GC -XX:+UseStringDeduplication
> -XX:+PrintStringDeduplicationStatistics
> -Dlog4j.configuration=file:~/apache-jena-3.12.0/jena-log4j.properties



FAILED-2: $ riotcmd.turtle --time --verbose --syntax=TURTLE
> --output=JSON-LD large_file.ttl -Xmx40G -XX:+OptimizeStringConcat
> -XX:+UseConcMarkSweepGC
> -Dlog4j.configuration=file:~/apache-jena-3.12.0/jena-log4j.properties


Question: I believe I may be missing some parameters or configurations that
I could fine-tune. Any suggestions on what could I try? If not, are there
any alternate mechanisms by which I could convert the large TTL to a
JSON-LD?

Stack trace follows below:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at java.util.LinkedHashMap.newNode(LinkedHashMap.java:256)
> at java.util.HashMap.putVal(HashMap.java:631)
> at java.util.HashMap.put(HashMap.java:612)
> at com.github.jsonldjava.core.RDFDataset$IRI.<init>(RDFDataset.java:317)
> at com.github.jsonldjava.core.RDFDataset$Quad.<init>(RDFDataset.java:52)
> at com.github.jsonldjava.core.RDFDataset.addQuad(RDFDataset.java:540)
> at org.apache.jena.riot.writer.JenaRDF2JSONLD.parse(JenaRDF2JSONLD.java:85)
> at
> org.apache.jena.riot.writer.JsonLDWriter.toJsonLDJavaAPI(JsonLDWriter.java:205)
> at
> org.apache.jena.riot.writer.JsonLDWriter.serialize(JsonLDWriter.java:178)
> at org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:139)
> at org.apache.jena.riot.writer.JsonLDWriter.write(JsonLDWriter.java:145)
> at org.apache.jena.riot.RDFWriter.write$(RDFWriter.java:207)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:165)
> at org.apache.jena.riot.RDFWriter.output(RDFWriter.java:112)
> at org.apache.jena.riot.RDFWriterBuilder.output(RDFWriterBuilder.java:178)
> at org.apache.jena.riot.RDFDataMgr.write$(RDFDataMgr.java:1277)
> at org.apache.jena.riot.RDFDataMgr.write(RDFDataMgr.java:1162)
> at riotcmd.CmdLangParse$1.postParse(CmdLangParse.java:334)
> at riotcmd.CmdLangParse.exec$(CmdLangParse.java:170)
> at riotcmd.CmdLangParse.exec(CmdLangParse.java:128)
> at jena.cmd.CmdMain.mainMethod(CmdMain.java:93)
> at jena.cmd.CmdMain.mainRun(CmdMain.java:58)
> at jena.cmd.CmdMain.mainRun(CmdMain.java:45)
> at riotcmd.turtle.main(turtle.java:30)


-- 
Ankit Dangi

Reply via email to