Andy,
I improved my test (running on the same file): several runs, RDFFormats
variations… Results and code below (SSD disk, otherwise 5 years old mac
powerbook)
Main fact remains: JSON-LD serialization is slow (~10 times slower than
turtle). But people want JSON-LD. I think I’ll have a look at javascript to
convert turtle to JSON-LD in the browser.
Best Regards,
fps
model.size() 7559
*** WARM-UP ***
JSON-LD/pretty TIME: 766 ms
JSON-LD/flat TIME: 563 ms
N-Triples/utf-8 TIME: 80 ms
RDF/XML/pretty TIME: 560 ms
RDF/XML/plain TIME: 227 ms
RDF/XML/pretty TIME: 518 ms
Turtle/blocks TIME: 120 ms
Turtle/flat TIME: 142 ms
Turtle/pretty TIME: 110 ms
N-Triples/utf-8 TIME: 50 ms
*** RESULTS ***
JSON-LD/pretty TIME: 497 ms
JSON-LD/flat TIME: 475 ms
N-Triples/utf-8 TIME: 31 ms
RDF/XML/pretty TIME: 253 ms
RDF/XML/plain TIME: 140 ms
RDF/XML/pretty TIME: 215 ms
Turtle/blocks TIME: 50 ms
Turtle/flat TIME: 46 ms
Turtle/pretty TIME: 52 ms
N-Triples/utf-8 TIME: 34 ms
package testperfs;
import java.io.BufferedInputStream;
import java.io.BufferedOutputStream;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.io.OutputStream;
import java.net.URL;
import org.apache.jena.rdf.model.Model;
import org.apache.jena.rdf.model.ModelFactory;
import org.apache.jena.riot.RDFDataMgr;
import org.apache.jena.riot.RDFFormat;
import org.junit.After;
import org.junit.AfterClass;
import org.junit.Before;
import org.junit.BeforeClass;
import org.junit.Test;
public class serialization {
@BeforeClass
public static void setUpBeforeClass() throws Exception {
}
@AfterClass
public static void tearDownAfterClass() throws Exception {
}
@Before
public void setUp() throws Exception {
}
@After
public void tearDown() throws Exception {
}
@Test
public final void test() throws IOException {
Model m = loadModel();
System.out.println("model.size() " + m.size());
RDFFormat[] formats = {
RDFFormat.JSONLD_PRETTY,
RDFFormat.JSONLD_FLAT,
RDFFormat.NTRIPLES_UTF8,
RDFFormat.RDFXML_ABBREV,
RDFFormat.RDFXML_PLAIN,
RDFFormat.RDFXML_PRETTY,
RDFFormat.TURTLE_BLOCKS,
RDFFormat.TURTLE_FLAT,
RDFFormat.TURTLE_PRETTY,
RDFFormat.NTRIPLES_UTF8};
// warm it up
System.out.println("*** WARM-UP ***");
for (RDFFormat format : formats) {
doIt(m, format, 1);
}
// now for real
System.out.println("*** RESULTS ***");
for (RDFFormat format : formats) {
doIt(m, format, 20);
}
}
private void doIt(Model m, RDFFormat format, int n) throws IOException {
File f = new File(getFile("/testperfs"),"output.txt");
long time = 0;
for (int i = 0 ; i < n ; i++) {
if (f.exists()) f.delete();
OutputStream out = new BufferedOutputStream(new
FileOutputStream(f));
long start = System.currentTimeMillis();
RDFDataMgr.write(out, m, format) ;
out.flush();
out.close();
long end = System.currentTimeMillis();
time += (end-start);
}
String x = format + " TIME: " + time/n + " ms";
System.out.println(x);
}
private void doItOld(Model m, String lang) throws IOException {
File f = new File(getFile("/testperfs"),"output.txt");
if (f.exists()) f.delete();
OutputStream out = new BufferedOutputStream(new FileOutputStream(f));
long start = System.currentTimeMillis();
m.write(out, lang);
out.flush();
out.close();
long end = System.currentTimeMillis();
String x = lang + " TIME: " + (end-start) + " ms";
System.out.println(x);
f.delete();
}
private Model loadModel() throws IOException {
Model m = ModelFactory.createDefaultModel();
// Loading the model
File f = getTestFile();
InputStream in = new BufferedInputStream(new FileInputStream(f));
m.read(in, null, "JSON-LD");
in.close();
return m;
}
private File getTestFile() {
return getFile("/testperfs/docs.jsonld");
}
private File getFile(String name) {
URL resourceUrl = getClass().getResource(name);
return new File(resourceUrl.getFile());
}
}
> Le 21 nov. 2015 à 18:06, Andy Seaborne <[email protected]> a écrit :
>
> On 21/11/15 01:28, François-Paul Servant wrote:
>> Hi,
>>
>> it seems to me that JSON-LD serialization is slow. Do you have the same
>> feeling?
>> Here are the results of a comparative test that I run on my machine
>> (outputing one model to a file, using Jena 3.0.1-SNAPSHOT)
>>
>
> Are these single run costs? i.e. from cold?
>
>> model.size() 7559
>> JSON-LD TIME: 649 ms
>
> Jena use a separate self-contained engine, jsonld-java, which in trun uses
> Jackson.
>
> It means taking a copy of much of the material to be printed, getting an
> in-memory structure then traversing it for output, including formatting the
> JSON so it is not all on one line and JSON-indented.
>
>> TURTLE TIME: 136 ms
>
> That is the normal Turtle writer? It's pretty printing and so non-streaming -
> there are variations (RDFFormat) to printing streaming style with less
> prettiness but still using prefixed names. Pretty is not free.
>
>> RDF/XML TIME: 548 ms
>
> Ditto - RDF/XML or RDF/XML-ABBREV. The default using RDFDataMgr for
> Loang.RDFXML is "pretty" (RDF/XML-ABBREV)
>
>> N-TRIPLE TIME: 61 ms
>
> Is that to a spinning disk or SSD? (Given a disk write inc sync is of the
> order of 10ms)
>
> Andy
>
>>
>> annoying...
>>
>> Best,
>>
>> fps
>>
>