optimizing serialization of results from fuseki

Paul Tyson Wed, 06 Jan 2016 08:18:07 -0800

I have a modest (17M triple) dataset, fairly flat graph. I run some
queries selecting nodes with anywhere from 12-20 different property
values.


Result set counts are anywhere from 10,000 to 30,000 nodes. Total
execution time measured at client are in the 30-40 second range.

The web request begins streaming results immediately, but seems to take
longer than it should (based on the number of results and size of data
transfer). I also notice that the time is roughly linear with the size
of dataset--halving the dataset size halves the result set and the
execution time. I wouldn't have expected this behavior if all the time
was due to an indexed search.

My question is: is total query time limited by search execution speed,
or by marshaling and serialization of search results? 

I have tried different query patterns, and believe I have the best
queries possible for the use case.

I'm looking for other suggestions to reduce overall execution time. The
performance does not improve drastically going from 4Gb to 8 or 16Gb
RAM. My test platforms are 64-bit Windows, ranging from small server
(16Gb RAM, 4 CPU) to laptops with 4Gb RAM.

Thanks,
--Paul

optimizing serialization of results from fuseki

Reply via email to