On 26/08/14 23:13, Stephen Owens wrote:
Very cool Andy. How's the write performance?
Good question - not completely sure yet - I'm not expecting a huge gain.
It's easier to test read performance and read performance was what I was
more interested in initially.
For read, you can parse and send the result to the moral equivalent of
/dev/null so producing a big file and timing gets the numbers. "riot"
does this already.
For writing, you need a generator capable of going faster than the
output writer and that needs to be such that it itself isn't slowing the
computer down.
Writing N-triples is currently faster than parsing - parsing is fiddling
out with character-by-character checking for the markers like '<' and
'>'. Length encoded structures and someone else's tuned code beats
that easily. Just need to get the bytes->java characters going
efficiently, which is also the trick for parsing N-Triples. Writing
does not need such a copy-heavy/single-character manipulation code path
and you can output the strings more directly to a write buffer directly,
still checking for escape sequences on literals (singe character
operations - yuk).
Writing Thrift is build datastructures and output, but no escape
sequence checking is needed.
Andy