Hi, on a similar vein I have a modified NTriple reader which uses a prefix file to reduce the file size. Whilst the serialisation allows parallel processing in spark the file sizes were large and this has reduced them to 1/10 the original size on average.
There is not an existing line based serialisation with some for of prefixing is there? On 17 Dec 2016 20:03, "Andy Seaborne" <a...@apache.org> wrote: > Related: > > Jena now provides "Serializable" for Triple/Quad/Node > > It did not make 3.1.1, it's in development snapshots and in the next > release. > > Use with spark was the original motivation. > > Andy > > https://issues.apache.org/jira/browse/JENA-1233 > > On 17/12/16 09:14, Joint wrote: > >> >> >> Hi. >> I was about to use the above to wrap some quads and spoof the RDDs as >> graphs from within a dataset but before I do has this been done before? I >> have some code which calls the RDD methods from the graph base find. Not >> wanting to invent the wheel and such... >> >> >> Dick >> >>