On 02/08/17 13:31, Élie Roux wrote:
Le 02/08/2017 à 14:13, Jean-Marc Vanel a écrit :
Élie,

I would use N-Triples format, sorted in alphanumerical order.

Thank you very much for your answer! I thought about this approach but I
see two problems:

- NTRIPLE is hardly readable and I would prefer having my data stored as
TURTLE for readability

- more importantly, this will still output a lot of diff noise because
blank node IDs will change randomly (and will not keep the same order)

Only if you reload the file ... in which case it is a different blank node.

The NT writer uses the internal label for the blank node so if the blank node label is changing, suggesting the file is reloaded.

This is most serious for subjects because they will be wildly far apart whereas (block writer) triples are locally grouped. Sorting by subject would need to define the comparison based on something - maybe a primary key value?

Dumping a TDB database (which is N-Quads) shows he label is stable if the source is stable.

    Andy


Thank you,

Reply via email to