So, this is what I was asking about earlier.  With small graphs, e.g.
DESCRIBE <....>, the algorithms for graph isomorphism that support blank
nodes should be good.   rdflib includes an implementation, and I wish I
knew whether there is an implementation of that digest algorithm for Jena.

On Fri, Dec 8, 2017 at 2:27 AM, Claude Warren <[email protected]> wrote:

> On Fri, Nov 24, 2017 at 12:19 PM, Laura Morales <[email protected]> wrote:
>
> > > What about simply deleting the old graph and loading the triples of the
> > > .nt file into the graph afterwards? I don't see any benefit of such a
> > > "tool" - you could just write your own bash script for this if you need
> > > this quite often.
> >
> > The advantage is with large graphs, such as wikidata. If I download their
> > dumps once a week, it's much more efficient to only change a few triples
> > instead of deleting the entire graph and recreating the whole TDB store.
> >
>
>
> Performing a diff between two graphs with blank nodes might be speed up
> using bloom filters.
>
> I have code that represents triples as bloom filters and I know that 9 byte
> filters will work for very large graphs so you could probably get aways
> with 8 bytes to make them fit in a standard integer size.
>
> This is a multiple pass operation.
>
> create a bloom filter for each node in graph A.  Call this list A
>
> step through  graph B creating bloom filters for each triple. if the triple
> in question has blank nodes only encode non blank nodes
>
> If the bloom filter is not in List A it is new.
>
> if the bloom filter is in list A then it may be new and a direct lookup in
> graph A. if it is not found add it
>
> If your filter list has a pointer to the triples that it represents
> (remember there can be bloom filter collisions) then you can rapidly
> determine if there is a match and you also have a good starting place to do
> blank node comparisons to determine if the triples are equivalent.
>
> If anyone is interested in trying this I have some triple/bloom filter code
> in my github repository.
>
> Claude
>
> --
> I like: Like Like - The likeliest place on the web
> <http://like-like.xenei.com>
> LinkedIn: http://www.linkedin.com/in/claudewarren
>

Reply via email to