Henrik

I¹m not familiar with SDB so can¹t give you a precise answer but I¹ll do
my best to address your question, see answer inline:

On 25/03/2014 09:56, "Henrik Alstad" <[email protected]> wrote:

>Hi.
>I'm looking at the SDB indexes for the Nodes table,
>and I got a very naive question.
>
>There is a hash based Nodes table,
>where there is no ID column, and the primary key of the table is the hash.
>
>What happens if two of the lex-values by some insane coincidence get the
>same hash?

I assume that you do get an error as you suggest.

Regardless of the hash function used there is always a collision
probability.  SDB uses MD5 which has a probability of approximately
2^20.96 according to
http://en.wikipedia.org/wiki/Comparison_of_cryptographic_hash_functions#Cry
ptanalysis so approximately 1 in 2 million

>Is there a built-in handling for this case, or is the hash-function
>somehow
>specified to be unique? or is there actually a slight possibility of just
>getting sdb-errors ("non-unique primary key" or something like that) when
>using that version?

AFAIK there is no handling for this situation

SDB is not actively supported by the project and is in maintenance mode,
none of the active developers are familiar with the code or maintaining
it.  It has also been shown to scale poorly compared to native triple
stores.  The project recommends the use of the native TDB triple store
instead which is actively maintained and developed.

Of course we recognise that there are various reasons why you might want a
SQL backed solution - IT infrastructure limitations, project requirements,
better failover support etc - but just be aware that you will be somewhat
on your own wrt support is you choose to pursue SDB.  While the project
will try and fix bugs we don¹t necessarily have the expertise to fix
anything non-trivial.

Rob

>-- 
>Cheers,
>Henrik Kjus Alstad




Reply via email to