Re: 24 Million entries and I need to what?

Mister Ed Fri, 27 Dec 2013 20:45:15 -0800

On 12/27/2013 9:28 PM, Levi Pearson wrote:

On Fri, Dec 27, 2013 at 8:19 PM, S. Dale Morrey <[email protected]> wrote:

Well Levi you would be quite correct if the intent were to actually seek a
collision across 256 bits of space.  That is not what I'm going for here.
  In my mind detecting a collision would be evidence of a flawed
implementation of the hashing algorithm which is what my experiment is
actually seeking to uncover.  So I'm checking the specific implementation
to rule out a flaw, not nessecarily the algorithm.


You might try thinking about it this way:

Let's say we have an algorithm, X, that performs some mathematical
function. This means for every value in the input domain, there will
be a single value from the output co-domain that the function pairs it
with. Because this function is defined by an algorithm, we have an
unambiguous recipe for determining which value from the co-domain is
paired with some arbitrary value from the input domain. We need only
follow the algorithm!

Now, we have a computer program, Y, which purports to be a correct
implementation of algorithm X.  How might we go about verifying that
it is indeed a correct implementation?

What you have proposed is that we take some bit of knowledge about the
function defined by X, namely that it has the property P that it is
extremely unlikely that two different values from the domain will map
to the same value in the co-domain.  So if we run Y a very large
number of times, and we have no collisions (i.e., it has not produced
a counterexample to P), does this give us confidence that Y is a
correct manifestation of X?  In fact, it might make us *slightly* more
confident that it is correct, but hardly enough to matter.

Consider the program Y_fake that implements an entirely different
algorithm than X, but which also has a property P, at least to the
extent that is measurable on your equipment in a reasonable amount of
time. Your test cannot differentiate between Y_fake and Y in any sense
other than a very vague and probabilistic one, and will probably only
find differences after the expense of a great deal of time and
computing power.

I suggest you skip backwards a couple of paragraphs and revisit the
real question: Is Y a correct implementation of algorithm X?  How can
we verify it?

There are a couple of approaches one could take, but I trust you can
think of a better one than your current approach.

        --Levi

Levi, if we could add the heart of a teacher to that mathematicalfundamentalist thinking, we'd have either a coding super soldier tocontend with and/or the engineering equiv of yoda: seeking the wronggoal, you are. Rephrase the question, you should. Strong with thenumbers Levi is.





/*
PLUG: http://plug.org, #utah on irc.freenode.net
Unsubscribe: http://plug.org/mailman/options/plug
Don't fear the penguin.
*/

Re: 24 Million entries and I need to what?

Reply via email to