Folks, for the maths curious I found a seemingly ridiculous coincidence in
the MD5 and SHA1 hash algorithms. Here's what I do in my test:

 

Get the ASCII bytes of a string.

Hash the bytes using MD5 or SHA1.

"Fold" the 16 or 20 byte hash down to 4 bytes with XOR to make an Int32.

 

By "fold" I mean that I make a short buffer out of a long one by doing this
loop over the long buffer: fold[i%4] ^= buff[i]. Doing this is questionable,
but some gut feeling told me that this would make more randomness and better
distribute the 4 byte hash. It turns out that this "folding" actually
produces a weird effect.

 

I found two strings that produce the same hash this way. Here is the output
of my test code:

 

"123" --> Hash 202CB962AC59075B964B07152D234B70 --> Fold 371DF25C

"3Px" --> Hash 02BAB61AA407FD643066D193A1C668B1 --> Fold 371DF25C

 

The strings produce quite different MD5 hashes as you would expect, but when
I XOR "fold" the buffers down I get the same 4 byte result. This seems
statistically infeasible. Using SHA1 you get collisions with "7Fw" and
"X9z". I have a dozen other examples.

 

Greg

 

using System;

using System.Security.Cryptography;

using System.Text;

 

namespace collider

{

    class Program

    {

        static void Main(string[] args)

        {

            HashAlgorithm md5 = MD5.Create();

            HashAlgorithm sha1 = SHA1.Create();

            ShowCollision(md5, "123", "3Px");

            ShowCollision(sha1, "7Fw", "X9z");

        }

 

        private static void ShowCollision(HashAlgorithm hasher, string s1,
string s2)

        {

            byte[] hash = hasher.ComputeHash(StringBytes(s1));

            byte[] fold = FoldBuffer(hash, 4);

            Trace("\n\"{0}\" --> Hash {1} --> Fold {2}", s1,
PrintBuff(hash), PrintBuff(fold));

            hash = hasher.ComputeHash(StringBytes(s2));

            fold = FoldBuffer(hash, 4);

            Trace("\"{0}\" --> Hash {1} --> Fold {2}", s2, PrintBuff(hash),
PrintBuff(fold));

        }

 

        private static string PrintBuff(byte[] buffer)

        {

            return BitConverter.ToString(buffer).Replace("-", "");

        }

 

        private static byte[] FoldBuffer(byte[] buffer, int length)

        {

            byte[] fold = new byte[length];

            for (int i = 0; i < buffer.Length; i++)

            {

                fold[i % length] ^= buffer[i];

            }

            return fold;

        }

 

        private static byte[] StringBytes(string s)

        {

            return Encoding.ASCII.GetBytes(s);

        }

 

        private static void Trace(string format, params object[] args)

        {

            Console.WriteLine(format, args);

        }

    }

}

 

 

Reply via email to