This actually what I've already mentioned -  with rainbow tables kept in
memory it could be really fast!

Marek


2014-06-12 9:25 GMT+02:00 Michael Cutler <mich...@tumra.com>:

> Hi Nick,
>
> The great thing about any *unsalted* hashes is you can precompute them
> ahead of time, then it is just a lookup to find the password which matches
> the hash in seconds -- always makes for a more exciting demo than "come
> back in a few hours".
>
> It is a no-brainer to write a generator function to create all possible
> passwords from a charset like "
> abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789", hash
> them and store them to lookup later.  It is however incredibly wasteful on
> storage space.
>
> - all passwords from 1 to 9 letters long
> - using the charset above = 13,759,005,997,841,642 passwords
> - assuming 20 bytes to store the SHA-1 and up to 9 to store the password
>  equals approximately 375.4 Petabytes
>
> Thankfully there is already a more efficient/compact mechanism to achieve
> this using Rainbow Tables <http://en.wikipedia.org/wiki/Rainbow_table> --
> better still, there is an active community of people who have already
> precomputed many of these datasets already.  The above dataset is readily
> available to download and is just 864GB -- much more feasible.
>
> All you need to do then is write a rainbow-table lookup function in Spark
> and leverage the precomputed files stored in HDFS.  Done right you should
> be able to achieve interactive (few second) lookups.
>
> Have fun!
>
> MC
>
>
>  *Michael Cutler*
> Founder, CTO
>
>
> * Mobile: +44 789 990 7847 Email:   mich...@tumra.com <mich...@tumra.com>
> Web:     tumra.com
> <http://tumra.com/?utm_source=signature&utm_medium=email> *
> *Visit us at our offices in Chiswick Park <http://goo.gl/maps/abBxq>*
> *Registered in England & Wales, 07916412. VAT No. 130595328 <130595328>*
>
>
> This email and any files transmitted with it are confidential and may also
> be privileged. It is intended only for the person to whom it is addressed.
> If you have received this email in error, please inform the sender 
> immediately.
> If you are not the intended recipient you must not use, disclose, copy,
> print, distribute or rely on this email.
>
>
> On 12 June 2014 01:24, Nick Chammas <nicholas.cham...@gmail.com> wrote:
>
>> Spark is obviously well-suited to crunching massive amounts of data. How
>> about to crunch massive amounts of numbers?
>>
>> A few years ago I put together a little demo for some co-workers to
>> demonstrate the dangers of using SHA1
>> <http://codahale.com/how-to-safely-store-a-password/> to hash and store
>> passwords. Part of the demo included a live brute-forcing of hashes to show
>> how SHA1's speed made it unsuitable for hashing passwords.
>>
>> I think it would be cool to redo the demo, but utilize the power of a
>> cluster managed by Spark to crunch through hashes even faster.
>>
>> But how would you do that with Spark (if at all)?
>>
>> I'm guessing you would create an RDD that somehow defined the search
>> space you're going to go through, and then partition it to divide the work
>> up equally amongst the cluster's cores. Does that sound right?
>>
>> I wonder if others have already used Spark for computationally-intensive
>> workloads like this, as opposed to just data-intensive ones.
>>
>> Nick
>>
>>
>> ------------------------------
>> View this message in context: Using Spark to crack passwords
>> <http://apache-spark-user-list.1001560.n3.nabble.com/Using-Spark-to-crack-passwords-tp7437.html>
>> Sent from the Apache Spark User List mailing list archive
>> <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com.
>>
>
>

Reply via email to