On 2013-11-13, at 8:13 PM, Jeffrey Walton <[email protected]> wrote:
> Is anyone aware of a blacklist that includes those 150 million records > from Adobe's latest breach? You are aware that these haven’t all been decrypted? (Or is there some news I’ve missed.) The passwords were encrypted, unsalted, using 3DES in ECB mode. But the actual encryption key is unknown. So the way that passwords have been “decrypted” is on a case by case basis. For example, if we have, say, 100,000 users using the same password, and one of them credibly ‘fesses up to what their password was, then we know what that password was for all of those users. These are reinforced by the fact that many of the passwords included password hints, often simply saying what the password was. We also can work out what some of the more popular passwords are by comparing with other breaches. For example if [email protected] is known to use the password snoopy1 in both the Sony and LinkedIn breaches, and gives the same hint in the Adobe data, that is a big clue. If we find a few dozen other reusers that way we can say with high confidence what that particular password is. The ECB mode and small block size of 3DES has also been helpful. So suppose we have about 6700 people corresponding to this password 6682 /NpNslkFN4nioxG6CatHBw== and 3402 corresponding to this one 3402 /FkacZU/hWrioxG6CatHBw== Even with the base64 encoding, you can see that the second block of each of those passwords is the same as it encrypts to ioxG6CatHBw (I really should convert the base64 to hex) So if we find (and I haven’t correlated what I’m working on with actual passwords, so now this is hypothetical) that ioxG6CatHBw appears for the last block of the encryption of “password1”, then we know that that is the encryption of “1” plus padding. Even a cursory glance at the data and you see penguins. My project is on relative frequency of passwords, so I’m not actually trying to figure out that plaintext. I’m interested in relative password frequency. Several people have noticed that the popularity of passwords resembles a power law distribution. David Malone and colleagues have specifically looked that this. @article{MaloneMaher11:CoRR, Author = {Malone, David and Maher, Kevin}, Journal = {CoRR}, Title = {Investigating the Distribution of Password Choices}, Volume = {abs/1104.3722}, Year = {2011}} And I’ve seen similar in my own work. The “problem” is that if the power law distribution holds up with a “big” exponent (near or above 1) then that would indicate a situation where popularity contributes to popularity. So I want the resemblance to a power law distribution to be superficial. There are other distributions that can look similar. Either that I want an explanation for why the popularity of password choice would make it more attractive to others. Are people really being influenced by their password choices by others? I think that high password reuse might be able to account for some of the power lawish distribution, but I haven’t quite worked that out. At any rate, this data dump is perfect for me. I’ve only just begun working on it, but unsalted ECB encrypted passwords allow me to count frequencies. > I tried finding a list and was not successful. There isn’t a list of these decrypted. Jeremi Gosney has, in collaboration with others, worked out what the passwords are for the 100 most frequent. Troy Hunt is doing some excellent work on correlating with other breaches. Cheers, -j –- Jeffrey Goldberg Chief Defender Against the Dark Arts @ AgileBits http://agilebits.com _______________________________________________ cryptography mailing list [email protected] http://lists.randombit.net/mailman/listinfo/cryptography
