Re: [cryptography] Oddity in common bcrypt implementation
On 28/06/11 1:01 PM, Paul Hoffman wrote: And this discussion of ASCII and internationalization has what to do with cryptography, I personally think this list is about users of crypto, rather than cryptographers-creators in particular. The former are mostly computer scientists who think in block-algorithm form, the latter are more the mathematicians. As a crypto-plumber (computer science user of crypto) I think it is impossible to divorce crypto from all the other security techniques. All the way up the stack. Or, talking about non-crypto security techniques like passwords is punishment for mucking up the general deployment of better crypto techniques. asks the person on the list is who is probably most capable of arguing about it but won't? [1] --Paul Hoffman [1] RFC 3536, and others iang ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
James A. Donald jam...@echeque.com writes: I rather think it is the right forum, for this forum is applied cryptography, and application usually requires password handling. If we are going to go beyond seven bit ascii, unicode is the only thing that is going to avoid compatibility hell. I realise it's fun to jump in and bikeshed the living daylights out of this, but before we do that we'd really need to figure out whether the cure is worse than the disease. For example my own code treats a password as an opaque string of bytes in whatever the local system's character set is. It's used all over the place, including CJK countries and other places whose character sets are about as far removed from ASCII as you can get. So far I've had exactly zero complaints about i18n or c18n-based password issues. [Pause] Yup, just counted them again, definitely zero. Turns out that most of the time when people are entering their passwords to, for example, unlock a private key, they don't have it spread across multiple totally dissimilar systems. Now I'm sure I could dream up all manner of pathological corner cases where the straight stream-of-bytes approach won't work, but in practice they just don't seem to crop up. So compatibility hell isn't. Given the high level of difficulty in dealing with this (I suspect that reliably handling all types of character encodings, not to mention assorted bugs and erratic behaviour in different implementations, across all possible systems, is a more or less unsolvable problem) and the fact that so far it's never cropped up as an issue, I'm sticking with the string-of-bytes interpretation. Peter. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/29/2011 06:49 AM, Peter Gutmann wrote: So far I've had exactly zero complaints about i18n or c18n-based password issues. [Pause] Yup, just counted them again, definitely zero. Turns out that most of the time when people are entering their passwords to, for example, unlock a private key, they don't have it spread across multiple totally dissimilar systems. Well I work on an implementation of the RADIUS thing as previously described. It's got a ton of users, some even in Asian countries, using it to interoperate with other vendors' products. I don't recall many users having password issues with character sets either. But I also know I could probably sit down and construct a broken case rather quickly. Nevertheless, if someone does report an unexplained issue we might ask if there are any weird, special characters in their password. (Actually, it's more complex than that. We reiterate that we would never ask them for their password but hint that special characters might be a source of problems.) So this suggests probably some combination of: 1. We picked the right encoding transformation logic. We receive the credentials via RADIUS and usually validate them against the Windows API which accepts UTF-16LE. IIRC we interpret the RADIUS credentials as what Windows calls ANSI for this. 2. Admins who configure these systems in other markets have learned how to adjust their various systems for their local encodings in ways that never required our support. Perhaps from past experience they are reluctant to ask us simple ASCII Americans for help troubleshooting this type of issue. 3. Users everywhere choose very simple ASCII passwords and are reluctant to report issues with special characters all the way up to us vendors. Right now we're giving Solar Designer several bits of entropy for free. If we could solve the 'high bit' problem, it could be a significant increase in effective security for a lot of people. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Jun 29, 2011, at 4:49 AM, Peter Gutmann wrote: I realise it's fun to jump in and bikeshed the living daylights out of this, +1. i18n geeks do this all the time. but before we do that we'd really need to figure out whether the cure is worse than the disease. +1. When we first applied i18n to passwords, we argued long and hard about it because there were strong arguments on both sides. We ended up with yes we should do it and now can evaluate what came of that decision. For example my own code treats a password as an opaque string of bytes in whatever the local system's character set is. It's used all over the place, including CJK countries and other places whose character sets are about as far removed from ASCII as you can get. That would be the billion+ people who use the Arabic script for many different written languages. So far I've had exactly zero complaints about i18n or c18n-based password issues. And that's a very valuable data point. It weighs heavily against the developer problems we have seen with using stringprep in password preparation. In retrospect, I believe we made the wrong choice to go with process passwords with this complex algorithm, but I don't think we could have known that at the time. --Paul Hoffman ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 2011-06-29 7:01 PM, Ian G wrote: On 28/06/11 1:01 PM, Paul Hoffman wrote: And this discussion of ASCII and internationalization has what to do with cryptography, I personally think this list is about users of crypto, rather than cryptographers-creators in particular. The former are mostly computer scientists who think in block-algorithm form, the latter are more the mathematicians. As a crypto-plumber (computer science user of crypto) I think it is impossible to divorce crypto from all the other security techniques. All the way up the stack. Crypto plumbing is on topic. Thus password normalization is on topic. One problem with unicode is that identical characters often have multiple codes, one for each character meaning. Also, characters that are in some sense composite may be represented both as two characters, or as a single character. Thus the exact same password string, in visible symbols, may have multiple codes. The user types what he reasonably believes to be the password, but it does not work! Thus the password has to be normalized before being hashed. Further, often a variants of a single character with a single meaning also have multiple codes - there is no sharp boundary between the string, and formatting information, though this is more a problem for unicode searching, than for unicode passwords. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Wed, Jun 29, 2011 at 11:06 AM, Marsh Ray ma...@extendedsubset.com wrote: On 06/29/2011 06:49 AM, Peter Gutmann wrote: So far I've had exactly zero complaints about i18n or c18n-based password issues. [Pause] Yup, just counted them again, definitely zero. Turns out that most of the time when people are entering their passwords to, for example, unlock a private key, they don't have it spread across multiple totally dissimilar systems. Well I work on an implementation of the RADIUS thing as previously described. It's got a ton of users, some even in Asian countries, using it to interoperate with other vendors' products. I don't recall many users having password issues with character sets either. But I also know I could probably sit down and construct a broken case rather quickly. Nevertheless, if someone does report an unexplained issue we might ask if there are any weird, special characters in their password. (Actually, it's more complex than that. We reiterate that we would never ask them for their password but hint that special characters might be a source of problems.) So this suggests probably some combination of: 1. We picked the right encoding transformation logic. We receive the credentials via RADIUS and usually validate them against the Windows API which accepts UTF-16LE. IIRC we interpret the RADIUS credentials as what Windows calls ANSI for this. From my interop-ing experience with Windows, Linux, and Apple (plus their mobile devices), I found the best choice for password interoperability was UTF8, not UTF16. I've used UTF8 with classical password file schemes, EAP-PSK, and Thomas Wu's SRP. UTF8 works great for serialization and with other libraries, such as Crypto++ and OpenSSL (sorry Dr. Guttman!). Plus, the windows standard stream stuff was [is?] half broken for the wide character sets. So on Windows, you're going to have to would around wostream problems or use the narrow gear on the command line. Jeff ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
fOn 06/29/2011 05:41 PM, Jeffrey Walton wrote: From my interop-ing experience with Windows, Linux, and Apple (plus their mobile devices), I found the best choice for password interoperability was UTF8, not UTF16. I use UTF-8 whenever possible, too. Just to be clear here, the native OS Win32 API that must be used in some configurations accepts UTF-16LE passwords for authentication. That's not my choice. Neither is it my choice what encoding the remote endpoint happens to be using. It doesn't even tell me. My code simply has to convert between them in the least-broken manner possible. The realities of crypto authentication protocol implementation mean I can't log the decrypted password for debugging or ask the user about it either. I actually added a heuristic that counts the number of typical characters and logs a message to the effect of hmm, looks like this thing may not have decoded properly, maybe the shared secret isn't correct. That little diagnostic has proven quite helpful at times. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Jun 29, 2011, at 11:36 AM, James A. Donald wrote: Thus the password has to be normalized before being hashed. Further, often a variants of a single character with a single meaning also have multiple codes - there is no sharp boundary between the string, and formatting information, though this is more a problem for unicode searching, than for unicode passwords. Meh. I've dealt with this issue in cases where the password is (sadly) by necessity not even unicode, but the series raw scan codes from the keyboard. I've dealt with them like Peter -- they're just a blob of bytes, and had only slightly more problems he's had. There are a couple of possible solutions, but the use the same keyboard one has practical appeal. By experience says that while strictly speaking, you are correct, in practice it's not an issue because you almost always are hashing the same blob of bytes. Note that I'm not saying it can't go wrong, not that it won't go wrong. Merely that I don't see it as an issue that must be solved up front. It's an issue that needs to be considered, but it's less of a problem than you'd think. Jon ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/27/2011 06:30 PM, Sampo Syreeni wrote: On 2011-06-20, Marsh Ray wrot I once looked up the Unicode algorithm for some basic case insensitive string comparison... 40 pages! Isn't that precisely why e.g. Peter Gutmann once wrote against the canonicalization (in the Unicode context, normalization) that ISO derived crypto protocols do, in favour of the bytes are bytes approach that PGP/GPG takes? Yes, but in most actual systems the strings are going to get handled. It's more a question of whether or not your protocol specification defines the format it's expecting. Humans tend to not define text very precisely and computers don't work with it directly anyway, they only work with encoded representations of text as character data. Even a simple accented character in a word or name can be represented in several different ways. Many devs (particularly Unixers :-) in the US, AU, and NZ have gotten away with the 7 bit ASCII assumption for a long time, but most of the rest of the world has to deal with locales, code pages, and multi-byte encodings. This seemed to allow older IETF protocol specs to often get away without a rigorous treatment of the character data encoding issues. (I suspect one factor in the lead of the English-speaking world in the development of 20th century computers and protocols is because we could get by with one of the smallest character sets.) Let's say you're writing a piece of code like: if (username == root) { // avoid doing something insecure with root privs } The logic of this example is probably broken in important ways but the point remains: sometimes we need to compare usernames for equality in contexts that have security implications. You can only claim bytes are bytes up until the point that the customer says they have a directory server which compares usernames case insensitively. For most things verbatim binary is the right choice. However, a password or pass phrase is specifically character data which is the result of a user input method. If you want to do crypto, just do crypto on the bits/bytes. If you really have to, you can tag the intended format for forensic purposes and sign your intent. But don't meddle with your given bits. Canonicalization/normalization is simply too hard to do right or even to analyse to have much place in protocol design. Consider RAIDUS. The first RFC http://tools.ietf.org/html/rfc2058#section-5.2 says nothing about the encoding of the character data of the password field, it just treats it as a series of octets. So what do you do when implementing RADIUS on an OS that gives user input to your application with UTF-16LE encoding? If you don't meddle with your given bits and just pass them on to the protocol layer, they are almost guaranteed to be non-interoperable. Later RFCs http://tools.ietf.org/html/rfc2865 have added in most places It is recommended that the message contain UTF-8 encoded 10646 characters. I think this is a really practical middle ground. Interestingly, it doesn't say this for the password field, likely because the authors figured it would break some existing underspecified behavior. So exactly which characters are allowed in passwords and how are they to be represented for interoperable RADIUS implementations? I have no idea, and I help maintain one! Consequently, we can hardly blame users for not using special characters in their passwords. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 28/06/11 11:25 AM, Nico Williams wrote: On Tue, Jun 28, 2011 at 9:56 AM, Marsh Rayma...@extendedsubset.com wrote: Consequently, we can hardly blame users for not using special characters in their passwords. The most immediate problem for many users w.r.t. non-ASCII in passwords is not the likelihood of interop problems but the heterogeneity of input methods and input method selection in login screens, password input fields in apps and browsers, and so on, as well as the fact that they can't see the password they are typing to confirm that the input method is working correctly. This particular security idea came from terminal laboratories in the 1970s and 1980s where annoying folk would look over your shoulder to read your password as you typed it. The assumption of people looking over your shoulder is well past its use-by date. These days we work with laptops, etc, which all work to a more private setting. Even Internet Cafes have their privacy shields between booths. There are still some lesser circumstances where this is an issue (using your laptop in a crowded place or typing a PIN onto a reader/ATM). Indeed in the latter case, the threat is a camera that picks up the keys as they are typed. But for the most part, we should be deprecating the practice at its mandated level and exploring optional or open methods. Like: Oddly enough mobiles are ahead of other systems here in that they show the user the *last/current* character of any passwords they are entering. iang ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
And this discussion of ASCII and internationalization has what to do with cryptography, asks the person on the list is who is probably most capable of arguing about it but won't? [1] --Paul Hoffman [1] RFC 3536, and others ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/28/2011 10:36 AM, Ian G wrote: On 28/06/11 11:25 AM, Nico Williams wrote: The most immediate problem for many users w.r.t. non-ASCII in passwords is not the likelihood of interop problems but the heterogeneity of input methods and input method selection in login screens, password input fields in apps and browsers, and so on, as well as the fact that they can't see the password they are typing to confirm that the input method is working correctly. This particular security idea came from terminal laboratories in the 1970s and 1980s where annoying folk would look over your shoulder to read your password as you typed it. Hardcopy terminals were common even into the 80s. Obviously you don't want the password lying around on printouts. Even worse, some terminals couldn't disable the local echo as characters were typed. The best the host could do for password entry was to backspace overprint a bunch of characters on the printout beforehand to obscure it. The assumption of people looking over your shoulder is well past its use-by date. +1 Perhaps someday our systems will be secure enough that shoulder-surfing is a problem worth worrying about again. Oddly enough mobiles are ahead of other systems here in that they show the user the *last/current* character of any passwords they are entering. Don't forget, the person in the room with you may have a 5 megapixel video camera in their shirt pocket with a view of your keyboard. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/28/2011 12:01 PM, Paul Hoffman wrote: And this discussion of ASCII and internationalization has what to do with cryptography, asks the person on the list is who is probably most capable of arguing about it but won't? [1] It's highly relevant to the implementation of cryptographic systems as Nico mentioned because interoperability depends on it and the nature of cryptographic authentication systems tends to obscure the problems. Sometimes security vulnerabilities result. The old LM LanMan password hashing scheme uppercased everything for no good reason. Perhaps they did it out of the desire to avoid issues with accented lower case characters. Look at these test vectors for PBKDF2: http://tools.ietf.org/html/rfc6070 None of them have the high bit set on any password character! Seems like there was a recent bcrypt implementation issue that escaped notice for a long time due to test vectors having this same property and some cryptographically weak credentials were issued as a result. 1 of 8 bits of the key material is strongly biased towards 0. This loss of entropy is especially significant when the entirety of the input is limited to 8 or so chars as is common. Wow, this sounds a lot like the way 64-bit DES was weakened to 56 bits. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/28/2011 12:48 PM, Steven Bellovin wrote: Wow, this sounds a lot like the way 64-bit DES was weakened to 56 bits. It wasn't weakened -- parity bits were rather important circa 1974. (One should always think about the technology of the time. It's a very reasonable-sounding explanation, particularly at the time. http://en.wikipedia.org/wiki/Robbed_bit_signaling is even still used for things like T-1 lies. But somehow the system managed to handle 64-bit plaintexts and 64-bit ciphertexts. Why would they need to shorten the key? Of the three different data types it would be the thing that was LEAST often sent across serial communications lines needing parity. If error correction was needed on the key for some kind of cryptographic security reasons, then 8 bits would hardly seem to be enough. What am I missing here? The initial and final permutations were rightly denounced as cryptographically irrelevant (though it isn't clear that that would be true in a secret design; the British had a lot of trouble until they figured out the static keyboard map of the Enigma), but they weren't there for cryptographic reasons; rather, they were an artifact of a serial/parallel conversion. Interesting. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 2011-06-28, Marsh Ray wrote: Yes, but in most actual systems the strings are going to get handled. Is this really necessarily true, or just an artifact of how things are implemented now? Or even a simple-minded implementation. Take the case of passwords and usernames. It might make some sense to do a case-insensitive username comparison. It wouldn't hurt security much, it might help usability and interoperability, and it constitutes desirable eye-candy. But a case-insensitive password compare?!? For some reason I don't think anybody would want to go there, and that almost everybody would want the system to rather fail safe than to do anything but pass around (type-tagged) bits. I mean, would anybody really like a spell checker in their ATM? It's more a question of whether or not your protocol specification defines the format it's expecting. It's not all about that. There's also the issue of implementability, testability and ability to promote (and then withstand) analysis. Any system that dedicates fourty pages worth of text to string comparison doesn't have those attributes. It doesn't promote security proper, but rather bloated software, difficulties with interoperability, unsecure workarounds and even plain security through obscurity. As a case in point, the Unicode normalization tables have changed numerous times in the past, and they aren't even the whole story. True, after some pressure from crypto folks they finally fixed the normalization target at something like v3.2 or whathaveyou. But then that too will in time lead to a whole bulk of special cases and other nastiness, which then promotes versioning difficulties, code that is too lengthy to debug properly, and diversion of resources from sound security engineering towards what I'm tempted to call politically correct software engineering. I mean, you've certain already to have seen what happened in the IETF IDN WG wrt DNS phishing... If I ever saw a kluge, attempts at homograph elimination (a form of normalization) is that. Humans tend to not define text very precisely and computers don't work with it directly anyway, they only work with encoded representations of text as character data. Passwords aren't text in the normal sense. Precisely because they should be the only thing human keyed crypto should depend on for security. As for the rest of the text... Tag it and bag it as-is. At least the original intent can then be uncovered forensically, if need be. Unlike if you go around twiddling your bits on the way. Many devs (particularly Unixers :-) in the US, AU, and NZ have gotten away with the 7 bit ASCII assumption for a long time, but most of the rest of the world has to deal with locales, code pages, and multi-byte encodings. Finnish people don't, and never have. Let's say you're writing a piece of code like: if (username == root) { // avoid doing something insecure with root privs } The logic of this example is probably broken in important ways but the point remains: sometimes we need to compare usernames for equality in contexts that have security implications. Then you write it out as root and it matches root because you wrote root the first time around. Plus, that is also why those security and interoperability sensitive things have been pared downto a minimum, common character set, in the first place. You can only claim bytes are bytes up until the point that the customer says they have a directory server which compares usernames case insensitively. If there's a security implication, you should then probably fail safe and wait for the software vendor to fix the possible interoperability bug. The first RFC http://tools.ietf.org/html/rfc2058#section-5.2 says nothing about the encoding of the character data of the password field, it just treats it as a series of octets. Yeah. That's sloppy, compared to today's standards and environments. I've in fact often wondered why language/encoding/etc considerations aren't a mandatory section in an RFC, like security is. Even when dealing with manifestly user-input character data. Consequently, we can hardly blame users for not using special characters in their passwords. Can you really blaim the user for anything? -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Jun 28, 2011, at 2:46 31PM, Marsh Ray wrote: On 06/28/2011 12:48 PM, Steven Bellovin wrote: Wow, this sounds a lot like the way 64-bit DES was weakened to 56 bits. It wasn't weakened -- parity bits were rather important circa 1974. (One should always think about the technology of the time. It's a very reasonable-sounding explanation, particularly at the time. http://en.wikipedia.org/wiki/Robbed_bit_signaling is even still used for things like T-1 lies. But somehow the system managed to handle 64-bit plaintexts and 64-bit ciphertexts. Why would they need to shorten the key? Of the three different data types it would be the thing that was LEAST often sent across serial communications lines needing parity. If error correction was needed on the key for some kind of cryptographic security reasons, then 8 bits would hardly seem to be enough. What am I missing here? Errors in plaintext weren't nearly as important. In text, the normal redundancy of natural language suffices; even for otherwise-unprotected data, a random error affects only one data item. For ciphertext, the modes of operation provide a range of different choices on error propagation. In either case, higher-level protocols could provide more detection or correction. A single-bit error in a key, however, could be disastrous; everything is garbled. Even hardware wasn't nearly as reliable then; it was not at all uncommon to have redundant circuitry (at least in mainframes) for registers and ALUs, using the complement output of the flip-flops used for registers (http://en.wikipedia.org/wiki/Flip-flop_%28electronics%29). And there were fill devices: http://en.wikipedia.org/wiki/Fill_device -- the path from it to the crypto device really needed error detection. Beyond that -- we know from Biham and Shamir that the inherent strength of DES is ~54 bits against differential cryptanalysis; having more bits to go into the key schedule doesn't help. --Steve Bellovin, https://www.cs.columbia.edu/~smb ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Tue, Jun 28, 2011 at 2:09 PM, Sampo Syreeni de...@iki.fi wrote: On 2011-06-28, Marsh Ray wrote: Yes, but in most actual systems the strings are going to get handled. Is this really necessarily true, or just an artifact of how things are implemented now? Or even a simple-minded implementation. Take the case of passwords and usernames. It might make some sense to do a case-insensitive username comparison. It wouldn't hurt security much, it might help usability and interoperability, and it constitutes desirable eye-candy. But a case-insensitive password compare?!? For some reason I don't think anybody would want to go there, and that almost everybody would want the system to rather fail safe than to do anything but pass around (type-tagged) bits. I mean, would anybody really like a spell checker in their ATM? It's not *case* that you want to be insensitive to. It's *form*, specifically Unicode form. Whenever you derive keys from passwords or otherwise use them as inputs to one-way functions you need to put the password into canonical form, which for Unicode means one of NFC, NFD, NFKC, or NFKD, plus some additional mappings. This all relates to things like diacritical marks (accents, etcetera) and such things as variants of periods and whitespace (think of non-breaking space). As Paul H. points out, this is not the right forum to discuss the specifics of Unicode password preparation. So we should stop at noting the fact that ASCII passwords are always canonical while Unicode, non-ASCII passwords require additional preparation in order to be put into canonical form. (This might also be true of other character sets (or codesets), but we should really only concern ourselves with Unicode here.) Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 06/28/2011 02:09 PM, Sampo Syreeni wrote: But a case-insensitive password compare?!? For some reason I don't think anybody would want to go there, and that almost everybody would want the system to rather fail safe than to do anything but pass around (type-tagged) bits. I mean, would anybody really like a spell checker in their ATM? http://mattjezorek.com/articles/security-researchers-need-to-research First lets quickly discuss the problem with the AOL Passwords they are case insensitive, truncated to 8 characters and required to be at least 6. There is more detailed information on The Washington Post by @briankrebs . This is a problem and makes weak passwords no mater how complex one wants to make it (or thinks that it is). AOL is acting as an identity service to iTunes, HuffPo, Meebo, anyone who uses the AOL OpenAuth or OpenID services and any instant messaging client. This both compounds the issue and spreads the risk around to everyone. On 06/28/2011 02:09 PM, Sampo Syreeni wrote: Any system that dedicates fourty pages worth of text to string comparison doesn't have those attributes. It doesn't promote security proper, but rather bloated software, difficulties with interoperability, unsecure workarounds and even plain security through obscurity. As a case in point, the Unicode normalization tables have changed numerous times in the past, and they aren't even the whole story. True, after some pressure from crypto folks they finally fixed the normalization target at something like v3.2 or whathaveyou. But then that too will in time lead to a whole bulk of special cases and other nastiness, which then promotes versioning difficulties, code that is too lengthy to debug properly, and diversion of resources from sound security engineering towards what I'm tempted to call politically correct software engineering. I agree. The thing is borderline unusable unless you can leverage the resources of an Apple, Adobe, IBM, or Microsoft on just the text handling. I mean, you've certain already to have seen what happened in the IETF IDN WG wrt DNS phishing... If I ever saw a kluge, attempts at homograph elimination (a form of normalization) is that. Speaking of DNS and crypto protocols, have you seen ICANN's plan to register custom gTLDs? That's right - public internet DNS names without a dot in them. Talk about violating your fundamental assumptions. How is x509 PKI going to interact with this? Passwords aren't text in the normal sense. Precisely because they should be the only thing human keyed crypto should depend on for security. As for the rest of the text... Tag it and bag it as-is. At least the original intent can then be uncovered forensically, if need be. Unlike if you go around twiddling your bits on the way. Yes, I used to develop print spooling software and always regretted it when we deviated from this strategy. You can only claim bytes are bytes up until the point that the customer says they have a directory server which compares usernames case insensitively. If there's a security implication, you should then probably fail safe and wait for the software vendor to fix the possible interoperability bug. Ideally. In practice, sometimes the string-matching code doesn't know which the 'safer' direction to fail is. You can't simply file a bug report and have a Microsoft, Novell, or Sun change (or maybe even document) the fundamental behavior of their directory server. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 2011-06-20, Marsh Ray wrote: I once looked up the Unicode algorithm for some basic case insensitive string comparison... 40 pages! Isn't that precisely why e.g. Peter Gutmann once wrote against the canonicalization (in the Unicode context, normalization) that ISO derived crypto protocols do, in favour of the bytes are bytes approach that PGP/GPG takes? If you want to do crypto, just do crypto on the bits/bytes. If you really have to, you can tag the intended format for forensic purposes and sign your intent. But don't meddle with your given bits. Canonicalization/normalization is simply too hard to do right or even to analyse to have much place in protocol design. -- Sampo Syreeni, aka decoy - de...@iki.fi, http://decoy.iki.fi/front +358-50-5756111, 025E D175 ABE5 027C 9494 EEB0 E090 8BA9 0509 85C2 ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Tue, Jun 21, 2011 at 03:38:39PM +1200, Peter Gutmann wrote: Jeffrey Walton noloa...@gmail.com writes: The 'details' mentioned above is at http://www.schneier.com/blowfish-bug.txt, and here's the crux of Morgan's report: [bfinit] chokes whenever the most significant bit of key[j] is a '1'. For example, if key[j]=0x80, key[j], a signed char, is sign extended to 0xff80 before it is ORed with data Wow. I was aware that some buggy Blowfish implementations existed, but I was not aware of what the bug was. When I saw Solar's post I actually wondered whether it was this bug, Same kind of bug, yes. propagated through the use of that BF implementation. No, I invented it on my own. ;-( Anyhow, here's how I am dealing with the issue in code: Bug fix, plus a backwards compatibility feature: http://cvsweb.openwall.com/cgi/cvsweb.cgi/Owl/packages/glibc/crypt_blowfish/crypt_blowfish.c.diff?r1=1.9;r2=1.10 8-bit test vectors added, for both modes (correct and buggy): http://cvsweb.openwall.com/cgi/cvsweb.cgi/Owl/packages/glibc/crypt_blowfish/wrapper.c.diff?r1=1.9;r2=1.10 These are only used by make check, which I felt was not enough - many people are taking just the main C file and use it in their programs. Obviously, my make check would not exist in their source code trees. So if those programs are ever miscompiled or otherwise broken, it might not be detected. To deal with this, I added: Quick self-test on every use: http://cvsweb.openwall.com/cgi/cvsweb.cgi/Owl/packages/glibc/crypt_blowfish/crypt_blowfish.c.diff?r1=1.10;r2=1.11 I am likely to go ahead and release this. Alexander ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Wed, Jun 15, 2011 at 04:22:55AM +0400, Solar Designer wrote: 3. Order of ExpandKey()s in the costly loop: http://www.openwall.com/lists/crypt-dev/2011/04/29/1 BTW, this inconsistency is seen even in bcrypt.c in OpenBSD - source code comment vs. actual code. Then I released my bcrypt code from JtR for reuse, under the name of crypt_blowfish in 2000. Several Linux distros started using it (patched into glibc), as well as PostgreSQL's contrib/pgcrypto, CommuniGate Pro messaging server, and some other programs. More recently, this same code got into PHP 5.3.0+. Of course, those hashes are fully compatible with OpenBSD's. I have to retract that statement. Yesterday, I was informed of a bug in JtR, which also made its way into crypt_blowfish, and which made the hashes incompatible with OpenBSD's for passwords with non-ASCII characters (those with the 8th bit set). Yes, it was an unintended/inadvertent sign extension. What's worse, in some cases it results in other characters in those passwords being ignored. Very nasty, and embarrassing. I am surprised this could go unnoticed for 13 years. I am trying to learn some lessons from this. More detail here: http://www.openwall.com/lists/oss-security/2011/06/20/2 http://www.openwall.com/lists/oss-security/2011/06/20/6 Although the code fix is a one-liner, fixing this in all affected software is difficult, especially where backwards compatibility matters. I'd appreciate any suggestions. Alexander ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Mon, Jun 20, 2011 at 12:11:38PM -0500, Marsh Ray wrote: On 06/20/2011 09:59 AM, Solar Designer wrote: On Wed, Jun 15, 2011 at 04:22:55AM +0400, Solar Designer wrote: Yesterday, I was informed of a bug in JtR, which also made its way into crypt_blowfish, and which made the hashes incompatible with OpenBSD's for passwords with non-ASCII characters (those with the 8th bit set). Yes, it was an unintended/inadvertent sign extension. What's worse, in some cases it results in other characters in those passwords being ignored. This would result in false positives from JtR? No. The matching of hashes was precise, so in order for JtR to report a cracked password it'd need to actually try the right expanded key. It resulted in most (but not all) passwords with non-ASCII chars that were hashed on OpenBSD (or on some other systems that implemented bcrypt without the bug) not being cracked, even when the correct passwords were input to JtR (such as in a wordlist). Very nasty, and embarrassing. Looks to me like a very ordinary bug. JtR wasn't intended to be a security-critical piece of software. It doesn't need to be and it wasn't tested as such. That was exactly my thinking when I was writing the code initially, but: Taking code which works well for one set of requirements and re-using it in another context with a subtly different set of requirements is a classic Software Engineering mistake. AIUI, others borrowed JtR code and used it in a security-critical context. They didn't recertify it for its new usage, i.e. test it for false positives. This is a mistake on their part. No, I was the one to make the code available separately from JtR for such reuse. I re-reviewed it at the time, and I added a different set of tests (make check in crypt_blowfish), but I missed this bug and my test vectors didn't include 8-bit chars. I am surprised this could go unnoticed for 13 years. Perhaps it says something about the frequency of chars 127 in actual user passwords? I don't even know how to type them on my US keyboard and I'd be reluctant to use them in my passwords. Yes. It also says something about the frequency of password hash migrations between different platforms (such as Linux to/from OpenBSD). If one person in 200 uses a password with 8-bit chars, and one admin in 5000 does such a migration, that's only one inconvenienced user in a million. This is unlikely to get researched and reported. Or maybe it also says something about the willingness of JtR users to report issues with false positives? No, not that. I am trying to learn some lessons from this. The best C developers might get the sign extension thing right 98% of the time. This bug is simply going to appear with some degree of frequency in everybody's code. The signedness of chars is even different between compiler platforms. It takes a ton of testing and independent review to weed out the other cases. Yes, one lesson is that such pieces of code need more testing. Maybe fuzzing with random inputs, including binary data, comparing them against other existing implementations. Another is that unsigned integer types should be used more (by default), despite of some drawbacks in some contexts (such as extra compiler warnings to silence; maybe the compilers are being stupid by warning of the wrong things). Yet another lesson is that in crypto contexts XOR may be preferred over OR in cases where both are meant to yield the same result (such as when combining small integers into a larger one, without overlap). If anything goes wrong with either operand for whatever reason (implementation bug, miscompile, CPU bug, intermittent failure), XOR tends to preserve more entropy from the other operand. In case of this crypt_blowfish sign extension bug, its impact would be a lot less if I used XOR in place of OR. A drawback is that XOR hides the programmer's expected lack of overlap between set bits (the code is not self-documenting in that respect anymore). And I am reminded of a near miss with miscompile of the Blowfish code in JtR, but luckily not in crypt_blowfish, with a certain buggy version of gcc. So I am considering adding runtime testing. (JtR has it already, but in crypt_blowfish it's only done on make check for performance reasons. Yet there's a way around those performance reasons while maintaining nearly full code coverage.) Finally, better test vectors need to be produced and published. If 8-bit chars are meant to be supported, must include them in test vectors, etc. More detail here: http://www.openwall.com/lists/oss-security/2011/06/20/2 http://www.openwall.com/lists/oss-security/2011/06/20/6 Although the code fix is a one-liner, fixing this in all affected software is difficult, especially where backwards compatibility matters. I'd appreciate any suggestions. For places where it's used for password validation, it may need to be bug-compatible, right? Yes, that's
Re: [cryptography] Oddity in common bcrypt implementation
On 06/20/2011 12:55 PM, Solar Designer wrote: Yes, one lesson is that such pieces of code need more testing. Maybe fuzzing with random inputs, including binary data, comparing them against other existing implementations. There are certainly more bugs lurking where the complex rules of international character data collide with password hashing. How does a password login application work from a UTF-8 terminal (or web page) when the host is using a single-byte code page? I once looked up the Unicode algorithm for some basic case insensitive string comparison... 40 pages! Another is that unsigned integer types should be used more (by default), I know I use them whenever possible. Even into the 18th century Europeans were deeply suspicious of negative numbers. http://en.wikipedia.org/wiki/Negative_number#History There may be arguments for consistently using signed ints too, but bit manipulation isn't one of them. And why stop with negative numbers, why not include the complex plane, div0, and infinities too? Sorry, I'm getting silly. despite of some drawbacks in some contexts (such as extra compiler warnings to silence; maybe the compilers are being stupid by warning of the wrong things). Yeah IMHO unsigned char data[] = { 0xFF, 0xFF, 0xFF, 0xFF }; should always compile without warnings. Yet another lesson is that in crypto contexts XOR may be preferred over OR in cases where both are meant to yield the same result (such as when combining small integers into a larger one, without overlap). If anything goes wrong with either operand for whatever reason (implementation bug, miscompile, CPU bug, intermittent failure), XOR tends to preserve more entropy from the other operand. In case of this crypt_blowfish sign extension bug, its impact would be a lot less if I used XOR in place of OR. Well, XOR has the property that setting a bit an even number of times turns it off. This is obviously not what you want when combining flags, for example. I suspect there are as many mistakes to be made with XOR as there are with OR. It's very hard to predict the ways in which bitwise expressions will be buggy. A drawback is that XOR hides the programmer's expected lack of overlap between set bits (the code is not self-documenting in that respect anymore). It would make sense for C to have more bit manipulation operators. Some processors have instructions for bit replacement, counting bits, finding the lowest '1', etc. And I am reminded of a near miss with miscompile of the Blowfish code in JtR, but luckily not in crypt_blowfish, with a certain buggy version of gcc. So I am considering adding runtime testing. (JtR has it already, but in crypt_blowfish it's only done on make check for performance reasons. Yet there's a way around those performance reasons while maintaining nearly full code coverage.) Seems like make check is a good place for it. Finally, better test vectors need to be produced and published. If 8-bit chars are meant to be supported, must include them in test vectors, etc. Yes, I think this a big criticism of the bcrypt algorithm. It's just not documented precisely enough for standardization. It is easy to continue supporting the bug as an option. It is tricky to add code to exploit the bug - there are too many special cases. Might not be worth it considering how uncommon such passwords are and how slow the hashes are to compute. Years ago I worked at a place that insisted our passwords be all upper case. Because that's the last thing cracking programs typically search for was their rationale. I didn't have the heart to tell them about LM. It sounds obvious now that I hear myself typing it, but generalizations about the frequency might not apply in any specific case. Some admin somewhere has a password rule that enforces a near worst-case on their users. http://extendedsubset.com/?p=18 It would be curious to estimate the actual real-world impact of the bug, though, given some large bcrypt hash databases and a lot of CPU time. Haha, more seem to be made available all the time. Yes, this is similar to what I proposed on oss-security - using the $2x$ prefix to request the buggy behavior. Somebody needs to start keeping a master list of these prefixes. This is the kind of thing that IETF/IANA can be good at (but it can take a long time). What would be helpful for the downstream vendors is any expert guidance you could give on the severity to inform the policy decisions. E.g., bug-compatible mode reduces cracking time by X% in cases where I included some info on that in my first oss-security posting on this, and referenced my postings to john-dev with more detail. This estimate is very complicated, though. It is complicated even for the case when there's just one 8-bit character, and I don't dare to make it for other cases (lots of if's). Perhaps you could do an exhaustive search up to a certain length and look at
Re: [cryptography] Oddity in common bcrypt implementation
On Mon, Jun 20, 2011 at 2:09 PM, Marsh Ray ma...@extendedsubset.com wrote: There are certainly more bugs lurking where the complex rules of international character data collide with password hashing. How does a password login application work from a UTF-8 terminal (or web page) when the host is using a single-byte code page? It wouldn't. If you want to use non-ASCII passwords you need a consistent choice of encoding (and to be able to use some input method that allows you to input your non-ASCII password correctly consistently). Even then you also need a stringprep (see RFC3454) and a stringprep profile suitable for passwords (see SASLprep, RFC4013) because you need to have consistent normalization (which otherwise would be a function of the selected input method), for example. (Well, only if you want to derive keys from the password; if you're just sending the password then the other side can do whatever stringprepping or normalization-insensitive comparisons it likes.) I once looked up the Unicode algorithm for some basic case insensitive string comparison... 40 pages! Case-insensitivity is full of special cases... The good news is that you do want to preserve case in the case of passwords. It's normalization that kills you. (There's also other things to worry about, such as what to do about unassigned codepoints.) Anyways, this is a bit far afield. Another is that unsigned integer types should be used more (by default), Certainly for char! Is there really any reason to ever use signed chars? Nico -- ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On 2011-06-21 3:11 AM, Marsh Ray wrote: The best C developers might get the sign extension thing right 98% of the time. Unless it really is human readable text, cast it to BYTE If it really is human readable text, use a string library, preferably a sixteen bit unicode library. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
Jeffrey Walton noloa...@gmail.com writes: The 'details' mentioned above is at http://www.schneier.com/blowfish-bug.txt, and here's the crux of Morgan's report: [bfinit] chokes whenever the most significant bit of key[j] is a '1'. For example, if key[j]=3D0x80, key[j], a signed char, is sign extended to 0xff80 before it is ORed with data When I saw Solar's post I actually wondered whether it was this bug, propagated through the use of that BF implementation. Peter. ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Tue, Jun 14, 2011 at 04:52:30PM -0500, Marsh Ray wrote: The first 7 chars $2a$05$ are a configuration string. The subsequent 53 characters (in theory) contains a 128 bit salt and a 192 bit hash value. But 53 is an odd length (literally!) for a base64 string, as base64 uses four characters to encode three bytes. I don't see an official reference for the format of bcrypt hashes. There's the Usenix 99 paper, which is a great read in many ways, but it's not a rigorous implementation spec. I discovered this a while back when I wrote a bcrypt implementation. Unfortunately the only real specification seems to be 'what the OpenBSD implementation does'. And the OpenBSD implementation also does this trunction, which you can see in ftp://ftp.fr.openbsd.org/pub/OpenBSD/src/lib/libc/crypt/bcrypt.c with encode_base64((u_int8_t *) encrypted + strlen(encrypted), ciphertext, 4 * BCRYPT_BLOCKS - 1); Niels Provos is probably the only reliable source as to why this truncation was done though I assume it was some attempt to minimize padding bits or reduce the hash size. -Jack ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
Also a discussion on this going on at http://news.ycombinator.com/item?id=2654586 On 06/14/2011 05:50 PM, Jack Lloyd wrote: I discovered this a while back when I wrote a bcrypt implementation. Unfortunately the only real specification seems to be 'what the OpenBSD implementation does'. That is something of a drawback to bcrypt. And the OpenBSD implementation also does this trunction, which you can see in ftp://ftp.fr.openbsd.org/pub/OpenBSD/src/lib/libc/crypt/bcrypt.c with encode_base64((u_int8_t *) encrypted + strlen(encrypted), ciphertext, 4 * BCRYPT_BLOCKS - 1); Niels Provos is probably the only reliable source as to why this truncation was done though I assume it was some attempt to minimize padding bits or reduce the hash size. That's a pretty weird design decision to use this massive 128 bit salt but then chop bits off the actual hash value to adjust the length. The 128 bit salt wastes 4 bits in the base64 encoding (22 chars * 6 bits per char = 132 bits). The 31 character base64 discards 8 of the 192 bit output bits (31*6 = 186). If they'd just used only a 126 bit salt they could base64 encode it in 21 chars with no wasted space. That would allow them to store the full 192 bits in 32 chars with no wasted space. So they threw away 8 hash output bits in order to save 2 salt bits. - Marsh ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography
Re: [cryptography] Oddity in common bcrypt implementation
On Tue, Jun 14, 2011 at 06:50:18PM -0400, Jack Lloyd wrote: encode_base64((u_int8_t *) encrypted + strlen(encrypted), ciphertext, 4 * BCRYPT_BLOCKS - 1); Here's the commit by Niels that fixes the bug in encode_base64() and replaces it with the explicit - 1 above: http://www.openbsd.org/cgi-bin/cvsweb/src/lib/libc/crypt/bcrypt.c.diff?r1=1.11;r2=1.12 That was in 1998. The commit message not surprisingly says: fix base64 encoding, this problem was reported by Solar Designer so...@false.com some time ago. So it was indeed a deliberate decision not to break compatibility, which made sense to me. Who cares if it's 192 or 184 bits, as long as we all agree on a specific number. Using base-64 more optimally could be nice, but not enough of a reason to break compatibility either. And, by the way, it's not base64, but just a base 64 encoding. It can produce any number of characters, not just multiples of 4. By the way, it is also subtly incompatible with the base 64 encoding used by other common crypt(3) implementations... which is not base64 either. In this posting, I am using base64 (without the dash) to refer to base64 as in MIME, and base-64 (with the dash) to other base 64 encodings (which vary). Alexander ___ cryptography mailing list cryptography@randombit.net http://lists.randombit.net/mailman/listinfo/cryptography