Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Glad we got to the bottom of that. That's quite a nasty compiler/language bug I must say. Not even a warning. Still, python crashes when trying to print the name of a null character. It wouldn't surprise me if there are other weird issues lurking. Would definitely sleep better with a more restricted character set. On 17 Jul 2014 00:04, Andreas Schildbach andr...@schildbach.de wrote: Please excuse me. I had a more thorough look at the original problem and found that the only problem with the original test case was that you cannot specify codepoints from the SMP using \u in Java. I always tried \u010400 but that doesn't work. Here is a fix for bitcoinj. The test now passes. https://github.com/bitcoinj/bitcoinj/pull/143 We can (and probably should) still need to filter control chars, I'll have a look at that now again. On 07/16/2014 11:06 PM, Aaron Voisine wrote: If I first remove \u, so the non-normalized passphrase is \u03D2\u0301\U00010400\U0001F4A9, and then NFC normalize it, it becomes \u03D3\U00010400\U0001F4A9 UTF-8 encoded this is: 0xcf93f0909080f09f92a9 (not the same as what you got, Andreas!) Encoding private key: 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 with this passphrase, I get a BIP38 key of: 6PRW5o9FMb4hAYRQPmgcvVDTyDtr6R17VMXGLmvKjKVpGkYhBJ4uYuR9wZ I recommend rather than simply removing control characters from the password that instead the spec require that passwords containing control characters are invalid. We don't want people trying to be clever and putting them in thinking they are adding to the password entropy. Also for UI compatibility across many platforms, I'm also in favor disallowing any character below U+0020 (space) I can submit a PR once we figure out why Andreas's passphrase was different than what I got. Aaron Voisine breadwallet.com On Wed, Jul 16, 2014 at 4:04 AM, Andreas Schildbach andr...@schildbach.de wrote: Damn, I just realized that I implement only the decoding side of BIP38. So I cannot propose a complete test vector. Here is what I have: Passphrase: ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Passphrase bytes after removing ISO control characters and NFC normalization: 0xcf933034303066346139 Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Can someone calculate the encrypted key from it (using whatever implementation) and I will verify it decodes properly in bitcoinj? On 07/16/2014 12:46 PM, Andreas Schildbach wrote: I will change the bitcoinj implementation and propose a new test vector. On 07/16/2014 11:29 AM, Mike Hearn wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com http://breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Here is a good article that helped me with what's going wrong: http://www.oracle.com/technetwork/articles/javase/supplementary-142654.html Basically, Java is stuck at 16 bits per char due to legacy reasons. They admit that for a new language, they would probably use 32 (or 24?) bits per char. \u literals express UTF-16 encoding, so you have to use 16 bits. I learned that for codepoint 0x010400, I could write \uD801\uDC00, which is the UTF-16 encoding of that codepoint. Other languages have literals for codepoints. E.g. Python can use u\U00010400 or HTML has #x10400; Unfortunately, Java is missing such a construct (at least in Java6). On 07/17/2014 12:59 PM, Mike Hearn wrote: Glad we got to the bottom of that. That's quite a nasty compiler/language bug I must say. Not even a warning. Still, python crashes when trying to print the name of a null character. It wouldn't surprise me if there are other weird issues lurking. Would definitely sleep better with a more restricted character set. On 17 Jul 2014 00:04, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Please excuse me. I had a more thorough look at the original problem and found that the only problem with the original test case was that you cannot specify codepoints from the SMP using \u in Java. I always tried \u010400 but that doesn't work. Here is a fix for bitcoinj. The test now passes. https://github.com/bitcoinj/bitcoinj/pull/143 We can (and probably should) still need to filter control chars, I'll have a look at that now again. On 07/16/2014 11:06 PM, Aaron Voisine wrote: If I first remove \u, so the non-normalized passphrase is \u03D2\u0301\U00010400\U0001F4A9, and then NFC normalize it, it becomes \u03D3\U00010400\U0001F4A9 UTF-8 encoded this is: 0xcf93f0909080f09f92a9 (not the same as what you got, Andreas!) Encoding private key: 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 with this passphrase, I get a BIP38 key of: 6PRW5o9FMb4hAYRQPmgcvVDTyDtr6R17VMXGLmvKjKVpGkYhBJ4uYuR9wZ I recommend rather than simply removing control characters from the password that instead the spec require that passwords containing control characters are invalid. We don't want people trying to be clever and putting them in thinking they are adding to the password entropy. Also for UI compatibility across many platforms, I'm also in favor disallowing any character below U+0020 (space) I can submit a PR once we figure out why Andreas's passphrase was different than what I got. Aaron Voisine breadwallet.com http://breadwallet.com On Wed, Jul 16, 2014 at 4:04 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Damn, I just realized that I implement only the decoding side of BIP38. So I cannot propose a complete test vector. Here is what I have: Passphrase: ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Passphrase bytes after removing ISO control characters and NFC normalization: 0xcf933034303066346139 Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Can someone calculate the encrypted key from it (using whatever implementation) and I will verify it decodes properly in bitcoinj? On 07/16/2014 12:46 PM, Andreas Schildbach wrote: I will change the bitcoinj implementation and propose a new test vector. On 07/16/2014 11:29 AM, Mike Hearn wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de mailto:andr...@schildbach.de mailto:andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
I'm all for fixing bugs, but I know from bitter experience that outside the BMP dragons lurk. Browsers don't even expose Unicode APIs at all. You end up needing to ship an entire pure-js implementation, which can be too large for some use cases (too much time sunk on that issue in my last job). I'm hoping BIP 38 doesn't get widely used anyway, to be frank. People moving private keys around by hand has caused quite a few problems in the past, sometimes people lost money. It's better to work at the level of a wallet and ideally ask people to move money using regular transactions. Way less potential for errors. Regardless, I'll file a JVM bug and see what the outcome is. On Wed, Jul 16, 2014 at 12:23 AM, Aaron Voisine vois...@gmail.com wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
I will change the bitcoinj implementation and propose a new test vector. On 07/16/2014 11:29 AM, Mike Hearn wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com http://breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net mailto:Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net mailto:Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Damn, I just realized that I implement only the decoding side of BIP38. So I cannot propose a complete test vector. Here is what I have: Passphrase: ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Passphrase bytes after removing ISO control characters and NFC normalization: 0xcf933034303066346139 Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Can someone calculate the encrypted key from it (using whatever implementation) and I will verify it decodes properly in bitcoinj? On 07/16/2014 12:46 PM, Andreas Schildbach wrote: I will change the bitcoinj implementation and propose a new test vector. On 07/16/2014 11:29 AM, Mike Hearn wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com http://breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net mailto:Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
On Wed, Jul 16, 2014 at 11:29 AM, Mike Hearn m...@plan99.net wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. Forbidding control characters, at least anything 32 makes a lot of sense to me. Carriage returns, linefeeds, formfeeds, null characters, I see no valid reason to allow them and lots of reasons they could cause havoc. PILE OF POO or GRINNING CAT FACE WITH SMILING EYES should be allowed in this day and age though. Wladimir -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Please excuse me. I had a more thorough look at the original problem and found that the only problem with the original test case was that you cannot specify codepoints from the SMP using \u in Java. I always tried \u010400 but that doesn't work. Here is a fix for bitcoinj. The test now passes. https://github.com/bitcoinj/bitcoinj/pull/143 We can (and probably should) still need to filter control chars, I'll have a look at that now again. On 07/16/2014 11:06 PM, Aaron Voisine wrote: If I first remove \u, so the non-normalized passphrase is \u03D2\u0301\U00010400\U0001F4A9, and then NFC normalize it, it becomes \u03D3\U00010400\U0001F4A9 UTF-8 encoded this is: 0xcf93f0909080f09f92a9 (not the same as what you got, Andreas!) Encoding private key: 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 with this passphrase, I get a BIP38 key of: 6PRW5o9FMb4hAYRQPmgcvVDTyDtr6R17VMXGLmvKjKVpGkYhBJ4uYuR9wZ I recommend rather than simply removing control characters from the password that instead the spec require that passwords containing control characters are invalid. We don't want people trying to be clever and putting them in thinking they are adding to the password entropy. Also for UI compatibility across many platforms, I'm also in favor disallowing any character below U+0020 (space) I can submit a PR once we figure out why Andreas's passphrase was different than what I got. Aaron Voisine breadwallet.com On Wed, Jul 16, 2014 at 4:04 AM, Andreas Schildbach andr...@schildbach.de wrote: Damn, I just realized that I implement only the decoding side of BIP38. So I cannot propose a complete test vector. Here is what I have: Passphrase: ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Passphrase bytes after removing ISO control characters and NFC normalization: 0xcf933034303066346139 Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Can someone calculate the encrypted key from it (using whatever implementation) and I will verify it decodes properly in bitcoinj? On 07/16/2014 12:46 PM, Andreas Schildbach wrote: I will change the bitcoinj implementation and propose a new test vector. On 07/16/2014 11:29 AM, Mike Hearn wrote: Yes sorry, you're right, the issue starts with the null code point. Python seems to have problems starting there too. It might work if we took that out. On Wed, Jul 16, 2014 at 11:17 AM, Andreas Schildbach andr...@schildbach.de mailto:andr...@schildbach.de wrote: Guys, you are always talking about the Unicode astral plane, but in fact its a plain old (ASCII) control character where this problem starts and likely ends: \u. Let's ban/filter ISO control characters and be done with it. Most control characters will never be enterable by any keyboard into a password field. Of course I assume that Character.isISOControl() works consistently across platforms. http://docs.oracle.com/javase/7/docs/api/java/lang/Character.html#isISOControl%28char%29 On 07/16/2014 12:23 AM, Aaron Voisine wrote: If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com http://breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. --
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
I think generally control-characters (such as \u) should be disallowed in passphrases. (Even the use of whitespaces is very questionable.) I'm ok with allowing pile-of-poo's. On mobile phones there is keyboards just containing emoticons -- why not allow those? Assuming NFC works of course. On 07/15/2014 03:07 PM, Eric Winer wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: * Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK http://codepoints.net/U+03D2, COMBINING ACUTE ACCENT http://codepoints.net/U+0301, NULL http://codepoints.net/U+, DESERET CAPITAL LETTER LONG I http://codepoints.net/U+10400, PILE OF POO http://codepoints.net/U+1F4A9) * Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn * Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF * Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 * /Note:/ The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net mailto:Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
I have a python implementation that seems to pass this test vector: https://github.com/wozz/electrum/blob/bip38_import/lib/bip38.py#L299 On Jul 15, 2014, at 9:19 AM, Andreas Schildbach andr...@schildbach.de wrote: I think generally control-characters (such as \u) should be disallowed in passphrases. (Even the use of whitespaces is very questionable.) I'm ok with allowing pile-of-poo's. On mobile phones there is keyboards just containing emoticons -- why not allow those? Assuming NFC works of course. On 07/15/2014 03:07 PM, Eric Winer wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net mailto:m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: * Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK http://codepoints.net/U+03D2, COMBINING ACUTE ACCENT http://codepoints.net/U+0301, NULL http://codepoints.net/U+, DESERET CAPITAL LETTER LONG I http://codepoints.net/U+10400, PILE OF POO http://codepoints.net/U+1F4A9) * Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn * Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF * Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 * /Note:/ The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net mailto:Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Unicode guarantees that null-terminated strings still work. U+ terminates a unicode (or C) string. strlen() gets the string byte count. mbstowcs() gets the character count. Whitespace can be problematic, but should be allowed. Control characters should be filtered. Emoticons probably cannot be filtered without substandard approaches such as character blacklists, a road you do not want to travel. (all this is simply standard practice) On Tue, Jul 15, 2014 at 9:07 AM, Eric Winer enwi...@gmail.com wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Note: The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Jeff Garzik Bitcoin core developer and open source evangelist BitPay, Inc. https://bitpay.com/ -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Unicode guarantees that null-terminated strings still work. UTF-8 guarantees that. Other encodings do not, you can have null bytes in UTF-16 strings for example. Indeed most languages that use pascal-style encodings internally allow null characters in strings, it's just not a good idea to exploit that fact ... -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Can you provide the rationale for standard practice? For example, why should whitespace be allowed? I regularly use trim() on any passphrase (or other input ftm). So what's the action point? Should we amend the spec to filter control characters? That would get rid of the \u problem. On 07/15/2014 05:17 PM, Jeff Garzik wrote: Unicode guarantees that null-terminated strings still work. U+ terminates a unicode (or C) string. strlen() gets the string byte count. mbstowcs() gets the character count. Whitespace can be problematic, but should be allowed. Control characters should be filtered. Emoticons probably cannot be filtered without substandard approaches such as character blacklists, a road you do not want to travel. (all this is simply standard practice) On Tue, Jul 15, 2014 at 9:07 AM, Eric Winer enwi...@gmail.com wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Note: The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
On whitespace: Security UX testing I've seen shows it is mentally easier for some users to memorize and use longer passphrases, if they are permitted spaces. I've not seen anything written on use of tabs/NLs/FFs in passphrases. I can see the logic of some systems, that convert \s+ into ' ' for purposes of password hashing, even though that might frustrate a security nerd or two. http://security.stackexchange.com/questions/32691/why-not-allow-spaces-in-a-password I do think control characters should be filtered. On Tue, Jul 15, 2014 at 11:32 AM, Andreas Schildbach andr...@schildbach.de wrote: Can you provide the rationale for standard practice? For example, why should whitespace be allowed? I regularly use trim() on any passphrase (or other input ftm). So what's the action point? Should we amend the spec to filter control characters? That would get rid of the \u problem. On 07/15/2014 05:17 PM, Jeff Garzik wrote: Unicode guarantees that null-terminated strings still work. U+ terminates a unicode (or C) string. strlen() gets the string byte count. mbstowcs() gets the character count. Whitespace can be problematic, but should be allowed. Control characters should be filtered. Emoticons probably cannot be filtered without substandard approaches such as character blacklists, a road you do not want to travel. (all this is simply standard practice) On Tue, Jul 15, 2014 at 9:07 AM, Eric Winer enwi...@gmail.com wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK, COMBINING ACUTE ACCENT, NULL, DESERET CAPITAL LETTER LONG I, PILE OF POO) Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 Note: The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
I was part of adding in that test vector, and I think it's a good test vector since it is an extreme edge-case of the current definition: If the BIP38 proposal allows any password that can be in UTF-8, NFC normalized form, those characters cover the various edge cases (combining characters, null character, astral range) that if your implementation doesn't handle, then it can't really be said to be BIP38-compatible/compliant, right? The passphrase in the test vector is NOT in NFC form; that's the point. Whatever implementation gets designed has to assume the input is not already NFC-normalized and needs to handle/sanitize that input before further processing. To test your implementation for compliance, you should not be inputting the NFC-normalized bytestring as the password input, you should be entering the original passphrase as the test. My original pull request for this change (https://github.com/bitcoin/bips/pull/29) shows a Python and a NodeJS way to input that test vector password as intended. Some input devices may already handle the input as NFC, which is great, but per the BIP38 proposal, that shouldn't be assumed, so various implementations are cross-compatible. If one implementation assumes the input is already NFC, they may encode/decode the password incorrectly, and lock a user out of their wallet. Android allows different user keyboards to be used, so I'm guessing there's one somewhere that allows manual entry of unicode codepoints that could be used to enter a null character, and with the next version of iOS, Apple devices will also get custom keyboard options, too, so even if the default Apple keyboard does NFC-form properly, other developers' keyboards may not. So while it is an extreme edge case, that is not very likely to be used as a real password by any user, that's what test vectors are for: to test for the edge case that you might not have expected and handled in your implementation. Brooks On Tue, Jul 15, 2014 at 8:07 AM, Eric Winer enwi...@gmail.com wrote: I don't know for sure if the test vector is correct NFC form. But for what it's worth, the Pile of Poo character is pretty easily accessible on the iPhone and Android keyboards, and in this string it's already in NFC form (f09f92a9 in the test result). I've certainly seen it in usernames around the internet, and wouldn't be surprised to see it in passphrases entered on smartphones, especially if the author of a BIP38-compatible app includes a (possibly ill-advised) suggestion to have your passphrase include special characters. I haven't seen the NULL character on any smartphone keyboards, though - I assume the iOS and Android developers had the foresight to know how much havoc that would wreak on systems assuming null-terminated strings. It seems unlikely that NULL would be in a real-world passphrase entered by a sane user. On Tue, Jul 15, 2014 at 8:03 AM, Mike Hearn m...@plan99.net wrote: [+cc aaron] We recently added an implementation of BIP 38 (password protected private keys) to bitcoinj. It came to my attention that the third test vector may be broken. It gives a hex version of what the NFC normalised version of the input string should be, but this does not match the results of the Java unicode normaliser, and in fact I can't even get Python to print the names of the characters past the embedded null. I'm curious where this normalised version came from. Given that pile of poo is not a character I think any sane user would put into a passphrase, I question the value of this test vector. NFC form is intended to collapse things like umlaut control characters onto their prior code point, but here we're feeding the algorithm what is basically garbage so I'm not totally surprised that different implementations appear to disagree on the outcome. Proposed action: we remove this test vector as it does not represent any real world usage of the spec, or if we desperately need to verify NFC normalisation I suggest using a different, more realistic test string, like Zürich, or something written in Thai. Test 3: - Passphrase ϓ␀Ѐ (\u03D2\u0301\u\U00010400\U0001F4A9; GREEK UPSILON WITH HOOK http://codepoints.net/U+03D2, COMBINING ACUTE ACCENT http://codepoints.net/U+0301, NULL http://codepoints.net/U+, DESERET CAPITAL LETTER LONG I http://codepoints.net/U+10400, PILE OF POO http://codepoints.net/U+1F4A9) - Encrypted key: 6PRW5o9FLp4gJDDVqJQKJFTpMvdsSGJxMYHtHaQBF3ooa8mwD69bapcDQn - Bitcoin Address: 16ktGzmfrurhbhi6JGqsMWf7TyqK9HNAeF - Unencrypted private key (WIF): 5Jajm8eQ22H3pGWLEVCXyvND8dQZhiQhoLJNKjYXk9roUFTMSZ4 - *Note:* The non-standard UTF-8 characters in this passphrase should be NFC normalized to result in a passphrase of0xcf9300f0909080f09f92a9 before further processing -- Want fast and easy access to all the code in your
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf9300*01*303430300166346139 vs cf9300*f0*909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38 NFC normalisation issue
If the user creates a password on an iOS device with an astral character and then can't enter that password on a JVM wallet, that sucks. If JVMs really can't support unicode NFC then that's a strong case to limit the spec to the subset of unicode that all popular platforms can support, but it sounds like it might just be a JVM string library bug that could hopefully be reported and fixed. I get the same result as in the test case using apple's CFStringNormalize(passphrase, kCFStringNormalizationFormC); Aaron Voisine breadwallet.com On Tue, Jul 15, 2014 at 11:20 AM, Mike Hearn m...@plan99.net wrote: Yes, we know, Andreas' code is indeed doing normalisation. However it appears the output bytes end up being different. What I get back is: cf930001303430300166346139 vs cf9300f0909080f09f92a9 from the spec. I'm not sure why. It appears this is due to the character from the astral planes. Java is old and uses 16 bit characters internally - it wouldn't surprise me if there's some weirdness that means it doesn't/won't support this kind of thing. I recommend instead that any implementation that wishes to be compatible with JVM based wallets (I suspect Android is the same) just refuse any passphrase that includes characters outside the BMP. At least unless someone can find a fix. I somehow doubt this will really hurt anyone. -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development -- Want fast and easy access to all the code in your enterprise? Index and search up to 200,000 lines of code with a free copy of Black Duck Code Sight - the same software that powers the world's largest code search on Ohloh, the Black Duck Open Hub! Try it now. http://p.sf.net/sfu/bds ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38
On Fri, Oct 25, 2013 at 11:50 AM, Mike Caldwell mcaldw...@swipeclock.com wrote: I have noticed that there was a recent change to BIP 0038 (Password-Protected Private Key) on the Wiki, which is a proposal I wrote in late 2012. Gregory, it looks to me as though you have made this change, and I’m hoping for your help here. The change suggests that the number was never assigned, and that there has been no discussion regarding the proposal on this list. Greetings, (repeating from our discussion on IRC) No prior messages about your proposal have made it to the list, and no mention of the assignment had been made in the wiki. The first I ever heard of this scheme was long after you'd written the document when I attempted to assign the number to something else then noticed something existed at that name. Since you had previously created BIP documents without public discussion (e.g. BIP 22 https://en.bitcoin.it/wiki/OP_CHECKSIGEX_DRAFT_BIP [...] Or, I wonder did your emails just get eaten that time too?), I'd just assumed something similar had happened here. I didn't take any action at the time I first noticed it, but after someone complained about bitcoin-qt not confirming with BIP38 to me today it was clear to me that people were confusing this with something that was officially (as much as anything is) supported, so I moved the document out. (I've since moved it back, having heard from you that you thought that it had actually been assigned/announced). With respect to moving it forward: Having a wallet which can only a single address is poor form. Jean-Paul Kogelman has a draft proposal which is based on your BIP38 work though the encoding scheme is different, having been revised in response to public discussion. Perhaps efforts here can be combined? -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development
Re: [Bitcoin-development] BIP 38
Gregory, No problem, thanks for providing the IRC recap, and glad I've finally made radio contact with the list. Perhaps there can be some long overdue discussion on the topic. I see Kogelman's improvements to my proposal as being of merit and may very well be sufficient to supersede what I've originally proposed. I suppose the main thing I'm wanting to ensure is that the identity of my original proposal is maintained. Regardless of whether a paper wallet or physical bitcoin with a single address is poor form or whether my proposal is rejected or superseded, I hope there can be a consensus that BIP38 can continue to be understood to mean Password-protected private key proposal by Mike Caldwell, and that it can appear in the lists of BIPs alongside others. Regarding BIP 22... I in fact did not originally attempt to post to the list over what I had created and called BIP 22 once upon a time, I literally just created a wiki entry contrary to advice in BIP 1 that I had not read at the time. I recognize it's totally legitimate to feel and act upon the appearance that BIP 38 was created in a similar shortcut fashion. Certainly, the next thing I propose will be in the form of a draft outside the BIP numberspace and I won't solicit a BIP number without an established consensus in the future. That said, I'm asking for BIP 38 to stand and be recognized as in existence, so as to not confuse those who call it by that name and who have already chosen to do something with it (whether that's to implement it, or to draft improvements to it like Kogelman). If I did BIP 38 over again, there's a couple shortcomings of my own that I wouldn't mind seeing addressed in another iteration, and the right venue for that may very well be to contribute to Kogelman's work. My particular improvements might include wanting the ability to outsource the computationally expensive step to another service at a minimized risk to the user, potentially the ability to have special-purpose encrypted minikeys (sort of how ARM has Thumb for places where the tradeoff makes sense), and a typo check with better privacy (I currently use sha256(address)[0...3] which may unintentionally reveal the bitcoin address, if it's funded, to someone who has the encrypted key but doesn't know the password). mike -Original Message- From: Gregory Maxwell [mailto:gmaxw...@gmail.com] Sent: Friday, October 25, 2013 2:05 PM To: Mike Caldwell Cc: bitcoin-development@lists.sourceforge.net Subject: Re: [Bitcoin-development] BIP 38 On Fri, Oct 25, 2013 at 11:50 AM, Mike Caldwell mcaldw...@swipeclock.com wrote: I have noticed that there was a recent change to BIP 0038 (Password-Protected Private Key) on the Wiki, which is a proposal I wrote in late 2012. Gregory, it looks to me as though you have made this change, and I’m hoping for your help here. The change suggests that the number was never assigned, and that there has been no discussion regarding the proposal on this list. Greetings, (repeating from our discussion on IRC) No prior messages about your proposal have made it to the list, and no mention of the assignment had been made in the wiki. The first I ever heard of this scheme was long after you'd written the document when I attempted to assign the number to something else then noticed something existed at that name. Since you had previously created BIP documents without public discussion (e.g. BIP 22 https://en.bitcoin.it/wiki/OP_CHECKSIGEX_DRAFT_BIP [...] Or, I wonder did your emails just get eaten that time too?), I'd just assumed something similar had happened here. I didn't take any action at the time I first noticed it, but after someone complained about bitcoin-qt not confirming with BIP38 to me today it was clear to me that people were confusing this with something that was officially (as much as anything is) supported, so I moved the document out. (I've since moved it back, having heard from you that you thought that it had actually been assigned/announced). With respect to moving it forward: Having a wallet which can only a single address is poor form. Jean-Paul Kogelman has a draft proposal which is based on your BIP38 work though the encoding scheme is different, having been revised in response to public discussion. Perhaps efforts here can be combined? -- October Webinars: Code for Performance Free Intel webinars can help you accelerate application performance. Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from the latest Intel processors and coprocessors. See abstracts and register http://pubads.g.doubleclick.net/gampad/clk?id=60135991iu=/4140/ostg.clktrk ___ Bitcoin-development mailing list Bitcoin-development@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bitcoin-development