On Wednesday, January 11, 2017 12:49:00 PM CET Tim Ruehsen wrote: > On Wednesday, January 11, 2017 12:23:34 PM CET Tim Ruehsen wrote: > > On Tuesday, January 10, 2017 11:40:49 PM CET Daniel Stenberg wrote: > > > On Tue, 10 Jan 2017, Alessandro Ghedini wrote: > > > >> TESTFAIL: These test cases failed: 165 1034 1035 2046 2047 > > > > > > > > Note that this is with curl 7.52.1 and libidn2 0.14 from Debian > > > > unstable. > > > > > > I suspect this has something to do with libidn2's limitations, but we > > > haven't changed any IDN code in curl since 7.52.1 that I can recall and > > > I > > > use 0.14 too. > > > > Sorry for dropping in late... I made the recent changes to libidn2 which > > is > > basically TR46 support. > > > > Now the bad news: I introduced a bug (regression) regarding NFC conversion > > in libidn 0.14. A fix is already in upstream repo but not released yet. > > This might introduce the test failures you experience... on some systems > > UTF-8/Unicode might be decomposed and on some it is composed. Using > > decomposed codepoints with IDN2_NFC_INPUT fails with libidn 0.14. > > > > But if you enable the new TR46 feature, the input is NFCed (and > > lowercased) > > automatically: > > > > Example: > > $ printf "\x62\x6c\x61\xcc\x8a\x62\xc3\xa6\x72\x67\x72\xc3\xb8\x64\x2e\x6e > > \x6f"|idn2 > > idn2: lookup: string is not in Unicode NFC format > > > > $ printf "\x62\x6c\x61\xcc\x8a\x62\xc3\xa6\x72\x67\x72\xc3\xb8\x64\x2e\x6e > > \x6f"|idn2 -T > > > > You can check for TR46 availability: > > > > #if IDN2_VERSION_NUMBER >= 0x00140000 > > > > if ((rc = idn2_lookup_u8((uint8_t *)utf8, (uint8_t **)ascii, > > > > IDN2_TRANSITIONAL)) == IDN2_OK) > > ... > > #else > > ... > > #endif > > > > The IDN2_TRANSITIONAL enables TR46 'transitional' conversion (tries to be > > compatible to IDNA2008 and IDNA2003 as much as possible), > > IDN2_NONTRANSITIONAL enables TR46 'non-transitional (IDNA2008, the way > > that > > every app should go... may arise some incompatibilties with IDNA2003 which > > is still under heavy use). > > I just looked at lib/url.c... how do you NFC'ed the input to > idn2_lookup_ul(), couldn't find any conversion using a quick grep ? > > My suggestion would be: > > diff --git a/lib/url.c b/lib/url.c > index 7944d7b0c..81cd490e0 100644 > --- a/lib/url.c > +++ b/lib/url.c > @@ -4010,7 +4010,12 @@ static void fix_hostname(struct connectdata *conn, > struct hostname *host) > #ifdef USE_LIBIDN2 > if(idn2_check_version(IDN2_VERSION)) { > char *ace_hostname = NULL; > - int rc = idn2_lookup_ul((const char *)host->name, &ace_hostname, 0); > +#ifdef IDN2_TRANSITIONAL > + int flags = IDN2_TRANSITIONAL; > +#else > + int flags = IDN2_INPUT_NFC; > +#endif > + int rc = idn2_lookup_ul((const char *)host->name, &ace_hostname, > flags); if(rc == IDN2_OK) { > host->encalloc = (char *)ace_hostname; > /* change the name pointer to point to the encoded hostname */ > > That way there is a chance that the tests work more stable between different > systems.
Sorry, I meant IDN2_NFC_INPUT. Tim
signature.asc
Description: This is a digitally signed message part.
------------------------------------------------------------------- List admin: https://cool.haxx.se/list/listinfo/curl-library Etiquette: https://curl.haxx.se/mail/etiquette.html