Re: [PATCH] Fix UTF-8 ssid's
On Wed, 2014-08-06 at 14:43 +1000, Lorn Potter wrote: > That code results in an ssid configured on the AP as LT-test-Å > which from wpa_supplicant is this: LT-test-\xc5 > to have at the end. That at the end is a representation of the UTF replacement character in whatever charset you terminal is running in and with whatever fonts your terminal supports. > g_utf8_validate seems to be failing in this instance, which makes the > string to have that undecipherable at the end. That Å is represented with value 0xc5 and has it's highest bit set, which for UTF-8 means that the character continues into the next byte. Since there is no next byte, the UTF-8'ization makes the correct claim that the last byte is undecipherable and substitutes it with the said UTF-8 substitution char. The terminal then doesn't know how to show this substitution char, so out comes something else. Cheers, Patrik ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
On 1/08/2014 8:12 pm, Patrik Flykt wrote: Hi, On Fri, 2014-08-01 at 12:53 +0300, Jukka Rissanen wrote: Hi Lorn, + result = g_convert_with_fallback((const char *)ssid, -1, + "UTF-8", "ISO-8859-1", + 0, 0, + &bytes_written, &error); SSID's are just byte arrays, how can we know which codeset we are converting from (you assume ISO-8859-1 here)? hmm true. They can also contain mixed codeset characters. We do not know the charset used for the SSID, so we can't do it like this. The code below this patch goes through the SSID character by character until it encounters a non-UTF-8 one. The offending characters are replaced by the U+FFFD replacement character and thus the procedure stitches up the name to be UTF-8 compliant. That code results in an ssid configured on the AP as LT-test-Å which from wpa_supplicant is this: LT-test-\xc5 to have at the end. g_utf8_validate seems to be failing in this instance, which makes the string to have that undecipherable at the end. With mixed codeset characters, this completely fails (but mixed fails on every desktop platform I checked too) ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
Hi, On Mon, 2014-08-04 at 12:03 +1000, Lorn Potter wrote: > My device seems to using ANSI_X3.4-1968 Which one? The one (Jolla??) you run ConnMan on? > For an AP with ssid set to LT-Test-ÅÅЀ > > connman shows: LT-Test-Ѐ > > wpa_cli shows it as: LT-Test-\xc5\xc5Ѐ > > (on my laptop it's LT-Test-__Ѐ) > (consequently, only nm-applet comes closest to showing it: > LT-Test-ÅÅЀ) So ConnMan is quite close. Those were supposed to turn up as UTF replacement character boxes. Apparently nm-applet just guesses, this time correctly. And everybody gets É wrong, so not all parts of the SSID is correctly configurable to the AP. Are you watching ConnMan's SSID from the terminal? What does 'printenv | egrep "LC_|LANG"' show as locale settings? Can you check with Wireshark/Kismet what the SSID actually looks like on the wire? Cheers, Patrik ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
On 01/08/14 19:53, Jukka Rissanen wrote: Hi Lorn, I just wonder what is the issue that this patch is fixing, isn't the current implementation working? I would say it isn't, or something isn't working as it should. My device seems to using ANSI_X3.4-1968 For an AP with ssid set to LT-Test-ÅÅЀ connman shows: LT-Test-Ѐ wpa_cli shows it as: LT-Test-\xc5\xc5Ѐ (on my laptop it's LT-Test-__Ѐ) (consequently, only nm-applet comes closest to showing it: LT-Test-ÅÅЀ) I dug a bit deeper and wpa_supplicant seems to be using a mixture of UTF-16 and XML decimal (according to kcharselect) for unicode characters. On pe, 2014-08-01 at 19:42 +1000, Lorn Potter wrote: I found an old patch that crashed, and fixed it up. Enjoy! --- gsupplicant/supplicant.c | 20 1 file changed, 20 insertions(+) diff --git a/gsupplicant/supplicant.c b/gsupplicant/supplicant.c index 534944b..19dbb1a 100644 --- a/gsupplicant/supplicant.c +++ b/gsupplicant/supplicant.c @@ -1256,6 +1256,26 @@ static void interface_network_removed(DBusMessageIter *iter, void *user_data) static char *create_name(unsigned char *ssid, int ssid_len) { + SUPPLICANT_DBG("%s, %i", ssid, ssid_len) + + gchar *result; + GError *error = 0; + gsize bytes_written = 0; + + if (g_utf8_validate((const char *)ssid, ssid_len, NULL) == TRUE) + return g_strndup((const char *)ssid, ssid_len); + + result = g_convert_with_fallback((const char *)ssid, -1, + "UTF-8", "ISO-8859-1", + 0, 0, + &bytes_written, &error); SSID's are just byte arrays, how can we know which codeset we are converting from (you assume ISO-8859-1 here)? + if (result) { + return result; + } else { + SUPPLICANT_DBG("Error converting to UTF-8: %s", error->message); + g_error_free (error); + } + GString *string; const gchar *remainder, *invalid; int valid_bytes, remaining_bytes; C++ style code here, the variables should be declared at the beginning of func. Cheers, Jukka ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
Hi, On Fri, 2014-08-01 at 12:53 +0300, Jukka Rissanen wrote: > Hi Lorn, > > + result = g_convert_with_fallback((const char *)ssid, -1, > > + "UTF-8", "ISO-8859-1", > > + 0, 0, > > + &bytes_written, &error); > > SSID's are just byte arrays, how can we know which codeset we are > converting from (you assume ISO-8859-1 here)? We do not know the charset used for the SSID, so we can't do it like this. The code below this patch goes through the SSID character by character until it encounters a non-UTF-8 one. The offending characters are replaced by the U+FFFD replacement character and thus the procedure stitches up the name to be UTF-8 compliant. Cheers, Patrik ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
Hi, I found an old patch that crashed, and fixed it up. Enjoy! New commit message style? ^^ Tomasz ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
Re: [PATCH] Fix UTF-8 ssid's
Hi Lorn, I just wonder what is the issue that this patch is fixing, isn't the current implementation working? On pe, 2014-08-01 at 19:42 +1000, Lorn Potter wrote: > I found an old patch that crashed, and fixed it up. > Enjoy! > > --- > gsupplicant/supplicant.c | 20 > 1 file changed, 20 insertions(+) > > diff --git a/gsupplicant/supplicant.c b/gsupplicant/supplicant.c > index 534944b..19dbb1a 100644 > --- a/gsupplicant/supplicant.c > +++ b/gsupplicant/supplicant.c > @@ -1256,6 +1256,26 @@ static void > interface_network_removed(DBusMessageIter *iter, void *user_data) > > static char *create_name(unsigned char *ssid, int ssid_len) > { > + SUPPLICANT_DBG("%s, %i", ssid, ssid_len) > + > + gchar *result; > + GError *error = 0; > + gsize bytes_written = 0; > + > + if (g_utf8_validate((const char *)ssid, ssid_len, NULL) == TRUE) > + return g_strndup((const char *)ssid, ssid_len); > + > + result = g_convert_with_fallback((const char *)ssid, -1, > + "UTF-8", "ISO-8859-1", > + 0, 0, > + &bytes_written, &error); SSID's are just byte arrays, how can we know which codeset we are converting from (you assume ISO-8859-1 here)? > + if (result) { > + return result; > + } else { > + SUPPLICANT_DBG("Error converting to UTF-8: %s", > error->message); > + g_error_free (error); > + } > + > GString *string; > const gchar *remainder, *invalid; > int valid_bytes, remaining_bytes; C++ style code here, the variables should be declared at the beginning of func. Cheers, Jukka ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman
[PATCH] Fix UTF-8 ssid's
I found an old patch that crashed, and fixed it up. Enjoy! --- gsupplicant/supplicant.c | 20 1 file changed, 20 insertions(+) diff --git a/gsupplicant/supplicant.c b/gsupplicant/supplicant.c index 534944b..19dbb1a 100644 --- a/gsupplicant/supplicant.c +++ b/gsupplicant/supplicant.c @@ -1256,6 +1256,26 @@ static void interface_network_removed(DBusMessageIter *iter, void *user_data) static char *create_name(unsigned char *ssid, int ssid_len) { + SUPPLICANT_DBG("%s, %i", ssid, ssid_len) + + gchar *result; + GError *error = 0; + gsize bytes_written = 0; + + if (g_utf8_validate((const char *)ssid, ssid_len, NULL) == TRUE) + return g_strndup((const char *)ssid, ssid_len); + + result = g_convert_with_fallback((const char *)ssid, -1, + "UTF-8", "ISO-8859-1", + 0, 0, + &bytes_written, &error); + if (result) { + return result; + } else { + SUPPLICANT_DBG("Error converting to UTF-8: %s", error->message); + g_error_free (error); + } + GString *string; const gchar *remainder, *invalid; int valid_bytes, remaining_bytes; -- 1.7.10.4 ___ connman mailing list connman@connman.net https://lists.connman.net/mailman/listinfo/connman