Re: [PATCH v2] charset: fix mm_charset_get_encoded_len

2017-08-15 Thread Aleksander Morgado
On 12/08/17 00:17, Ben Chan wrote:
> The while loop in mm_charset_get_encoded_len() iterates through each
> valid UTF-8 encoded character in the given NULL-terminated UTF-8 string.
> It uses g_utf8_find_next_char() to find the position of the next
> character. In case, g_utf8_find_next_char() returns NULL, it tries to
> find the end (i.e. the NULL character) of the string.
> 
> This patch fixes the following issues in the while loop:
> 
> 1. The loop uses both 'next' and 'end' to track the position of the next
>character in the string.
> 
>When g_utf8_find_next_char() returns a non-NULL value, 'next' is
>essentially the same as 'end'.
> 
>When g_utf8_find_next_char() returns NULL, 'next' remains NULL while
>'end' is adjusted to track the end of the string (but is done
>incorrectly as described in #2). After the 'p = next' assignment, the
>'while (*p)' check results in a NULL dereference. 'p' should thus be
>set to 'end' instead of 'next'.
> 
>'next' is thus redundant and can be removed.
> 
> 2. When g_utf8_find_next_char() returns NULL and 'end' is adjusted to
>track the end of string, the 'while (*end++)' loop stops when finding
>the NULL character, but 'end' is advanced past the NULL character.
>After the 'p = end' assignment, the 'while (*p)' check results in a
>dereference of out-of-bounds pointer.
> 
>'while (*++end)' should be used instead given that 'p' doesn't point
>to a NULL character when 'end = p' happens. 'end' will be updated to
>point to the NULL character. After the 'p = end' assignment (fixed in
>#1), the 'while (*p)' check will properly stop the loop.
> ---
> v2: Updated a comment suggested by Dan
> 

Pushed to git master. I'm sending a follow up commit with a related issue.

>  src/mm-charsets.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/src/mm-charsets.c b/src/mm-charsets.c
> index 191789d7..fb47b0ae 100644
> --- a/src/mm-charsets.c
> +++ b/src/mm-charsets.c
> @@ -595,7 +595,7 @@ mm_charset_get_encoded_len (const char *utf8,
>  MMModemCharset charset,
>  guint *out_unsupported)
>  {
> -const char *p = utf8, *next;
> +const char *p = utf8;
>  guint len = 0, unsupported = 0;
>  SubsetEntry *e;
>  
> @@ -618,17 +618,17 @@ mm_charset_get_encoded_len (const char *utf8,
>  
>  c = g_utf8_get_char_validated (p, -1);
>  g_return_val_if_fail (c != (gunichar) -1, 0);
> -end = next = g_utf8_find_next_char (p, NULL);
> +end = g_utf8_find_next_char (p, NULL);
>  if (end == NULL) {
> -/* Find the end... */
> +/* Find the string terminating NULL */
>  end = p;
> -while (*end++);
> +while (*++end);
>  }
>  
>  if (!e->func (c, p, (end - p), ))
>  unsupported++;
>  len += clen;
> -p = next;
> +p = end;
>  }
>  
>  if (out_unsupported)
> 


-- 
Aleksander
https://aleksander.es
___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel


[PATCH v2] charset: fix mm_charset_get_encoded_len

2017-08-11 Thread Ben Chan
The while loop in mm_charset_get_encoded_len() iterates through each
valid UTF-8 encoded character in the given NULL-terminated UTF-8 string.
It uses g_utf8_find_next_char() to find the position of the next
character. In case, g_utf8_find_next_char() returns NULL, it tries to
find the end (i.e. the NULL character) of the string.

This patch fixes the following issues in the while loop:

1. The loop uses both 'next' and 'end' to track the position of the next
   character in the string.

   When g_utf8_find_next_char() returns a non-NULL value, 'next' is
   essentially the same as 'end'.

   When g_utf8_find_next_char() returns NULL, 'next' remains NULL while
   'end' is adjusted to track the end of the string (but is done
   incorrectly as described in #2). After the 'p = next' assignment, the
   'while (*p)' check results in a NULL dereference. 'p' should thus be
   set to 'end' instead of 'next'.

   'next' is thus redundant and can be removed.

2. When g_utf8_find_next_char() returns NULL and 'end' is adjusted to
   track the end of string, the 'while (*end++)' loop stops when finding
   the NULL character, but 'end' is advanced past the NULL character.
   After the 'p = end' assignment, the 'while (*p)' check results in a
   dereference of out-of-bounds pointer.

   'while (*++end)' should be used instead given that 'p' doesn't point
   to a NULL character when 'end = p' happens. 'end' will be updated to
   point to the NULL character. After the 'p = end' assignment (fixed in
   #1), the 'while (*p)' check will properly stop the loop.
---
v2: Updated a comment suggested by Dan

 src/mm-charsets.c | 10 +-
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/src/mm-charsets.c b/src/mm-charsets.c
index 191789d7..fb47b0ae 100644
--- a/src/mm-charsets.c
+++ b/src/mm-charsets.c
@@ -595,7 +595,7 @@ mm_charset_get_encoded_len (const char *utf8,
 MMModemCharset charset,
 guint *out_unsupported)
 {
-const char *p = utf8, *next;
+const char *p = utf8;
 guint len = 0, unsupported = 0;
 SubsetEntry *e;
 
@@ -618,17 +618,17 @@ mm_charset_get_encoded_len (const char *utf8,
 
 c = g_utf8_get_char_validated (p, -1);
 g_return_val_if_fail (c != (gunichar) -1, 0);
-end = next = g_utf8_find_next_char (p, NULL);
+end = g_utf8_find_next_char (p, NULL);
 if (end == NULL) {
-/* Find the end... */
+/* Find the string terminating NULL */
 end = p;
-while (*end++);
+while (*++end);
 }
 
 if (!e->func (c, p, (end - p), ))
 unsupported++;
 len += clen;
-p = next;
+p = end;
 }
 
 if (out_unsupported)
-- 
2.14.0.434.g98096fd7a8-goog

___
ModemManager-devel mailing list
ModemManager-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/modemmanager-devel