Re: [PATCH v6 0/1] http: Add Accept-Language header if possible

2015-01-19 Thread Eric Sunshine
On Sunday, January 18, 2015, Yi EungJun  wrote:
> Add an Accept-Language header which indicates the user's preferred
> languages defined by $LANGUAGE, $LC_ALL, $LC_MESSAGES and $LANG.
>
> Examples:
>   LANGUAGE= -> ""
>   LANGUAGE=ko:en -> "Accept-Language: ko, en;q=0.9, *;q=0.1"
>   LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *;q=0.1"
>   LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *;q=0.1"
>
> This gives git servers a chance to display remote error messages in
> the user's preferred language.
>
> Limit the number of languages to 1,000 because q-value must not be
> smaller than 0.001, and limit the length of Accept-Language header to
> 4,000 bytes for some HTTP servers which cannot accept such long header.
>
> Signed-off-by: Yi EungJun 
> ---
> diff --git a/http.c b/http.c
> index 040f362..349b033 100644
> --- a/http.c
> +++ b/http.c
> @@ -986,6 +993,145 @@ static void extract_content_type(struct strbuf *raw, 
> struct strbuf *type,
> strbuf_addstr(charset, "ISO-8859-1");
>  }
>
> +static void write_accept_language(struct strbuf *buf)
> +{
> +   [...]
> +   const int MAX_DECIMAL_PLACES = 3;
> +   const int MAX_LANGUAGE_TAGS = 1000;
> +   const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
> +   struct strbuf *language_tags = NULL;
> +   int num_langs;

Mental note: 'num_langs' is not initialized.

> +   const char *s = get_preferred_languages();
> +
> +   /* Don't add Accept-Language header if no language is preferred. */
> +   if (!s)
> +   return;
> +
> +   /*
> +* Split the colon-separated string of preferred languages into
> +* language_tags array.
> +*/
> +   do {
> +   /* increase language_tags array to add new language tag */
> +   REALLOC_ARRAY(language_tags, num_langs + 1);

'num_langs' is used here uninitialized.

> +   strbuf_init(&language_tags[num_langs], 0);
> +
> +   /* collect language tag */
> +   for (; *s && (isalnum(*s) || *s == '_'); s++)
> +   strbuf_addch(&language_tags[num_langs], *s == '_' ? 
> '-' : *s);
> +
> +   /* skip .codeset, @modifier and any other unnecessary parts */
> +   while (*s && *s != ':')
> +   s++;
> +
> +   if (language_tags[num_langs].len > 0) {

Mental note: An empty ("") language tag is never allowed in language_tags[].

 > +   num_langs++;

This is a little bit ugly. At the top of the loop, you allocate space
in the array for a strbuf and initialize it. However, if the language
tag is empty (""), then 'num_langs' is never incremented, so the next
time through the loop, strbuf_init() is invoked on the same block of
memory (assuming the realloc was a no-op since the allocation size did
not change), overwriting whatever was there and possibly leaking
memory. In this particular case, by examining the parser code closely,
we can see that nothing was added to the strbuf, so nothing is being
leaked the next time around, given the current implementation of
strbuf.

However, this is potentially fragile. A change to the implementation
of strbuf in the future (for instance, if strbuf_init() allocates
memory immediately) could result in a leak here. Moreover, this
no-leak situation only holds true if no text at all has been added to
the strbuf after strbuf_init(). If someone changes the parser in the
future to operate a bit differently so that some text is added and
then removed from the strbuf, even though the end result still has
length is 0, then it will start leaking.

One way to make this more robust would be to have a separate strbuf
for collecting the language tag. When you encounter a non-empty tag,
only then grow the array and initialize the new strbuf in the array.
Finally, use strbuf_swap() to swap the collected language tag into the
new array position. Something like this:

struct strbuf tag = STRBUF_INIT;
do {
 for (; *s && (isalnum(*s) || *s == '_'); s++)
 strbuf_addch(&tag, *s == '_' ? '-' : *s);

[...]

if (tag.len) {
num_langs++;
REALLOC_ARRAY(language_tags, num_langs);
strbuf_init(&language_tags[num_langs], 0);
strbuf_swap(&tag, &language_tags[num_langs]);

if (num_langs >= ...)
break;
}
while (...);
strbuf_release(&tag);

> +   if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' 
> */
> +   break;
> +   }
> +   } while (*s++);
> +
> +   /* write Accept-Language header into buf */
> +   if (num_langs >= 1) {
> +   int i;
> +   int last_buf_len;

Mental note: 'last_buf_len' is not initialized.

> +   int max_q;
> +   int decimal_places;
> +   char q_format[32];
> +
> +   /* add '*' */
> +   REALLOC

[PATCH v6 0/1] http: Add Accept-Language header if possible

2014-12-22 Thread Yi EungJun
Changes since v5

>From Junio C Hamano's review:

* The tests use `ls-remote` instead of `clone` for tests; I copied the test
  code from ba8e63dc30a80656fddc616f714fb217ad220c04.

* Set cached_accept_langauge to NULL after free it.

>From Eric Sunshine's review:

* get_accept_language() returns a pointer to const char instead of strbuf; the
  type of cached_accept_language also has been changed to char* from strbuf*

* write_accept_language(), which is extracted from get_accept_language(),
  respects MAX_SIZE_OF_HEADER.

* The for-loop in write_accept_language() works correctly if lang_begin points
  an empty string.

>From Jeff King's advice:

* get_preferred_languages() considers LC_MESSAGES only if NO_GETTEXT is not
  defined.

* Remove the tests for LC_MESSAGES, LANG and LC_ALL.

Yi EungJun (1):
  http: Add Accept-Language header if possible

 http.c | 173 +
 remote-curl.c  |   2 +
 t/t5550-http-fetch-dumb.sh |  32 +
 3 files changed, 207 insertions(+)

-- 
2.2.0.375.gcd18ce6.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html