On Sun, Mar 4, 2018 at 3:14 PM,  <lars.schnei...@autodesk.com> wrote:
> Git recognizes files encoded with ASCII or one of its supersets (e.g.
> UTF-8 or ISO-8859-1) as text files. All other encodings are usually
> interpreted as binary and consequently built-in Git text processing
> tools (e.g. 'git diff') as well as most Git web front ends do not
> visualize the content.
> [...]
> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
> ---
> diff --git a/convert.c b/convert.c
> @@ -978,6 +1051,25 @@ static int ident_to_worktree(const char *path, const 
> char *src, size_t len,
> +static const char *git_path_check_encoding(struct attr_check_item *check)
> +{
> +       [...]
> +       /*
> +        * Ensure encoding names are always upper case (e.g. UTF-8) to
> +        * simplify subsequent string comparisons.
> +        */
> +       return xstrdup_toupper(value);

xstrdup_toupper() allocates memory...

> +}
> @@ -1033,6 +1125,7 @@ struct conv_attrs {
>         enum crlf_action attr_action; /* What attr says */
>         enum crlf_action crlf_action; /* When no attr is set, use 
> core.autocrlf */
>         int ident;
> +       const char *working_tree_encoding; /* Supported encoding or default 
> encoding if NULL */

...which is assigned to 'const char *'...

>  };
> @@ -1064,6 +1158,7 @@ static void convert_attrs(struct conv_attrs *ca, const 
> char *path)
>                         else if (eol_attr == EOL_CRLF)
>                                 ca->crlf_action = CRLF_TEXT_CRLF;
>                 }
> +               ca->working_tree_encoding = git_path_check_encoding(ccheck + 
> 5);

...by this code, and eventually leaked.

It's too bad it isn't cleaned up (freed), but looking at the callers,
fixing this leak would be mildly noisy (though not particularly
invasive). How much do we care about this leak?

>         } else {
>                 ca->drv = NULL;
>                 ca->crlf_action = CRLF_UNDEFINED;
> diff --git a/t/t0028-working-tree-encoding.sh 
> b/t/t0028-working-tree-encoding.sh
> @@ -0,0 +1,135 @@
> +test_expect_success 'check $GIT_DIR/info/attributes support' '
> +       test_when_finished "rm -f test.utf8.raw test.utf32.raw 
> test.utf32.git" &&

It seems weird to be cleaning up files this test didn't create
(test.utf8.raw and test.utf32.raw).

> +       test_when_finished "git reset --hard HEAD" &&
> +
> +       echo "*.utf32 text working-tree-encoding=utf-32" 
> >.git/info/attributes &&
> +       git add test.utf32 &&
> +
> +       git cat-file -p :test.utf32 >test.utf32.git &&
> +       test_cmp_bin test.utf8.raw test.utf32.git
> +'
> +
> +test_expect_success 'check unsupported encodings' '
> +       test_when_finished "rm -f err.out" &&
> +       test_when_finished "git reset --hard HEAD" &&

Resetting to HEAD here is an important cleanup action, but tests don't
usually clean up files such as 'err.out' since such detritus doesn't
usually impact subsequent tests negatively. (Just an observation; no
re-roll needed.)

> +       echo "*.nothing text working-tree-encoding=" >>.gitattributes &&
> +       printf "nothing" >t.nothing &&
> +       git add t.nothing &&
> +
> +       echo "*.garbage text working-tree-encoding=garbage" >>.gitattributes 
> &&
> +       printf "garbage" >t.garbage &&
> +       test_must_fail git add t.garbage 2>err.out &&
> +       test_i18ngrep "fatal: failed to encode" err.out
> +'

Reply via email to