On 2013-07-04 19.19, brian m. carlson wrote:
> The commit code already contains code for validating UTF-8, but it does not
> check for invalid values, such as guaranteed non-characters and surrogates.  
> Fix
s/guaranteed non-characters/code points out of range/
> this by explicitly checking for and rejecting such characters.
Do we really reject them, or do we (only) warn about them ? 

Other question:
Now that we have a check for codepoints out of range, beyond U+10FFFF,
do we want to have an additional testcase ?

> +test_expect_success 'UTF-8 invalid characters refused' '
May be:
 test_expect_success 'UTF-8 invalid surrogate' '

> +     test_when_finished "rm -f $HOME/stderr $HOME/invalid" && 
> +     rm -f "$HOME/stderr" &&
> +     echo "UTF-8 characters" >F &&
> +     printf "Commit message\n\nInvalid surrogate:\355\240\200\n" \
> +             >"$HOME/invalid" &&
> +     git commit -a -F "$HOME/invalid" \
> +             2>"$HOME"/stderr &&
> +     grep "did not conform" "$HOME"/stderr
> +'
> +
> +rm -f "$HOME/stderr"
Does it make sense to "grep on the fly", like this:
git commit -a -F "$HOME/invalid" 2>&1  | grep "did not conform"

