Re: [Question] Is it normal for accented characters to be shown as decomposed Unicode on GNU/Linux?

2015-06-22 Thread Bastien Traverse
Le 22/06/2015 17:04, Charles Bailey a écrit :
 Note that these aren't decomposed (in the unicode decomposition
 sense) but are merely octal escaped representations of the utf-8
 encoded file names.

Thanks, I had read that term in similar context (German umlaut) and
thought it was correctly describing the phenomenon. Key words octal
escape return more precise results :)

 My understanding that this is normal and probably dates back (at least
 for status as far as:
 
   commit a734d0b10bd0f5554abb3acdf11426040cfc4df0
   Author: Dmitry Potapov dpota...@gmail.com
   Date:   Fri Mar 7 05:30:58 2008 +0300
 
   Make private quote_path() in wt-status.c available as
 quote_path_relative()
 
   [...]
 
 The behaviour can be changed by setting the git config variable
 core.quotePath to false.

This is awesome, thank you. Indeed I just tried my test case with this
config option set to false and accented characters appear normally.

Thank you!
--
To unsubscribe from this list: send the line unsubscribe git in


Re: [Question] Is it normal for accented characters to be shown as decomposed Unicode on GNU/Linux?

2015-06-22 Thread Charles Bailey
On Mon, Jun 22, 2015 at 03:17:40PM +0200, Bastien Traverse wrote:
 test case:
 $ mkdir accent-test  cd !$
 $ git init
 $ touch rêve réunion
 $ git status
 On branch master
 
 Initial commit
 
 Untracked files:
   (use git add file... to include in what will be committed)
 
   r\303\251union
   r\303\252ve

Note that these aren't decomposed (in the unicode decomposition
sense) but are merely octal escaped representations of the utf-8
encoded file names.

My understanding that this is normal and probably dates back (at least
for status as far as:

commit a734d0b10bd0f5554abb3acdf11426040cfc4df0
Author: Dmitry Potapov dpota...@gmail.com
Date:   Fri Mar 7 05:30:58 2008 +0300

Make private quote_path() in wt-status.c available as
quote_path_relative()

[...]

The behaviour can be changed by setting the git config variable
core.quotePath to false.
--
To unsubscribe from this list: send the line unsubscribe git in


[Question] Is it normal for accented characters to be shown as decomposed Unicode on GNU/Linux?

2015-06-22 Thread Bastien Traverse
Hi everybody,

I have a repository where some files and folders contain accented
characters due to being in French. Such names include rêve (dream),
réunion (meeting) etc.

Whether already in version control or not, git tools only show their
*decomposed* representation (I use a UTF-8 locale, see below), but don't
accept those representations as input (and auto-completion is broken for
those), which is a bit misleading (test case follows).

I've seen the threads about accented characters on OSX and the use of
'core.precomposeunicode', but as I'm running on GNU/Linux I thought this
shouldn't apply.

Since I've already had a problem in git with a weirdly encoded character
(see http://thread.gmane.org/gmane.comp.version-control.git/269710), I
wanted to get some feedback to determine whether my setup was the cause
of it or if it was normal to see decomposed file names in git. I found
in man git-status:

 If a filename contains whitespace or other nonprintable
 characters, that field will be quoted in the manner of a C string
 literal: surrounded by ASCII double quote (34) characters, and with
 interior special characters backslash-escaped.

So do everybody using accented characters see those in decomposed form
in git? And if so why some softwares built on top of it (like gitit [1])
don't inherit those decomposed representations?

[1] http://gitit.net/

Thanks!

---
test case:
$ mkdir accent-test  cd !$
$ git init
$ touch rêve réunion
$ git status
On branch master

Initial commit

Untracked files:
  (use git add file... to include in what will be committed)

r\303\251union
r\303\252ve
$ git add .
$ git commit -m accent test
[master (root commit) 0d776b7] accent test
 2 files changed, 0 insertions(+), 0 deletions(-)
 create mode 100644 r\303\251union
 create mode 100644 r\303\252ve
$ git log --summary
commit 0d776b7a09d5384a76066999431507018e292efe
Author: Bastien Traverse bastien@traverse.email
Date:   2015-06-22 14:13:46 +0200

accent test

 create mode 100644 r\303\251union
 create mode 100644 r\303\252ve
$ mv rêve reve
$ git status
On branch master
Changes not staged for commit:
  (use git add/rm file... to update what will be committed)
  (use git checkout -- file... to discard changes in working directory)

deleted:r\303\252ve

Untracked files:
  (use git add file... to include in what will be committed)

reve

no changes added to commit (use git add and/or git commit -a)
$ git add [[TAB-TAB]]
r\303\252ve  reve
$ git add [[TAB]] -- git add \r\\303\\252ve\
fatal: pathspec 'r\303\252ve' did not match any files
$ git add r\303\252ve
fatal: pathspec 'r\303\252ve' did not match any files
$ git add rêve reve OR git add .
$ git status
On branch master
Changes to be committed:
  (use git reset HEAD file... to unstage)

renamed:r\303\252ve - reve

I'm running an up-to-date Arch linux with following software versions
and locale config:

$ uname -a
Linux xxx 4.0.5-1-ARCH #1 SMP PREEMPT Sat Jun 6 18:37:49 CEST 2015
x86_64 GNU/Linux
$ bash --version
GNU bash, version 4.3.39(1)-release (x86_64-unknown-linux-gnu)
$ git --version
git version 2.4.3
$ locale
LANG=fr_FR.utf8
LC_CTYPE=fr_FR.utf8
LC_NUMERIC=fr_FR.utf8
LC_TIME=fr_FR.utf8
LC_COLLATE=fr_FR.utf8
LC_MONETARY=fr_FR.utf8
LC_MESSAGES=fr_FR.utf8
LC_PAPER=fr_FR.utf8
LC_NAME=fr_FR.utf8
LC_ADDRESS=fr_FR.utf8
LC_TELEPHONE=fr_FR.utf8
LC_MEASUREMENT=fr_FR.utf8
LC_IDENTIFICATION=fr_FR.utf8
LC_ALL=
$ localectl
   System Locale: LANG=fr_FR.UTF8
   VC Keymap: fr
  X11 Layout: fr
 X11 Variant: oss

Cheers
--
To unsubscribe from this list: send the line unsubscribe git in