On Sun, Sep 20, 2020 at 11:06:21PM +0900, Yasuhito FUTATSUKI wrote: > On 2020/09/19 5:26, Yasuhito FUTATSUKI wrote: > > Hi, > > > > While I tried to make an example that escape_path() does not work as > > expected in specific locale such as ja_JP.SJIS or CP932 (suggested by > > Jun), I found it seems svn_cmdline__edit_file_externally() always > > passes the file name as UTF-8 even if the LC_CTYPE is other than > > UTF-8. > > > > I think it is also need svn_path_cstring_from_utf8() conversion for > > file_name in svn_cmdline__edit_file_externally(). > > As far as I read the code, svn_cmdline__edit_file_externally() is > called with paths come from svn_io_open_unique_file3() or > "local_abspath" from svn_client_conflict_t, so I believe it is > needed.
Your patch looks good. I have one question below: > I wrote a patch address it. > (attached fix-edit-file-externally-patch.txt) > [[[ > Fix file name to edit from utf8 to local style. > > * subversion/libsvn_subr/cmdline.c (svn_cmdline__edit_file_externally): > Apply svn_path_cstring_from_utf8() to target file name. > ]]] > > > Here is a reproucing script. (Using ja_JP.UTF8 and ja_JP.SJIS locale). > > As it is quite few system can use ja_JP.SJIS locale (I only know > FreeBSD and macOS) and there is also escape_path issue in SJIS locale, > I modified this script to use ja_JP.eucJP locale. It looks like apr_escape_shell() will escape 0x5c (backslash, \) when looking for ASCII character bytes. When 0x5c is escaped this breaks the SJIS-encoded byte sequence. Have you already tried always using escape_path() on the UTF-8 version of the string, and then converting the escaped path to the locale's encoding? In other words: First use escape_path, then use svn_path_cstring_from_utf8? Perhaps that will make SJIS work? Cheers, Stefan > [[[ > #!/bin/sh > # assuming UTF-8 encoding in this file > > testdir=/tmp/svn-conflict-edit-filename-test > > if [ ! -d ${testdir} ]; then > mkdir -p ${testdir} > fi > > reposdir=${testdir}/testrepo > reposurl=file://${reposdir} > :${svn:=svn} > > svnadmin create ${reposdir} > cat > ${testdir}/record_filename.sh <<EOF > #!/bin/sh > LC_CTYPE=C; export LC_CTYPE > echo \$* > ${testdir}/svn-conflict-edit-file-name.txt > exit 0 > EOF > > LC_CTYPE=ja_JP.UTF-8 ; export LC_CTYPE > > # add a file "予定表.txt" (it means schedule in Japanese) > # in UTF-8 working copy. > # "予定表.txt" represented in hex are followings: > # e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 (UTF-8) > # 97 5c 92 e8 95 5c 2e 74 78 74 (SJIS; contains two '\'(== 0x5c)) > # cd bd c4 ea 9c bd 2e 74 78 74 (EUC-JP) > schedfn_utf8="予定表.txt" > schedfn_sjis=`echo ${schedfn_utf8} | iconv -f utf-8 -t sjis` > schedfn_eucjp=`echo ${schedfn_utf8} | iconv -f utf-8 -t euc-jp` > > ${svn} checkout ${reposurl} ${testdir}/wc-utf-8 > cd ${testdir}/wc-utf-8 > > cat > ${schedfn_utf8} <<EOF > 2020/09/19 foo > EOF > > ${svn} add ${schedfn_utf8} > ${svn} commit -m 'add schedule memo.' > > # prepare EUC-JP locale wc. > (LC_CTYPE=ja_JP.eucJP; export LC_CTYPE ; \ > ${svn} checkout $reposurl ${testdir}/wc-eucjp) > > # update the file in UTF-8 wc and commit it > cat >> ${schedfn_utf8} <<EOF > 2020/09/20 bar > EOF > ${svn} commit -m 'add schedule at 2020/09/20' > > # add local modification in EUC-JP wc > LC_CTYPE=ja_JP.eucJP ; export LC_CTYPE > cd ${testdir}/wc-eucjp > cat >> ${schedfn_eucjp} <<EOF > 2020/09/21 baz > EOF > > ${svn} update --force-interactive --accept edit \ > --editor-cmd "/bin/sh ${testdir}/record_filename.sh" > > LC_CTYPE=C ; export LC_CTYPE > ls | od -t x1 > od -t x1 ${testdir}/svn-conflict-edit-file-name.txt > ]]] > > The result with unpatched svn (last 4 lines): > [[[ > 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a > 0000013 > 0000000 e4 ba 88 e5 ae 9a e8 a1 a8 2e 74 78 74 0a > 0000016 > ]]] > > The result with patched svn (last 4 lines): > [[[ > 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a > 0000013 > 0000000 cd bd c4 ea c9 bd 2e 74 78 74 0a > 0000013 > ]]] > > Cheers, > -- > Yasuhito FUTATSUKI <futat...@yf.bsclub.org> > Index: subversion/libsvn_subr/cmdline.c > =================================================================== > --- subversion/libsvn_subr/cmdline.c (revision 1881848) > +++ subversion/libsvn_subr/cmdline.c (working copy) > @@ -1400,6 +1400,7 @@ > apr_pool_t *pool) > { > const char *editor, *cmd, *base_dir, *file_name, *base_dir_apr; > + const char *file_name_local; > char *old_cwd; > int sys_err; > apr_status_t apr_err; > @@ -1423,9 +1424,11 @@ > return svn_error_wrap_apr > (apr_err, _("Can't change working directory to '%s'"), base_dir); > > + SVN_ERR(svn_path_cstring_from_utf8(&file_name_local, file_name, pool)); > /* editor is explicitly documented as being interpreted by the user's > shell, > and as such should already be quoted/escaped as needed. */ > - cmd = apr_psprintf(pool, "%s %s", editor, escape_path(pool, file_name)); > + cmd = apr_psprintf(pool, "%s %s", editor, > + escape_path(pool, file_name_local)); > sys_err = system(cmd); > > apr_err = apr_filepath_set(old_cwd, pool);