Author: rinrab
Date: Tue Jun 23 18:53:59 2026
New Revision: 1935602
Log:
Switch command-line programs to UTF-8 mode.
Merging from branch utf8-cmdline.
The Subversion library internally works with UTF-8 encoded strings which
simplifies it a lot. However, commands-line programs receive arguments in
locale encoding (cstring). Before this change, we used to convert encoding as
we handle each individual argument. But this approach has several limitations;
1) It's easy to miss some conversions since we need to constantly keep it in
mind that the arguments are in non-UTF-8 encoding, unlike the majority of
codebase assumes. Let's keep it consistent and simple.
2) Even though we could perfectly handle Non-ASCII (Unicode) strings in UTF-8
encoding under the hood, cstring squishes them to allow ASCII-only data.
Switching command-line to UTF-8 would remove that layer of converting input
data back and forth, which means direct conversions from UTF-16 to UTF-8
without any complications.
Recently, we added a family of svn_cmdline__get_cstring_argv() function to
resolve an issue with inconsistent encoding of what we assumed was cstring.
Now, it's time to get into UTF-8, so we are introducing
svn_cmdline__get_utf8_argv() for that. It uses svn_utf__win32_utf16_to_utf8()
on Windows, and svn_utf_cstring_to_utf8() on Unix.
This branch should not bring any behaviour changes on Unix, while Windows code
now supports Unicode command-line arguments.
* subversion/include/private/svn_cmdline_private.h
(svn_cmdline__win32_get_utf8_argv, svn_cmdline__default_get_utf8_argv):
Declare methods.
(svn_cmdline__get_utf8_argv): New define.
* subversion/include/svn_client.h
(svn_client_args_to_target_array3): Declare new functions that do not
perform UTF-8 conversion unlike its old variant.
(svn_client_args_to_target_array2): Deprecate.
* subversion/include/svn_opt.h
(svn_opt_args_to_target_array4, svn_opt_parse_revprop2): Declare new
functions that do not perform UTF-8 conversion unlike their old variant.
(svn_opt_args_to_target_array3, svn_opt_parse_revprop): Deprecate.
* subversion/libsvn_client/client.h
(svn_client__process_target_array): Declare new private function to put
common code of target collection to later use in either UTF8 or non-UTF8
versions.
* subversion/libsvn_client/cmdline.c
(svn_client__process_target_array): Implement function, using body of
former svn_client_args_to_target_array2() function.
(svn_client_args_to_target_array3): Implement by forwarding targets to
svn_client__process_target_array() as they are without any conversions.
* subversion/libsvn_client/deprecated.c
(svn_client_args_to_target_array2): Implement backward compatibility
function in a similar way as svn_client_args_to_target_array3(), but also
converting targets to UTF8 encoding.
* subversion/libsvn_subr/cmdline.c
(svn_cmdline__win32_get_utf8_argv): New function that receives a collection
of argv in locale Windows encoding and converts them to UTF8 using
svn_utf__win32_utf16_to_utf8.
(svn_cmdline__default_get_utf8_argv): New function that ensures the exact
same behaviour in terms of encoding conversions performed as our
command-line code used to do it before on Unix platforms since it is backed
by svn_utf_cstring_to_utf8().
* subversion/libsvn_subr/deprecated.c
(svn_opt_args_to_target_array3): Use svn_opt_parse_all_args() method to back
the implementation.
(svn_opt_parse_revprop): Implement wrapper over svn_opt_parse_revprop2 with
UTF8 conversion.
* subversion/libsvn_subr/opt.c
(svn_opt_parse_revprop2): Bump version and remove UTF8 conversion.
* subversion/svn/auth-cmd.c
* subversion/svn/propdel-cmd.c
* subversion/svn/propedit-cmd.c
* subversion/svn/propget-cmd.c
* subversion/svn/propset-cmd.c
* subversion/svn/shelf-cmd.c
* subversion/svn/shelf2-cmd.c
(svn_cl__auth,
svn_cl__changelist,
svn_cl__propdel,
svn_cl__propedit,
svn_cl__propget,
svn_cl__propedit,
svn_cl__propget,
svn_cl__propset,
get_next_argument): Remove conversion to UTF8 encoding of target arguments.
* subversion/svn/svn.c
(parse_compatible_version): Do not convert the encoding of value of the
opt_compatible_version argument.
(sub_main): Use svn_cmdline__get_utf8_argv instead of
svn_cmdline__get_cstring_argv to get normalized arguments.
(sub_main): Read opt_arg directly to utf8_opt_arg and remove all conversion
from cstring to UTF8.
(sub_main): For some options which want cstring, convert utf8_opt_arg back.
Those include --diff-cmd, --merge-cmd, and --editor-cmd.
(sub_main): Use a newer version of svn_opt_parse_revprop,
svn_opt_parse_revprop
which does not convert strings to UTF8.
(sub_main): Do not utf8-ize first_arg.
(sub_main): Do not convert the message to UTF8.
* subversion/svn/util.c subversion/svn/util.c
(svn_cl__make_log_msg_baton): Consider --encoding argument only when the
message is read from a file, keeping it in UTF8 when --message is utilised
instead.
(svn_cl__args_to_target_array_print_reserved): Use
svn_client_args_to_target_array3 instead of svn_client_args_to_target_array2,
which doesn't convert result to utf8, since we are working with already
converted 'os'.
* subversion/svnadmin/svnadmin.c
(parse_args): No need to command-line arguments to UTF8.
(set_revprop): Explicitly state UTF8 encoding when invoking the
svn_subst_translate_string2() since the arguments should already be in this
encoding. This ensures that no double conversion would be done.
(subcommand_lslocks): Use svn_opt_args_to_target_array4() instead of
svn_opt_args_to_target_array3() to retrieve targets which does not convert
encodings.
(sub_main): Use svn_cmdline__get_utf8_argv() instead of
svn_cmdline__get_cstring_argv() to get UTF8 argv.
(sub_main): Read opt_arg directly to utf8_opt_arg and remove all conversion
from cstring to UTF8.
(sub_main): Do not utf8-ize first_arg.
* subversion/svnbench/svnbench.c
(sub_main): Convert args straight to UTF8, remove in-place conversions, and
silently migrate to handle the other arguments as UTF8.
* subversion/svnbench/null-list-cmd.c
(includes): Remove svn_utf.h to make sure no more UTF8 conversions are done.
* subversion/svnbench/util.c
(svn_cl__args_to_target_array_print_reserved): Use newer version of
svn_client_args_to_target_array(), which doesn't perform UTF8 conversion
(sub_main): Directly convert argv to UTF8 and remove all following
conversions.
* subversion/svnlook/svnlook.c
(print_diff_tree): Don't convert diff_options to UTF8 twice.
(sub_main): Convert args straight to UTF8, remove in-place conversions, and
handle the other arguments as UTF8.
* subversion/svnfsfs/svnfsfs.c
* subversion/svnmucc/svnmucc.c
* subversion/svndumpfilter/svndumpfilter.c
* subversion/svnrdump/svnrdump.c
* subversion/svnserve/svnserve.c
* subversion/svnversion/svnversion.c
* tools/client-side/svn-mergeinfo-normalizer/svn-mergeinfo-normalizer.c
* tools/client-side/svnconflict/svnconflict.c
* tools/dev/svnraisetreeconflict/svnraisetreeconflict.c
* tools/dev/wc-ng/svn-wc-db-tester.c
* tools/server-side/svnauthz.c
(includes): Remove svn_utf8.h where possible as we no longer use its
functions.
(sub_main): Convert all args straight to UTF8 and remove later conversion.
* subversion/svnsync/svnsync.c
(initialize_cmd, synchronize_cmd, copy_revprops_cmd, info_cmd): Use
svn_opt_args_to_target_array4() instead of svn_opt_args_to_target_array3()
and remove svn_utf_cstring_to_utf8() calls.
(sub_main): Convert args straight to UTF8, remove in-place conversions, and
handle the other arguments as UTF8.
* subversion/tests/cmdline/svntest/testcase.py
(_detect_utf8_locale): New local helper.
(RequireUtf8_deco): Add RequireUtf8 decorator that allows test to be run in
UTF8 environment to fix failures on Unix platforms.
* subversion/tests/cmdline/basic_tests.py
(): Import RequireUtf8 decorator from testcase.py
(argv_with_best_fit_chars):
Adapt the unit test to the arguments in utf-8 bytes.
(unicode_arguments_test): Use that RequireUtf8 decorator to force the test to
run in the UTF8 environment.
* subversion/tests/libsvn_subr/opt-test.c
(includes): Add svn_hash.h.
(test_svn_opt_parse_revprop): Add new test.
(test_funcs): Run the test.
Reviewed by: ivan
Modified:
subversion/trunk/ (props changed)
subversion/trunk/subversion/include/private/svn_cmdline_private.h
subversion/trunk/subversion/include/svn_client.h
subversion/trunk/subversion/include/svn_opt.h
subversion/trunk/subversion/libsvn_client/client.h
subversion/trunk/subversion/libsvn_client/cmdline.c
subversion/trunk/subversion/libsvn_client/deprecated.c
subversion/trunk/subversion/libsvn_subr/cmdline.c
subversion/trunk/subversion/libsvn_subr/deprecated.c
subversion/trunk/subversion/libsvn_subr/opt.c
subversion/trunk/subversion/libsvn_subr/utf8proc/ (props changed)
subversion/trunk/subversion/svn/auth-cmd.c
subversion/trunk/subversion/svn/changelist-cmd.c
subversion/trunk/subversion/svn/cl.h
subversion/trunk/subversion/svn/propdel-cmd.c
subversion/trunk/subversion/svn/propedit-cmd.c
subversion/trunk/subversion/svn/propget-cmd.c
subversion/trunk/subversion/svn/propset-cmd.c
subversion/trunk/subversion/svn/shelf-cmd.c
subversion/trunk/subversion/svn/shelf2-cmd.c
subversion/trunk/subversion/svn/svn.c
subversion/trunk/subversion/svn/util.c
subversion/trunk/subversion/svnadmin/svnadmin.c
subversion/trunk/subversion/svnbench/null-list-cmd.c
subversion/trunk/subversion/svnbench/svnbench.c
subversion/trunk/subversion/svnbench/util.c
subversion/trunk/subversion/svndumpfilter/svndumpfilter.c
subversion/trunk/subversion/svnfsfs/svnfsfs.c
subversion/trunk/subversion/svnlook/svnlook.c
subversion/trunk/subversion/svnmucc/svnmucc.c
subversion/trunk/subversion/svnrdump/svnrdump.c
subversion/trunk/subversion/svnserve/svnserve.c
subversion/trunk/subversion/svnsync/svnsync.c
subversion/trunk/subversion/svnversion/svnversion.c
subversion/trunk/subversion/tests/cmdline/basic_tests.py
subversion/trunk/subversion/tests/cmdline/svntest/testcase.py
subversion/trunk/subversion/tests/libsvn_subr/opt-test.c
subversion/trunk/tools/client-side/svn-mergeinfo-normalizer/svn-mergeinfo-normalizer.c
subversion/trunk/tools/client-side/svnconflict/svnconflict.c
subversion/trunk/tools/dev/svnraisetreeconflict/svnraisetreeconflict.c
subversion/trunk/tools/dev/wc-ng/svn-wc-db-tester.c
subversion/trunk/tools/server-side/svnauthz.c
NOTE: this message was too long when including "diff" contents.
The contents have been replaced with URLs to display the
diff contents on a web page.
Modified: subversion/trunk/subversion/include/private/svn_cmdline_private.h
Modified: subversion/trunk/subversion/include/svn_client.h
Modified: subversion/trunk/subversion/include/svn_opt.h
Modified: subversion/trunk/subversion/libsvn_client/client.h
Modified: subversion/trunk/subversion/libsvn_client/cmdline.c
Modified: subversion/trunk/subversion/libsvn_client/deprecated.c
Modified: subversion/trunk/subversion/libsvn_subr/cmdline.c
Modified: subversion/trunk/subversion/libsvn_subr/deprecated.c
Modified: subversion/trunk/subversion/libsvn_subr/opt.c
Modified: subversion/trunk/subversion/svn/auth-cmd.c
Modified: subversion/trunk/subversion/svn/changelist-cmd.c
Modified: subversion/trunk/subversion/svn/cl.h
Modified: subversion/trunk/subversion/svn/propdel-cmd.c
Modified: subversion/trunk/subversion/svn/propedit-cmd.c
Modified: subversion/trunk/subversion/svn/propget-cmd.c
Modified: subversion/trunk/subversion/svn/propset-cmd.c
Modified: subversion/trunk/subversion/svn/shelf-cmd.c
Modified: subversion/trunk/subversion/svn/shelf2-cmd.c
Modified: subversion/trunk/subversion/svn/svn.c
Modified: subversion/trunk/subversion/svn/util.c
Modified: subversion/trunk/subversion/svnadmin/svnadmin.c
Modified: subversion/trunk/subversion/svnbench/null-list-cmd.c
Modified: subversion/trunk/subversion/svnbench/svnbench.c
Modified: subversion/trunk/subversion/svnbench/util.c
Modified: subversion/trunk/subversion/svndumpfilter/svndumpfilter.c
Modified: subversion/trunk/subversion/svnfsfs/svnfsfs.c
Modified: subversion/trunk/subversion/svnlook/svnlook.c
Modified: subversion/trunk/subversion/svnmucc/svnmucc.c
Modified: subversion/trunk/subversion/svnrdump/svnrdump.c
Modified: subversion/trunk/subversion/svnserve/svnserve.c
Modified: subversion/trunk/subversion/svnsync/svnsync.c
Modified: subversion/trunk/subversion/svnversion/svnversion.c
Modified: subversion/trunk/subversion/tests/cmdline/basic_tests.py
Modified: subversion/trunk/subversion/tests/cmdline/svntest/testcase.py
Modified: subversion/trunk/subversion/tests/libsvn_subr/opt-test.c
Modified:
subversion/trunk/tools/client-side/svn-mergeinfo-normalizer/svn-mergeinfo-normalizer.c
Modified: subversion/trunk/tools/client-side/svnconflict/svnconflict.c
Modified: subversion/trunk/tools/dev/svnraisetreeconflict/svnraisetreeconflict.c
Modified: subversion/trunk/tools/dev/wc-ng/svn-wc-db-tester.c
Modified: subversion/trunk/tools/server-side/svnauthz.c