On Mon, May 26, 2025 at 8:29 PM Branko Čibej <br...@apache.org> wrote:

> On 26. 5. 25 14:07, Timofei Zhakov wrote:
>
> On Tue, May 20, 2025 at 9:29 PM Branko Čibej <br...@apache.org> wrote:
>
>> On 20. 5. 25 20:31, rin...@apache.org wrote:
>>
>> Author: rinrab
>> Date: Tue May 20 18:31:35 2025
>> New Revision: 1925722
>>
>> URL: http://svn.apache.org/viewvc?rev=1925722&view=rev
>> Log:
>> On the 'utf8-cmdline-prototype' branch: add a new function that retrieves
>> target list from args, instead of converting the encoding, assumes that
>> args have already been converted.
>>
>>
>>
>> Once again, why don't you use apr_app_initialize() that does all that and
>> more, instead of perpetuating the same problem? As long as this is used on
>> Windows from main() instead of wmain(), there's no way it will work
>> consistently.
>>
>
> First I wanna mention that this exact change is still needed even if we
> decide to use apr app to host the binary. At this point there is no
> difference between using apr app and svn's implementation of wmain and main
> wrappers.
>
> Also, I'd like to argue about using apr app in general. The subversion's
> command line wasn't using this library during the whole history (as far as
> I know), so I don't think it's correct to start using it now. What would we
> do if we need some functionality apr app doesn't implement? I mean that
> would be a quite huge change...
>
> In my opinion, apr app is not something fully ready to use in subversion.
> For example, it doesn't wrap main on unix platforms, while it still can be
> executed with non-utf8-ized arguments. In wmain() apr_conv_ucs2_to_utf8()
> just ignores the error, which is bad (on win32).
>
> And finally, the code we have right now is pretty good, and it seems
> possible to support non-ASCII arguments (which I'm working on).
>
> I hope this makes some sense.
>
>
>
> OK, fine, but I still don't see where you're converting the argument list
> directly from UTF-16-LE to UTF-8. svn_cmdline__win32_get_cstring_argv
> doesn't do that, and it should.
>
> That's the main problem on Windows: that function is converting to the
> local multibyte character set, then svn_client_args_to_target_array2
> converts /that/ to UTF-8. That's fine-ish on Unix, where we don't convert
> twice, but completely brain-dead on Windows. And I freely admit I had a lot
> do do with that brain-deadedness.
>
> This is what should be happening, IMO:
>
>    1. deprecate svn_client_args_to_target_array2
>    2. rename your svn_client_args_to_target_array_utf8
>    svn_client_args_to_target_array3 and make it explicitly require that the
>    args are already in UTF-8
>    3. svn_cmdline__win32_get_cstring_argv must convert to UTF-8
>    4. svn_cmdline__default_get_cstring_argv must *also* convert to UTF-8
>
>
> You've done steps 1 and 2, 3 should be trivial and 4 will make the result
> compatible with 3.
>
> -- Brane
>

IMHO, it's much better to follow my way instead, in which I'm adding utf8
alternatives for those functions, without deprecating old ones, let's call
them cstring.

This is exactly what happened with svn_cmdline__get_cstring_argv(). I added
a new function svn_cmdline__get_utf8_argv(), that returns a utf8 string
instead. Then we can switch the one used in each executable one-by-one.

This approach makes it so we can smoothly switch all programs to use utf8
args, which simplifies the process of hacking this whole thing a lot.

The same thing with svn_client_args_to_target_array_utf8, and also this
difference in names is also helpful to differentiate between cstring and
utf8 versions much more than version bump. Also the old one has not been
deprecated, because someone may still want this function to convert
encoding.

Let's not overcomplicate things if possible.

-- 
Timofei Zhakov

Reply via email to