On Wed, May 28, 2025 at 12:51 PM Branko Čibej <br...@apache.org> wrote:

> On 26. 5. 25 16:58, rin...@apache.org wrote:
>
> Author: rinrab
> Date: Mon May 26 14:58:30 2025
> New Revision: 1925827
>
> URL: http://svn.apache.org/viewvc?rev=1925827&view=rev
> Log:
> On the 'utf8-cmdline-prototype' branch: convert opt_arg to utf8 once
> in the beginning of argument handling loop instead of doing this conversion
> while handling every argument.
>
> This is transitional code to convert them straight to utf8 from platform
> dependent encoding.
>
> Modified:
>     subversion/branches/utf8-cmdline-prototype/subversion/svn/svn.c
>
> Modified: subversion/branches/utf8-cmdline-prototype/subversion/svn/svn.c
> URL: 
> http://svn.apache.org/viewvc/subversion/branches/utf8-cmdline-prototype/subversion/svn/svn.c?rev=1925827&r1=1925826&r2=1925827&view=diff
> ==============================================================================
> --- subversion/branches/utf8-cmdline-prototype/subversion/svn/svn.c (original)
> +++ subversion/branches/utf8-cmdline-prototype/subversion/svn/svn.c Mon May 
> 26 14:58:30 2025
> @@ -2308,13 +2308,17 @@ sub_main(int *exit_code,
>            return SVN_NO_ERROR;
>          }
>
> +      if (opt_arg != NULL)
> +        SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool));
> +      else
> +        utf8_opt_arg = NULL;
> +
>
>
> utf8_opt_arg is already in the pool after this step, so ...
>
> [...]
>
>
>        case opt_auth_username:
> -        SVN_ERR(svn_utf_cstring_to_utf8(&opt_state.auth_username,
> -                                        opt_arg, pool));
> +        opt_state.auth_username = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>        case opt_auth_password:
> -        SVN_ERR(svn_utf_cstring_to_utf8(&opt_state.auth_password,
> -                                        opt_arg, pool));
> +        opt_state.auth_password = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>
>
> ... why are you copying it into the very same pool again?
>
> [...]
>
>        case 'x':
> -        SVN_ERR(svn_utf_cstring_to_utf8(&opt_state.extensions,
> -                                        opt_arg, pool));
> +        opt_state.extensions = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>
>
> And here...
>
> [...]
>
> -        SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool));
>          opt_state.old_target = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>        case opt_new_cmd:
> -        SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool));
>          opt_state.new_target = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>
>
> These were from before with no explanation as to why they're being copied
> twice. :(
>
> [...]
>
>        case opt_show_item:
> -        SVN_ERR(svn_utf_cstring_to_utf8(&utf8_opt_arg, opt_arg, pool));
> -        opt_state.show_item = utf8_opt_arg;
> +        opt_state.show_item = apr_pstrdup(pool, utf8_opt_arg);
>          break;
>
>
> This was actually correct before your change. :D
>
>
> Despite the million lines of code of option parsing in sub_main(),
> utf8_opt_arg is converted and consumed exactly once per iteration. So all
> those apr_pstrdup() calls, the ones that were there before and the ones you
> added, could be removed.
>
>
> -- Brane
>

The apr_getopt_long() function doesn't guarantee whether option_arg would
be allocated in a pool or not. Most probably it just returns a pointer to
argv, which is not even in any pool (passed from the system) or returned
from from svn_cmdline__get_utf8_argv(), which may cause issues. Actually
no, but it's still better to copy it for sure.

Previously, they were duplicated by svn_utf_cstring_to_utf8(), which
allocates the result in a pool even if no actual conversion has been
performed.

For example, check out this block of code, existed in the trunk before:

      case opt_encoding:
        opt_state.encoding = apr_pstrdup(pool, opt_arg);
        break;

Basically I want to avoid or minimise changing behaviour in utf8-isation.

Also this may be cool to introduce an iterpool into the argument handling
loop. In this case it's mandatory to copy them to the right one.

-- 
Timofei Zhakov

Reply via email to