On Tue, Jun 10, 2025 at 8:00 PM Branko Čibej <br...@apache.org> wrote:
On 11. 6. 25 01:19, Nathan Hartman wrote:
On Tue, Jun 10, 2025 at 12:26 PM Branko Čibej <br...@apache.org>
wrote:
On 10. 6. 25 17:32, Timofei Zhakov wrote:
On Tue, Jun 10, 2025 at 5:24 PM Branko Čibej
<br...@apache.org> wrote:
On 10. 6. 25 16:01, Timofei Zhakov wrote:
On Mon, Jun 9, 2025 at 7:26 PM Branko Čibej
<br...@apache.org> wrote:
On 9. 6. 25 18:31, rin...@apache.org wrote:
Author: rinrab
Date: Mon Jun 9 16:31:15 2025
New Revision: 1926293
URL:http://svn.apache.org/viewvc?rev=1926293&view=rev
<http://svn.apache.org/viewvc?rev=1926293&view=rev>
Log:
Do not include apr_xlate.h into svn_utf.h, but manually wrap the
SVN_APR_*_CHARSET constants into APR_*_CHARSET.
-1. You can't presume that this will never change
in APR.
Fix the cmake build to handle include paths
correctly. This is not that fix. Please revert.
-- Brane
I want to make sure that you entirely understand this
change.
SVN_APR_*_CHARSETfamily has nothing similar with APR_*_CHARSETsince
they are being converted into APR constants
through the get_apr_xlate_charset()function.
This also means we can add our own shortcuts to SVN_APR_*_CHARSET,
which I'm planning to do for UTF-8.
As long as consumers of our library are not passing APR_*_CHARSETto
our utf routines, we are all fine.
Yes. "As long as". This stuff has been in our public
header since before Subversion was an ASF project. APR
is and always has been part of Subversion's public
interface.
It doesn't make us dependent on apr's constants. Please
withdraw your veto.
You're changing a public header in a
backward-incompatible way because the CMake build has
broken include paths.
You don't see anything wrong with that? Looks to me like
you got your priorities wrong.
-- Brane
Okay, I see. You are talking about removing this include,
not about the whole change, right? I agree that this is
wrong for backward compat to remove this from the private
header. However, the remaining part
(with SVN_APR_*_CHARSET) is correct. It's still an
improvement IMO, which we can use later to add new charset
shortcuts (like I said for utf8).
No, it is not correct, because this is *not* a private
header. It's public interface.
The proper fix would be to put something like this in
CMakeLists.txt:
include_directories($<TARGET_PROPERTY:apr::apr,INCLUDE_DIRECTORIES>
$<TARGET_PROPERTY:apr::aprutil,INCLUDE_DIRECTORIES>)
Right now, your changes for utf-8 argument parsing has
spilled into our public API in an incompatible way, just to
avoid fixing cmake.
-- Brane
Hi,
Timofei, your tremendous efforts are very much appreciated.
Brane, your vigilance and care is very much appreciated too.
Okay, so there's a speed bump here. Please, let's all be patient and
try to gain some clarity, and let's try to stay positive. We've seen
quite a bit of what *shouldn't* be done; let's try to focus on what
*should* be done...
Yes, obviously I agree that we should not change public APIs, nor
mask any problem by patching the wrong thing, such as fixing a build
problem. Btw I think build system issues are not the motivation for
this change...
It looks to me like this patch is meant to be one step in a sequence,
but we haven't seen the rest of the sequence, because we've paused
here. So we're not understanding the full picture in context.
So, if we could understand what the rest of the sequence will be?
What
problem are we trying to solve? What is the intended solution?
From my reading of the diff:
It looks like the intention is to "subclass" and extend APR's charset
stuff by doing two things:
(1) defining SVN_APR_LOCALE_CHARSET and SVN_APR_DEFAULT_CHARSET
"on top of" APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET, to
set the
stage for adding further definitions (which we haven't seen yet),
and
(2) adding the function get_apr_xlate_charset(), which currently is a
no-op, but appears as though it intends to be extended later to
examine CHARSET and either pass it through unchanged or translate
it to a different code.
Am I correct above the above? If not, where am I mistaken?
In APR, APR_LOCALE_CHARSET and APR_DEFAULT_CHARSET are #define'd to
0 and 1, respectively, typecast to (const char *). The reason for
this
typecasting is so that you can either pass one of the above names, or
pass a string, like "ISO-8559-1".
Several things are not clear to me:
(1) How does APR intend for applications to extend beyond these two
codes?
It doesn't. These are magic pointers for the only two encodings
that the application can't know directly.
Perhaps it's one of these possibilities:
(a) APR intends for applications to use either APR_LOCALE_CHARSET
and APR_DEFAULT_CHARSET or use strings. Nothing else. Or,
This.
(b) APR intends for applications to add their own defines, and
perhaps use the above defines and/or strings also. Or,
Sure applications can define their own string constants, just not
override those two, which are special.
(c) APR hasn't considered this question. Or,
Yes, we did. :)
Thanks for your response. This is extremely helpful!
(d) APR hasn't considered this question because it should never
be necessary to do so. If so, what *are* we supposed to do?
Brane's insights here will be particularly appreciated.
I frankly don't understand why we'd need any more than what APR
gives us. Subversion certainly doesn't want to deal with anyhting
explicit other than UTF-8 (and, maybe in the future, the other
UTF's that tend to be used in text files, such as UTF-16-LE on
Windows).
Those constants are intended to make it easy for the application
to create a translator *from* whatever encoding the locale give
and the source compiler give us. Everything else, as far as we're
concerned, is either utf-8 (internally) or something specific
about a file's contents that we can get from svn:mime-type. Not
that we currently do, as far as I'm aware.
Ack.
(2) Perhaps more importantly, does APR lack something that SVN needs
in this area? If so, what?
Now, a suggestion. A picture is worth a thousand words, and in
software development, code is pictures. So perhaps things would be
more clear to everyone by doing something like:
(1) Revert the change on trunk.
(2) Fix the CMake build if indeed it's broken right now; I saw
Brane's response but I haven't checked. Fixing it will remove
anything CMake-related from the equation, so we can focus on the
charset issue without doubts.
The CMake build is broken with regard to include directories,
that's a given, and it's easy to fix. Perhaps not in such a
cmake-purist way that some people around here appear to advocate,
so let me just drop this reminder here: a build tool is a *tool*,
meant to make life easier for our users and *us*. The point is not
to try to get some sort of literary award for purism.
Yes, after my earlier reply I saw the GHA notification about the build
breaking on the revision that added back the #include in question.
(3) Create a branch and make this change + the further changes that
are supposed to build upon it on the branch...
Now, with a branch and with all the changes together, we will all be
a lot smarter because we will see everything in context. Right now,
we're seeing one change in a vacuum and making judgments without the
full information.
Even if the branch turns out to approach the problem from the wrong
direction, at least we'll be able to see the full picture and get
some
better feedback about how to do it right.
Again, I agree about not changing the public APIs. That's what we
should *not* do. I'm trying to get at: what *should* we do?
As I said, I see no need for any extra public constants just to
get our command-line tools to parse their arguments correctly. All
we need is for that is APR_LOCALE_CHARSET, or rather
SVN_APR_LOCALE_CHARSET, which our conversion functions already use.
If, on the other hand, this is not the case -- then I'd really
like to see some discussion and design proposal here on the list.
Not just throwing code at a branch or at trunk with no real
explanation and apparently no plan, because right now I have
absolutely no clue what's on the branch, what's on trunk, what
will be merged or if will be merged at all.
I admit I'm a bit confused also, though partly that's because a lot
has been happening here with Timofei's efforts. We've gotten
accustomed to things being kind of dormant for a while. I'm frankly
glad to see that change.
And yes, I also don't know why this change is on the trunk rather than
the branch, why the need for our own defines, etc. But I'm going to
assume there is a plan and I'll wait to hear from Timofei before
reaching any conclusions. I'm +1 to discussing the aim and intention
of this change and how it fits in with whatever it tries to achieve.
Once we understand that, we can all be more helpful, and, again,
insights from your extensive knowledge of the APR/SVN interaction, and
any other guidance you can offer, will be greatly appreciated.
Originally the idea was to fix a known bug. I can't help but feel
that this bug fixing has gone slightly out of control, and the
goal now is something else -- but I don't know what.
-- Brane
Cheers,
Nathan