On 11.06.2019 16:14, William A Rowe Jr wrote:
> On Tue, Jun 11, 2019 at 4:15 AM Branko Čibej <br...@apache.org
> <mailto:br...@apache.org>> wrote:
>
>     On 07.06.2019 21:58, William A Rowe Jr wrote:
>     > I think the optimal way is to allocate a pair of apr thread-specific
>     > wchar buffers in each thread's pool on startup, and use those
>     > exclusively per-thread for wchar translations. We could be
>     looking at
>     > 64k/thread exclusively for name translation, but it doesn't seem
>     > unreasonable.
>     >
>     > The alternative is to continue to use stack, we surely don't want to
>     > lock on acquiring or allocating name translation buffers. /shrug
>
>     Since this is Windows, and there's no embedded Windows environment
>     left
>     that I'm aware of that's still alive, we can continue with using the
>     stack ... but wouldn't it be so much better to alloca() the required
>     size instead of blindly burning through 64k every time? Obviously that
>     means counting the characters first.
>
>
> I don't think there is any need to do so. A name buffer needs 32k
> (right now
> we limit that to 8k, but if we wanted to satisfy the OS-acceptable
> pathname
> length, we would blow that out 4x the current arbitrary limit.) For a
> rename,
> make that 2x buffers. But there is no benefit to wasting cycles
> determining
> the string length ahead of allocation, because the very next call can run
> right past that limit and hit the wall on available stack. Demanding
> that there
> always be a potential 32k runway of available stack doesn't seem
> excessive.
>
> The point to the stack is that it contracts immediately on return. So
> we aren't
> burning through a 64k buffer - we are ensuring that we have been topped-up
> to 64k remaining. The targets of these calls are all Win32 API
> invocations, 
> we are never nesting them inside further big-buffer allocations of our
> own.
> What might happen in ntdll we have little control over.

Point. Agreed.

>  We either reserve about 2x buffers for file name transliteration in heap
> per thread, or we use the thread stack. As long as we trust that our utf-8
> to ucs-2 logic is rock solid and the allocations and limits are correctly
> coded, this continues to be a safe approach.

Apropos of that, for 2.0 we're about to or have already ditched support
for versions of Windows that do not have native UTF-8/UTF-16 conversions
(ah, yes ... Windows has finally moved from UCS-2 to UTF-16). Wouldn't
this be the right time to switch to using Windows' functions instead of
staying with our own? Especially since, with the transition to UTF-16,
we have to deal correctly with surrogate pairs, something our current
code (IIRC) doesn't do.

-- Brane

Reply via email to