On 11.06.2019 16:14, William A Rowe Jr wrote: > On Tue, Jun 11, 2019 at 4:15 AM Branko Čibej <br...@apache.org > <mailto:br...@apache.org>> wrote: > > On 07.06.2019 21:58, William A Rowe Jr wrote: > > I think the optimal way is to allocate a pair of apr thread-specific > > wchar buffers in each thread's pool on startup, and use those > > exclusively per-thread for wchar translations. We could be > looking at > > 64k/thread exclusively for name translation, but it doesn't seem > > unreasonable. > > > > The alternative is to continue to use stack, we surely don't want to > > lock on acquiring or allocating name translation buffers. /shrug > > Since this is Windows, and there's no embedded Windows environment > left > that I'm aware of that's still alive, we can continue with using the > stack ... but wouldn't it be so much better to alloca() the required > size instead of blindly burning through 64k every time? Obviously that > means counting the characters first. > > > I don't think there is any need to do so. A name buffer needs 32k > (right now > we limit that to 8k, but if we wanted to satisfy the OS-acceptable > pathname > length, we would blow that out 4x the current arbitrary limit.) For a > rename, > make that 2x buffers. But there is no benefit to wasting cycles > determining > the string length ahead of allocation, because the very next call can run > right past that limit and hit the wall on available stack. Demanding > that there > always be a potential 32k runway of available stack doesn't seem > excessive. > > The point to the stack is that it contracts immediately on return. So > we aren't > burning through a 64k buffer - we are ensuring that we have been topped-up > to 64k remaining. The targets of these calls are all Win32 API > invocations, > we are never nesting them inside further big-buffer allocations of our > own. > What might happen in ntdll we have little control over.
Point. Agreed. > We either reserve about 2x buffers for file name transliteration in heap > per thread, or we use the thread stack. As long as we trust that our utf-8 > to ucs-2 logic is rock solid and the allocations and limits are correctly > coded, this continues to be a safe approach. Apropos of that, for 2.0 we're about to or have already ditched support for versions of Windows that do not have native UTF-8/UTF-16 conversions (ah, yes ... Windows has finally moved from UCS-2 to UTF-16). Wouldn't this be the right time to switch to using Windows' functions instead of staying with our own? Especially since, with the transition to UTF-16, we have to deal correctly with surrogate pairs, something our current code (IIRC) doesn't do. -- Brane