On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <i...@visualsvn.com> wrote: > > On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <i...@visualsvn.com> wrote: > > > > On Wed, 3 Jul 2019 at 18:37, Joe Orton <jor...@redhat.com> wrote: > > > > > > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote: > > > > They also make the existing old API unusable for many cases thus > > > > making a switch to the new API mandatory, even though it doesn't have > > > > to be so (see below). > > > > > > > > APR 1.x did not specify that the result of apr_dir_read() has to be > > > > allocated > > > > in dir->pool: > > > > > > > > > > > > https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8 > > > > > > > > This function does not accept a pool argument, and I think that it's > > > > natural > > > > to assume limited lifetime in such case. This is what the current major > > > > APR > > > > users (httpd, svn) do, and other users should probably be ready for > > > > that too, > > > > since on Win32 the subsequent calls to apr_dir_read() already > > > > invalidate the > > > > previous apr_finfo_t. > > > > > > Are there many examples in the APR API where returned strings are not > > > either pool-allocated or constant (process lifetime)? It seems unusual > > > and very surprising to me. > > > > > > A hypothetical API consumer can have made either assumption: > > > > > > a) of constant memory (true with limits on Win32, breaks for Unix), or > > > b) of pool-allocated string lifetime (works for Unix, breaks for Win32) > > > > > > neither is documented and both are broken by current implementations. It > > > is impossible to satisfy both so anybody can argue that fixing either > > > implementation to match the other is a regression. > > > > > > Doubling down on (a) over (b) seems strongly inferior to me, the API > > > cannot support this without introducing runtime overhead for all callers > > > (a new iterpool in the dir object), so I'd rather fix this properly with > > > a new API. > > > > > > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at > > > the same as introducing _pread() if desired - making the strdup only > > > happen for _pread would only be a minor tweak, not a whole new subpool. > > > > > > > There is example of apr_hash_t.ITERATOR that is statically allocated in > > the hash object and returned to the caller. > > > > Also the native readdir() API behaves the same way [1]: > > [[[ > > The returned pointer, and pointers within the structure, might be > > invalidated > > or the structure or the storage areas might be overwritten by a subsequent > > call > > to readdir() on the same directory stream. > > ]]] > > > > From this perspective the current behavior of apr_dir_read on Unix looks > > inconsistent and surprising for the API user because it is impossible to use > > in a loop without unbounded memory usage. > > > > I strongly agree that the revved version of this function that accepts a > > POOL > > argument will be good from both usage and implementation terms. I would be > > +1 > > on adding such API. But I also think that unless we fix the original issue > > by using the preallocated buffer/iterpool, the whole thing would still be > > problematic for any existing API user. > > > > More detailed: if we take this opportunity to choose and document the API to > > consistently return a temporary result, then as the caller I can either > > process > > the data during iteration or manually put it into a long-living pool if > > that is > > necessary. I think that all current callers work with the API this way. And > > I > > can also switch to the new revved function to manually specify the pool and > > better control the allocations. If I would like to support both APR 1.x and > > 2.x > > the compatibility is also straightforward to achieve. > > > > If we instead choose an option where the results may be allocated > > in the dir object pool, then as the caller I cannot avoid unbounded memory > > usage when supporting both APR 1.x and 2.x. And this approach also > > introduces > > the unavoidable the unbounded memory usage for all Win32 users that > > do not update their code. > > > > [1] > > https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475 > > > Is there any feedback or thoughts on these proposed changes? > > Thanks! >
For what it's worth, I'm -1 on the changes done in r1862071 and r1862435 for the reasons listed above. PS: No idea if I can actually veto changes, because while I'm a committer on the APR project, I am not in its PMC. But "-1" properly expresses my technical opinion on this topic. -- Ivan Zhakov