On Thu, 25 Jul 2019 at 15:58, Ivan Zhakov <i...@visualsvn.com> wrote:
>
> On Mon, 8 Jul 2019 at 20:05, Ivan Zhakov <i...@visualsvn.com> wrote:
> >
> > On Wed, 3 Jul 2019 at 18:37, Joe Orton <jor...@redhat.com> wrote:
> > >
> > > On Wed, Jul 03, 2019 at 02:43:02PM +0300, Ivan Zhakov wrote:
> > > > They also make the existing old API unusable for many cases thus
> > > > making a switch to the new API mandatory, even though it doesn't have
> > > > to be so (see below).
> > > >
> > > > APR 1.x did not specify that the result of apr_dir_read() has to be 
> > > > allocated
> > > > in dir->pool:
> > > >
> > > >   
> > > > https://apr.apache.org/docs/apr/1.7/group__apr__dir.html#ga3e4ee253e0c712160bee10bfb9c8e4a8
> > > >
> > > > This function does not accept a pool argument, and I think that it's 
> > > > natural
> > > > to assume limited lifetime in such case. This is what the current major 
> > > > APR
> > > > users (httpd, svn) do, and other users should probably be ready for 
> > > > that too,
> > > > since on Win32 the subsequent calls to apr_dir_read() already 
> > > > invalidate the
> > > > previous apr_finfo_t.
> > >
> > > Are there many examples in the APR API where returned strings are not
> > > either pool-allocated or constant (process lifetime)?  It seems unusual
> > > and very surprising to me.
> > >
> > > A hypothetical API consumer can have made either assumption:
> > >
> > > a) of constant memory (true with limits on Win32, breaks for Unix), or
> > > b) of pool-allocated string lifetime (works for Unix, breaks for Win32)
> > >
> > > neither is documented and both are broken by current implementations. It
> > > is impossible to satisfy both so anybody can argue that fixing either
> > > implementation to match the other is a regression.
> > >
> > > Doubling down on (a) over (b) seems strongly inferior to me, the API
> > > cannot support this without introducing runtime overhead for all callers
> > > (a new iterpool in the dir object), so I'd rather fix this properly with
> > > a new API.
> > >
> > > I'd be fine with leaving Win32 apr_dir_read() behaviour as-is for 1.x at
> > > the same as introducing _pread() if desired - making the strdup only
> > > happen for _pread would only be a minor tweak, not a whole new subpool.
> > >
> >
> > There is example of apr_hash_t.ITERATOR that is statically allocated in
> > the hash object and returned to the caller.
> >
> > Also the native readdir() API behaves the same way [1]:
> > [[[
> > The returned pointer, and pointers within the structure, might be 
> > invalidated
> > or the structure or the storage areas might be overwritten by a subsequent 
> > call
> > to readdir() on the same directory stream.
> > ]]]
> >
> > From this perspective the current behavior of apr_dir_read on Unix looks
> > inconsistent and surprising for the API user because it is impossible to use
> > in a loop without unbounded memory usage.
> >
> > I strongly agree that the revved version of this function that accepts a 
> > POOL
> > argument will be good from both usage and implementation terms. I would be 
> > +1
> > on adding such API. But I also think that unless we fix the original issue
> > by using the preallocated buffer/iterpool, the whole thing would still be
> > problematic for any existing API user.
> >
> > More detailed: if we take this opportunity to choose and document the API to
> > consistently return a temporary result, then as the caller I can either 
> > process
> > the data during iteration or manually put it into a long-living pool if 
> > that is
> > necessary. I think that all current callers work with the API this way. And 
> > I
> > can also switch to the new revved function to manually specify the pool and
> > better control the allocations. If I would like to support both APR 1.x and 
> > 2.x
> > the compatibility is also straightforward to achieve.
> >
> > If we instead choose an option where the results may be allocated
> > in the dir object pool, then as the caller I cannot avoid unbounded memory
> > usage when supporting both APR 1.x and 2.x. And this approach also 
> > introduces
> > the unavoidable the unbounded memory usage for all Win32 users that
> > do not update their code.
> >
> > [1] 
> > https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html#tag_16_475
> >
> Is there any feedback or thoughts on these proposed changes?
>
> Thanks!
>

For what it's worth, I'm -1 on the changes done in r1862071 and
r1862435 for the reasons listed above.

PS: No idea if I can actually veto changes, because while I'm a
committer on the APR project, I am not in its PMC. But "-1" properly
expresses my technical opinion on this topic.

--
Ivan Zhakov

Reply via email to