On Mon, Apr 27, 2020 at 11:11 AM Joe Orton <[email protected]> wrote:
>
> On Sun, Apr 26, 2020 at 05:26:02PM +0200, Yann Ylavic wrote:
> > For FILE buckets, the behaviour of apr_bucket_setaside() is to take
> > *full* ownership of the underlying apr_file, that is: allocate/move
> > the file/cleanup to the new pool AND set the old file's fd to -1 (see
> > apr_file_setaside, [1]).
> >
> > This can be problematic in an output filters chain whenever a FILE
> > bucket gets splitted. Suppose that an FTYPE_RESOURCE output filter
> > splits a FILE bucket in two and sends the first one through the chain,
> > and then due to network congestion the core output filter needs to set
> > the FILE bucket aside. It happens that the second FILE bucket becomes
> > a dangling bucket (with fd == -1).
>
> Hang on, I don't follow this, how does data->fd end up as -1 in this
> case? The buckets split from the original FILE bucket have a common
> ->data since it is a refcounted bucket type. If any of them are
> setaside the common ->data->fd is replaced with a new apr_file_t
> allocated from a new (longer-lived) pool. No?
The call chain is:
apr_bucket_setaside(bucket, newpool)
=> file_bucket_setaside(bucket, newpool)
=> apr_file_setaside(&newfd, bucket->data->fd, newpool)
=> copy bucket->data->fd to newfd, move cleanup to newpool
=> bucket->data->fd->filedes = -1
=> bucket->data->fd = newfd
And since bucket->data is shared, all other buckets with the same data
get their filedes set to -1.
>
> > The case I'm facing is a bit different but similar. A handler is
> > saving large bodies into a file and inserts an input filter for
> > mod_proxy to use the file bucket as request body (much like
> > mod_request). The file is still needed/used after mod_proxy (in an
> > output filter), so I don't expect the file to be invalidated by
> > mod_proxy's output filtering.
>
> I do agree that it is surprising behaviour that the apr_file_t * is
> potentially invalidated after being wrapped in a FILE bucket though. Not
> sure how to avoid it, IIRC this is some optimisation to avoid excessive
> dup2() calls?
Yes I think so, that plus a new fd to take into account for ulimit nofile.
Alternatively we could play with apr_os_file_get()+apr_os_file_set(),
which will leave alone the fd and cleanup from the original pool, but
at that point I think we'd better let the caller responsible for the
lifetime..
Regards,
Yann.
>
> Regards, Joe
>