On Mon, Apr 27, 2020 at 02:29:55PM +0200, Yann Ylavic wrote:
> The call chain is:
>   apr_bucket_setaside(bucket, newpool)
>   => file_bucket_setaside(bucket, newpool)
>      => apr_file_setaside(&newfd, bucket->data->fd, newpool)
>         => copy bucket->data->fd to newfd, move cleanup to newpool
>         => bucket->data->fd->filedes = -1
>      => bucket->data->fd = newfd
> 
> And since bucket->data is shared, all other buckets with the same data
> get their filedes set to -1.

You mentioned this was a problem for split buckets, which I don't get.  
For a split FILE bucket bucket->data is common across copies, so this 
seems fine, the last "bucket->data->fd = newfd;" updates _all_ the split 
buckets.

I can see this is a problem *outside* of _split, so if you do something 
like:

    e1 = apr_brigade_insert_file(bb, fd, ...);
    e2 = apr_brigade_insert_file(bb, fd, ...);

and then a later apr_bucket_setaside(e1) will break e2 as you describe, 
because e1 and e2 _don't_ have a common ->data.

I can see in the list archive a discussion of apr_file_setaside() where 
cliffw suggested apr_file_setaside() should work like apr_mmap_dup() 
which doesn't appear to have this problem.

> > > The case I'm facing is a bit different but similar. A handler is 
> > > saving large bodies into a file and inserts an input filter for 
> > > mod_proxy to use the file bucket as request body (much like 
> > > mod_request). The file is still needed/used after mod_proxy (in an 
> > > output filter), so I don't expect the file to be invalidated by 
> > > mod_proxy's output filtering.
> >
> > I do agree that it is surprising behaviour that the apr_file_t * is
> > potentially invalidated after being wrapped in a FILE bucket though. Not
> > sure how to avoid it, IIRC this is some optimisation to avoid excessive
> > dup2() calls?
> 
> Yes I think so, that plus a new fd to take into account for ulimit nofile.
> Alternatively we could play with apr_os_file_get()+apr_os_file_set(),
> which will leave alone the fd and cleanup from the original pool, but
> at that point I think we'd better let the caller responsible for the
> lifetime..

I think within the constraint of the current APR/httpd buckets/filtering 
API it is probably necessary to say that you cannot insert a FILE bucket 
referring to the same apr_file_t * twice.  Or you should dup2 inbetween.

That is clearly the implication of the apr_file_setaside() API warning 
which extends to apr_bucket_setaside().  Alternatively you could say the 
apr_file_setaside() implementation is broken and should work like 
apr_mmap_dup().

(I think treating setaside as special for FILE is a bad road to start 
down - I should be able copy and paste the FILE implementation as a new 
MYFILE bucket type and have the core behave correctly, if not optimally)

Regards, Joe

Reply via email to