On 09/17/2011 02:01 AM, Amos Jeffries wrote: > On 16/09/11 16:29, Alex Rousskov wrote: >> On 09/15/2011 08:24 PM, Amos Jeffries wrote: >>> On 16/09/11 06:16, Alex Rousskov wrote: >>>> ------------------------------------------------------------ >>>> revno: 11740 >>>> committer: Alex Rousskov<[email protected]> >>>> branch nick: trunk >>>> timestamp: Thu 2011-09-15 12:16:59 -0600 >>>> message: >>>> Do not let cache manager requests kill SMP Squid using isOpen() >>>> assertion. >>>> >>>> As the comment above the close call implies, we have not imported >>>> the foreign socket descriptor into our fd_table yet. We must use >>>> raw close(2), just like the corresponding >>>> Mgr::Request::Request(msg) code that allocates request.conn, uses >>>> raw assignment to give that half-baked connection a descriptor. >>>> >>>> TODO: This direct manipulation of Connection::fd is ugly, and this >>>> half-baked connection will most likely cause more [hidden] problems >>>> down the road. For example, Mgr::Request destructor will assert in >>>> a similar way if the request object is destroyed before >>>> Action::respond() is called.>> modified: >>>> src/mgr/Action.cc >> >> >>> Problem is: why is the FD getting outside the IPC Coordinator/Strand >>> scope when it is not "valid" according to fd_table? >> >> >> It is not 100% clear where that IPC scope ends in this case, but the >> problem is not really with the scope borders. The bug was due to the >> [old] action creator saying "we are not going to import this raw fd so >> that each specific action can decide whether it actually needs an fd" >> and the [new] default action code saying "we are going to close this >> Comm connection" even though the Connection object was in that >> half-baked state. >> >> >>> - If it can't be read/write/close, it should not really be passed >>> round >>> at all. >>> - If it *can* be read/write/close, the ->close() is the right way to >>> close and remove the possible handlers. >> >> It can be read/written/closed, but it can be read/written/closed using >> Comm only after the fd is imported into Comm space. The old code did not >> import and used raw close(2) by default. The new code still did not >> import but used Comm close by default. >> >> >>> Correlation of these is that the code doing conn->fd = msg.getFd() when >>> unpacking the TypedMsgHdr should be (a) not, or (b) setting up fd_table >>> state sufficiently to match the Comm::Connection state. >> > > > Option (1) > >> Yes, that is why I left a TODO note: We can change the old code to >> import the fd into Comm [and create a full Connection object] as soon as >> possible. It would be a waste of cycles in most cases, but it will >> prevent half-baked Connection objects from lingering around and causing >> more asserts or fd leaks. >> > > option (2) >> The alternative is to use a "raw fd holder" class/object that would use >> raw close(2) unless a fully-baked Connection object is created. SSL code >> uses that approach for many temporary SSL things to avoid leaks. > > option (3): > Comm::Connection::close() could detect the fd_table flag which is > asserting right? then it could use ::close() instead of comm_close().
I do not like option (3) because it will prevent us from detecting bugs where a regular fd is closed twice (from two Connection objects) and other code logic problems. > option (4): > Or probably, add another Comm::Connection flag to indicate "partial > import" for the opening TCP flags. Which triggers the particular closing > method if still set on ->close(). >> Since this is not a performance-sensitive area of the code, the former >> (i.e., create a Connection object ASAP) may be better/simpler. > > Well, like you keep reminding me. Everything is performance sensitive > indirectly. I think that would be an exaggeration. It is difficult to imagine a realistic scenario where we need to super-optimize cache manager queries, but if such a need does surface, we can always migrate from simpler to more complex (option #1 to another option). > So I'm thinking option (4), with option (1) in specific cases would be > best. > Option (3) whenever we fiddle with the fd directly. Option (1) > whenever its known that the FD will be used. Possibly some time after > option (3) was done, so unsetting the half-imported flag if set earlier. > This way we avoid fd_table and related CPU consumptions entirely for > most of these. And still roughly follow the semantics of what the flags > mean: things done when opening this FD, so we can repeat the sequence in > comm_reset() as needed. I think we should keep it simple and use Option #1 for all cases. The fewer half-baked things we have, the better (this is not limited to asserts when closing a descriptor). If you insist, I can live with option #4. BTW, I think Ipc::ImportFdIntoComm() is now broken as well. If things go wrong, it calls conn->close() for a half-baked connection. If you work on this, please try to find all cases where Connection code changed close(2) to conn->close(). There may be more such bugs lurking there. Thank you, Alex.
