[ https://issues.apache.org/jira/browse/ZOOKEEPER-2891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael Han resolved ZOOKEEPER-2891. ------------------------------------ Resolution: Fixed Fix Version/s: (was: 3.4.10) 3.4.15 Issue resolved by pull request 999 [https://github.com/apache/zookeeper/pull/999] > Invalid processing of zookeeper_close for mutli-request > ------------------------------------------------------- > > Key: ZOOKEEPER-2891 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2891 > Project: ZooKeeper > Issue Type: Bug > Components: c client > Affects Versions: 3.4.10 > Environment: Linux ubuntu 4.4.0-87-generic > gcc (Ubuntu 5.4.0-6ubuntu1~16.04.4) 5.4.0 20160609 > https://github.com/apache/zookeeper.git > branch-3.4 > Reporter: Alexander A. Strelets > Assignee: Alexander A. Strelets > Priority: Critical > Labels: easyfix, pull-request-available > Fix For: 3.4.15 > > Time Spent: 5h 10m > Remaining Estimate: 0h > > ZooKeeper C Client *+single thread+* build > When I call _zookeeper_close()_ while there is a pending _multi_ request, I > expect the request completes with _ZCLOSING_ (-116) status. > But with the existing code I actually get the following: > - the program exits with _SIGABRT_ from _assert(entry)_ in > _deserialize_multi()_ > - and even if I remove this assertion and just break the enclosing loop, the > returned status is _ZOK_ but not _ZCLOSING_ > So, there are two defects with processing calls to _zookeeper_close()_ for > pending _multi_ requests: improper assertion in implementation and invalid > status in confirmation. > *+I propose two changes in the code:+* > 1. Screen _assert(entry)_ in _deserialize_multi()_ when it's called due to > _zookeeper_close()_ (note that _entry_ may normally become equal to _NULL_ in > this case), and as soon as _entry == NULL_, break the loop. To provide this, > _deserialize_multi()_ must be informed by the caller weather it's currently > the "normal" or the "special" case. > I propose adding a new parameter _rc_hint_ (return code hint) to > _deserialize_multi()_. When _deserialize_multi()_ is called in "normal" case, > _rc_hint_ is preset with _ZOK_ (0), and the behavior is absolutely the same > as with the existing code. But when it's called due to _zookeeper_close()_, > the _rc_hint_ is automatically preset with _ZCLOSING_ (-116) by the caller, > and this changes the behavior of _deserialize_multi()_ as described above. > How it works: > Let _zookeeper_close()_ is called while there is a pending _multi_ request. > Then function _deserialize_multi()_ is called for the so-called "Fake > response" on _multi_ request which is fabricated by the function > _free_completions()_. Such fake response includes only the header but zero > bytes for the body. Due to this _deserialize_MultiHeader(ia, "multiheader", > &mhdr)_, which is called repeatedly for each _completion_list_t *entry = > dequeue_completion(clist)_, does not assign the _mhdr_ and keeps _mhdr.done > == 0_ as it was originally initialized. Consequently the _while (!mhdr.done)_ > loop does not ever end, and finally falls into the _assert(entry)_ with > _entry == NULL_ when all fake sub-requests are "completed". > But if, as I propose, the caller made a hint to _deserialize_multi()_ that > it's actually the "special" case (that it processes the fake response indeed, > for example), with the proposed changes it would omit improper assertion and > break the loop on the first _entry == NULL_. Now at least > _deserialize_multi()_ exits and does not emit _SIGABRT_. > 2. Passthrough the "return code hint" _rc_hint_, as it was initially > specified by the caller, to the _deserialize_multi()_ return code, if the > hint is not _ZOK_ (0). > How it works: > With the existing code _deserialize_multi()_ returns unsuccessful _rc_-code > only if there is an error in processing some of subrequests. And if there are > no errors, it returns _ZOK_ (0) which is assigned as the default value to > _rc_ at the very beginning of the function. Indeed, in the case of fake > multi-response there are no errors in subresponses (because they are empty > and fake). So, _deserialize_multi()_ returns _ZOK_ (0). Then, with _rc = > deserialize_multi(xid, cptr, ia)_ in _deserialize_response()_ it overrides > the true _ZCLOSING_ status. > But if the true status (for example, _ZCLOSING_) is initially hinted to > _deserialize_multi()_, as I propose, _deserialize_multi()_ would reproduce it > back instead of irrelevant _ZOK_ (0). And consequently > _deserialize_response()_ would finally report the true status (particularly > _ZCLOSING_). > This is a proposed fix: https://github.com/apache/zookeeper/pull/999 > // Previously proposed fix: https://github.com/apache/zookeeper/pull/360 > [upd] > It looks like about the same problem is described in ZOOKEEPER-1636 > However, the patch proposed in this ticket also remedies the second linked > problem: reporting _ZCLOSING_ status (as required) to the multi-request > completion handler. -- This message was sent by Atlassian JIRA (v7.6.14#76016)