Re: [devel] [PATCH 0 of 1] Review Request for imm: Abort ccb when timeout on client while waiting for implementer [#1817]

Hung Nguyen Fri, 13 May 2016 03:17:09 -0700

Hi Anders,

We broadcast the ccb-abort message when receiving IMMND_EVT_A2ND_CL_TIMEOUT.
The client is only able to send ccb-apply message after the 
IMMND_EVT_A2ND_CL_TIMEOUT.
So the ccb-abort message will reach IMMD before the ccb-apply message.


BR,


Hung Nguyen - DEK Technologies


--------------------------------------------------------------------------------
From: Anders Bjornerstedt [email protected]
Sent: Friday, May 13, 2016 4:09PM
To: Hung Nguyen, Neelakanta Reddy, Zoran Milinkovic
     [email protected], [email protected], 
[email protected]
Cc: Opensaf-devel
     [email protected]
Subject: Re: [devel] [PATCH 0 of 1] Review Request for imm: Abort ccb when 
timeout on client while waiting for implementer [#1817]


Hi,

The test-application is in this case must either be a negative test-case, or a 
bug in the test, or a feature in a branch of the test.
In either case the test was "good" in that it exposed an error in the server 
(IMMND) even if it was not the tests aim to check for this error.
My point is that no sensible (correctly coded) application would send an apply 
after getting ERR_TIMEOUT on an operation request on a CCB.

Aborting the CCB on the server side is a cluster broadcast operation, i.e. it 
takes time itself.

A client that gets timeout due to say an extremely short timeout could send 
that rouge ccb-apply request quite soon so that
it arrives back at the *local* IMMND before the ccb  abort has been processed.

So probably the bug in the server (IMMND) needs to be fixed in a more bullet 
proof way than this patch does ?
The server needs to be bullet proof in general against client missbehavior.

Any attempt by a client such as in this case to invoke a ccbApply towards the 
server where the CCB is already aborted,
or in some unexpected state must not lead to an assert in the server, unless 
there really is an inconsistency in the server.

In this case there apppears to be an inconsistency in the server.
The local IMMND server does in fact detect that the client times out, very soon 
after the timeout.

So how can the client with this client id successfully send  a subsequent 
message that is accepted and processed by the local IMMND
causing the assert:

14:12:26 SC-2-1 osafimmnd[5140]: ImmModel.cc:5242: ccbApply: Assertion 
'reqConn==0 || (ccb->mOriginatingConn == reqConn)' failed.

It seems that this assert in the IMMND expresses a conditions about the state 
of a client that is not correct or too simple.

So to summarize: I am worried that there is a more general problem here.
IMMA clients (with a given client id at the local IMMND) that timeout should be 
discarded by the local IMMND as a local operation
when the local IMMND gets IMMND_EVT_A2ND_CL_TIMEOUT.

After this any attempt by the client to use that handle (client-id) must be 
rejected by the local server very early in the message processing.
The client should not even get a reply. The message should just be discarded.


/AndersBj


> ----Ursprungligt meddelande----
> Från : [email protected]
> Datum : 2016-05-13 - 05:33 (GMT)
> Till : [email protected], [email protected]
> Kopia : [email protected]
> Ämne : [devel] [PATCH 0 of 1] Review Request for imm: Abort ccb when timeout 
> on client while waiting for implementer [#1817]
>
> Summary: imm: Abort ccb when timeout on client while waiting for implementer 
> [#1817]
> Review request for Trac Ticket(s): 1817
> Peer Reviewer(s): Zoran, Neel
> Pull request to:
> Affected branch(es): 4.7, 5.0, 5.1
> Development branch: 5.1
>
> --------------------------------
> Impacted area       Impact y/n
> --------------------------------
> Docs                    n
> Build system            n
> RPM/packaging           n
> Configuration files     n
> Startup scripts         n
> SAF services            n
> OpenSAF services        y
> Core libraries          n
> Samples                 n
> Tests                   n
> Other                   n
>
>
> Comments (indicate scope for each "y" above):
> ---------------------------------------------
>
>
> changeset 88f1b81d425f2c561ddaba07d69598af4ad98349
> Author:       Hung Nguyen <[email protected]>
> Date: Fri, 13 May 2016 11:16:13 +0700
>
>       imm: Abort ccb when timeout on client while waiting for implementer 
> [#1817]
>
>       Timeout on client while waiting for implementer happens when
>       IMMA_SYNCR_TIMEOUT < IMMA_OI_CALLBACK_TIMEOUT. mOriginatingConn is 
> clear and
>       we can't send response in 
> ::ccbObjCreate/Modify/Del/CompletedContinuation().
>       If the client invokes more ccb operations, it will get ERR_TIMEOUT. We
>       better abort the ccb in this case. This is the same behavior as when 
> there's
>       a timeout from OI.
>
>
> Complete diffstat:
> ------------------
> osaf/services/saf/immsv/immnd/ImmModel.cc  |  18 +++++++++++++-----
> osaf/services/saf/immsv/immnd/ImmModel.hh  |   2 +-
> osaf/services/saf/immsv/immnd/immnd_evt.c  |  18 +++++++++++++++++-
> osaf/services/saf/immsv/immnd/immnd_init.h |   2 +-
> 4 files changed, 32 insertions(+), 8 deletions(-)
>
>
> Testing Commands:
> -----------------
>
>
>
> Testing, Expected Results:
> --------------------------
> Please see the ticket description.
>
>
> Conditions of Submission:
> -------------------------
> Ack from reviewers.
>
>
> Arch      Built     Started    Linux distro
> -------------------------------------------
> mips        n          n
> mips64      n          n
> x86         n          n
> x86_64      n          n
> powerpc     n          n
> powerpc64   n          n
>
>
> Reviewer Checklist:
> -------------------
> [Submitters: make sure that your review doesn't trigger any checkmarks!]
>
>
> Your checkin has not passed review because (see checked entries):
>
> ___ Your RR template is generally incomplete; it has too many blank entries
>     that need proper data filled in.
>
> ___ You have failed to nominate the proper persons for review and push.
>
> ___ Your patches do not have proper short+long header
>
> ___ You have grammar/spelling in your header that is unacceptable.
>
> ___ You have exceeded a sensible line length in your headers/comments/text.
>
> ___ You have failed to put in a proper Trac Ticket # into your commits.
>
> ___ You have incorrectly put/left internal data in your comments/files
>     (i.e. internal bug tracking tool IDs, product names etc)
>
> ___ You have not given any evidence of testing beyond basic build tests.
>     Demonstrate some level of runtime or other sanity testing.
>
> ___ You have ^M present in some of your files. These have to be removed.
>
> ___ You have needlessly changed whitespace or added whitespace crimes
>     like trailing spaces, or spaces before tabs.
>
> ___ You have mixed real technical changes with whitespace and other
>     cosmetic code cleanup changes. These have to be separate commits.
>
> ___ You need to refactor your submission into logical chunks; there is
>     too much content into a single commit.
>
> ___ You have extraneous garbage in your review (merge commits etc)
>
> ___ You have giant attachments which should never have been sent;
>     Instead you should place your content in a public tree to be pulled.
>
> ___ You have too many commits attached to an e-mail; resend as threaded
>     commits, or place in a public tree for a pull.
>
> ___ You have resent this content multiple times without a clear indication
>     of what has changed between each re-send.
>
> ___ You have failed to adequately and individually address all of the
>     comments and change requests that were proposed in the initial review.
>
> ___ You have a misconfigured ~/.hgrc file (i.e. username, email etc)
>
> ___ Your computer have a badly configured date and time; confusing the
>     the threaded patch review.
>
> ___ Your changes affect IPC mechanism, and you don't present any results
>     for in-service upgradability test.
>
> ___ Your changes affect user manual and documentation, your patch series
>     do not contain the patch that updates the Doxygen manual.
>
>
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> _______________________________________________
> Opensaf-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/opensaf-devel
>

------------------------------------------------------------------------------
Mobile security can be enabling, not merely restricting. Employees who
bring their own devices (BYOD) to work are irked by the imposition of MDM
restrictions. Mobile Device Manager Plus allows you to control only the
apps on BYO-devices by containerizing them, leaving personal data untouched!
https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Re: [devel] [PATCH 0 of 1] Review Request for imm: Abort ccb when timeout on client while waiting for implementer [#1817]

Reply via email to