Hi HansN,

Do you have any tips to created overload case,

I would like test and observe TIPC_DEST_DROPPABLE enabled & disabled cases.

-AVM


On 9/1/2016 9:12 AM, A V Mahesh wrote:
> Hi HansN,
>
> Sorry for the delay.
>
> I will test it and get back to you soon.
>
> -AVM
>
>
> On 8/31/2016 4:29 PM, Hans Nordebäck wrote:
>> Hi Mahesh,
>> Any updates on this?
>>
>> /Regards HansN
>>
>> -----Original Message-----
>> From: Anders Widell
>> Sent: den 25 augusti 2016 13:11
>> To: A V Mahesh <mahesh.va...@oracle.com>; Hans Nordebäck 
>> <hans.nordeb...@ericsson.com>; mathi.naic...@oracle.com
>> Cc: opensaf-devel@lists.sourceforge.net
>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>
>> Hi!
>>
>> This is what the TIPC user documentation says about TIPC_DEST_DROPPABLE:
>> "This option governs the handling of messages sent by the socket if 
>> the message cannot be delivered to its destination, either because 
>> the receiver is congested or because the specified receiver does not 
>> exist.
>> If enabled, the message is discarded; otherwise the message is 
>> returned to the sender."
>>
>> This is what the TIPC user documentation says about the return value 
>> from the recvmsg() system call: "When used with a connectionless 
>> socket, a return value of 0 indicates the arrival of a returned data 
>> message that was originally sent by this socket."
>>
>> I think the documentation is pretty clear. If you set 
>> TIPC_DEST_DROPPABLE to true, the receiver can discard messages e.g. 
>> when the receive buffer is full. The sender will not be notified in 
>> this case. If TIPC_DEST_DROPPABLE is set to false, the message will 
>> be returned to the sender in case of a full receive buffer. The 
>> sender knows that it has received such a returned message when the 
>> recvmsg() call returns zero.
>>
>> regards,
>> Anders Widell
>>
>> On 08/25/2016 11:30 AM, A V Mahesh wrote:
>>> Hi HansN,
>>>
>>>
>>> On 8/23/2016 5:22 PM, Hans Nordebäck wrote:
>>>
>>>> Hi Mahesh,
>>>>
>>>> Yes, this is my understanding too, if TIPC_DROPPABLE = true tipc may
>>>> drop messages silently,  at receive sock buffer full condition,  but
>>>> do not return any ancillary message.
>>>> If TIPC_DROPPABLE = false tipc may drop message but will send an
>>>> ancillary message to inform about TIPC_ERR_OVERLOAD.
>>> [AVM]
>>>
>>> My observation are understanding is different, based on TIPC code and
>>> Linux TIPC 2.0 Programmer's Guide , that the TIPC_ERR_OVERLOAD error
>>> returned when TIPC is unable to enqueue an incoming message on the
>>> receiving socket's receive queue irrelevant of TIPC_DEST_DROPPABLE
>>> enabled or disabled.
>>>
>>> The only difference between TIPC_DEST_DROPPABLE enabled or disabled is
>>> , If  TIPC_DEST_DROPPABLE enabled, the message is discarded and
>>> recvmsg() returned size is ZERO and application will get errors, if
>>> TIPC_DEST_DROPPABLE disabled  the message is returned to the sender it
>>> means the recvmsg() returned size is user send data size and
>>> application will get errors .
>>>
>>> I did check the TIPC code and documentations  and I haven't get any
>>> evidences that  TIPC_ERR_OVERLOAD error code will be send only If
>>> TIPC_DEST_DROPPABLE = false.
>>>
>>> Even while testing #1227
>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/) my
>>> observations and understanding was, an individual TIPC socket is only
>>> allowed to queue up
>>> OVERLOAD_LIMIT_BASE/2 messages of the lowest importance level before
>>> it starts rejecting them.
>>> Once a socket receiving queue length exceeds the maximum limit value,
>>> the receiving socket will send out a reject message  with
>>> TIPC_ERR_OVERLOAD error code with cmsg_type as
>>> TIPC_ERRINFO/TIPC_RETDATA, and the tipc code and Linux TIPC 2.0
>>> Programmer's Guide  confirmed the same .
>>>
>>> tipc/socket.c
>>> =======================================================
>>> /* Reject message if there isn't room to queue it */
>>>
>>> recv_q_len = (u32)atomic_read(&tipc_queue_size);
>>> if (unlikely(recv_q_len >= OVERLOAD_LIMIT_BASE)) {
>>>      if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE))
>>>          return TIPC_ERR_OVERLOAD;
>>> }
>>> recv_q_len = skb_queue_len(&sk->sk_receive_queue);
>>> if (unlikely(recv_q_len >= (OVERLOAD_LIMIT_BASE / 2))) {
>>>      if (rx_queue_full(msg, recv_q_len, OVERLOAD_LIMIT_BASE / 2))
>>>          return TIPC_ERR_OVERLOAD;
>>> }
>>> =======================================================
>>>
>>>
>>> 2.1.17. setsockopt() of  TIPC 2.0 Programmer's Guide
>>> =======================================================
>>> TIPC_DEST_DROPPABLE
>>> This option governs the handling of messages sent by the socket if the
>>> message cannot be delivered to its destination, either because the
>>> receiver is congested or because the specified receiver does not
>>> exist. If enabled, the message is discarded; otherwise the message is
>>> returned to the sender.
>>>
>>> By default, this option is disabled for SOCK_SEQPACKET and SOCK_STREAM
>>> socket types, and enabled for SOCK_RDM and SOCK_DGRAM, This
>>> arrangement ensures proper teardown of failed connections when
>>> connection-oriented data transfer is used, without increasing the
>>> complexity of connectionless data transfer.
>>>
>>> TIPC_SRC_DROPPABLE
>>> This option governs the handling of messages sent by the socket if
>>> link congestion occurs. If enabled, the message is discarded;
>>> otherwise the system queues the message for later transmission.
>>> By default, this option is disabled for SOCK_SEQPACKET, SOCK_STREAM,
>>> and SOCK_RDM socket types (resulting in "reliable" data transfer), and
>>> enabled for SOCK_DGRAM (resulting in "unreliable" data transfer).
>>> =======================================================
>>>
>>> Now I will try to create OVERLOAD case and update you soon my latest
>>> observations.
>>>
>>> -AVM
>>>
>>>> Correcting this and adding an abort is not backward compatible as
>>>> some service already handle flow control in some way, only log when
>>>> packages are dropped.
>>>> Regarding ticket #1960 there are other solutions than introducing
>>>> flow control in MDS, e.g. expose an option to the service to choose
>>>> connection oriented or connection less.
>>>> The problem with dropped messages seems in one case related to, (by
>>>> MDS), intensive MDS logging.
>>>>
>>>> /Thanks HansN
>>>> -----Original Message-----
>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>>> Sent: den 23 augusti 2016 11:27
>>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>>>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>>>
>>>> Hi HansN,
>>>>
>>>> It seems I am missing some thing , please allow me to under stand
>>>>
>>>> If I currently understand you observation :
>>>>
>>>> With current Opensaf code ( this #1957 patch NOT applied ) , by
>>>> default TIPC_DROPPABLE=true ,while running Opensaf with that binary
>>>> when TIPC_ERR_OVERLOAD  occurring, TIPC is not  given errors
>>>> TIPC_ERRINFO or  TIPC_RETDATA and following code is not being get hit
>>>> of function recvfrom_connectionless(), is my understanding right ?
>>>>
>>>> =====================================================================
>>>> ========================================
>>>>
>>>>
>>>> *if (anc->cmsg_type == TIPC_ERRINFO) {*
>>>>        /* TIPC_ERRINFO - TIPC error code associated with a returned
>>>> data message or a connection termination message  so abort */
>>>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>> ancillary
>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>>> *abort();*
>>>> *} else if (anc->cmsg_type == TIPC_RETDATA) {*
>>>>        /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC to
>>>> return rejected messages to the sender )
>>>>           we will hit this when we implement MDS retransmit lost
>>>> messages abort can be replaced with flow control logic*/
>>>>        for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>            m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>>            cptr++;
>>>>        }
>>>>        /* TIPC_RETDATA -The contents of a returned data message so
>>>> abort */
>>>>        m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>> ancillary
>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>>> *abort();*
>>>> }
>>>>
>>>> =====================================================================
>>>> ========================================
>>>>
>>>>
>>>> -AVM
>>>>
>>>>
>>>> On 8/23/2016 1:08 PM, Hans Nordebäck wrote:
>>>>> Hi Mahesh,
>>>>>
>>>>> Please see response below with [HansN] /Thanks HansN
>>>>>
>>>>> -----Original Message-----
>>>>> From: A V Mahesh [mailto:mahesh.va...@oracle.com]
>>>>> Sent: den 23 augusti 2016 08:25
>>>>> To: Hans Nordebäck <hans.nordeb...@ericsson.com>; Anders Widell
>>>>> <anders.wid...@ericsson.com>; mathi.naic...@oracle.com
>>>>> Cc: opensaf-devel@lists.sourceforge.net
>>>>> Subject: Re: [PATCH 1 of 1] MDS: Log TIPC dropped messages [#1957]
>>>>>
>>>>> Hi HansN
>>>>>
>>>>> Please see response below with [AVM]
>>>>>
>>>>> -AVM
>>>>>
>>>>> On 8/23/2016 11:41 AM, Hans Nordebäck wrote:
>>>>>> Hi Mahesh,
>>>>>>
>>>>>> please see comments below.
>>>>>>
>>>>>> /Thanks HansN
>>>>>>
>>>>>>
>>>>>> On 08/23/2016 07:21 AM, A V Mahesh wrote:
>>>>>>> Hi HansN,
>>>>>>>
>>>>>>> Let us fist discuss the error handling and abort, then we can come
>>>>>>> back to interpretation of  TIPC currently  does permit  OR does
>>>>>>> not permit an application to send a multicast message with the
>>>>>>> "destination droppable" setting disabled.
>>>>>>>
>>>>>>> Let us disable TIPC_DEST_DROPPABLE, so that  TIPC will try to
>>>>>>> return an undelivered multicast message to its sender and we can
>>>>>>> determine issue is  because of TIPC_ERR_OVERLOAD, this helps in
>>>>>>> debugging , so that application may increased SO_SNDBUF/SO_RCVBUF
>>>>>>> to reduce the problem.
>>>>>>>
>>>>>>> But still we need to abort(), the reason for that is current MDS
>>>>>>> implementations doesn't have flow control logic ( no retry because
>>>>>>> of error ) , so Application like AMF can go wrong and cluster will
>>>>>>> go into unstable/recoverble state.
>>>>>>>
>>>>>> [HansN] In the current implementation messages are dropped silently
>>>>>> and no abort is done.
>>>>> [AVM]  I can see  abort(); in current code , you mean abort(); is
>>>>> not working and application(amf) is not existing ?
>>>>> [HansN] In case of TIPC_DROPPABLE=true and messages are dropped,
>>>>> (TIPC_ERR_OVERLOAD)  no abort is be performed, e.g amfd detects this
>>>>> in the msg sanity chk and logs "invalid msg id ..."
>>>>> ====================================================================
>>>>> ==
>>>>> ======
>>>>> if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>>         /* TIPC_ERRINFO - TIPC error code associated with a returned
>>>>> data message or a connection termination message  so abort */
>>>>>         m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>>> ancillary
>>>>> data: TIPC_ERRINFO abort err :%s", strerror(errno) );
>>>>> *abort();*
>>>>> } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>>         /* If we set TIPC_DEST_DROPPABLE off messge (configure TIPC
>>>>> to return rejected messages to the sender )
>>>>>            we will hit this when we implement MDS retransmit lost
>>>>> messages abort can be replaced with flow control logic*/
>>>>>         for (i = anc->cmsg_len - sizeof(*anc); i > 0; i--) {
>>>>>             m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n", *cptr);
>>>>>             cptr++;
>>>>>         }
>>>>>         /* TIPC_RETDATA -The contents of a returned data message  so
>>>>> abort */
>>>>>         m_MDS_LOG_CRITICAL("MDTM: undelivered message condition
>>>>> ancillary
>>>>> data: TIPC_RETDATA abort err :%s", strerror(errno) );
>>>>> *abort();*
>>>>> }
>>>>> ====================================================================
>>>>> ==
>>>>> ======
>>>>>> This patch enables logging
>>>>>> when packages are dropped to help in debugging. I don't agree that
>>>>>> we should also introduce abort, but instead:
>>>>>> 1) Implement a solution to handle dropped packages, ticket #1960
>>>>> [AVM]  This is nothing but flow control implementation in MDS, this
>>>>> is future enhancement
>>>>>
>>>>>> 2) Investigate why packages may be dropped, the receiving MDS
>>>>>> thread is a real time thread and should be able to consume a large
>>>>>> amount of incoming messages.
>>>>>> E.g. is the receiving MDS thread "live hanging" due to locks, file
>>>>>> I/O etc?
>>>>>>> This was the reason we haven't gone for it while addressing Ticket
>>>>>>> #1227
>>>>>>> (https://sourceforge.net/p/opensaf/mailman/message/33207717/)
>>>>>>> So currently we don't have any advantage of disabling
>>>>>>> TIPC_DEST_DROPPABLE and not allowing multicast messages.
>>>>>>>
>>>>>>> -AVM
>>>>>>>
>>>>>>>
>>>>>>> On 8/18/2016 2:43 PM, Hans Nordeback wrote:
>>>>>>>> osaf/libs/core/mds/mds_dt_tipc.c |  32
>>>>>>>> +++++++++++++++++++++++++-------
>>>>>>>>      1 files changed, 25 insertions(+), 7 deletions(-)
>>>>>>>>
>>>>>>>>
>>>>>>>> diff --git a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>> b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>> --- a/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>> +++ b/osaf/libs/core/mds/mds_dt_tipc.c
>>>>>>>> @@ -320,6 +320,15 @@ uint32_t mdtm_tipc_init(NODE_ID nodeid,
>>>>>>>>                      m_MDS_LOG_INFO("MDTM: Successfully set
>>>>>>>> default socket option TIPC_IMP = %d", TIPCIMPORTANCE);
>>>>>>>>              }
>>>>>>>>      +        int droppable = 0;
>>>>>>>> +        if (setsockopt(tipc_cb.BSRsock, SOL_TIPC,
>>>>>>>> TIPC_DEST_DROPPABLE, &droppable, sizeof(droppable)) != 0) {
>>>>>>>> +                LOG_ER("MDTM: Can't set TIPC_DEST_DROPPABLE to
>>>>>>>> + zero
>>>>>>>> err :%s\n", strerror(errno));
>>>>>>>> +                m_MDS_LOG_ERR("MDTM: Can't set
>>>>>>>> + TIPC_DEST_DROPPABLE
>>>>>>>> to zero err :%s\n", strerror(errno));
>>>>>>>> +                osafassert(0);
>>>>>>>> +        } else {
>>>>>>>> +                m_MDS_LOG_NOTIFY("MDTM: Successfully set
>>>>>>>> TIPC_DEST_DROPPABLE to zero");
>>>>>>>> +        }
>>>>>>>> +
>>>>>>>>          return NCSCC_RC_SUCCESS;
>>>>>>>>      }
>>>>>>>>      @@ -563,6 +572,8 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>>>>          unsigned char *cptr;
>>>>>>>>          int i;
>>>>>>>>          int has_addr;
>>>>>>>> +    int anc_data[2];
>>>>>>>> +
>>>>>>>>          ssize_t sz;
>>>>>>>>            has_addr = (from != NULL) && (addrlen != NULL); @@
>>>>>>>> -591,19
>>>>>>>> +602,26 @@ ssize_t recvfrom_connectionless (int sd,
>>>>>>>>                     if the message was sent using a TIPC name or
>>>>>>>> name sequence as the
>>>>>>>>                     destination rather than a TIPC port ID So
>>>>>>>> abort for TIPC_ERRINFO and TIPC_RETDATA*/
>>>>>>>>                  if (anc->cmsg_type == TIPC_ERRINFO) {
>>>>>>>> -                /* TIPC_ERRINFO - TIPC error code associated 
>>>>>>>> with a
>>>>>>>> returned data message or a connection termination message  so
>>>>>>>> abort */
>>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err :%s",
>>>>>>>> strerror(errno) );
>>>>>>>> -                abort();
>>>>>>>> +                anc_data[0] = *((unsigned int*)(CMSG_DATA(anc) +
>>>>>>>> 0));
>>>>>>>> +                if (anc_data[0] == TIPC_ERR_OVERLOAD) {
>>>>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>>>>> ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered
>>>>>>>> + message
>>>>>>>> condition ancillary data: TIPC_ERR_OVERLOAD");
>>>>>>>> +                } else {
>>>>>>>> +                    /* TIPC_ERRINFO - TIPC error code associated
>>>>>>>> with a returned data message or a connection termination message
>>>>>>>> so abort */
>>>>>>>> +                    LOG_CR("MDTM: undelivered message condition
>>>>>>>> ancillary data: TIPC_ERRINFO abort err : %d", anc_data[0]);
>>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: undelivered
>>>>>>>> + message
>>>>>>>> condition ancillary data: TIPC_ERRINFO abort err : %d",
>>>>>>>> anc_data[0]);
>>>>>>>> +                }
>>>>>>>>                  } else if (anc->cmsg_type == TIPC_RETDATA) {
>>>>>>>> -                /* If we set TIPC_DEST_DROPPABLE off messge
>>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>>> +                /* If we set TIPC_DEST_DROPPABLE off message
>>>>>>>> (configure TIPC to return rejected messages to the sender )
>>>>>>>>                         we will hit this when we implement MDS
>>>>>>>> retransmit lost messages  abort can be replaced with flow control
>>>>>>>> logic*/
>>>>>>>>                      for (i = anc->cmsg_len - sizeof(*anc); i > 0;
>>>>>>>> i--) {
>>>>>>>> -                    m_MDS_LOG_DBG("MDTM: returned byte 0x%02x\n",
>>>>>>>> *cptr);
>>>>>>>> +                    LOG_CR("MDTM: returned byte 0x%02x\n", 
>>>>>>>> *cptr);
>>>>>>>> +                    m_MDS_LOG_CRITICAL("MDTM: returned byte
>>>>>>>> 0x%02x\n", *cptr);
>>>>>>>>                          cptr++;
>>>>>>>>                      }
>>>>>>>>                      /* TIPC_RETDATA -The contents of a returned
>>>>>>>> data message  so abort */
>>>>>>>> -                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>> condition ancillary data: TIPC_RETDATA abort err :%s",
>>>>>>>> strerror(errno) );
>>>>>>>> -                abort();
>>>>>>>> +                LOG_CR("MDTM: undelivered message condition
>>>>>>>> ancillary data: TIPC_RETDATA");
>>>>>>>> +                m_MDS_LOG_CRITICAL("MDTM: undelivered message
>>>>>>>> condition ancillary data: TIPC_RETDATA");
>>>>>>>>                  } else if (anc->cmsg_type == TIPC_DESTNAME) {
>>>>>>>>                      if (sz == 0) {
>>>>>>>>                          m_MDS_LOG_DBG("MDTM: recd bytes=0 on
>>>>>>>> received on sock, abnormal/unknown  condition. Ignoring");
>>
>


------------------------------------------------------------------------------
_______________________________________________
Opensaf-devel mailing list
Opensaf-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to