Hi,

I have not been able yet to reproduce. But from reading the code it is obvious 
that there is a queue of max 200 messages 
when using MDS/TCP. If that queue gets longer you get an assert in the library! 
Such queue does not exist in the 
MDS/TIPC case. Could you explain why it is there?

To me this is a bug and should be fixed. MDS could for example return an error 
code instead of assert.

Do you agree?

Thanks,
Hans

On 10/15/2013 12:59 PM, A V Mahesh wrote:
> Hi Hans,
>
> Assuming you tried this in an UML environment, can you share the backtrace?
> B.T.W, i am not able to repordcue the problme even with writes a burst of 
> larger number of records.
> Perhaps LOG is a slow receiver, will get back once you share the backtrace.
>
> -AVM
>
>
> but not able to reprodcue the  the log server crash
> can you please suggenst exat tescase whic logtest that writes a burst of 700 
> records with 5 us interval.
>
> On 10/14/2013 6:52 PM, Hans Feldt wrote:
>>
>> Hi,
>>
>> Using the OpenSAF test program "logtest" and the latest opensaf configured 
>> with MDS/TCP crashes the log server in the
>> assert in mds_mdtm_queue_add_unsent_msg():
>>
>>>     ++tcp_cb->mdtm_tcp_unsent_counter; /* Increment the counter to keep a 
>>> tab on number of messages */
>>>     if (tcp_cb->mdtm_tcp_unsent_counter <= DTM_INTRANODE_UNSENT_MSG) {
>>>         if (NULL == hdr && NULL == tail) {
>>>             tcp_cb->mds_mdtm_msg_unsent_hdr = tmp;
>>>             tcp_cb->mds_mdtm_msg_unsent_tail = tmp;
>>>         } else {
>>>             tail->next = tmp;
>>>             tcp_cb->mds_mdtm_msg_unsent_tail = tmp;
>>>
>>>             /* Change the poll from POLLIN to POLLOUT */
>>>             pfd[0].events = pfd[0].events | POLLOUT;
>>>         }
>>>     } else {
>>>         syslog(LOG_ERR, " MDTM unsent message is more!=%d", 
>>> DTM_INTRANODE_UNSENT_MSG);
>>>         assert(0);
>>>         return NCSCC_RC_FAILURE;
>>>     }
>>
>> $ grep DTM_INTRANODE_UNSENT_MSG include/*
>> include/mds_dt_tcp_disc.h:#define DTM_INTRANODE_UNSENT_MSG 200
>>
>> mds_mdtm_unsent_queue_add_send() is the only place 
>> mds_mdtm_queue_add_unsent_msg() is called.
>>
>> mds_mdtm_unsent_queue_add_send() can return an error code, none of its 
>> callers check the return code! I guess it
>> should return void then and abort internally.
>>
>> Can you explain what is going on?
>>
>> Thanks,
>> hans
>
>
>

------------------------------------------------------------------------------
October Webinars: Code for Performance
Free Intel webinars can help you accelerate application performance.
Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most from 
the latest Intel processors and coprocessors. See abstracts and register >
http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to