Hi,
Please see my responses for questions # 3 and #4.
Regards, Vu
On 9/7/19 4:07 AM, William R Elliott wrote:
Hello,
We are using opensaf version 5.1.0. We have a cluster using tcp as a transport
mechanism with opensaf multicast feature enabled.
We would appreciate answers to the following questions:
1. Please provide a link or any document that gives details on how the
opensaf mds layer works.
2. osafntfd ER ntfs_mds_msg_send FAILED - Trace of the problem.
Sep 3 19:57:26.107676 osafntfd [11558:NtfClient.cc:0202] << notificationReceived
Sep 3 19:57:26.107679 osafntfd [11558:NtfClient.cc:0147] >>
notificationReceived: 108 2
Sep 3 19:57:26.107685 osafntfd [11558:NtfFilter.cc:0464] >> checkFilter
Sep 3 19:57:26.107711 osafntfd [11558:ntfsv_mem.c:0769] >> ntfsv_get_ntf_header
Sep 3 19:57:26.107721 osafntfd [11558:ntfsv_mem.c:0790] << ntfsv_get_ntf_header
Sep 3 19:57:26.107726 osafntfd [11558:NtfFilter.cc:0071] T8
numNotificationClassIds: 0
Sep 3 19:57:26.107729 osafntfd [11558:NtfFilter.cc:0056] T8 num EventTypes: 1
Sep 3 19:57:26.107732 osafntfd [11558:NtfFilter.cc:0060] T2 EventTypes matches
Sep 3 19:57:26.107735 osafntfd [11558:NtfFilter.cc:0187] T8 num
notificationObjects: 0
Sep 3 19:57:26.107738 osafntfd [11558:NtfFilter.cc:0202] T8 num
NotifyingObjects: 0
Sep 3 19:57:26.107741 osafntfd [11558:NtfFilter.cc:0223] T2 hdfilter matches
Sep 3 19:57:26.107745 osafntfd [11558:NtfFilter.cc:0087] T8 numSi: 0
Sep 3 19:57:26.107748 osafntfd [11558:NtfFilter.cc:0471] << checkFilter
Sep 3 19:57:26.107751 osafntfd [11558:NtfClient.cc:0184] T2
NtfClient::notificationReceived notification 723 matches subscription 0, client
108
Sep 3 19:57:26.107756 osafntfd [11558:NtfNotification.cc:0105] T1 Subscription
0 added to list in notification 723 client 108, subscriptionList size is 1
Sep 3 19:57:26.107761 osafntfd [11558:NtfSubscription.cc:0211] >>
sendNotification
Sep 3 19:57:26.107764 osafntfd [11558:NtfSubscription.cc:0222] T3
send_notification_lib called, client 108, notification 723
Sep 3 19:57:26.107768 osafntfd [11558:ntfs_com.c:0284] >> send_notification_lib
Sep 3 19:57:26.107771 osafntfd [11558:ntfsv_mem.c:0769] >> ntfsv_get_ntf_header
Sep 3 19:57:26.107774 osafntfd [11558:ntfsv_mem.c:0790] << ntfsv_get_ntf_header
Sep 3 19:57:26.10 osafntfd [11558:ntfs_com.c:0286] T3 client id: 108,
not_id: 723
Sep 3 19:57:26.107781 osafntfd [11558:mds_c_sndrcv.c:0396] >> mds_send
Sep 3 19:57:26.107785 osafntfd [11558:mds_c_sndrcv.c:0403] << mds_send
Sep 3 19:57:26.107788 osafntfd [11558:mds_c_sndrcv.c:0681] >> mds_mcm_send
Sep 3 19:57:26.107791 osafntfd [11558:mds_c_sndrcv.c:0916] >>
mcm_pvt_normal_svc_snd
Sep 3 19:57:26.107794 osafntfd [11558:mds_c_sndrcv.c:0956] >>
mcm_pvt_normal_snd_process_common
Sep 3 19:57:26.107800 osafntfd [11558:mds_c_sndrcv.c:1699] >>
mds_mcm_process_disc_queue_checks
Sep 3 19:57:26.107804 osafntfd [11558:mds_c_sndrcv.c:1740] TR in else if
sub_info->tmr_flag !- true
Sep 3 19:57:26.107813 osafntfd [11558:mds_c_sndrcv.c:1747] TR
MDS_SND_RCV:Subscription exists but no timer running
Sep 3 19:57:26.107816 osafntfd [11558:mds_c_sndrcv.c:1749] TR MDS_SND_RCV :L
mds_mcm_process_disc_queue_checks
Sep 3 19:57:26.107819 osafntfd [11558:mds_c_sndrcv.c:1750] <<
mds_mcm_process_disc_queue_checks
Sep 3 19:57:26.107900 osafntfd [11558:mds_c_sndrcv.c:1048] <<
mcm_pvt_normal_snd_process_common
Sep 3 19:57:26.107937 osafntfd [11558:mds_c_sndrcv.c:0937] >>
mcm_pvt_normal_svc_snd
Sep 3 19:57:26.107941 osafntfd [11558:mds_c_sndrcv.c:0846] << mds_mcm_send
Sep 3 19:57:26.108141 osafntfd [11558:ntfs_mds.c:1290] ER ntfs_mds_msg_send
FAILED
Sep 3 19:57:26.108160 osafntfd [11558:ntfs_com.c:0308] ER ntfs_mds_msg_send to
ntfa failed rc: 2
Sep 3 19:57:26.108165 osafntfd [11558:NtfNotification.cc:0142] T1 Removing
subscription 0 client 108 from notification 723, subscriptionList size is 0
Sep 3 19:57:26.108169 osafntfd [11558:ntfs_com.c:0503] >> sendNotConfirmUpdate:
client: 108, subId: 0, notId: 723
a. The traces show that a notification is received for client id 108 and
then a mds_send is tried for the same client id but it fails because there is
no timer running.
b. What does a client represent? A opensaf process? A SU? A component?
c. What is the purpose of sending a message back after receipt of the
notification? Since it is not sent it is discarded and does not seem to have
any impact to the cluster.
d. Hardcoded timers are defined in mds_main.c:
uint32_t MDS_QUIESCED_TMR_VAL = 80;
uint32_t MDS_AWAIT_ACTIVE_TMR_VAL = 18000;
uint32_t MDS_SUBSCRIPTION_TMR_VAL = 500;
uint32_t MDTM_REASSEMBLE_TMR_VAL = 500;
uint32_t MDTM_CACHED_EVENTS_TMR_VAL = 24000;;
Could each one of these be explained? Can any of these be increased? If yes,
what effect would that have?
3. Are there limitations of a size of a cluster, the number of SGs, the
number of SUs, the number of components per SU? Testing shows that as the
number of SUs/components increase, the
erro