- **status**: review --> fixed
- **assigned_to**: elunlen --> nobody
- **Comment**:
commit 1c58a2106a55ad212a8e296424b1f20508eeb9cd
Author: Lennart Lund <[email protected]>
Date: Thu Oct 19 15:17:27 2017 +0200
smf: coredump and syslog flood after immnd crash [#2441]
When reinitializing the OI handle, done in a separate thread, then keep the
new handle in a local variable until the whole OI including OI set is done
When finished the new handle can be published in the global cb structure.
Also protect global variable change with imm lock mutex
---
** [tickets:#2441] smf: coredump and syslog flood after immnd crash**
**Status:** fixed
**Milestone:** 5.17.10
**Created:** Thu Apr 27, 2017 09:05 AM UTC by Rafael Odzakow
**Last Updated:** Wed Oct 18, 2017 01:41 PM UTC
**Owner:** nobody
Seen in opensaf version: 183d7c379a8f
short ID: 8190
SMF shall handle the return code ERR_BAD_HANDLE in a better way probably by
reinitializing and creating a new handle. ERR_BAD_HANDLE can happen when IMMND
crashes and is still reinitializing.
These lines are flooding the system and trace log:
~~~
5:09:43.862034 osafsmfd
[27207:../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_main.c:0107]
WA Lock failed eith EBUSY pthread_mutex_trylock for imm 16
5:09:43.862042 osafsmfd
[27207:../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_main.c:0101]
>> smfd_imm_trylock
~~~
SMF backtrace
~~~
### BT FULL ###
#0 0x00007f04a27e50c7 in raise () from /lib64/libc.so.6
No symbol table info available.
#1 0x00007f04a27e6478 in abort () from /lib64/libc.so.6
No symbol table info available.
#2 0x00007f04a4d1afee in __osafassert_fail (__file=<optimized out>,
__line=<optimized out>, __func=<optimized out>, __assertion=<optimized out>) at
../../../../../../opensaf/osaf/libs/core/leap/sysf_def.c:281
No locals.
#3 0x0000000000411f8f in updateImmAttr (dn=<optimized out>,
attributeName=0x47db5b "saSmfCmpgElapsedTime",
attrValueType=SA_IMM_ATTR_SATIMET, value=0x1d89cb8) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_campaign_oi.cc:773
rc = SA_AIS_ERR_BAD_OPERATION
__FUNCTION__ = "updateImmAttr"
#4 0x000000000040f129 in SmfCampaign::updateElapsedTime
(this=this@entry=0x1d89c90) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaign.cc:930
updateTime = <optimized out>
diffTime = <optimized out>
timeStamp = {
tv_sec = 1490843664,
tv_usec = 109102
}
#5 0x000000000040f169 in SmfCampaign::stopElapsedTime (this=0x1d89c90) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaign.cc:958
No locals.
#6 0x0000000000440bee in SmfCampState::changeState
(this=this@entry=0x7f048c014b70, i_camp=i_camp@entry=0x7f048c001220,
i_state=0x7f048c00d3d0) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampState.cc:224
__FUNCTION__ = "changeState"
newState = {
static npos = <optimized out>,
_M_dataplus = {
<std::allocator<char>> = {
<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
members of std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::_Alloc_hider:
_M_p = 0x7f048c12ae48 "SmfCampStateExecFailed"
}
}
oldState = {
static npos = <optimized out>,
_M_dataplus = {
<std::allocator<char>> = {
<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
members of std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::_Alloc_hider:
_M_p = 0x7f048c02f108 "SmfCampStateExecuting"
}
}
#7 0x0000000000443faa in SmfCampStateExecuting::procResult
(this=0x7f048c014b70, i_camp=0x7f048c001220, i_procedure=<optimized out>,
i_result=<optimized out>) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampState.cc:947
error = {
static npos = <optimized out>,
_M_dataplus = {
<std::allocator<char>> = {
<__gnu_cxx::new_allocator<char>> = {<No data fields>}, <No data fields>},
members of std::basic_string<char, std::char_traits<char>, std::allocator<char>
>::_Alloc_hider:
_M_p = 0x7f048c010f48 "Procedure safSmfProc=SingleStep_upgrade_SCs failed"
}
}
__FUNCTION__ = "procResult"
result = <optimized out>
#8 0x000000000042500a in SmfUpgradeCampaign::procResult (this=0x7f048c001220,
i_procedure=0x7f048c10afe0, i_result=SMF_PROC_FAILED) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfUpgradeCampaign.cc:955
__FUNCTION__ = "procResult"
campResult = <optimized out>
#9 0x000000000040cdb1 in SmfCampaignThread::processEvt (this=0x1d8d310) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaignThread.cc:653
evt = 0x7f0490001b40
#10 0x000000000040cf48 in SmfCampaignThread::handleEvents
(this=this@entry=0x1d8d310) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaignThread.cc:699
ret = <optimized out>
__FUNCTION__ = "handleEvents"
fds = {{
fd = 25,
events = 1,
revents = 1
}}
#11 0x0000000000408253 in SmfCampaignThread::main (this=this@entry=0x1d8d310)
at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaignThread.cc:760
__FUNCTION__ = "main"
#12 0x0000000000408352 in SmfCampaignThread::main (info=0x1d8d310) at
../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/SmfCampaignThread.cc:109
__FUNCTION__ = "main"
self = 0x1d8d310
#13 0x00007f04a3a110a4 in start_thread () from /lib64/libpthread.so.0
No symbol table info available.
#14 0x00007f04a289502d in clone () from /lib64/libc.so.6
No symbol table info available.
The following lines flooded all syslog messages of SC-1
Mar 30 5:09:43.862001 osafsmfd
[27207:../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_main.c:0101]
>> smfd_imm_trylock
Mar 30 5:09:43.862034 osafsmfd
[27207:../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_main.c:0107]
WA Lock failed eith EBUSY pthread_mutex_trylock for imm 16
Coredump happened:
Mar 30 5:14:24.109139 osafsmfd
[27207:../../../../../../../opensaf/osaf/libs/agents/saf/imma/imma_oi_api.c:2519]
ER ERR_BAD_OPERATION: The SaImmOiHandleT is not associated with any
implementer name
Mar 30 5:14:24.109204 osafsmfd
[27207:../../../../../../../opensaf/osaf/services/saf/smfsv/smfd/smfd_campaign_oi.cc:0772]
ER updateImmAttr(): immutil_update_one_rattr FAILED, rc = 20, going to assert
~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets