- **status**: assigned --> review
---
** [tickets:#3322] log: log agent in main process is disabled after child
process exits**
**Status:** review
**Milestone:** 5.22.11
**Created:** Wed Oct 05, 2022 09:36 AM UTC by Hieu Hong Hoang
**Last Updated:** Wed Oct 05, 2022 09:36 AM UTC
**Owner:** Hieu Hong Hoang
**Attachments:**
- [loga](https://sourceforge.net/p/opensaf/tickets/3322/attachment/loga) (28.8
kB; application/octet-stream)
-
[osaflogd](https://sourceforge.net/p/opensaf/tickets/3322/attachment/osaflogd)
(527.4 kB; application/octet-stream)
While using log agent, if the new process was created by forking current
process, both processes have a same mds address. If the child process exits,
the destructor of log agent is invoked. It will unregister with mds so all
other subscribing services will detect this log agent is down. However the main
process is still using that mds address, all requests from main process become
invalid because logd thinks this log agent has already been down.
Log analysis:
* Main process initializes the mds address:
~~~
<143>1 2022-10-05T10:41:05.537677+02:00 SC-2 logtest 763 loga [meta
sequenceId="89"] 763:log/agent/lga_mds.cc:1287 >> lga_mds_init
<143>1 2022-10-05T10:41:05.537782+02:00 SC-2 logtest 763 loga [meta
sequenceId="90"] 763:log/agent/lga_mds.cc:1334 << lga_mds_init
~~~
* Duplicate the main process by using
https://man7.org/linux/man-pages/man2/fork.2.html . Then the duplicated process
exits and the destructor of log agent is invoked:
~~~
<143>1 2022-10-05T10:41:05.541101+02:00 SC-2 logtest 763 loga [meta
sequenceId="156"] 772:log/agent/lga_agent.cc:167 >> ~LogAgent
<143>1 2022-10-05T10:41:05.54126+02:00 SC-2 logtest 763 loga [meta
sequenceId="157"] 772:log/agent/lga_state.cc:160 >> stop_recovery2_thread
<143>1 2022-10-05T10:41:05.541297+02:00 SC-2 logtest 763 loga [meta
sequenceId="158"] 772:log/agent/lga_state.cc:166 TR stop_recovery2_thread
RecoveryState::kNormal no thread to stop
<143>1 2022-10-05T10:41:05.541315+02:00 SC-2 logtest 763 loga [meta
sequenceId="159"] 772:log/agent/lga_state.cc:183 << stop_recovery2_thread
<143>1 2022-10-05T10:41:05.541322+02:00 SC-2 logtest 763 loga [meta
sequenceId="160"] 772:log/agent/lga_util.cc:125 >> lga_shutdown
<143>1 2022-10-05T10:41:05.541329+02:00 SC-2 logtest 763 loga [meta
sequenceId="161"] 772:log/agent/lga_mds.cc:1351 >> lga_mds_deinit
<143>1 2022-10-05T10:41:05.541573+02:00 SC-2 logtest 763 loga [meta
sequenceId="162"] 772:log/agent/lga_mds.cc:1362 << lga_mds_deinit
~~~
* Logd detects this log agent is down and delete all clients of this log agent:
~~~
<143>1 2022-10-05T10:41:05.541593+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3386"] 452:log/logd/lgs_mds.cc:1230 T8 MDS DOWN dest:
2020f3aafb5a7, node ID: 2020f, svc_id: 21
<143>1 2022-10-05T10:41:05.541636+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3387"] 447:log/logd/lgs_evt.cc:415 >> proc_lga_updn_mds_msg
<143>1 2022-10-05T10:41:05.541648+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3388"] 447:log/logd/lgs_evt.cc:436 TR proc_lga_updn_mds_msg:
LGSV_LGS_EVT_LGA_DOWN mds_dest = 2020f3aafb5a7
<143>1 2022-10-05T10:41:05.541656+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3389"] 447:log/logd/lgs_evt.cc:338 >>
lgs_client_delete_by_mds_dest: mds_dest 2020f3aafb5a7
<143>1 2022-10-05T10:41:05.541663+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3390"] 447:log/logd/lgs_evt.cc:191 >> lgs_client_delete: client_id 9
<143>1 2022-10-05T10:41:05.541678+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3391"] 447:log/logd/lgs_evt.cc:213 T4 client_id: 9, REMOVE stream
id: 2
<143>1 2022-10-05T10:41:05.541686+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3392"] 447:log/logd/lgs_stream.cc:856 >> log_stream_close:
safLgStrCfg=saLogSystem,safApp=safLogService
<143>1 2022-10-05T10:41:05.541713+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3393"] 447:log/logd/lgs_stream.cc:922 << log_stream_close: rc=0,
numOpeners=7
<143>1 2022-10-05T10:41:05.541737+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3394"] 447:log/logd/lgs_evt.cc:239 << lgs_client_delete
<143>1 2022-10-05T10:41:05.541744+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3395"] 447:log/logd/lgs_evt.cc:348 << lgs_client_delete_by_mds_dest
~~~
* The main process send a writting request to logd:
~~~
<143>1 2022-10-05T10:41:07.541457+02:00 SC-2 logtest 763 loga [meta
sequenceId="182"] 763:log/agent/lga_mds.cc:1439 >> lga_mds_msg_async_send
<143>1 2022-10-05T10:41:07.541488+02:00 SC-2 logtest 763 loga [meta
sequenceId="183"] 763:log/agent/lga_mds.cc:789 >> lga_mds_enc
<143>1 2022-10-05T10:41:07.541516+02:00 SC-2 logtest 763 loga [meta
sequenceId="184"] 763:log/agent/lga_mds.cc:820 T2 msgtype: 0
<143>1 2022-10-05T10:41:07.541524+02:00 SC-2 logtest 763 loga [meta
sequenceId="185"] 763:log/agent/lga_mds.cc:834 T2 api_info.type: 4
<143>1 2022-10-05T10:41:07.541533+02:00 SC-2 logtest 763 loga [meta
sequenceId="186"] 763:log/agent/lga_mds.cc:862 << lga_mds_enc: total_bytes = 108
<143>1 2022-10-05T10:41:07.541581+02:00 SC-2 logtest 763 loga [meta
sequenceId="187"] 763:log/agent/lga_mds.cc:1462 << lga_mds_msg_async_send
~~~
* Logd skips the request due to invalid client id:
~~~
<143>1 2022-10-05T10:41:07.541929+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3420"] 447:log/logd/lgs_evt.cc:1313 >> proc_write_log_async_msg:
client_id 9, stream ID 2, node_name = SC-2
<143>1 2022-10-05T10:41:07.541941+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3421"] 447:log/logd/lgs_evt.cc:109 TR clm_node_id delete failed : 9
<143>1 2022-10-05T10:41:07.541948+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3422"] 447:log/logd/lgs_evt.cc:114 TR client_id: 9 lookup failed
<143>1 2022-10-05T10:41:07.541954+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3423"] 447:log/logd/lgs_evt.cc:1323 TR Bad client ID: 9
<143>1 2022-10-05T10:41:07.541962+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3424"] 447:log/logd/lgs_util.cc:387 >> lgs_send_write_log_ack
<141>1 2022-10-05T10:41:07.542052+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3425"] 447:log/logd/lgs_util.cc:406 NO Failed (2) to send of WRITE
ack to: 2020f3aafb5a7
<143>1 2022-10-05T10:41:07.542073+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3426"] 447:log/logd/lgs_util.cc:408 << lgs_send_write_log_ack
<143>1 2022-10-05T10:41:07.54208+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3427"] 447:log/logd/lgs_util.cc:416 >> lgs_free_write_log
<143>1 2022-10-05T10:41:07.542087+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3428"] 447:log/logd/lgs_util.cc:438 << lgs_free_write_log
<143>1 2022-10-05T10:41:07.542093+02:00 SC-1 osaflogd 447 osaflogd [meta
sequenceId="3429"] 447:log/logd/lgs_evt.cc:0 << proc_write_log_async_msg
~~~
---
Sent from sourceforge.net because [email protected] is
subscribed to https://sourceforge.net/p/opensaf/tickets/
To unsubscribe from further messages, a project admin can change settings at
https://sourceforge.net/p/opensaf/admin/tickets/options. Or, if this is a
mailing list, you can unsubscribe from the mailing list.
_______________________________________________
Opensaf-tickets mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-tickets