Hi Anders,
A new method like WaitForFileCloseWrite() with IN_CLOSE_WRITE can leave
nid unchanged, I can try this out.
Thanks,
Minh
On 22/04/17 00:03, Anders Widell wrote:
> Woundn't it be enough to wait for IN_CLOSE_WRITE only? Maybe we can
> introduce a new method in FileNotify, instead of extending
> WaitForFileCreation()?
>
> / Anders Widell
>
>
> On 04/21/2017 03:49 PM, minh chau wrote:
>> Hi Anders,
>>
>> It is because WaitForFileCreation() is also called in nid, where nid
>> does not really need to know IN_CLOSE.
>> To introduce IN_CLOSE in file_notify.cc, then we have to change nid
>> to only wait for IN_CREATE, and transportd waits for IN_CREATE |
>> IN_CLOSE.
>> If this sounds ok to you, I can make a V2 for the above change?
>>
>> thanks,
>> Minh
>>
>> On 21/04/17 23:27, Anders Widell wrote:
>>> Hi!
>>>
>>> Wouldn't this problem, as you already hint at, be solved if we
>>> simply change the inotify condition to wait for IN_CLOSE_WRITE? That
>>> sound like a better solution than introducing a retry loop here...
>>>
>>> regards,
>>>
>>> Anders Widell
>>>
>>>
>>> On 04/20/2017 08:14 AM, Minh Chau wrote:
>>>> At startup, osaftransportd waits for osafdtmd.pid file creation
>>>> and then reads dtm pid. If osafdtmd.pid has not been completedly
>>>> created but osaftransportd still receives IN_CREATE, osaftransported
>>>> will fail to read pid of dtmd. That results in a node reboot with
>>>> a reason as "osafdtmd failed to start".
>>>>
>>>> To ensure that osaftransportd reads a valid pid, osaftransportd
>>>> should also watch IN_CLOSE which is the event that dtmd closes
>>>> osafdtmd.pid. That change will also affect to how nid is using
>>>> file_notify; therefore, this patch adds a retry on reading pid
>>>> of dtmd if the estabished stream to read osafdtmd.pid is not ready.
>>>> ---
>>>> src/dtm/transport/transport_monitor.cc | 20 +++++++++++++++++---
>>>> 1 file changed, 17 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/src/dtm/transport/transport_monitor.cc
>>>> b/src/dtm/transport/transport_monitor.cc
>>>> index 26c0b44..635ff00 100644
>>>> --- a/src/dtm/transport/transport_monitor.cc
>>>> +++ b/src/dtm/transport/transport_monitor.cc
>>>> @@ -45,10 +45,24 @@ pid_t TransportMonitor::WaitForDaemon(const
>>>> std::string& daemon_name,
>>>> pidfile, user_fds, seconds_to_wait * kMillisPerSec);
>>>> if (rc == base::FileNotify::FileNotifyErrors::kOK) {
>>>> - if (!(std::ifstream{pidfile} >> pid).fail()) {
>>>> - if (IsDir(proc_path_ + std::to_string(pid))) {
>>>> - return pid;
>>>> + std::ifstream::iostate istate = std::ifstream::goodbit;
>>>> + int retry_cnt = 0;
>>>> + int retry_max = 5;
>>>> + do {
>>>> + if (retry_cnt > 0) {
>>>> + osaf_nanosleep(&kHundredMilliseconds);
>>>> }
>>>> + std::ifstream is{pidfile};
>>>> + if (!is.fail()) istate = (is >> pid).rdstate();
>>>> + } while ((retry_cnt++ < retry_max) && (pid == pid_t{-1}));
>>>> +
>>>> + if (pid == pid_t{-1}) {
>>>> + LOG_WA("Failed to read pid from file %s, rdstate:0x%x",
>>>> + pidfile.c_str(), static_cast<int>(istate));
>>>> + return pid_t{-1};
>>>> + }
>>>> + if (IsDir(proc_path_ + std::to_string(pid))) {
>>>> + return pid;
>>>> }
>>>> }
>>>> return pid_t{-1};
>>>
>>>
>>
>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel