I think osaftransportd must do this sequence:

* Open the file

* Lock the file

* Check if the file has a non-zero size - if the size is zero, close the 
file and treat it as non-existing.

Though your worry about deadlock is important from a security 
perspective. Locking shall be non-blocking as Hans says - otherwise 
there can be a risk for denial of service.

regards,

Anders Widell

On 05/03/2017 08:52 AM, Hans Nordebäck wrote:
> Hi Minh,
>
> the flock is used in non blocking mode, not sure but as I mentioned 
> earlier why not add a new function
>
> FileNotify::WaitForFileClose and change the return codes in 
> ProcessEvents to differentiate the events?
>
> /Thanks HansN
>
>
> On 05/03/2017 08:43 AM, minh chau wrote:
>> Hi Anders, Hans,
>>
>> Using flock() when reading pid file in osaftransportd, it may be 
>> causing a deadlock. When open() is called in __create_pidfile(), that 
>> will generate IN_CREATE towards osaftransportd. Right after that if 
>> osaftransportd takes the lock before __create_pidfile(), I think it 
>> will cause a deadlock?
>>
>> There might be two ways we can do:
>>
>> (1) It's a little bit change in the temporary approach, we first 
>> create original pid file as normally, at the end of 
>> __create_pidfile(), create a symlink temporary pid file targeting the 
>> original pid file. Then, osaftransportd now watches the temporary pid 
>> file, and FileNotify can continue using IN_CREATE (which is created 
>> by both symlink() and open(), don't have to add IN_MOVED_TO)
>>
>> (2) The other simple one is the retry of reading pid as sent out in 
>> V1, it does retry as similar as NID is trying to open fifo file after 
>> call WaitForFileCreation(). We can extend the waiting time of retries 
>> for osaftransportd as similar to NID, and the pid should be ready to 
>> read by that period of waiting time.
>>
>> Or (3)...?
>>
>> thanks,
>> Minh
>>
>> On 03/05/17 00:43, Anders Widell wrote:
>>> Yes, when reading this patch I realize that the pidfile creation 
>>> already uses flock() for atomicity. Maybe this is standard practice 
>>> when creating pid files, and if so we probably shouldn't do it in 
>>> another way because then we can have problems interacting with other 
>>> programs that operate on the pid files (Ubuntu start-stop-daemon and 
>>> LSB start_daemon). Maybe the solution is as simple as using flock() 
>>> when reading the pid file?
>>>
>>> regards,
>>>
>>> Anders Widell
>>>
>>>
>>> On 05/02/2017 03:10 PM, Hans Nordebäck wrote:
>>>> Hi Minh,
>>>>
>>>> some minor comments below, I discussed with Anders about using the 
>>>> existing flock instead of the rename.
>>>>
>>>> /BR HansN
>>>>
>>>>
>>>> On 04/28/2017 11:36 AM, Minh Chau wrote:
>>>>> At startup, osaftransportd waits for osafdtmd.pid file creation
>>>>> and then reads dtm pid. If osafdtmd.pid has not been completedly
>>>>> created but osaftransportd still receives IN_CREATE, osaftransported
>>>>> will fail to read pid of dtmd. That results in a node reboot with
>>>>> a reason as "osafdtmd failed to start".
>>>>>
>>>>> The patch implements an approach suggested by Anders Widell, which
>>>>> creates a completed temporary pid file first, then renames it to
>>>>> correct pid file name. Whenever osaftransportd is notified to read
>>>>> dtmd's pid, the data in pid file should be always safe to read. In
>>>>> addition to this, FileNotify needs to introduce IN_MOVED_TO event.
>>>>> ---
>>>>>   src/base/daemon.c       | 27 ++++++++++++++++++---------
>>>>>   src/base/file_notify.cc | 10 +++++++++-
>>>>>   2 files changed, 27 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/src/base/daemon.c b/src/base/daemon.c
>>>>> index 77a869561..2fd161ce0 100644
>>>>> --- a/src/base/daemon.c
>>>>> +++ b/src/base/daemon.c
>>>>> @@ -88,40 +88,49 @@ static int __create_pidfile(const char *pidfile)
>>>>>   {
>>>>>       FILE *file = NULL;
>>>>>       int fd, pid, rc = 0;
>>>>> +    char pidfiletmp[256] = {0};
>>>> [HansN] use NAME_MAX instead of 256
>>>>> +
>>>>> +    sprintf(pidfiletmp, "%s.tmp", pidfile);
>>>> [HansN] use snprintf and shouldn't pid be included in temp name?
>>>>>         /* open the file and associate a stream with it */
>>>>> -    if (((fd = open(pidfile, O_RDWR | O_CREAT, 0644)) == -1) ||
>>>>> +    if (((fd = open(pidfiletmp, O_RDWR | O_CREAT, 0644)) == -1) ||
>>>>>           ((file = fdopen(fd, "r+")) == NULL)) {
>>>>> -        syslog(LOG_ERR, "open failed, pidfile=%s, errno=%s", 
>>>>> pidfile,
>>>>> -               strerror(errno));
>>>>> +        syslog(LOG_ERR, "open failed, pidfiletmp=%s, errno=%s",
>>>>> +            pidfiletmp, strerror(errno));
>>>> [HansN] close(fd) is missing
>>>>>           return -1;
>>>>>       }
>>>>>         /* Lock the file */
>>>>>       if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
>>>>> -        syslog(LOG_ERR, "flock failed, pidfile=%s, errno=%s", 
>>>>> pidfile,
>>>>> -               strerror(errno));
>>>>> +        syslog(LOG_ERR, "flock failed, pidfiletmp=%s, errno=%s",
>>>>> +            pidfiletmp, strerror(errno));
>>>>>           fclose(file);
>>>>>           return -1;
>>>>>       }
>>>>>         pid = getpid();
>>>>>       if (!fprintf(file, "%d\n", pid)) {
>>>>> -        syslog(LOG_ERR, "fprintf failed, pidfile=%s, errno=%s", 
>>>>> pidfile,
>>>>> -               strerror(errno));
>>>>> +        syslog(LOG_ERR, "fprintf failed, pidfiletmp=%s, errno=%s",
>>>>> +             pidfiletmp, strerror(errno));
>>>>>           fclose(file);
>>>>>           return -1;
>>>>>       }
>>>>>       fflush(file);
>>>>>         if (flock(fd, LOCK_UN) == -1) {
>>>>> -        syslog(LOG_ERR, "flock failed, pidfile=%s, errno=%s", 
>>>>> pidfile,
>>>>> -               strerror(errno));
>>>>> +        syslog(LOG_ERR, "flock failed, pidfiletmp=%s, errno=%s",
>>>>> +            pidfiletmp, strerror(errno));
>>>>>           fclose(file);
>>>>>           return -1;
>>>>>       }
>>>>>       fclose(file);
>>>>>   +    if (rename(pidfiletmp, pidfile) != 0) {
>>>>> +        syslog(LOG_ERR, "rename failed, old=%s new=%s, error:%s",
>>>>> +            pidfiletmp, pidfile, strerror(errno));
>>>>> +        return -1;
>>>>> +    }
>>>>> +
>>>>>       return rc;
>>>>>   }
>>>>>   diff --git a/src/base/file_notify.cc b/src/base/file_notify.cc
>>>>> index e96be8b4a..80d1b7d54 100644
>>>>> --- a/src/base/file_notify.cc
>>>>> +++ b/src/base/file_notify.cc
>>>>> @@ -50,7 +50,7 @@ FileNotify::FileNotifyErrors 
>>>>> FileNotify::WaitForFileCreation(
>>>>>     SplitFileName(file_name);
>>>>>       if ((inotify_wd_ = inotify_add_watch(inotify_fd_, 
>>>>> file_path_.c_str(),
>>>>> -                                       IN_CREATE)) == -1) {
>>>>> +                                       IN_CREATE | IN_MOVED_TO)) 
>>>>> == -1) {
>>>>>       LOG_NO("inotify_add_watch failed: %s", strerror(errno));
>>>>>       return FileNotifyErrors::kError;
>>>>>     }
>>>>> @@ -144,6 +144,14 @@ FileNotify::FileNotifyErrors 
>>>>> FileNotify::ProcessEvents(
>>>>>                   return FileNotifyErrors::kOK;
>>>>>                 }
>>>>>               }
>>>>> +            if (event->mask & IN_MOVED_TO) {
>>>>> +              if (file_name_ == event->name) {
>>>>> +                TRACE("file name: %s moved to %s", 
>>>>> file_name_.c_str(),
>>>>> +                    file_path_.c_str());
>>>>> +                delete[] fds;
>>>>> +                return FileNotifyErrors::kOK;
>>>>> +              }
>>>>> +            }
>>>>>               if (event->mask & IN_IGNORED) {
>>>>>                 TRACE("IN_IGNORE received, (ignored)");
>>>>>               }
>>>>
>>>
>>>
>>>
>>
>



------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel

Reply via email to