I think osaftransportd must do this sequence:
* Open the file
* Lock the file
* Check if the file has a non-zero size - if the size is zero, close the
file and treat it as non-existing.
Though your worry about deadlock is important from a security
perspective. Locking shall be non-blocking as Hans says - otherwise
there can be a risk for denial of service.
regards,
Anders Widell
On 05/03/2017 08:52 AM, Hans Nordebäck wrote:
> Hi Minh,
>
> the flock is used in non blocking mode, not sure but as I mentioned
> earlier why not add a new function
>
> FileNotify::WaitForFileClose and change the return codes in
> ProcessEvents to differentiate the events?
>
> /Thanks HansN
>
>
> On 05/03/2017 08:43 AM, minh chau wrote:
>> Hi Anders, Hans,
>>
>> Using flock() when reading pid file in osaftransportd, it may be
>> causing a deadlock. When open() is called in __create_pidfile(), that
>> will generate IN_CREATE towards osaftransportd. Right after that if
>> osaftransportd takes the lock before __create_pidfile(), I think it
>> will cause a deadlock?
>>
>> There might be two ways we can do:
>>
>> (1) It's a little bit change in the temporary approach, we first
>> create original pid file as normally, at the end of
>> __create_pidfile(), create a symlink temporary pid file targeting the
>> original pid file. Then, osaftransportd now watches the temporary pid
>> file, and FileNotify can continue using IN_CREATE (which is created
>> by both symlink() and open(), don't have to add IN_MOVED_TO)
>>
>> (2) The other simple one is the retry of reading pid as sent out in
>> V1, it does retry as similar as NID is trying to open fifo file after
>> call WaitForFileCreation(). We can extend the waiting time of retries
>> for osaftransportd as similar to NID, and the pid should be ready to
>> read by that period of waiting time.
>>
>> Or (3)...?
>>
>> thanks,
>> Minh
>>
>> On 03/05/17 00:43, Anders Widell wrote:
>>> Yes, when reading this patch I realize that the pidfile creation
>>> already uses flock() for atomicity. Maybe this is standard practice
>>> when creating pid files, and if so we probably shouldn't do it in
>>> another way because then we can have problems interacting with other
>>> programs that operate on the pid files (Ubuntu start-stop-daemon and
>>> LSB start_daemon). Maybe the solution is as simple as using flock()
>>> when reading the pid file?
>>>
>>> regards,
>>>
>>> Anders Widell
>>>
>>>
>>> On 05/02/2017 03:10 PM, Hans Nordebäck wrote:
>>>> Hi Minh,
>>>>
>>>> some minor comments below, I discussed with Anders about using the
>>>> existing flock instead of the rename.
>>>>
>>>> /BR HansN
>>>>
>>>>
>>>> On 04/28/2017 11:36 AM, Minh Chau wrote:
>>>>> At startup, osaftransportd waits for osafdtmd.pid file creation
>>>>> and then reads dtm pid. If osafdtmd.pid has not been completedly
>>>>> created but osaftransportd still receives IN_CREATE, osaftransported
>>>>> will fail to read pid of dtmd. That results in a node reboot with
>>>>> a reason as "osafdtmd failed to start".
>>>>>
>>>>> The patch implements an approach suggested by Anders Widell, which
>>>>> creates a completed temporary pid file first, then renames it to
>>>>> correct pid file name. Whenever osaftransportd is notified to read
>>>>> dtmd's pid, the data in pid file should be always safe to read. In
>>>>> addition to this, FileNotify needs to introduce IN_MOVED_TO event.
>>>>> ---
>>>>> src/base/daemon.c | 27 ++++++++++++++++++---------
>>>>> src/base/file_notify.cc | 10 +++++++++-
>>>>> 2 files changed, 27 insertions(+), 10 deletions(-)
>>>>>
>>>>> diff --git a/src/base/daemon.c b/src/base/daemon.c
>>>>> index 77a869561..2fd161ce0 100644
>>>>> --- a/src/base/daemon.c
>>>>> +++ b/src/base/daemon.c
>>>>> @@ -88,40 +88,49 @@ static int __create_pidfile(const char *pidfile)
>>>>> {
>>>>> FILE *file = NULL;
>>>>> int fd, pid, rc = 0;
>>>>> + char pidfiletmp[256] = {0};
>>>> [HansN] use NAME_MAX instead of 256
>>>>> +
>>>>> + sprintf(pidfiletmp, "%s.tmp", pidfile);
>>>> [HansN] use snprintf and shouldn't pid be included in temp name?
>>>>> /* open the file and associate a stream with it */
>>>>> - if (((fd = open(pidfile, O_RDWR | O_CREAT, 0644)) == -1) ||
>>>>> + if (((fd = open(pidfiletmp, O_RDWR | O_CREAT, 0644)) == -1) ||
>>>>> ((file = fdopen(fd, "r+")) == NULL)) {
>>>>> - syslog(LOG_ERR, "open failed, pidfile=%s, errno=%s",
>>>>> pidfile,
>>>>> - strerror(errno));
>>>>> + syslog(LOG_ERR, "open failed, pidfiletmp=%s, errno=%s",
>>>>> + pidfiletmp, strerror(errno));
>>>> [HansN] close(fd) is missing
>>>>> return -1;
>>>>> }
>>>>> /* Lock the file */
>>>>> if (flock(fd, LOCK_EX | LOCK_NB) == -1) {
>>>>> - syslog(LOG_ERR, "flock failed, pidfile=%s, errno=%s",
>>>>> pidfile,
>>>>> - strerror(errno));
>>>>> + syslog(LOG_ERR, "flock failed, pidfiletmp=%s, errno=%s",
>>>>> + pidfiletmp, strerror(errno));
>>>>> fclose(file);
>>>>> return -1;
>>>>> }
>>>>> pid = getpid();
>>>>> if (!fprintf(file, "%d\n", pid)) {
>>>>> - syslog(LOG_ERR, "fprintf failed, pidfile=%s, errno=%s",
>>>>> pidfile,
>>>>> - strerror(errno));
>>>>> + syslog(LOG_ERR, "fprintf failed, pidfiletmp=%s, errno=%s",
>>>>> + pidfiletmp, strerror(errno));
>>>>> fclose(file);
>>>>> return -1;
>>>>> }
>>>>> fflush(file);
>>>>> if (flock(fd, LOCK_UN) == -1) {
>>>>> - syslog(LOG_ERR, "flock failed, pidfile=%s, errno=%s",
>>>>> pidfile,
>>>>> - strerror(errno));
>>>>> + syslog(LOG_ERR, "flock failed, pidfiletmp=%s, errno=%s",
>>>>> + pidfiletmp, strerror(errno));
>>>>> fclose(file);
>>>>> return -1;
>>>>> }
>>>>> fclose(file);
>>>>> + if (rename(pidfiletmp, pidfile) != 0) {
>>>>> + syslog(LOG_ERR, "rename failed, old=%s new=%s, error:%s",
>>>>> + pidfiletmp, pidfile, strerror(errno));
>>>>> + return -1;
>>>>> + }
>>>>> +
>>>>> return rc;
>>>>> }
>>>>> diff --git a/src/base/file_notify.cc b/src/base/file_notify.cc
>>>>> index e96be8b4a..80d1b7d54 100644
>>>>> --- a/src/base/file_notify.cc
>>>>> +++ b/src/base/file_notify.cc
>>>>> @@ -50,7 +50,7 @@ FileNotify::FileNotifyErrors
>>>>> FileNotify::WaitForFileCreation(
>>>>> SplitFileName(file_name);
>>>>> if ((inotify_wd_ = inotify_add_watch(inotify_fd_,
>>>>> file_path_.c_str(),
>>>>> - IN_CREATE)) == -1) {
>>>>> + IN_CREATE | IN_MOVED_TO))
>>>>> == -1) {
>>>>> LOG_NO("inotify_add_watch failed: %s", strerror(errno));
>>>>> return FileNotifyErrors::kError;
>>>>> }
>>>>> @@ -144,6 +144,14 @@ FileNotify::FileNotifyErrors
>>>>> FileNotify::ProcessEvents(
>>>>> return FileNotifyErrors::kOK;
>>>>> }
>>>>> }
>>>>> + if (event->mask & IN_MOVED_TO) {
>>>>> + if (file_name_ == event->name) {
>>>>> + TRACE("file name: %s moved to %s",
>>>>> file_name_.c_str(),
>>>>> + file_path_.c_str());
>>>>> + delete[] fds;
>>>>> + return FileNotifyErrors::kOK;
>>>>> + }
>>>>> + }
>>>>> if (event->mask & IN_IGNORED) {
>>>>> TRACE("IN_IGNORE received, (ignored)");
>>>>> }
>>>>
>>>
>>>
>>>
>>
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Opensaf-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/opensaf-devel