> On April 11, 2016, 8:10 p.m., Alejandro Fernandez wrote: > > ambari-common/src/main/python/resource_management/libraries/functions/fcntl_based_process_lock.py, > > line 29 > > <https://reviews.apache.org/r/45872/diff/3/?file=1339352#file1339352line29> > > > > Can we make this more robust by catching the exception and logging it. > > Alejandro Fernandez wrote: > I believe the code should be resilient in all cases for all OS'es. Fail > gracefully and allow the operations to continue, even if they are in > parallel, so they can be retried.
Found that dpkg relates on fcntl for locking on /var/lib/dpkg/lock. So implicitly Ambari relates on fcntl support. There is a ticket about dpkg and NFS: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=149491 and it's closed as won't fix ("these reports are unreasonable wishes") It's and old report, to check if this is still an issue I tried to reproduce an fcntl fail on locking /var/lib/dpkg/lock. I tried this with NFSv3 and stopped nfslock (statd and rpc.lockd) but fcntl calls do not fail. So I downloaded dpkg source on Ubuntu 14.04 (apt-get source dpkg), changed code related to the fcntl call to imitate that it failed: dpkg-1.17.5ubuntu5.5/lib/dpkg/file.c:146 (void file_lock) ``` if (fcntl(*lockfd, lock_cmd, &fl) == -1) { if (errno == EACCES || errno == EAGAIN) ohshit(_("%s is locked by another process"), desc); else ohshite(_("unable to lock %s"), desc); } changed this to errno = ENOLCK; ohshite(_("!!! unable to lock %s"), desc); ``` compiled it using ```apt-get source --compile dpkg``` and reinstalled dpkg. Now "dpkg --install" fails with ``` root@u1402:~# ./usr/bin/dpkg --install dpkg: error: !!! unable to lock dpkg status database: No locks available ``` Don't think it's a good practice to ignore fcntl failures because it indicates that the lock is broken. But for parrallel execution with retries it may be acceptable. - Andrii ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/45872/#review128233 ----------------------------------------------------------- On April 11, 2016, 2:33 p.m., Andrew Onischuk wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/45872/ > ----------------------------------------------------------- > > (Updated April 11, 2016, 2:33 p.m.) > > > Review request for Ambari, Sumit Mohanty and Vitalyi Brodetskyi. > > > Bugs: AMBARI-15762 > https://issues.apache.org/jira/browse/AMBARI-15762 > > > Repository: ambari > > > Description > ------- > > **Problem** > Ambari executes component installs in parallel on a host. Each install process > in the post-install phase executes certain shared/common configuration > directory related steps (stacks/HDP/2.0.6/hooks/after- > INSTALL/scripts/shared_initialization.py:link_configs() -> ... -> /resource_ma > nagement/libraries/functions/conf_select.py:convert_conf_directories_to_symlin > ks() ) on the directories listed in conf_select.py:PACKAGE_DIRS > > The common configuration directory related processing fail in they are kicked > of in the same time for the same directory. > > In the case the following code snippet is executed in the same time by > multiple component installs > each will see the `backup_dir` as non existent and will try to execute the > "cp" command. One will succeed while the others fail. > > > > > Execute(("cp", "-R", "-p", old_conf, backup_dir), not_if = format("test > -e {backup_dir}"), sudo = True) > > > Similar behaviour is seen for the following code snippet: > > > > > Execute(as_sudo(["cp", "-R", "-p", os.path.join(old_conf, "*"), > versioned_conf], auto_escape=False), only_if = format("ls -d {old_conf}/*")) > > > **Possible solution** > Use the linux "lockfile" command or "mkdir /var/lock/mylock" (see example of > the mkdir solution here: <http://wiki.bash-hackers.org/howto/mutex>) > > > Diffs > ----- > > ambari-agent/src/main/python/ambari_agent/CustomServiceOrchestrator.py > 1bc045c > ambari-agent/src/test/python/ambari_agent/TestCustomServiceOrchestrator.py > 884612b > > ambari-agent/src/test/python/resource_management/TestFcntlBasedProcessLock.py > PRE-CREATION > > ambari-common/src/main/python/resource_management/libraries/functions/fcntl_based_process_lock.py > PRE-CREATION > > ambari-server/src/main/resources/stacks/HDP/2.0.6/hooks/after-INSTALL/scripts/params.py > 9f4971d > > ambari-server/src/main/resources/stacks/HDP/2.0.6/hooks/after-INSTALL/scripts/shared_initialization.py > c772ddb > > Diff: https://reviews.apache.org/r/45872/diff/ > > > Testing > ------- > > mvn clean test > > > Thanks, > > Andrew Onischuk > >
