lock"

Paul Baclace Mon, 03 Oct 2011 13:23:21 -0700

Two runs of whirr on EC2 yesterday randomly failed to install Hadoopcomponents. First it occurred on the master node, but when it occurredin one slave and not another, I could find the diff of the /tmp/logs/from jclouds. In a third run, everything worked fine. Same scriptsdriving whirr, same AMI, same number of nodes, same region, etc.Snippets of /tmp/logs/stderr.log shown below indicate that apt-getupdate had "Could not get lock /var/lib/dpkg/lock" on one slave, but notanother.


This is a serious reliability issue.  What is non-deterministic here?


Paul

------------ slave 1 -------------------
+ register_cloudera_repo
+ which dpkg
+ cat
+ curl -s http://archive.cloudera.com/debian/archive.key
+ sudo apt-key add -
+ sudo apt-get update

E: Could not get lock /var/lib/dpkg/lock - open (11: Resourcetemporarily unavailable)E: Unable to lock the administration directory (/var/lib/dpkg/), isanother process using it?

+ which dpkg
+ apt-get update

E: Could not get lock /var/lib/dpkg/lock - open (11: Resourcetemporarily unavailable)E: Unable to lock the administration directory (/var/lib/dpkg/), isanother process using it?

+ apt-get -y install hadoop-0.20

-------------- slave 2 ---------------
+ register_cloudera_repo
+ which dpkg
+ cat
+ curl -s http://archive.cloudera.com/debian/archive.key
+ sudo apt-key add -
+ sudo apt-get update
+ which dpkg
+ apt-get update
+ apt-get -y install hadoop-0.20
dpkg-preconfigure: unable to re-open stdin:
+ cp -r /etc/hadoop-0.20/conf.empty /etc/hadoop-0.20/conf.dist

+ update-alternatives --install /etc/hadoop-0.20/conf hadoop-0.20-conf/etc/hadoop-0.20/conf.dist 90+ install_cdh_hbase -c aws-ec2 -uhttp://apache.cs.utah.edu/hbase/hbase-0.90.3/hbase-0.90.3.tar.gz


-------------

non-deterministic "Could not get lock /var/lib/dpkg/lock"

Reply via email to