Paul - I am sorry you are affected by this issue but there isn't much we can do when the external repositories are unavailable - except for failing fast. Any other suggestion?
I know you are using a custom AMI - what if you install the JDK by default and override the install_java function with an empty one? -- Andrei Savu On Sat, Nov 12, 2011 at 9:20 PM, Paul Baclace <[email protected]>wrote: > > HOLDING dpkg lock:: > > 0 S root 961 762 0 80 0 - 6669 poll_s 18:29 ? 00:00:00 > apt-get -y install sun-java6-jdk > > (still holding after 45 minutes.) > > Root cause of problem is: The jdk installs are failing, timeout occurs, a > retry (or just marching on) succeeds eventually, but that leaves dpkg > locked, so no further installs occur. > > I launched 21 broken clusters this morning... :^( > > > Paul > > > On 20111112 11:02 , Paul Baclace wrote: > > Here is a guess: a remote depo went missing during an install, and the > package system was left in a locked state, never to be cleared again. > > What if Whirr forced the dpkg lock clear? Does it rely on that lock for > serialization? > > Paul > > > On 20111112 10:44 , Paul Baclace wrote: > > I am seeing this error, not due to any change I made: > > E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily > unavailable) > E: Unable to lock the administration directory (/var/lib/dpkg/), is > another process using it? > > What causes this intermittent problem? At the moment, it is very > repeatable. > > > Paul > > On 20111111 22:23 , Andrei Savu wrote: > > Can you make the S3 files public? Is this happening on all machines? > > You should probably consider > using whirr.instance-templates-max-percent-failures as described here: > http://whirr.apache.org/docs/0.6.0/configuration-guide.html > > Cheers, > > -- Andrei Savu / andreisavu.ro > > On Sat, Nov 12, 2011 at 2:22 AM, Arun Ramakrishnan < > [email protected]> wrote: > >> Guys, >> >> It looks like the apt hadoop packages aren't getting installed. Any ideas >> ? >> >> ################################################### >> >> 2011-11-11 12:31:31,893 DEBUG [jclouds.compute] (user thread 6) << stderr >> from jclouds-script-1321043482986 as [email protected] >> sed: can't read /etc/hadoop-0.20/conf.dist/hadoop-env.sh: No such file or >> directory >> sed: can't read /etc/hadoop-0.20/conf.dist/hadoop-env.sh: No such file or >> directory >> chgrp: invalid group: `hadoop' >> chgrp: invalid group: `hadoop' >> E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily >> unavailable) >> E: Unable to lock the administration directory (/var/lib/dpkg/), is >> another process using it? >> hadoop-0.20-datanode: unrecognized service >> E: Could not get lock /var/lib/dpkg/lock - open (11: Resource temporarily >> unavailable) >> E: Unable to lock the administration directory (/var/lib/dpkg/), is >> another process using it? >> hadoop-0.20-tasktracker: unrecognized service >> >> ################################################## >> >> I am using a binaries that i built form 0.7 a few weeks back. >> >> >> Full log : http://incentica-public.s3.amazonaws.com/whirr-ccore44.log >> Config : http://incentica-public.s3.amazonaws.com/whirr_cdh.properties >> >> >> This seems to happen non-deterministically and more so for larger >> clusters 10+ >> >> >> thanks >> Arun >> > > > > >
