[Bug 1195524] Re: race condition / transient failure to provision

2013-11-21 Thread Ben Howard
** Changed in: walinuxagent (Ubuntu Precise)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-24 Thread Launchpad Bug Tracker
This bug was fixed in the package walinuxagent - 1.3.2-0ubuntu2~13.04.1

---
walinuxagent (1.3.2-0ubuntu2~13.04.1) raring-proposed; urgency=low

  * Backport of 1.3.2-0ubuntu5 from 13.10
* disable ephemeral disk formating by default (LP: #1231490)
* debian/patches/shadow_permissions.patch: apply the appropriate
  permissions to /etc/shadow (LP: #1188820).
* debian/patches/verbose_logging.patch: use the appropriate log
  faculty when using verbose logging (LP: #1193404).
* Mark bugs fixed in 1.3.2-0ubuntu3:
  debian/patches/config_for_cloud-init.patch:
  - fix for race condition between cloud-init and waagent  (LP: #1195524)
  - mount resource disk on /mnt (LP: #1193380)
  - move walinuxagent init functionality to cloud-init (LP: #1037723)
  * Add requirement of cloud-init (LP: #1037723).
 -- Ben Howard ben.how...@ubuntu.com   Thu, 10 Oct 2013 09:24:46 -0600

** Changed in: walinuxagent (Ubuntu Raring)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-24 Thread Launchpad Bug Tracker
This bug was fixed in the package walinuxagent - 1.3.2-0ubuntu2~13.04.1

---
walinuxagent (1.3.2-0ubuntu2~13.04.1) raring-proposed; urgency=low

  * Backport of 1.3.2-0ubuntu5 from 13.10
* disable ephemeral disk formating by default (LP: #1231490)
* debian/patches/shadow_permissions.patch: apply the appropriate
  permissions to /etc/shadow (LP: #1188820).
* debian/patches/verbose_logging.patch: use the appropriate log
  faculty when using verbose logging (LP: #1193404).
* Mark bugs fixed in 1.3.2-0ubuntu3:
  debian/patches/config_for_cloud-init.patch:
  - fix for race condition between cloud-init and waagent  (LP: #1195524)
  - mount resource disk on /mnt (LP: #1193380)
  - move walinuxagent init functionality to cloud-init (LP: #1037723)
  * Add requirement of cloud-init (LP: #1037723).
 -- Ben Howard ben.how...@ubuntu.com   Thu, 10 Oct 2013 09:24:46 -0600

** Changed in: walinuxagent (Ubuntu Raring)
   Status: Fix Committed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-23 Thread Ben Howard
** Changed in: walinuxagent (Ubuntu Precise)
   Status: New = Fix Committed

** Changed in: walinuxagent (Ubuntu Precise)
   Importance: Undecided = Medium

** Changed in: walinuxagent (Ubuntu Raring)
   Importance: Undecided = Medium

** Changed in: walinuxagent (Ubuntu)
 Assignee: (unassigned) = Ben Howard (utlemming)

** Changed in: walinuxagent (Ubuntu Precise)
 Assignee: (unassigned) = Ben Howard (utlemming)

** Changed in: walinuxagent (Ubuntu Raring)
 Assignee: (unassigned) = Ben Howard (utlemming)

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-23 Thread Ben Howard
** Changed in: walinuxagent (Ubuntu Precise)
   Status: New = Fix Committed

** Changed in: walinuxagent (Ubuntu Precise)
   Importance: Undecided = Medium

** Changed in: walinuxagent (Ubuntu Raring)
   Importance: Undecided = Medium

** Changed in: walinuxagent (Ubuntu)
 Assignee: (unassigned) = Ben Howard (utlemming)

** Changed in: walinuxagent (Ubuntu Precise)
 Assignee: (unassigned) = Ben Howard (utlemming)

** Changed in: walinuxagent (Ubuntu Raring)
 Assignee: (unassigned) = Ben Howard (utlemming)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-22 Thread Stéphane Graber
Hello Scott, or anyone else affected,

Accepted walinuxagent into raring-proposed. The package will build now
and be available at
http://launchpad.net/ubuntu/+source/walinuxagent/1.3.2-0ubuntu2~13.04.1
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
enable and use -proposed.  Your feedback will aid us getting this update
out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Also affects: walinuxagent (Ubuntu Precise)
   Importance: Undecided
   Status: New

** Also affects: walinuxagent (Ubuntu Raring)
   Importance: Undecided
   Status: New

** Changed in: walinuxagent (Ubuntu Raring)
   Status: New = Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-10-22 Thread Stéphane Graber
Hello Scott, or anyone else affected,

Accepted walinuxagent into raring-proposed. The package will build now
and be available at
http://launchpad.net/ubuntu/+source/walinuxagent/1.3.2-0ubuntu2~13.04.1
in a few hours, and then in the -proposed repository.

Please help us by testing this new package.  See
https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to
enable and use -proposed.  Your feedback will aid us getting this update
out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug,
mentioning the version of the package you tested, and change the tag
from verification-needed to verification-done. If it does not fix the
bug for you, please add a comment stating that, and change the tag to
verification-failed.  In either case, details of your testing will help
us make a better decision.

Further information regarding the verification process can be found at
https://wiki.ubuntu.com/QATeam/PerformingSRUVerification .  Thank you in
advance!

** Also affects: walinuxagent (Ubuntu Precise)
   Importance: Undecided
   Status: New

** Also affects: walinuxagent (Ubuntu Raring)
   Importance: Undecided
   Status: New

** Changed in: walinuxagent (Ubuntu Raring)
   Status: New = Fix Committed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-08-19 Thread Ben Howard
Makring verification-done as part of SRU testing.

** Tags added: verification-done

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-08-19 Thread Ben Howard
Makring verification-done as part of SRU testing.

** Tags added: verification-done

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-08-07 Thread Ben Howard
** Description changed:

+ [Impact]: 
+ WALinuxAgent currently handles the provisioning of Ubuntu. This bug is fixed 
by Bug #1037723, which causes provisioning to be handled by Cloud-init. 
+ 
+ [Regression] : By moving provisioning functions to cloud-init, the
+ regression potential is low. Further, since this handled by a
+ configuration change in WALinuxAgent, current users will not be
+ affected.
+ 
+ [Test Case]: Boot on Windows Azure and make sure that the system
+ provisioned.
+ 
+ 
+ [ORIGINAL REPORT]
+ 
  when starting instances on azure, there is  a fair chance (seems more
  likely for some users thanothers) that the instance will fail to reach
  provisioned state.
  
  I do not have a a good guess as to why this is.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 13.04
  Package: walinuxagent 1.3.2-0ubuntu1
  ProcVersionSignature: Ubuntu 3.8.0-25.37-generic 3.8.13
  Uname: Linux 3.8.0-25-generic x86_64
  ApportVersion: 2.9.2-0ubuntu8.1
  Architecture: amd64
  Date: Fri Jun 28 01:29:25 2013
  MarkForUpload: True
  ProcEnviron:
-  TERM=screen
-  PATH=(custom, no user)
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  TERM=screen
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  SourcePackage: walinuxagent
  UpgradeStatus: No upgrade log present (probably fresh install)

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-08-07 Thread Ben Howard
** Description changed:

+ [Impact]: 
+ WALinuxAgent currently handles the provisioning of Ubuntu. This bug is fixed 
by Bug #1037723, which causes provisioning to be handled by Cloud-init. 
+ 
+ [Regression] : By moving provisioning functions to cloud-init, the
+ regression potential is low. Further, since this handled by a
+ configuration change in WALinuxAgent, current users will not be
+ affected.
+ 
+ [Test Case]: Boot on Windows Azure and make sure that the system
+ provisioned.
+ 
+ 
+ [ORIGINAL REPORT]
+ 
  when starting instances on azure, there is  a fair chance (seems more
  likely for some users thanothers) that the instance will fail to reach
  provisioned state.
  
  I do not have a a good guess as to why this is.
  
  ProblemType: Bug
  DistroRelease: Ubuntu 13.04
  Package: walinuxagent 1.3.2-0ubuntu1
  ProcVersionSignature: Ubuntu 3.8.0-25.37-generic 3.8.13
  Uname: Linux 3.8.0-25-generic x86_64
  ApportVersion: 2.9.2-0ubuntu8.1
  Architecture: amd64
  Date: Fri Jun 28 01:29:25 2013
  MarkForUpload: True
  ProcEnviron:
-  TERM=screen
-  PATH=(custom, no user)
-  LANG=en_US.UTF-8
-  SHELL=/bin/bash
+  TERM=screen
+  PATH=(custom, no user)
+  LANG=en_US.UTF-8
+  SHELL=/bin/bash
  SourcePackage: walinuxagent
  UpgradeStatus: No upgrade log present (probably fresh install)

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1195524] Re: race condition / transient failure to provision

2013-07-23 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/walinuxagent

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-23 Thread Launchpad Bug Tracker
This bug was fixed in the package walinuxagent - 1.3.2-0ubuntu4

---
walinuxagent (1.3.2-0ubuntu4) saucy; urgency=low

  * debian/patches/shadow_permissions.patch: apply the appropriate
permissions to /etc/shadow (LP: #1188820).
  * debian/patches/verbose_logging.patch: use the appropriate log
faculty when using verbose logging (LP: #1193404).
  * Mark bugs fixed in 1.3.2-0ubuntu3:
debian/patches/config_for_cloud-init.patch:
- fix for race condition between cloud-init and waagent  (LP: #1195524)
- mount resource disk on /mnt (LP: #1193380)
- move walinuxagent init functionality to cloud-init (LP: #1037723)
 -- Ben Howard ben.how...@ubuntu.com   Tue, 23 Jul 2013 09:43:40 -0600

** Changed in: walinuxagent (Ubuntu)
   Status: Confirmed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-23 Thread Launchpad Bug Tracker
** Branch linked: lp:ubuntu/walinuxagent

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-23 Thread Launchpad Bug Tracker
This bug was fixed in the package walinuxagent - 1.3.2-0ubuntu4

---
walinuxagent (1.3.2-0ubuntu4) saucy; urgency=low

  * debian/patches/shadow_permissions.patch: apply the appropriate
permissions to /etc/shadow (LP: #1188820).
  * debian/patches/verbose_logging.patch: use the appropriate log
faculty when using verbose logging (LP: #1193404).
  * Mark bugs fixed in 1.3.2-0ubuntu3:
debian/patches/config_for_cloud-init.patch:
- fix for race condition between cloud-init and waagent  (LP: #1195524)
- mount resource disk on /mnt (LP: #1193380)
- move walinuxagent init functionality to cloud-init (LP: #1037723)
 -- Ben Howard ben.how...@ubuntu.com   Tue, 23 Jul 2013 09:43:40 -0600

** Changed in: walinuxagent (Ubuntu)
   Status: Confirmed = Fix Released

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-08 Thread Scott Moser
I'm fairly certain there is still a race condition that I described comment 8.
Please raise the hostname issue in another bug.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-08 Thread Scott Moser
I'm fairly certain there is still a race condition that I described comment 8.
Please raise the hostname issue in another bug.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-01 Thread Raphaël Badin
I did some testing and it turns out the hostname size seems to be the
problem!  I started a bunch of machines with a hostname of 64 characters
and a bunch of machines with a hostname of 25 characters (with the
script I linked above): all the machines with a hostname of 25 ended up
in the state Running and all the others were stuck Commissioning.
Just to be sure, I had someone else confirm that behavior.

I did some more testing and I can now say that Azure is ok with
hostnames up to 55 characters.  A machine with a hostname of size 56 or
more won't get out of the commissioning state.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-01 Thread Raphaël Badin
s/Commissioning/Provisioning/ sorry about that.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-01 Thread Raphaël Badin
I did some testing and it turns out the hostname size seems to be the
problem!  I started a bunch of machines with a hostname of 64 characters
and a bunch of machines with a hostname of 25 characters (with the
script I linked above): all the machines with a hostname of 25 ended up
in the state Running and all the others were stuck Commissioning.
Just to be sure, I had someone else confirm that behavior.

I did some more testing and I can now say that Azure is ok with
hostnames up to 55 characters.  A machine with a hostname of size 56 or
more won't get out of the commissioning state.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-07-01 Thread Raphaël Badin
s/Commissioning/Provisioning/ sorry about that.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Raphaël Badin
 2013/06/28 15:08:12 EnvMonitor: Detected host name change: ubuntu - 
 gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
 2013/06/28 15:08:12 Setting host name: 
 gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
 Is it expected?

Yes, this machine was created using (a slightly modified version of)
this script: http://bazaar.launchpad.net/~gwacl-
hackers/gwacl/trunk/view/head:/example/management/run.go

The name is randomly generated.  It is a bit long (64 characters) but I
guess this cannot be the cause of our problem.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Raphaël Badin
The doc (http://msdn.microsoft.com/en-
us/library/windowsazure/jj157194.aspx) explicitly says the hostname can
get 64 characters long: HostName: Required. Specifies the host name for
the VM. Host names are ASCII character strings 1 to 64 characters in
length. Used with the LinuxProvisioningConfigurationSet configuration
set.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Stephen A. Zarkos
Hello!

Thank you for confirming that.  I think the hostname just looked odd so
just wanted to confirm as we don't have access to this environment.

If the instance is still available we really need more logs to figure
out what went wrong.  This will help determine if there's something
wrong on the VM (disk or network issue) or a timing issue that we may be
able work around.

We know the timeout set on the socket by httplib is the system default
timeout for sockets, so if the following http request from
ReportRoleProperties() was 'blocked' it would throw an exception on a
timeout or other IO error.  However, clearly the EnvMonitor was not able
to complete in this case.  Most likely the ifup call could not complete
for some reason.  If this is not a reliable operation then perhaps
disabling the HostMonitor a good solution, of course the VMs other
services may not be aware of a hostname change.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Raphaël Badin
 2013/06/28 15:08:12 EnvMonitor: Detected host name change: ubuntu - 
 gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
 2013/06/28 15:08:12 Setting host name: 
 gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
 Is it expected?

Yes, this machine was created using (a slightly modified version of)
this script: http://bazaar.launchpad.net/~gwacl-
hackers/gwacl/trunk/view/head:/example/management/run.go

The name is randomly generated.  It is a bit long (64 characters) but I
guess this cannot be the cause of our problem.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Raphaël Badin
The doc (http://msdn.microsoft.com/en-
us/library/windowsazure/jj157194.aspx) explicitly says the hostname can
get 64 characters long: HostName: Required. Specifies the host name for
the VM. Host names are ASCII character strings 1 to 64 characters in
length. Used with the LinuxProvisioningConfigurationSet configuration
set.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-29 Thread Stephen A. Zarkos
Hello!

Thank you for confirming that.  I think the hostname just looked odd so
just wanted to confirm as we don't have access to this environment.

If the instance is still available we really need more logs to figure
out what went wrong.  This will help determine if there's something
wrong on the VM (disk or network issue) or a timing issue that we may be
able work around.

We know the timeout set on the socket by httplib is the system default
timeout for sockets, so if the following http request from
ReportRoleProperties() was 'blocked' it would throw an exception on a
timeout or other IO error.  However, clearly the EnvMonitor was not able
to complete in this case.  Most likely the ifup call could not complete
for some reason.  If this is not a reliable operation then perhaps
disabling the HostMonitor a good solution, of course the VMs other
services may not be aware of a hostname change.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
** Attachment added: waagent.log and waagent dir.
   
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
ok. so i have 2 other systems that are showing this failure now.
I was able to ssh into them, though.
walinux-agent had provisioned the user, populated ssh keys and then also 
started sshd (which it actually should not do).

it shouldn't start sshd because it is possibly doing that before sshd
has the required facilities up (sshd starts on 'filesystem or runlevel
[2345]).

that wouldn't seem to be the problem here, and actually has allowed us
into the instance to debug.

$ ls -tr --full-time /var/log/upstart/*.log
-rw-r- 1 root root  46 2013-06-28 15:07:55.843772000 + 
/var/log/upstart/container-detect.log
-rw-r- 1 root root  95 2013-06-28 15:07:56.095772000 + 
/var/log/upstart/console-setup.log
-rw-r- 1 root root 282 2013-06-28 15:07:56.183772000 + 
/var/log/upstart/procps-virtual-filesystems.log
-rw-r- 1 root root 118 2013-06-28 15:07:56.311772000 + 
/var/log/upstart/module-init-tools.log
-rw-r- 1 root root 282 2013-06-28 15:07:58.310376600 + 
/var/log/upstart/procps-static-network-up.log
-rw-r- 1 root root 110 2013-06-28 15:08:02.993943800 + 
/var/log/upstart/udev-fallback-graphics.log
-rw-r- 1 root root 158 2013-06-28 15:08:09.876561300 + 
/var/log/upstart/ureadahead-other.log
-rw-r- 1 root root  64 2013-06-28 15:09:30.346411301 + 
/var/log/upstart/rsyslog.log
-rw-r- 1 root root  64 2013-06-28 15:09:30.370411301 + 
/var/log/upstart/dbus.log

$ cat /proc/mounts 
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=335336k,nr_inodes=83834,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=137672k,mode=755 0 0
/dev/disk/by-uuid/65a0705a-7afe-482f-917d-c59e75cf0c52 / ext4 
rw,relatime,user_xattr,barrier=1,data=ordered,discard 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
/dev/sdb1 /mnt/resource ext4 rw,relatime,user_xattr,barrier=1,data=ordered 0 0

$ cat /etc/fstab
UUID=65a0705a-7afe-482f-917d-c59e75cf0c52   /ext4   
defaults,discard   0 0

mountall is not running.
$ sudo status mountall
mountall stop/waiting

$ ls -altr /var/run/landscape
ls: cannot access /var/run/landscape: No such file or directory

$ runlevel 
N 2

$ ps axw
..
root   389 1  0 15:07 ?00:00:00 upstart-udev-bridge --daemon
root   391 1  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   508 1  0 15:07 ?00:00:00 /usr/bin/python 
/usr/sbin/waagent -daemo
root   574   391  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   577   391  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   598 2  0 15:07 ?00:00:00 [kpsmoused]
root   633 1  0 15:07 ?00:00:00 upstart-socket-bridge --daemon
root   906 2  0 15:08 ?00:00:00 [jbd2/sdb1-8]
root   907 2  0 15:08 ?00:00:00 [ext4-dio-unwrit]
root   931   508  0 15:08 ?00:00:00 [sh] defunct
root  1015 1  0 15:08 ?00:00:00 dhclient3 -e IF_METRIC=100 -pf 
/var/run/
root  1025 1  0 15:08 ?00:00:00 /bin/sh 
/etc/network/if-up.d/ntpdate
root  1028  1025  0 15:08 ?00:00:00 lockfile-create 
/var/lock/ntpdate-ifup
root  1121 1  0 15:09 ?00:00:00 /usr/sbin/sshd -D
syslog1137 1  0 15:09 ?00:00:00 rsyslogd -c5
102   1142 1  0 15:09 ?00:00:00 dbus-daemon --system --fork 
--activation
root  1200 1  0 15:09 tty4 00:00:00 /sbin/getty -8 38400 tty4
root  1207 1  0 15:09 tty5 00:00:00 /sbin/getty -8 38400 tty5
root  1214 1  0 15:09 tty2 00:00:00 /sbin/getty -8 38400 tty2
root  1215 1  0 15:09 tty3 00:00:00 /sbin/getty -8 38400 tty3
root  1218 1  0 15:09 tty6 00:00:00 /sbin/getty -8 38400 tty6
root  1248 1  0 15:09 ?00:00:00 
/usr/sbin/hv_kvp_daemon_3.2.0-48-virtual
root  1250 1  0 15:09 ?00:00:00 acpid -c /etc/acpi/events -s 
/var/run/ac
root  1251 1  0 15:09 ?00:00:00 cron
daemon1252 1  0 15:09 ?00:00:00 atd
root  1265 1  0 15:09 tty1 00:00:00 /sbin/getty -8 38400 tty1
whoopsie  1279 1  0 15:09 ?00:00:00 whoopsie
root  1308  1121  0 15:23 ?00:00:00 sshd: test [priv]
test  1412  1308  0 15:24 ?00:00:00 sshd: test@pts/0
test  1413  1412  0 15:24 pts/000:00:01 -bash
root  1755 2  0 15:33 ?00:00:00 [kworker/0:0]
root  2054 2  0 15:38 ?00:00:00 [kworker/0:2]
root  2274 2  0 15:43 ?00:00:00 [kworker/0:1]
test  2450  1413  0 15:45 pts/000:00:00 ps -ef

Note, it seems that 

Re: [Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread David Medberry
I noticed while doing the kdump debugging that you can ssh into a
cloud-image instance very early (well before the console allows login) on a
kvm based cloud-image instance. I was surprised that service was available
before console login was available. Not sure it is related to what your are
seeing on azure though.

(During kdump there is about a 40 second delay while the dump is
post-processed before console login is available--but ssh login is already
available and you can 'watch' the kdump post processing.)  This means (at
least to me) that ssh is on fairly early.


On Fri, Jun 28, 2013 at 10:06 AM, Scott Moser smo...@ubuntu.com wrote:

 ** Attachment added: waagent.log and waagent dir.

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

 --
 You received this bug notification because you are a member of Canonical
 Microsoft Azure Collaboration, which is subscribed to walinuxagent in
 Ubuntu.
 Matching subscriptions: walinxuagnet bugs
 https://bugs.launchpad.net/bugs/1195524

 Title:
   race condition / transient failure to provision

 To manage notifications about this bug go to:

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions



-- 
-dave

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
ok... i think ihave a reasonable description of what caused this specific hang.
The last entry in this /var/log/waagent.log was:
2013/06/28 14:50:12 Provisioning image using OVF settings in the DVD.
2013/06/28 14:50:12 Resource disk (/dev/sdb1) is mounted at /mnt/resource with 
fstype ext4
2013/06/28 14:50:14 Created user account: test
2013/06/28 14:50:15 EnvMonitor: Detected host name change: ubuntu - 
gwaclhost24lm4crqqc7kqfl9hya2ieuvxz9fvdqpn6ipfgdi4gh1p7snuwjezcw
2013/06/28 14:50:15 Setting host name: 
gwaclhost24lm4crqqc7kqfl9hya2ieuvxz9fvdqpn6ipfgdi4gh1p7snuwjezcw

In a *good* run, the entry after Created user account should be somethin
glike:

2013/06/28 16:38:40 Provisioning image using OVF settings in the DVD.
2013/06/28 16:38:40 Disabled SSH password-based authentication methods.
2013/06/28 16:38:41 Created user account: smoser
2013/06/28 16:38:42 EnvMonitor: Detected host name change: ubuntu - smoser0628p
2013/06/28 16:38:42 Setting host name: smoser0628p
2013/06/28 16:38:55 ERROR:CalledProcessError.  Error Code: 1
2013/06/28 16:38:55 ERROR:CalledProcessError.  Command string: service ssh 
status | grep running
2013/06/28 16:38:55 ERROR:CalledProcessError.  Command result: 
2013/06/28 16:38:55 Posted Role Properties. 
CertificateThumbprint=075533b9075b4130651b1f74c451ca9b
2013/06/28 16:38:55 Root password deleted.


Note, the ERROR information is coming from a differen thread in the process. 
you can just ignore it.
The key is  that it should be Posting Role Properties, but this instnace didn't 
get there.

an 'strace' of the pid, shows:
$ sudo strace -p 498
Process 498 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 321648}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}

the other key bit of info here is that 'Setting host name' comes from
'UpdateAndPublishHostName', which , amoung other questionable
operations, does:

for ethernetInterface in PossibleEthernetInterfaces:
Run(ifdown  + ethernetInterface +   ifup  + 
ethernetInterface,chk_err=False) # We supress error logging on error.
self.RestoreRoutes()


So what I think happened here, is that the attempt to ReportRoleProperties 
(Posting Role Properties) ends up blocking/hanging in 
httplib.HTTPConnection(self.Endpoint), where no 'timeout' is specified.

One good thing is that this can be disabled via:
 Provisioning.MonitorHostName=n


** Changed in: walinuxagent (Ubuntu)
   Importance: Undecided = High

** Changed in: walinuxagent (Ubuntu)
   Status: New = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Long Li
Can you also share the kernel logs, and all the files in
/var/lib/waagent?

In the waagent.log, the new host name seems uncommon:

2013/06/28 15:08:12 EnvMonitor: Detected host name change: ubuntu - 
gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
2013/06/28 15:08:12 Setting host name: 
gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw

Is it expected?

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Stephen A. Zarkos
Aside from the kernel logs, please also include /var/log/syslog and any other 
relevant logs.  Long, I believe the contents of /var/lib/waagent were posted 
here:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

There is retry logic in httplib, and there would be logs if we
encountered issues/timeouts from ReportRoleProperties(), so I'm not sure
if the VM even got that far.  I see we also have a defunct 'sh' process,
possibly something suffered earlier when calling ifup.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
** Attachment added: waagent.log and waagent dir.
   
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
ok. so i have 2 other systems that are showing this failure now.
I was able to ssh into them, though.
walinux-agent had provisioned the user, populated ssh keys and then also 
started sshd (which it actually should not do).

it shouldn't start sshd because it is possibly doing that before sshd
has the required facilities up (sshd starts on 'filesystem or runlevel
[2345]).

that wouldn't seem to be the problem here, and actually has allowed us
into the instance to debug.

$ ls -tr --full-time /var/log/upstart/*.log
-rw-r- 1 root root  46 2013-06-28 15:07:55.843772000 + 
/var/log/upstart/container-detect.log
-rw-r- 1 root root  95 2013-06-28 15:07:56.095772000 + 
/var/log/upstart/console-setup.log
-rw-r- 1 root root 282 2013-06-28 15:07:56.183772000 + 
/var/log/upstart/procps-virtual-filesystems.log
-rw-r- 1 root root 118 2013-06-28 15:07:56.311772000 + 
/var/log/upstart/module-init-tools.log
-rw-r- 1 root root 282 2013-06-28 15:07:58.310376600 + 
/var/log/upstart/procps-static-network-up.log
-rw-r- 1 root root 110 2013-06-28 15:08:02.993943800 + 
/var/log/upstart/udev-fallback-graphics.log
-rw-r- 1 root root 158 2013-06-28 15:08:09.876561300 + 
/var/log/upstart/ureadahead-other.log
-rw-r- 1 root root  64 2013-06-28 15:09:30.346411301 + 
/var/log/upstart/rsyslog.log
-rw-r- 1 root root  64 2013-06-28 15:09:30.370411301 + 
/var/log/upstart/dbus.log

$ cat /proc/mounts 
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=335336k,nr_inodes=83834,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,relatime,size=137672k,mode=755 0 0
/dev/disk/by-uuid/65a0705a-7afe-482f-917d-c59e75cf0c52 / ext4 
rw,relatime,user_xattr,barrier=1,data=ordered,discard 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
/dev/sdb1 /mnt/resource ext4 rw,relatime,user_xattr,barrier=1,data=ordered 0 0

$ cat /etc/fstab
UUID=65a0705a-7afe-482f-917d-c59e75cf0c52   /ext4   
defaults,discard   0 0

mountall is not running.
$ sudo status mountall
mountall stop/waiting

$ ls -altr /var/run/landscape
ls: cannot access /var/run/landscape: No such file or directory

$ runlevel 
N 2

$ ps axw
..
root   389 1  0 15:07 ?00:00:00 upstart-udev-bridge --daemon
root   391 1  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   508 1  0 15:07 ?00:00:00 /usr/bin/python 
/usr/sbin/waagent -daemo
root   574   391  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   577   391  0 15:07 ?00:00:00 /sbin/udevd --daemon
root   598 2  0 15:07 ?00:00:00 [kpsmoused]
root   633 1  0 15:07 ?00:00:00 upstart-socket-bridge --daemon
root   906 2  0 15:08 ?00:00:00 [jbd2/sdb1-8]
root   907 2  0 15:08 ?00:00:00 [ext4-dio-unwrit]
root   931   508  0 15:08 ?00:00:00 [sh] defunct
root  1015 1  0 15:08 ?00:00:00 dhclient3 -e IF_METRIC=100 -pf 
/var/run/
root  1025 1  0 15:08 ?00:00:00 /bin/sh 
/etc/network/if-up.d/ntpdate
root  1028  1025  0 15:08 ?00:00:00 lockfile-create 
/var/lock/ntpdate-ifup
root  1121 1  0 15:09 ?00:00:00 /usr/sbin/sshd -D
syslog1137 1  0 15:09 ?00:00:00 rsyslogd -c5
102   1142 1  0 15:09 ?00:00:00 dbus-daemon --system --fork 
--activation
root  1200 1  0 15:09 tty4 00:00:00 /sbin/getty -8 38400 tty4
root  1207 1  0 15:09 tty5 00:00:00 /sbin/getty -8 38400 tty5
root  1214 1  0 15:09 tty2 00:00:00 /sbin/getty -8 38400 tty2
root  1215 1  0 15:09 tty3 00:00:00 /sbin/getty -8 38400 tty3
root  1218 1  0 15:09 tty6 00:00:00 /sbin/getty -8 38400 tty6
root  1248 1  0 15:09 ?00:00:00 
/usr/sbin/hv_kvp_daemon_3.2.0-48-virtual
root  1250 1  0 15:09 ?00:00:00 acpid -c /etc/acpi/events -s 
/var/run/ac
root  1251 1  0 15:09 ?00:00:00 cron
daemon1252 1  0 15:09 ?00:00:00 atd
root  1265 1  0 15:09 tty1 00:00:00 /sbin/getty -8 38400 tty1
whoopsie  1279 1  0 15:09 ?00:00:00 whoopsie
root  1308  1121  0 15:23 ?00:00:00 sshd: test [priv]
test  1412  1308  0 15:24 ?00:00:00 sshd: test@pts/0
test  1413  1412  0 15:24 pts/000:00:01 -bash
root  1755 2  0 15:33 ?00:00:00 [kworker/0:0]
root  2054 2  0 15:38 ?00:00:00 [kworker/0:2]
root  2274 2  0 15:43 ?00:00:00 [kworker/0:1]
test  2450  1413  0 15:45 pts/000:00:00 ps -ef

Note, it seems that 

Re: [Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread David Medberry
I noticed while doing the kdump debugging that you can ssh into a
cloud-image instance very early (well before the console allows login) on a
kvm based cloud-image instance. I was surprised that service was available
before console login was available. Not sure it is related to what your are
seeing on azure though.

(During kdump there is about a 40 second delay while the dump is
post-processed before console login is available--but ssh login is already
available and you can 'watch' the kdump post processing.)  This means (at
least to me) that ssh is on fairly early.


On Fri, Jun 28, 2013 at 10:06 AM, Scott Moser smo...@ubuntu.com wrote:

 ** Attachment added: waagent.log and waagent dir.

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

 --
 You received this bug notification because you are a member of Canonical
 Microsoft Azure Collaboration, which is subscribed to walinuxagent in
 Ubuntu.
 Matching subscriptions: walinxuagnet bugs
 https://bugs.launchpad.net/bugs/1195524

 Title:
   race condition / transient failure to provision

 To manage notifications about this bug go to:

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions



-- 
-dave

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Scott Moser
ok... i think ihave a reasonable description of what caused this specific hang.
The last entry in this /var/log/waagent.log was:
2013/06/28 14:50:12 Provisioning image using OVF settings in the DVD.
2013/06/28 14:50:12 Resource disk (/dev/sdb1) is mounted at /mnt/resource with 
fstype ext4
2013/06/28 14:50:14 Created user account: test
2013/06/28 14:50:15 EnvMonitor: Detected host name change: ubuntu - 
gwaclhost24lm4crqqc7kqfl9hya2ieuvxz9fvdqpn6ipfgdi4gh1p7snuwjezcw
2013/06/28 14:50:15 Setting host name: 
gwaclhost24lm4crqqc7kqfl9hya2ieuvxz9fvdqpn6ipfgdi4gh1p7snuwjezcw

In a *good* run, the entry after Created user account should be somethin
glike:

2013/06/28 16:38:40 Provisioning image using OVF settings in the DVD.
2013/06/28 16:38:40 Disabled SSH password-based authentication methods.
2013/06/28 16:38:41 Created user account: smoser
2013/06/28 16:38:42 EnvMonitor: Detected host name change: ubuntu - smoser0628p
2013/06/28 16:38:42 Setting host name: smoser0628p
2013/06/28 16:38:55 ERROR:CalledProcessError.  Error Code: 1
2013/06/28 16:38:55 ERROR:CalledProcessError.  Command string: service ssh 
status | grep running
2013/06/28 16:38:55 ERROR:CalledProcessError.  Command result: 
2013/06/28 16:38:55 Posted Role Properties. 
CertificateThumbprint=075533b9075b4130651b1f74c451ca9b
2013/06/28 16:38:55 Root password deleted.


Note, the ERROR information is coming from a differen thread in the process. 
you can just ignore it.
The key is  that it should be Posting Role Properties, but this instnace didn't 
get there.

an 'strace' of the pid, shows:
$ sudo strace -p 498
Process 498 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 321648}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}) = 0 (Timeout)
select(0, NULL, NULL, NULL, {1, 0}

the other key bit of info here is that 'Setting host name' comes from
'UpdateAndPublishHostName', which , amoung other questionable
operations, does:

for ethernetInterface in PossibleEthernetInterfaces:
Run(ifdown  + ethernetInterface +   ifup  + 
ethernetInterface,chk_err=False) # We supress error logging on error.
self.RestoreRoutes()


So what I think happened here, is that the attempt to ReportRoleProperties 
(Posting Role Properties) ends up blocking/hanging in 
httplib.HTTPConnection(self.Endpoint), where no 'timeout' is specified.

One good thing is that this can be disabled via:
 Provisioning.MonitorHostName=n


** Changed in: walinuxagent (Ubuntu)
   Importance: Undecided = High

** Changed in: walinuxagent (Ubuntu)
   Status: New = Confirmed

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Long Li
Can you also share the kernel logs, and all the files in
/var/lib/waagent?

In the waagent.log, the new host name seems uncommon:

2013/06/28 15:08:12 EnvMonitor: Detected host name change: ubuntu - 
gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw
2013/06/28 15:08:12 Setting host name: 
gwaclhostblkhljy4re3yp9swkdwp63kswkss9bqhn0zm3f3gunipzu5vwdr8qzw

Is it expected?

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-28 Thread Stephen A. Zarkos
Aside from the kernel logs, please also include /var/log/syslog and any other 
relevant logs.  Long, I believe the contents of /var/lib/waagent were posted 
here:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+attachment/3717115/+files/waagent-info.tar

There is retry logic in httplib, and there would be logs if we
encountered issues/timeouts from ReportRoleProperties(), so I'm not sure
if the VM even got that far.  I see we also have a defunct 'sh' process,
possibly something suffered earlier when calling ifup.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread Scott Moser
as I'm trying to debug this instance that failed for me, and then came up after 
a 'vm restart', there is no indication that / was ever mounted RW the first 
time.  Ie, there is no evidence in /var/log of *anything* having run.
cloud-init starts on mounted MOUNTPOINT=/
and logs pretty much immediately to /var/log/cloud-init.log
but there is no timestamps other than this boot.

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread Scott Moser
given my previous comment, i would have suspected that there was disk or kernel 
failure.
however, ssh-keyscan seemed to indicate ssh was running:
$ ssh-keyscan us-west-1.cloudapp.net 
# us-west-1.cloudapp.net SSH-2.0-OpenSSH_6.1p1 Debian-4
Connection closed by 137.135.115.232

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


Re: [Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread David Medberry
You can get kdump working in Azure (not that you really need it for this)
with the attached patch.

Ref: http://support.microsoft.com/kb/2858695


On Thu, Jun 27, 2013 at 8:01 PM, Scott Moser smo...@ubuntu.com wrote:

 given my previous comment, i would have suspected that there was disk or
 kernel failure.
 however, ssh-keyscan seemed to indicate ssh was running:
 $ ssh-keyscan us-west-1.cloudapp.net
 # us-west-1.cloudapp.net SSH-2.0-OpenSSH_6.1p1 Debian-4
 Connection closed by 137.135.115.232

 --
 You received this bug notification because you are a member of Canonical
 Microsoft Azure Collaboration, which is subscribed to walinuxagent in
 Ubuntu.
 Matching subscriptions: walinxuagnet bugs
 https://bugs.launchpad.net/bugs/1195524

 Title:
   race condition / transient failure to provision

 To manage notifications about this bug go to:

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions



-- 
-dave


** Attachment added: blacklist.hyperv.kdump.patch
   
https://bugs.launchpad.net/bugs/1195524/+attachment/3716552/+files/blacklist.hyperv.kdump.patch

-- 
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to walinuxagent in Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
Ubuntu-server-bugs mailing list
Ubuntu-server-bugs@lists.ubuntu.com
Modify settings or unsubscribe at: 
https://lists.ubuntu.com/mailman/listinfo/ubuntu-server-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread Scott Moser
as I'm trying to debug this instance that failed for me, and then came up after 
a 'vm restart', there is no indication that / was ever mounted RW the first 
time.  Ie, there is no evidence in /var/log of *anything* having run.
cloud-init starts on mounted MOUNTPOINT=/
and logs pretty much immediately to /var/log/cloud-init.log
but there is no timestamps other than this boot.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread Scott Moser
given my previous comment, i would have suspected that there was disk or kernel 
failure.
however, ssh-keyscan seemed to indicate ssh was running:
$ ssh-keyscan us-west-1.cloudapp.net 
# us-west-1.cloudapp.net SSH-2.0-OpenSSH_6.1p1 Debian-4
Connection closed by 137.135.115.232

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


Re: [Bug 1195524] Re: race condition / transient failure to provision

2013-06-27 Thread David Medberry
You can get kdump working in Azure (not that you really need it for this)
with the attached patch.

Ref: http://support.microsoft.com/kb/2858695


On Thu, Jun 27, 2013 at 8:01 PM, Scott Moser smo...@ubuntu.com wrote:

 given my previous comment, i would have suspected that there was disk or
 kernel failure.
 however, ssh-keyscan seemed to indicate ssh was running:
 $ ssh-keyscan us-west-1.cloudapp.net
 # us-west-1.cloudapp.net SSH-2.0-OpenSSH_6.1p1 Debian-4
 Connection closed by 137.135.115.232

 --
 You received this bug notification because you are a member of Canonical
 Microsoft Azure Collaboration, which is subscribed to walinuxagent in
 Ubuntu.
 Matching subscriptions: walinxuagnet bugs
 https://bugs.launchpad.net/bugs/1195524

 Title:
   race condition / transient failure to provision

 To manage notifications about this bug go to:

 https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions



-- 
-dave


** Attachment added: blacklist.hyperv.kdump.patch
   
https://bugs.launchpad.net/bugs/1195524/+attachment/3716552/+files/blacklist.hyperv.kdump.patch

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1195524

Title:
  race condition / transient failure to provision

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/walinuxagent/+bug/1195524/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs