[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-21 Thread James Ravn
Changing manage_etc_hosts seems to do something. If I set it to "yes",
and do "systemctl restart cloud-init", my /etc/hosts gets updated. The
main reason I do this is as an easy way to get the local hostname+fqdn
inside /etc/hosts, for resolution purposes. I could modify /etc/hosts
directly, but I thought cloud-init would be the best way to do it. Is it
not?

Cheers, James

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-21 Thread James Ravn
Thanks Scott. We set `manage_etc_hosts` to Yes when we configure new
hosts. That requires a restart of cloud-init to pick up the changes, as
far as I know.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
The additional failures in that log starting at 21:10 are me trying to
debug it by adding some print statements in the associated libraries.
Nothing obvious - for whatever reason the fallback_nic is set to None
which causes the error.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
Hi Ryan, all I did was remove entries prior to restarting cloud-init
(the prior entries ended at Nov 15). But I can paste the entire thing
here if it helps.

** Attachment added: "cloud-init.log"
   
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+attachment/5010724/+files/cloud-init.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
I tried running 'cloud-init collect-logs' but the command just hangs.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
** Attachment added: "cloud-init.log"
   
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+attachment/5010665/+files/cloud-init.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
Hi Scott, GMT time zone here so excuse the delay. Yes this is running on
AWS, on t2 instances. I'll try to gather the logs for you.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] Re: 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
This only seems to happen when trying to restart cloud-init on a running
node after the update. Maybe some transient state that is incompatible
after the update? After rebooting the node, cloud-init can be restarted
successfully.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1732917] [NEW] 17.1 update breaks EC2 nodes

2017-11-17 Thread James Ravn
Public bug reported:

We updated from 0.7.9 to 17.1 on Ubuntu 16.04. After that, cloud-init
fails to start.

Update:
Start-Date: 2017-11-15  06:03:19
Commandline: /usr/bin/unattended-upgrade
Upgrade: cloud-init:amd64 (0.7.9-233-ge586fe35-0ubuntu1~16.04.2, 
17.1-27-geb292c18-0ubuntu1~16.04.1), ubuntu-standard:amd64 (1.361, 1.361.1), 
ubuntu-server:amd64 (1.361, 1.361.1), grub-legacy-ec2:amd64 
(0.7.9-233-ge586fe35-0ubuntu1~16.04.2, 17.1-27-geb292c18-0ubuntu1~16.04.1), 
ubuntu-minimal:amd64 (1.361, 1.361.1)
End-Date: 2017-11-15  06:03:23

Failure:
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 2017-11-17 13:53:12,947 - 
util.py[WARNING]: failed stage init
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: failed run of stage init
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 

Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: Traceback (most recent call 
last):
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 638, in 
status_wrapper
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: ret = functor(name, 
args)
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 357, in main_init
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 
init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL))
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/stages.py", line 635, in 
apply_network_config
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: netcfg, src = 
self._find_networking_config()
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/stages.py", line 622, in 
_find_networking_config
Nov 17 13:53:12 ip-10-50-198-224 systemd[1]: cloud-init.service: Unit entered 
failed state.
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: if self.datasource and 
hasattr(self.datasource, 'network_config'):
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceEc2.py", line 307, 
in network_config
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 
net.get_interface_mac(self.fallback_nic): self.fallback_nic}
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 506, in 
get_interface_mac
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: if 
os.path.isdir(sys_dev_path(ifname, "bonding_slave")):
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]:   File 
"/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 38, in 
sys_dev_path
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: return 
get_sys_class_path() + devname + "/" + path
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: TypeError: Can't convert 
'NoneType' object to str implicitly
Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 

Nov 17 13:53:12 ip-10-50-198-224 systemd[1]: cloud-init.service: Failed with 
result 'exit-code'.


This is pretty serious. Is it normal to do a major version update on an LTS 
release?

** Affects: cloud-init (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1732917

Title:
  17.1 update breaks EC2 nodes

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

[Bug 1668297] [NEW] Ubuntu 16.04 4.8.0 kernel crashing on EC2 instances at boot

2017-02-27 Thread James Ravn
Public bug reported:

After switching to the linux-hwe kernel on 16.04.2, we started observing
kernel crashes on boot on some of our EC2 instances (so far, it only
seems to happen on the newer M4 types). The instance becomes
unresponsive when this happens. It looks like a rapl issue - we have
blacklisted intel_rapl and intel_rapl_perf for now. Here is the trace:

general protection fault:  [#1] SMP
Modules linked in: intel_rapl_perf(+) i2c_piix4 input_leds parport_pc serio_raw 
mac_hid parport sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscs
i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32
c raid1 raid0 multipath linear cirrus crct10dif_pclmul ttm crc32_pclmul 
drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt 
aesni_intel fb_sy
s_fops aes_x86_64 lrw glue_helper ablk_helper cryptd drm ixgbevf psmouse 
pata_acpi floppy fjes
CPU: 2 PID: 20 Comm: cpuhp/2 Not tainted 4.8.0-39-generic #42~16.04.1-Ubuntu
Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016
task: 8bee465a1d80 task.stack: 8bee465ac000
RIP: 0010:[]  [] rapl_cpu_online+0x63/0x71 
[intel_rapl_perf]
RSP: :8bee465afe18  EFLAGS: 00010212
RAX: 0200 RBX: c0728730 RCX: 
RDX: 0200 RSI: 0200 RDI: 0200
RBP: 8bee465afe30 R08:  R09: 0001
R10: 8bee45ec2600 R11: 8bec41fbce00 R12: 6401b4899ff8202c
R13: 0002 R14: 8bee4fc0daa0 R15: 
FS:  () GS:8bee4fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 563b0d9f9dc8 CR3: 00020608e000 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 c0728730 0002 004e 8bee465afe70
 95883d86 8bee4fc0daa0 8bee4fc0daa0 0002
 9663df60 8bee464b85f0 8bec47c19300 8bee465afe90
Call Trace:
 [] ? rapl_cpu_prepare+0x100/0x100 [intel_rapl_perf]
 [] cpuhp_invoke_callback+0x46/0x110
 [] cpuhp_thread_fun+0x41/0x100
 [] smpboot_thread_fn+0x105/0x160
 [] ? sort_range+0x30/0x30
 [] kthread+0xd8/0xf0
 [] ret_from_fork+0x1f/0x40
 [] ? kthread_create_on_node+0x1e0/0x1e0
Code: 23 00 00 4c 8b a4 ca 10 01 00 00 48 c7 c2 80 a0 00 00 48 01 c2 e8 6e 56 
50 d5 3b 05 fc 67 03 d6 7c 0e f0 4c 0f ab 2d 4d 23 00 00 <45> 89 6c 24 08 5b 31
c0 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 48
RIP  [] rapl_cpu_online+0x63/0x71 [intel_rapl_perf]
 RSP 
---[ end trace cd71880c1b07dfa5 ]---
BUG: unable to handle kernel paging request at 7957b4e8
IP: [] __wake_up_common+0x2b/0x90
PGD 0
Oops:  [#2] SMP
Modules linked in: intel_rapl_perf(+) i2c_piix4 input_leds parport_pc serio_raw 
mac_hid parport sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscs
i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 
async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32
c raid1 raid0 multipath linear cirrus crct10dif_pclmul ttm crc32_pclmul 
drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt 
aesni_intel fb_sy
s_fops aes_x86_64 lrw glue_helper ablk_helper cryptd drm ixgbevf psmouse 
pata_acpi floppy fjes
CPU: 2 PID: 20 Comm: cpuhp/2 Tainted: G  D 4.8.0-39-generic 
#42~16.04.1-Ubuntu
Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016
task: 8bee465a1d80 task.stack: 8bee465ac000
RIP: 0010:[]  [] __wake_up_common+0x2b/0x90
RSP: :8bee465afe38  EFLAGS: 00010086
RAX: 0282 RBX: 8bee465aff10 RCX: 
RDX: 7957b4e8 RSI: 0003 RDI: 8bee465aff10
RBP: 8bee465afe70 R08:  R09: 
R10: 8bee45ec2600 R11: 022f R12: 8bee465aff18
R13: 0282 R14:  R15: 0003
FS:  () GS:8bee4fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7957b4e8 CR3: 5fc06000 CR4: 001406e0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Stack:
 0001465a1d80  8bee465aff10 8bee465aff08
 0282   8bee465afe80
 958c6e43 8bee465afea8 958c78c7 8bee465a24d8
Call Trace:
 [] __wake_up_locked+0x13/0x20
 [] complete+0x37/0x50
 [] mm_release+0xbf/0x140
 [] do_exit+0x14d/0xb50
 [] rewind_stack_do_exit+0x17/0x20
 [] ? kthread_create_on_node+0x1e0/0x1e0
Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 4c 8d 67 08 53 41 89 
f7 48 83 ec 10 89 55 cc 48 8b 57 08 4c 89 45 d0 49 39 d4 <48> 8b 32 74 45 41 89
ce 48 8d 42 e8 4c 8d 6e e8 eb 03 49 89 d5
RIP  []