[Bug 1732917] Re: 17.1 update breaks EC2 nodes
Changing manage_etc_hosts seems to do something. If I set it to "yes", and do "systemctl restart cloud-init", my /etc/hosts gets updated. The main reason I do this is as an easy way to get the local hostname+fqdn inside /etc/hosts, for resolution purposes. I could modify /etc/hosts directly, but I thought cloud-init would be the best way to do it. Is it not? Cheers, James -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
Thanks Scott. We set `manage_etc_hosts` to Yes when we configure new hosts. That requires a restart of cloud-init to pick up the changes, as far as I know. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
The additional failures in that log starting at 21:10 are me trying to debug it by adding some print statements in the associated libraries. Nothing obvious - for whatever reason the fallback_nic is set to None which causes the error. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
Hi Ryan, all I did was remove entries prior to restarting cloud-init (the prior entries ended at Nov 15). But I can paste the entire thing here if it helps. ** Attachment added: "cloud-init.log" https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+attachment/5010724/+files/cloud-init.log -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
I tried running 'cloud-init collect-logs' but the command just hangs. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
** Attachment added: "cloud-init.log" https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+attachment/5010665/+files/cloud-init.log -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
Hi Scott, GMT time zone here so excuse the delay. Yes this is running on AWS, on t2 instances. I'll try to gather the logs for you. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] Re: 17.1 update breaks EC2 nodes
This only seems to happen when trying to restart cloud-init on a running node after the update. Maybe some transient state that is incompatible after the update? After rebooting the node, cloud-init can be restarted successfully. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1732917] [NEW] 17.1 update breaks EC2 nodes
Public bug reported: We updated from 0.7.9 to 17.1 on Ubuntu 16.04. After that, cloud-init fails to start. Update: Start-Date: 2017-11-15 06:03:19 Commandline: /usr/bin/unattended-upgrade Upgrade: cloud-init:amd64 (0.7.9-233-ge586fe35-0ubuntu1~16.04.2, 17.1-27-geb292c18-0ubuntu1~16.04.1), ubuntu-standard:amd64 (1.361, 1.361.1), ubuntu-server:amd64 (1.361, 1.361.1), grub-legacy-ec2:amd64 (0.7.9-233-ge586fe35-0ubuntu1~16.04.2, 17.1-27-geb292c18-0ubuntu1~16.04.1), ubuntu-minimal:amd64 (1.361, 1.361.1) End-Date: 2017-11-15 06:03:23 Failure: Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: 2017-11-17 13:53:12,947 - util.py[WARNING]: failed stage init Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: failed run of stage init Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: Traceback (most recent call last): Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 638, in status_wrapper Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: ret = functor(name, args) Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/cmd/main.py", line 357, in main_init Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: init.apply_network_config(bring_up=bool(mode != sources.DSMODE_LOCAL)) Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 635, in apply_network_config Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: netcfg, src = self._find_networking_config() Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/stages.py", line 622, in _find_networking_config Nov 17 13:53:12 ip-10-50-198-224 systemd[1]: cloud-init.service: Unit entered failed state. Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: if self.datasource and hasattr(self.datasource, 'network_config'): Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/sources/DataSourceEc2.py", line 307, in network_config Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: net.get_interface_mac(self.fallback_nic): self.fallback_nic} Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 506, in get_interface_mac Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: if os.path.isdir(sys_dev_path(ifname, "bonding_slave")): Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: File "/usr/lib/python3/dist-packages/cloudinit/net/__init__.py", line 38, in sys_dev_path Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: return get_sys_class_path() + devname + "/" + path Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: TypeError: Can't convert 'NoneType' object to str implicitly Nov 17 13:53:13 ip-10-50-198-224 cloud-init[11795]: Nov 17 13:53:12 ip-10-50-198-224 systemd[1]: cloud-init.service: Failed with result 'exit-code'. This is pretty serious. Is it normal to do a major version update on an LTS release? ** Affects: cloud-init (Ubuntu) Importance: Undecided Status: New -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1732917 Title: 17.1 update breaks EC2 nodes To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/cloud-init/+bug/1732917/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1668297] [NEW] Ubuntu 16.04 4.8.0 kernel crashing on EC2 instances at boot
Public bug reported: After switching to the linux-hwe kernel on 16.04.2, we started observing kernel crashes on boot on some of our EC2 instances (so far, it only seems to happen on the newer M4 types). The instance becomes unresponsive when this happens. It looks like a rapl issue - we have blacklisted intel_rapl and intel_rapl_perf for now. Here is the trace: general protection fault: [#1] SMP Modules linked in: intel_rapl_perf(+) i2c_piix4 input_leds parport_pc serio_raw mac_hid parport sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscs i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32 c raid1 raid0 multipath linear cirrus crct10dif_pclmul ttm crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt aesni_intel fb_sy s_fops aes_x86_64 lrw glue_helper ablk_helper cryptd drm ixgbevf psmouse pata_acpi floppy fjes CPU: 2 PID: 20 Comm: cpuhp/2 Not tainted 4.8.0-39-generic #42~16.04.1-Ubuntu Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016 task: 8bee465a1d80 task.stack: 8bee465ac000 RIP: 0010:[] [] rapl_cpu_online+0x63/0x71 [intel_rapl_perf] RSP: :8bee465afe18 EFLAGS: 00010212 RAX: 0200 RBX: c0728730 RCX: RDX: 0200 RSI: 0200 RDI: 0200 RBP: 8bee465afe30 R08: R09: 0001 R10: 8bee45ec2600 R11: 8bec41fbce00 R12: 6401b4899ff8202c R13: 0002 R14: 8bee4fc0daa0 R15: FS: () GS:8bee4fc0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 563b0d9f9dc8 CR3: 00020608e000 CR4: 001406e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Stack: c0728730 0002 004e 8bee465afe70 95883d86 8bee4fc0daa0 8bee4fc0daa0 0002 9663df60 8bee464b85f0 8bec47c19300 8bee465afe90 Call Trace: [] ? rapl_cpu_prepare+0x100/0x100 [intel_rapl_perf] [] cpuhp_invoke_callback+0x46/0x110 [] cpuhp_thread_fun+0x41/0x100 [] smpboot_thread_fn+0x105/0x160 [] ? sort_range+0x30/0x30 [] kthread+0xd8/0xf0 [] ret_from_fork+0x1f/0x40 [] ? kthread_create_on_node+0x1e0/0x1e0 Code: 23 00 00 4c 8b a4 ca 10 01 00 00 48 c7 c2 80 a0 00 00 48 01 c2 e8 6e 56 50 d5 3b 05 fc 67 03 d6 7c 0e f0 4c 0f ab 2d 4d 23 00 00 <45> 89 6c 24 08 5b 31 c0 41 5c 41 5d 5d c3 0f 1f 44 00 00 55 48 RIP [] rapl_cpu_online+0x63/0x71 [intel_rapl_perf] RSP ---[ end trace cd71880c1b07dfa5 ]--- BUG: unable to handle kernel paging request at 7957b4e8 IP: [] __wake_up_common+0x2b/0x90 PGD 0 Oops: [#2] SMP Modules linked in: intel_rapl_perf(+) i2c_piix4 input_leds parport_pc serio_raw mac_hid parport sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscs i_tcp libiscsi_tcp libiscsi scsi_transport_iscsi autofs4 btrfs raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32 c raid1 raid0 multipath linear cirrus crct10dif_pclmul ttm crc32_pclmul drm_kms_helper ghash_clmulni_intel syscopyarea sysfillrect sysimgblt aesni_intel fb_sy s_fops aes_x86_64 lrw glue_helper ablk_helper cryptd drm ixgbevf psmouse pata_acpi floppy fjes CPU: 2 PID: 20 Comm: cpuhp/2 Tainted: G D 4.8.0-39-generic #42~16.04.1-Ubuntu Hardware name: Xen HVM domU, BIOS 4.2.amazon 11/11/2016 task: 8bee465a1d80 task.stack: 8bee465ac000 RIP: 0010:[] [] __wake_up_common+0x2b/0x90 RSP: :8bee465afe38 EFLAGS: 00010086 RAX: 0282 RBX: 8bee465aff10 RCX: RDX: 7957b4e8 RSI: 0003 RDI: 8bee465aff10 RBP: 8bee465afe70 R08: R09: R10: 8bee45ec2600 R11: 022f R12: 8bee465aff18 R13: 0282 R14: R15: 0003 FS: () GS:8bee4fc0() knlGS: CS: 0010 DS: ES: CR0: 80050033 CR2: 7957b4e8 CR3: 5fc06000 CR4: 001406e0 DR0: DR1: DR2: DR3: DR6: fffe0ff0 DR7: 0400 Stack: 0001465a1d80 8bee465aff10 8bee465aff08 0282 8bee465afe80 958c6e43 8bee465afea8 958c78c7 8bee465a24d8 Call Trace: [] __wake_up_locked+0x13/0x20 [] complete+0x37/0x50 [] mm_release+0xbf/0x140 [] do_exit+0x14d/0xb50 [] rewind_stack_do_exit+0x17/0x20 [] ? kthread_create_on_node+0x1e0/0x1e0 Code: 0f 1f 44 00 00 55 48 89 e5 41 57 41 56 41 55 41 54 4c 8d 67 08 53 41 89 f7 48 83 ec 10 89 55 cc 48 8b 57 08 4c 89 45 d0 49 39 d4 <48> 8b 32 74 45 41 89 ce 48 8d 42 e8 4c 8d 6e e8 eb 03 49 89 d5 RIP []