Re: [lxc-users] Compiling 1.03 in Ubuntu server

Michael H. Warfield Thu, 08 May 2014 12:04:28 -0700

On Thu, 2014-05-08 at 14:24 -0400, CDR wrote:
> The trace is showing it. Look for the word lxc-start


When you do that.  Point it out.  Highlight it in the message.  You see
it but there's a lot of noise in the tracedump below.  I couldn't spot
it and had to save the message as an mbox dump and do a search on it.

> On Thu, May 8, 2014 at 2:09 PM, Tamas Papp <[email protected]> wrote:
> >
> > On 05/08/2014 08:06 PM, CDR wrote:
> >> Ubuntu server blows up with LXC, and I am using the very latest kernel, 
> >> 3.14.2
> >>
> >>
> >> [ 3798.345926] WARNING: CPU: 11 PID: 6963 at
> >> /home/apw/COD/linux/fs/sysfs/dir.c:52 sysfs_warn_dup+0x91/0xb0()
> >> [ 3798.345928] sysfs: cannot create duplicate filename
> >> '/devices/pci0000:00/0000:00:05.0/0000:02:00.1/net/eth1/upper_eth0'
> >> [ 3798.345930] Modules linked in: macvlan veth xt_conntrack ipt_REJECT
> >> ip6_tables ebtable_nat ebtables xt_CHECKSUM iptable_mangle
> >> ipt_MASQUERADE iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4
> >> nf_nat_ipv4 nf_nat nf_conntrack xt_tcpudp bridge stp llc
> >> iptable_filter ip_tables x_tables dm_crypt gpio_ich dcdbas
> >> intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul crc32_pclmul
> >> ghash_clmulni_intel aesni_intel aes_x86_64 bnep rfcomm lrw gf128mul
> >> psmouse glue_helper bluetooth ablk_helper cryptd 6lowpan_iphc
> >> serio_raw joydev ipmi_si i7core_edac acpi_power_meter edac_core
> >> lpc_ich mac_hid parport_pc ppdev lp parport ses enclosure hid_generic
> >> usbhid hid usb_storage bnx2 megaraid_sas wmi
> >> [ 3798.345989] CPU: 11 PID: 6963 Comm: lxc-start Not tainted
> >> 3.14.2-031402-generic #201404262053
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

All that's saying is that this was the command in progress in user space
at the time of the kernel fault.

> >> [ 3798.345991] Hardware name: Dell Inc. PowerEdge R910/0KYD3D, BIOS
> >> 2.10.0 08/29/2013
> >> [ 3798.345993]  0000000000000034 ffff885fc4c0b5f8 ffffffff8175505c
> >> 0000000000000007
> >> [ 3798.346002]  ffff885fc4c0b648 ffff885fc4c0b638 ffffffff8106cb5c
> >> ffff883fd02c44b0
> >> [ 3798.346008]  ffff887fd2b03000 ffff887fd2b03000 ffff883fd02c44b0
> >> 0000000000000001
> >> [ 3798.346014] Call Trace:
> >> [ 3798.346025]  [<ffffffff8175505c>] dump_stack+0x46/0x58
> >> [ 3798.346033]  [<ffffffff8106cb5c>] warn_slowpath_common+0x8c/0xc0
> >> [ 3798.346037]  [<ffffffff8106cc46>] warn_slowpath_fmt+0x46/0x50
> >> [ 3798.346044]  [<ffffffff8137c750>] ? strlcat+0x60/0x80
> >> [ 3798.346047]  [<ffffffff81245d41>] sysfs_warn_dup+0x91/0xb0
> >> [ 3798.346051]  [<ffffffff812460c0>] 
> >> sysfs_do_create_link_sd.isra.2+0xd0/0xe0
> >> [ 3798.346054]  [<ffffffff812460f5>] sysfs_create_link+0x25/0x50
> >> [ 3798.346060]  [<ffffffff816497b8>] netdev_adjacent_sysfs_add+0x58/0x70
> >> [ 3798.346068]  [<ffffffff816502d4>] netdev_adjacent_rename_links+0xa4/0xc0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The above 5 lines are not giving me a warm and fuzzy.  What is that
container doing with sysfs?

> >> [ 3798.346071]  [<ffffffff816503c3>] dev_change_name+0xd3/0x240
> >> [ 3798.346078]  [<ffffffff8165e24b>] do_setlink+0x72b/0x790
> >> [ 3798.346082]  [<ffffffff8165fffc>] rtnl_newlink+0x48c/0x6a0
> >> [ 3798.346085]  [<ffffffff8165fc61>] ? rtnl_newlink+0xf1/0x6a0
> >> [ 3798.346093]  [<ffffffff811694a3>] ? get_page_from_freelist+0x443/0x630
> >> [ 3798.346099]  [<ffffffff8116db00>] ? __pagevec_lru_add_fn+0x220/0x220
> >> [ 3798.346102]  [<ffffffff8165fa85>] rtnetlink_rcv_msg+0x165/0x250
> >> [ 3798.346108]  [<ffffffff8163d337>] ? __alloc_skb+0x87/0x2a0
> >> [ 3798.346112]  [<ffffffff8165f920>] ? __rtnl_unlock+0x20/0x20
> >> [ 3798.346120]  [<ffffffff8167ce59>] netlink_rcv_skb+0xa9/0xd0
> >> [ 3798.346123]  [<ffffffff8165cba5>] rtnetlink_rcv+0x25/0x40
> >> [ 3798.346127]  [<ffffffff8167c3b8>] netlink_unicast+0x128/0x1d0
> >> [ 3798.346130]  [<ffffffff8167caf4>] netlink_sendmsg+0x364/0x440
> >> [ 3798.346138]  [<ffffffff8163478f>] sock_sendmsg+0xaf/0xc0
> >> [ 3798.346146]  [<ffffffff81188cb9>] ? __do_fault+0x409/0x500
> >> [ 3798.346150]  [<ffffffff81634e9c>] ___sys_sendmsg+0x3ac/0x3c0
> >> [ 3798.346155]  [<ffffffff8118d173>] ? handle_mm_fault+0xb3/0x160
> >> [ 3798.346160]  [<ffffffff8176619c>] ? __do_page_fault+0x28c/0x550
> >> [ 3798.346165]  [<ffffffff8111df5c>] ? acct_account_cputime+0x1c/0x20
> >> [ 3798.346171]  [<ffffffff810a66b9>] ? account_user_time+0x99/0xb0
> >> [ 3798.346175]  [<ffffffff810a6d3d>] ? vtime_account_user+0x5d/0x70
> >> [ 3798.346183]  [<ffffffff811ed6f3>] ? __fdget+0x13/0x20
> >> [ 3798.346187]  [<ffffffff816358b9>] __sys_sendmsg+0x49/0x90
> >> [ 3798.346190]  [<ffffffff81635919>] SyS_sendmsg+0x19/0x20
> >> [ 3798.346197]  [<ffffffff8176b6bf>] tracesys+0xe1/0xe6
> >> [ 3798.346199] ---[ end trace 99513b106fc1cfe0 ]---

To my eye, this looks like a sysfs problem (that very well may be
container related) deep down in the kernel.  It could be deeper.  It's
passing through some netlink layers.  Under no circumstance, should a
user space application trigger a fault like this.

By definition, it has to be a kernel fault, maybe triggered by
lxc-start, though I'm not sure I see how.  Even if a user application is
doing some thing wrong, it should never be capable of causing a fault
like this, so, just because of the fault itself, there's something
that's not being handled properly in the kernel and, ergo, you have a
kernel problem.

> >> On Thu, May 8, 2014 at 12:29 PM, Tamas Papp <[email protected]> wrote:
> >>
> >
> > Why do you think, it's lxc related?
> >
> > t

Regards,
Mike
-- 
Michael H. Warfield (AI4NB) | (770) 978-7061 |  [email protected]
   /\/\|=mhw=|\/\/          | (678) 463-0932 |  http://www.wittsend.com/mhw/
   NIC whois: MHW9          | An optimist believes we live in the best of all
 PGP Key: 0x674627FF        | possible worlds.  A pessimist is sure of it!

signature.asc
Description: This is a digitally signed message part

_______________________________________________
lxc-users mailing list
[email protected]
http://lists.linuxcontainers.org/listinfo/lxc-users

Re: [lxc-users] Compiling 1.03 in Ubuntu server

Reply via email to