What is this error? "Cannot connect because we still have 5" Coult it be
the cause?

for i in `cat cpeg`;do ssh root@$i 'hostname && zgrep "Cannot connect
because we still have " /var/log/cloud/agent/agent.log* |less |awk "{print
\$1}" |uniq -c |sort';done

cpegh0001

      1 /var/log/cloud/agent/agent.log:2013-11-06

      3 /var/log/cloud/agent/agent.log.2013-10-02.gz:2013-10-02

cpegh0002

      3 /var/log/cloud/agent/agent.log.2013-10-06.gz:2013-10-02

      5 /var/log/cloud/agent/agent.log.2013-09-30.gz:2013-09-30

cpegh0003

      2 /var/log/cloud/agent/agent.log.2013-09-30.gz:2013-09-30

      3 /var/log/cloud/agent/agent.log.2013-10-02.gz:2013-10-02

     72 /var/log/cloud/agent/agent.log.2013-06-06.gz:2013-06-06

cpegh0004

    159 /var/log/cloud/agent/agent.log.2013-10-21.gz:2013-10-20

     35 /var/log/cloud/agent/agent.log.2013-09-23.gz:2013-09-06

cpegh0005

      3 /var/log/cloud/agent/agent.log.2013-10-02.gz:2013-10-02

cpegh0006

      1 /var/log/cloud/agent/agent.log.2013-06-25.gz:2013-06-25

      1 /var/log/cloud/agent/agent.log.2013-10-18.gz:2013-10-18

      4 /var/log/cloud/agent/agent.log.2013-09-30.gz:2013-09-30

cpegh0007

      2 /var/log/cloud/agent/agent.log:2013-11-06

cpegh0008

     27 /var/log/cloud/agent/agent.log.2013-10-19.gz:2013-10-19

      2 /var/log/cloud/agent/agent.log.2013-09-30.gz:2013-09-30

cpegh0009

     15 /var/log/cloud/agent/agent.log.2013-11-08.gz:2013-11-08

     36 /var/log/cloud/agent/agent.log.2013-11-09.gz:2013-11-09

      3 /var/log/cloud/agent/agent.log.2013-10-21.gz:2013-10-21

     43 /var/log/cloud/agent/agent.log.2013-11-05.gz:2013-11-05

     44 /var/log/cloud/agent/agent.log.2013-11-02.gz:2013-11-02

      4 /var/log/cloud/agent/agent.log.2013-11-06.gz:2013-11-06

     53 /var/log/cloud/agent/agent.log:2013-11-13

      5 /var/log/cloud/agent/agent.log.2013-11-03.gz:2013-11-03

cpegh0010

     41 /var/log/cloud/agent/agent.log.2013-11-06.gz:2013-11-06

cpegh0011

cpegh0012

      1 /var/log/cloud/agent/agent.log.2013-10-21.gz:2013-10-21

cpegh0013

      3 /var/log/cloud/agent/agent.log.2013-10-21.gz:2013-10-21

cpegh0015

cpegh0016





On Wed, Nov 13, 2013 at 8:34 PM, Timothy Ehlers <ehle...@gmail.com> wrote:

> We are experiencing massive instability and cannot determine  whats
> causing this.
>
> Every so often jvsvc triggers the following in our system logs:
>
> Nov 13 18:59:31 cpegh0009 kernel: \[15188599.258955\] BUG: soft lockup -
> CPU#24 stuck for 22s\! \[jsvc:60385\]
>  Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266229\] Modules linked in:
> mptctl mptbase vhost_net macvtap macvlan 8021q garp ip6table_filter
> ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
> xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables nfsd
> kvm_amd kvm ghash_clmulni_intel aesni_intel cryptd aes_x86_64 nfs microcode
> psmouse radeon serio_raw ttm drm_kms_helper amd64_edac_mod joydev drm
> edac_core fam15h_power k10temp edac_mce_amd i2c_algo_bit sp5100_tco
> i2c_piix4 hpilo hpwdt lockd bridge stp mac_hid llc fscache auth_rpcgss
> acpi_power_meter nfs_acl bonding sunrpc lp parport hid_generic usbhid hid
> pata_atiixp ixgbe dca hpsa mdio
>  Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266322\] CPU 24
>  Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266323\] Modules linked in:
> mptctl mptbase vhost_net macvtap macvlan 8021q garp ip6table_filter
> ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat
> nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT
> xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables nfsd
> kvm_amd kvm ghash_clmulni_intel aesni_intel cryptd aes_x86_64 nfs microcode
> psmouse radeon serio_raw ttm drm_kms_helper amd64_edac_mod joydev drm
> edac_core fam15h_power k10temp edac_mce_amd i2c_algo_bit sp5100_tco
> i2c_piix4 hpilo hpwdt lockd bridge stp mac_hid llc fscache auth_rpcgss
> acpi_power_meter nfs_acl bonding sunrpc lp parport hid_generic usbhid hid
> pata_atiixp ixgbe dca hpsa mdio
>  Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266378\]
>  Nov 13 18:59:31 cpegh0009 kernel: \[15188599.266382\] Pid: 60385, comm:
> jsvc Not tainted 3.5.0-23-generic #35~precise1-Ubuntu HP ProLiant DL585
>
>  I am not sure if this is the cause of the high load or an after effect..
>
> 03:25:01 PM runq-sz plist-sz ldavg-1 ldavg-5 ldavg-15 blocked 06:45:01 PM
> 31 982 36.95 39.33 41.50 0 06:55:01 PM 17 1000 28.53 37.28 40.06 0 07:05:01
> PM 60 954 114.52 91.36 63.66 0 07:15:01 PM 48 961 29.55 53.94 60.76 0
> 07:25:01 PM 12 895 13.23 24.64 42.47 0 07:35:01 PM 5 772 8.02 13.32 28.31 0
>
>
> We run ubuntu 12.04.3 LTS on HP DL585s with 64 AMD cores and .5 TB of ram.
> This will host approx 40~50 vms (centos 5 guest).
>
> Agent version is:
> Version: 1:4.0.2
>
> Any ideas?
>
> Perhaps gathering cpu usage data on the jsvc pid ?
>
> --
> Tim Ehlers
>



-- 
Tim Ehlers

Reply via email to