Re: time issues and ZFS
Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? adrian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? Adrian, not yet, for several reasons: 1- as I explained, I can't realy force the problem, it happens when we run some zfs scripts, like mirror, but have to wait till enough changes happened on the source, usualy after 24hs. 2- changing to LAPIC seems to have solved the problem. 3- I'm now learning all I can about event timers and you have not answered some of my questions :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Re: kvm vlan virtio problem
Hi! The same warning shows up in our setup: Jan 21 23:40:46 host kernel: WARNING: at net/core/dev.c:1712 skb_gso_segment+0x1df/0x2b0() (Tainted: GW --- ) Jan 21 23:40:46 host kernel: Hardware name: System Product Name Jan 21 23:40:46 host kernel: tun: caps=(0x1b0049, 0x0) len=4452 data_len=4380 ip_summed=0 [...] KVM host: CentOS 6.3, Linux kernel 2.6.32-279.19.1.el6.x86_64 VM guest: FreeBSD 9.1, virtio-kmod-9.1-0.242658 Disabling TSO on vtnet0 stops the warnings on the KVM host. Is there any progress on this issue? Best regards Franz On 04.11.2012 02:53, Bryan Venteicher wrote: Hi, - Original Message - From: Bane Ivosev bane.ivo...@pmf.uns.ac.rs To: Bryan Venteicher bry...@daemoninthecloset.org, freebsd-stable@freebsd.org Sent: Saturday, November 3, 2012 7:58:57 PM Subject: Re: kvm vlan virtio problem thanks bryan, i don't have vlans inside guest. as you suggested i disabled nic tso ifconfig vtnet0 -tso and sysctl net.inet.tcp.tso=0 and so far, so good! everything is ok now. once again, thanks a lot. No need to disable TSO globally. So it seems if_vtnet sending down TSO frames that don't have the appropriate flags set. I'll look into it more in a next couple of days. Thanks for the quick reply back. Bryan On 11/03/2012 09:40 PM, Bryan Venteicher wrote: Hi, - Original Message - From: Bane Ivosevbane.ivo...@pmf.uns.ac.rs To: freebsd-stable@freebsd.org Sent: Saturday, November 3, 2012 2:12:25 PM Subject: kvm vlan virtio problem hi, i have several kvm ubuntu 12.04 and centos 6 hosts with standard bridged network setup. same problem on each server with freebsd 9 amd64 guest and virtio nic: soon after guest start host syslog is filling with this message at very high rate. guest is working without any problem. with e1000 guest driver eveything is ok. does enyone have/had same problem? thanks. I have a vague recollection of looking at something similar last year... Do you have VLAN configured in the guest as well? What is the ifconfig output? Does disabling TSO on the vtnetX device make these messages go away? Bryan kernel: [2337728.094141] [ cut here ] kernel: [2337728.094144] WARNING: at /build/buildd/linux-3.2.0/net/core/dev.c:1955 skb_gso_segment+0x341/0x3b0() kernel: [2337728.094146] Hardware name: System x3550 M3 -[7944K3G]- kernel: [2337728.094148] 802.1Q VLAN Support: caps=(0x30195833, 0x0) len=3196 data_len=0 ip_summed=0 kernel: [2337728.094149] Modules linked in: dm_snapshot ip6table_filter ip6_tables ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_CHECKSUM iptable_mangle xt_tcpudp iptable_filter ip_tables x_tables kvm_intel kvm dm_crypt nfsd nfs lockd fscache auth_rpcgss nfs_acl sunrpc nls_iso8859_1 nls_cp437 vfat fat 8021q garp bridge stp serio_raw cdc_ether usbnet vhost_net macvtap i7core_edac macvlan shpchp ioatdma edac_core dca tpm_tis lp parport mac_hid btrfs zlib_deflate libcrc32c usbhid hid megaraid_sas bnx2 kernel: [2337728.094177] Pid: 8685, comm: vhost-8683 Tainted: G W3.2.0-31-generic #50-Ubuntu kernel: [2337728.094179] Call Trace: kernel: [2337728.094180]IRQ [81066d7f] warn_slowpath_common+0x7f/0xc0 kernel: [2337728.094185] [81066e76] warn_slowpath_fmt+0x46/0x50 kernel: [2337728.094188] [8153f581] skb_gso_segment+0x341/0x3b0 kernel: [2337728.094191] [81542ee1] dev_hard_start_xmit+0xc1/0x540 kernel: [2337728.094196] [a01c0150] ? br_flood+0xc0/0xc0 [bridge] kernel: [2337728.094199] [8154360a] dev_queue_xmit+0x2aa/0x420 kernel: [2337728.094203] [a01c01e2] br_dev_queue_push_xmit+0x92/0xd0 [bridge] kernel: [2337728.094208] [a01c0278] br_forward_finish+0x58/0x60 [bridge] kernel: [2337728.094212] [a01c042b] __br_forward+0xab/0xd0 [bridge] kernel: [2337728.094217] [a01c04ed] br_forward+0x5d/0x70 [bridge] kernel: [2337728.094221] [a01c11c2] br_handle_frame_finish+0x182/0x2a0 [bridge] kernel: [2337728.094226] [a01c14a8] br_handle_frame+0x1c8/0x270 [bridge] kernel: [2337728.094231] [a01c12e0] ? br_handle_frame_finish+0x2a0/0x2a0 [bridge] kernel: [2337728.094234] [81540892] __netif_receive_skb+0x1e2/0x520 kernel: [2337728.094237] [81540ff1] process_backlog+0xb1/0x190 kernel: [2337728.094240] [815422e4] net_rx_action+0x134/0x290 kernel: [2337728.094242] [8165a4fe] ? _raw_spin_lock+0xe/0x20 kernel: [2337728.094245] [8106e528] __do_softirq+0xa8/0x210 kernel: [2337728.094248] [81664d6c] call_softirq+0x1c/0x30 kernel: [2337728.094249]EOI [81015305] do_softirq+0x65/0xa0 kernel:
FreeBSD 9.1 - openldap slapd lockups, mutex problems
Hi. (Im am sending this to the stable list, because it maybe kernel related.. ) On 9.1-RELEASE I am witnessing lockups of the openldap slapd daemon. The slapd runs for some days and then hangs, consuming high amounts of CPU. In this state slapd can only be restarted by SIGKILL. # procstat -kk 71195 PIDTID COMM TDNAME KSTACK 71195 149271 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d do_wait+0x678 __umtx_op_wait+0x68 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 194998 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _cv_wait_sig+0x12e seltdwait+0x110 kern_select+0x6ef sys_select+0x5d amd64_syscall+0x546 Xfast_syscall+0xf7 71195 195544 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 196183 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_timedwait_sig+0x19 _sleep+0x2d4 userret+0x9e doreti_ast+0x1f 71195 197966 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 198446 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 198453 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 198563 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 199520 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 200038 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 200670 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 200674 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 200675 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 201179 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 201180 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 201181 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 201183 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 71195 201189 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d _do_lock_umutex+0x5e8 do_lock_umutex+0x17c __umtx_op_wait_umutex+0x63 amd64_syscall+0x546 Xfast_syscall+0xf7 When I try to stop slapd through the rc script I can see in the logs that the process is waiting for a thread to terminate - indefinitely. Other multithreaded server processes running on the server without problems (apache-worker, mysqld, bind, etc.) On UFS2 slapd runs fine, without showing the error. Things I have tried already to stop the lockups: - running openldap-server23, openldap24 both with different BDB backend versions. - tuning the BDB Init File - reducing
Best priced Apartments
!--!DOCTYPE HTML PUBLIC -//W3C//DTD HTML 4.01 Transitional//EN http://www.w3.org/TR/html4/loose.dtd Best Priced Apartments FROM ONLY R639 000 One of South Africa's Premier Lifestyle Estates Lamandine, Burgundy Estate 9 hole Golf Course, School, Shopping Centre, Security, Sport Fields, Tennis Court, Parks, Cycling Trails, Next to Vineyards ... Transfer Mid 2013 - Estimated Shortfall R440 pm FOR MORE INFORMATION REPLY TO THIS EMAIL -- Message sent by: Just Invest ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Thus spake Daniel Braniss da...@cs.huji.ac.il: In the meantime here is some info: Intel(R) Xeon(R) CPU E5645: running with no problems LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0) Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0) Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On Tue, Jan 22, 2013 at 7:27 AM, Julian Stecklina jstec...@os.inf.tu-dresden.de wrote: Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... On many processors the lapic timer does not work correctly in states lower than C1. There are many processors that will automatically enter a C1E mode when the processor is idle, and in that state I have seen the lapic timer run slower than the programmed frequency, causing time to move to slowly on idle FreeBSD systems. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
busy on all disks that are a part of the ZFS pools without any load
Hello. I have a problem with all diks that are a part of ZFS pools. There is a busy state (6-7%) on all of them in the iostat. But only there! There aren't any load at all on the disks and other system utilities, such as gstat, 'zpool iostat', 'systat -iostat' which are reporting a zero busy status. 24 disks - 8 ZFS pools All 24 disks are on the 3Ware controller in JBOD mode (each disk works as a single disk without any hardware RAIDs) twa0: 3ware 9000 series Storage Controller port 0xd800-0xd8ff mem 0xf600-0xf7ff,0xfaedf000-0xfaed irq 16 at device 0.0 on pci2 twa0: INFO: (0x15: 0x1300): Controller details:: Model 9650SE-24M8, 24 ports, Firmware FE9X 4.08.00.006, BIOS BE9X 4.08.00.001 # uname -a FreeBSD gfs521 9.1-STABLE FreeBSD 9.1-STABLE #5 r245163: # iostat -xzt da,scsi extended device statistics device r/s w/skr/skw/s qlen svc_t %b da0 12.6 2.5 1248.5 173.20 29.0 7 da1 12.6 2.6 1227.7 173.20 22.6 6 da2 12.5 2.5 1233.3 173.20 29.3 7 da3 10.2 2.4 994.7 165.50 28.1 6 da4 10.5 2.4 1035.5 165.50 28.0 6 da5 10.7 2.4 1049.8 165.50 28.6 6 da6 14.6 2.5 1418.9 165.20 28.4 8 da7 14.4 2.5 1387.2 165.20 28.6 8 da8 14.3 2.5 1376.7 165.20 27.8 8 da9 10.8 2.5 1065.2 161.70 27.0 6 da10 11.0 2.5 1100.9 161.70 27.5 6 da11 10.4 2.5 1015.1 161.70 27.6 6 da12 13.5 2.4 1365.8 168.70 28.9 7 da13 13.9 2.4 1364.2 168.70 26.9 7 da14 13.9 2.4 1373.9 168.70 27.1 7 da15 13.6 2.6 1308.5 165.30 24.5 7 da16 14.3 2.5 1417.0 165.30 24.9 7 da17 14.0 2.5 1376.6 165.30 25.1 7 da18 17.0 2.4 1697.2 164.40 19.8 6 da19 16.0 2.4 1578.0 164.40 20.2 6 da20 16.5 2.4 1635.6 164.40 23.5 7 da21 8.7 2.5 802.8 186.30 27.2 6 da22 8.7 2.5 800.1 186.30 26.9 6 da23 8.6 2.5 797.0 186.30 27.1 6 # gstat 0 0 0 00.0 0 00.00.0| da0 0 0 0 00.0 0 00.00.0| da1 0 0 0 00.0 0 00.00.0| da2 0 0 0 00.0 0 00.00.0| da3 0 0 0 00.0 0 00.00.0| da4 0 0 0 00.0 0 00.00.0| da5 0 0 0 00.0 0 00.00.0| da6 0 0 0 00.0 0 00.00.0| da7 0 0 0 00.0 0 00.00.0| da8 0 0 0 00.0 0 00.00.0| da9 0 0 0 00.0 0 00.00.0| da10 0 0 0 00.0 0 00.00.0| da11 0 0 0 00.0 0 00.00.0| da12 0 0 0 00.0 0 00.00.0| da13 0 0 0 00.0 0 00.00.0| da14 0 0 0 00.0 0 00.00.0| da15 0 0 0 00.0 0 00.00.0| da16 0 0 0 00.0 0 00.00.0| da17 0 0 0 00.0 0 00.00.0| da18 0 0 0 00.0 0 00.00.0| da19 0 0 0 00.0 0 00.00.0| da20 0 0 0 00.0 0 00.00.0| da21 0 0 0 00.0 0 00.00.0| da22 0 0 0 00.0 0 00.00.0| da23 # zpool iostat capacity operationsbandwidth poolalloc free read write read write -- - - - - - - data1 11.4G 5.43T 0 0 0 0 data2 9.05G 5.43T 0 0 0 0 data3 10.1G 5.43T 0 0 0 0 data4 4.15G 5.43T 0 0 0 0 data5 11.9G 5.43T 0 0 0 0 data6 10.1G 5.43T 0 0 0 0 data7 76.1G 5.36T 0 0 0 0 data8 5.38M 5.44T 0 0 0 0 # zpool status -xv all pools are healthy Thanks for any ideas! ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busy on all disks that are a part of the ZFS pools without any load
on 22/01/2013 16:01 Oleksii Tsvietnov said the following: # iostat -xzt da,scsi extended device statistics device r/s w/skr/skw/s qlen svc_t %b da0 12.6 2.5 1248.5 173.20 29.0 7 da1 12.6 2.6 1227.7 173.20 22.6 6 da2 12.5 2.5 1233.3 173.20 29.3 7 da3 10.2 2.4 994.7 165.50 28.1 6 da4 10.5 2.4 1035.5 165.50 28.0 6 da5 10.7 2.4 1049.8 165.50 28.6 6 da6 14.6 2.5 1418.9 165.20 28.4 8 da7 14.4 2.5 1387.2 165.20 28.6 8 da8 14.3 2.5 1376.7 165.20 27.8 8 da9 10.8 2.5 1065.2 161.70 27.0 6 da10 11.0 2.5 1100.9 161.70 27.5 6 da11 10.4 2.5 1015.1 161.70 27.6 6 da12 13.5 2.4 1365.8 168.70 28.9 7 da13 13.9 2.4 1364.2 168.70 26.9 7 da14 13.9 2.4 1373.9 168.70 27.1 7 da15 13.6 2.6 1308.5 165.30 24.5 7 da16 14.3 2.5 1417.0 165.30 24.9 7 da17 14.0 2.5 1376.6 165.30 25.1 7 da18 17.0 2.4 1697.2 164.40 19.8 6 da19 16.0 2.4 1578.0 164.40 20.2 6 da20 16.5 2.4 1635.6 164.40 23.5 7 da21 8.7 2.5 802.8 186.30 27.2 6 da22 8.7 2.5 800.1 186.30 26.9 6 da23 8.6 2.5 797.0 186.30 27.1 6 These are values since boot, they do not reflect current system load. -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
On 01/22/13 07:27, Julian Stecklina wrote: Thus spake Daniel Braniss da...@cs.huji.ac.il: In the meantime here is some info: Intel(R) Xeon(R) CPU E5645: running with no problems LAPIC(600) HPET(450) HPET1(440) HPET2(440) HPET3(440) i8254(100) RTC(0) Intel(R) Xeon(R) CPU X5550: this is the problematic, at least for the moment HPET(450) HPET1(440) HPET2(440) HPET3(440) LAPIC(400) i8254(100) RTC(0) Does anyone know why the LAPIC is given a lower priority than HPET in this case? If you have an LAPIC, it should always be prefered to HPET, unless something is seriously wrong with it... Julian ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org This may help: Problem with LAPIC timer is that it stops working when CPU goes to C3 or deeper idle state. These states are not enabled by default, so unless you enabled them explicitly, it is safe to use LAPIC. In any case present 9-STABLE system should prevent you from using unsafe C-state if LAPIC timer is used. From all other perspectives LAPIC is preferable, as it is faster and easier to operate then HPET. Latest CPUs fixed the LAPIC timer problem, so I don't think that switching to it will be pessimistic in foreseeable future. -- Alexander Motin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busy on all disks that are a part of the ZFS pools without any load
These are values since boot, they do not reflect current system load. As I could see these values usually change from 0 to 60 . Why did they freeze on 7%? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busy on all disks that are a part of the ZFS pools without any load
on 22/01/2013 16:47 Oleksii Tsvietnov said the following: These are values since boot, they do not reflect current system load. As I could see these values usually change from 0 to 60 . Why did they freeze on 7%? That's the average value since boot to now? -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busy on all disks that are a part of the ZFS pools without any load
On 01/22/2013 04:51 PM, Andriy Gapon wrote: That's the average value since boot to now? Maybe... Does iostat's busy always show avarage value since boot? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: busy on all disks that are a part of the ZFS pools without any load
on 22/01/2013 17:05 Oleksii Tsvietnov said the following: On 01/22/2013 04:51 PM, Andriy Gapon wrote: That's the average value since boot to now? Maybe... Does iostat's busy always show avarage value since boot? Use -w option to see current state (starting from the second screen). Manual pages rule :-) -- Andriy Gapon ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: FreeBSD 9.1 - openldap slapd lockups, mutex problems
On 01/22/13 05:19, Kai Gallasch wrote: Hi. (Im am sending this to the stable list, because it maybe kernel related.. ) On 9.1-RELEASE I am witnessing lockups of the openldap slapd daemon. The slapd runs for some days and then hangs, consuming high amounts of CPU. In this state slapd can only be restarted by SIGKILL. # procstat -kk 71195 PIDTID COMM TDNAME KSTACK 71195 149271 slapd-mi_switch+0x186 sleepq_catch_signals+0x2cc sleepq_wait_sig+0x16 _sleep+0x29d do_wait+0x678 __umtx_op_wait+0x68 amd64_syscall+0x546 Xfast_syscall+0xf7 On UFS2 slapd runs fine, without showing the error. Has anyone else running openldap-server on FreeBSD 9.1 inside a jail seen similar problems? I have seen openldap spin the cpu and even run out of memory to get killed on some of our test systems running ~9.1-rel with zfs. No jails. I'm not sure what would have put load on our test systems other than nightly scripts. I had to focus my attention on other servers so I don't have one to inspect at this point, but I won't be surprised if I see this in production. Thanks for the tip about it being ZFS related, and I'll let you know if I find anything out. This is mostly a me too reply. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: Failsafe on kernel panic
I started investigating ipmi, so far i can configure IP from fbsd to ipmi. My question is how to access it? Can it be done inband attached to one oc the Ibm nics kn the board? or knlh out oc band? In case of oob any knows if the iLO plug is pure rj45 in ibm servers (specially x3250/3550)? Thanks in advance Sami בתאריך 20 בינו 2013 16:07, מאת Willem Jan Withagen w...@digiware.nl: On 17-1-2013 4:18, Ian Lepore wrote: On Wed, 2013-01-16 at 23:27 +0200, Sami Halabi wrote: Thank you for your response, very helpful. one question - how do i configure auto-reboot once kernel panic occurs? Sami From src/sys/conf/NOTES, this may be what you're looking for... # # Don't enter the debugger for a panic. Intended for unattended operation # where you may want to enter the debugger from the console, but still want # the machine to recover from a panic. # options KDB_UNATTENDED But I think it only has meaning if you have option KDB in effect, otherwise it should just reboot itself after a 15 second pause. Well it is not the magical fix-all solution. Last night I had to drive to the colo (lucky for me a 5 min drive.) because I could not get a system to reboot/recover from a crash. Upon arrival the system was crashed and halted on the message: rebooting in 15 sec. Which but those 15 secs are would have gone by for about 10-20 minutes. fysically rebooting or resetting ended up in the same position: rebooting in 15 sec. Without ever getting to actually rebooting. So if I (you) have servers 2 hours away, I usually try to work on upgrading/rebooting during business hours. And remote hands can get me out of trouble IPMI is another nice way of getting at the server in these cases. But that requires a lot more infra and tinkering. --WjW ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: time issues and ZFS
Hi! As I said before, the problem with non-HLT loops with event-timer in -9 and -head is that it calls the idle function inside a critical section (critical_enter and critical_exit) which blocks interrupts from occuring. The EI;HLT instruction pair on i386/amd64 atomically and correctly handles things from what I've been told. However, there's no atomic way to do this using ACPI sleeping, so there's a small window where an interrupt may come in but it isn't handled; waiting for the next interrupt to occur before it'll wake up and respond to that interrupt. I kept hitting my head against this when doing network testing. :( Now - specifically for timekeeping it shouldn't matter; that's to do with whether the counters are reliable or not (and heck, are even in lock-step on CPUs.) But extra latency could show up weirdly, hence why I was asking for you to try different timer configurations and idle loops. Thanks, Adrian On 22 January 2013 01:55, Daniel Braniss da...@cs.huji.ac.il wrote: Daniel, Have you run tests with the machdep.idle value changed, and fiddling kern.eventtimer.periodic / kern.eventtimer.idletick ? Adrian, not yet, for several reasons: 1- as I explained, I can't realy force the problem, it happens when we run some zfs scripts, like mirror, but have to wait till enough changes happened on the source, usualy after 24hs. 2- changing to LAPIC seems to have solved the problem. 3- I'm now learning all I can about event timers and you have not answered some of my questions :-) danny ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: kvm vlan virtio problem
Hi, - Original Message - Hi! The same warning shows up in our setup: Jan 21 23:40:46 host kernel: WARNING: at net/core/dev.c:1712 skb_gso_segment+0x1df/0x2b0() (Tainted: GW --- ) Jan 21 23:40:46 host kernel: Hardware name: System Product Name Jan 21 23:40:46 host kernel: tun: caps=(0x1b0049, 0x0) len=4452 data_len=4380 ip_summed=0 [...] KVM host: CentOS 6.3, Linux kernel 2.6.32-279.19.1.el6.x86_64 VM guest: FreeBSD 9.1, virtio-kmod-9.1-0.242658 Disabling TSO on vtnet0 stops the warnings on the KVM host. Is there any progress on this issue? It seems the only way this could happen is if the FreeBSD TCP/IP stack sent down an mbuf with CSUM_TSO set and CSUM_TCP not set; this doesn't seem possible from looking at FreeBSD's tcp_output(). I'll try to look closer at this during this week. Bryan Best regards Franz ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: kvm vlan virtio problem
Hi, - Original Message - Hi! The same warning shows up in our setup: Jan 21 23:40:46 host kernel: WARNING: at net/core/dev.c:1712 skb_gso_segment+0x1df/0x2b0() (Tainted: GW --- ) Jan 21 23:40:46 host kernel: Hardware name: System Product Name Jan 21 23:40:46 host kernel: tun: caps=(0x1b0049, 0x0) len=4452 data_len=4380 ip_summed=0 [...] KVM host: CentOS 6.3, Linux kernel 2.6.32-279.19.1.el6.x86_64 VM guest: FreeBSD 9.1, virtio-kmod-9.1-0.242658 Disabling TSO on vtnet0 stops the warnings on the KVM host. Is there any progress on this issue? Alright, I tried to recreate this on Ubuntu 12.10 without any luck. Please describe your network configuration. On my Linux host, my VLAN interface looks like: eth0.100 Link encap:Ethernet HWaddr 6c:f0:49:05:2b:6d inet6 addr: fe80::6ef0:49ff:fe05:2b6d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3119867 errors:0 dropped:0 overruns:0 frame:0 TX packets:3790183 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:166813040 (166.8 MB) TX bytes:5435432448 (5.4 GB) That is plugged into this bridge: br100 Link encap:Ethernet HWaddr 6c:f0:49:05:2b:6d inet addr:192.168.99.101 Bcast:192.168.99.255 Mask:255.255.255.0 inet6 addr: fe80::6ef0:49ff:fe05:2b6d/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:14 errors:0 dropped:0 overruns:0 frame:0 TX packets:18 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:876 (876.0 B) TX bytes:1420 (1.4 KB) With the tap device created by QEMU for my FreeBSD guest: vnet1 Link encap:Ethernet HWaddr fe:54:00:ec:4f:4e inet6 addr: fe80::fc54:ff:feec:4f4e/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:800284 errors:0 dropped:0 overruns:0 frame:0 TX packets:3119877 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:500 RX bytes:5238099122 (5.2 GB) TX bytes:210492002 (210.4 MB) All this tied together: # brctl show br100 bridge name bridge id STP enabled interfaces br100 8000.6cf049052b6d no eth0.100 vnet1 Does this approximate your configuration? What's the output of `ethtool -k` for your VLAN, bridge, and vnet interfaces? Bryan Best regards Franz ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org