Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-06 Thread John Baldwin
On Tuesday, November 05, 2013 3:14:22 pm Oliver Pinter wrote: hmm, and seems like, the bottleneck are not in geom or cam, but in em driver or in networking stack the scenario is: A machine: dd if=/dev/ada1 bs=1M | nc -l B machine: nc IP | dd of=/dev/null bs=1M hmm, when

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-06 Thread Adrian Chadd
.. the main reason to use machdep.idle=hlt is that it is a different code path. But to ensure you're always going via the hlt codepath, you _first_ have to disable mwait. The idle code first decides whether to run mwait or idle, then if it doesn't choose mwait, it chooses machdep.idle. That's

[10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Oliver Pinter
Hi all! The machine is a Haswell machine, the disc performance was very poor (20-30MByte/sec). When I change the kern.eventtimer.idletick from 0 to 1, the normal performance restored back to normal (70-90MByte/sec). The default eventtimer was LAPIC. On other machine Q9300, this was fully

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Adrian Chadd
Hi! Can you do 'sysctl dev.cpu' please? I'd like to see what sleep state(s) your CPU is entering. Thanks! -adrian On 5 November 2013 06:07, Oliver Pinter oliver.p...@gmail.com wrote: Hi all! The machine is a Haswell machine, the disc performance was very poor (20-30MByte/sec). When I

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Oliver Pinter
op@perpetua ~ sysctl dev.cpu dev.cpu.0.%desc: ACPI CPU dev.cpu.0.%driver: cpu dev.cpu.0.%location: handle=\_PR_.CPU0 dev.cpu.0.%pnpinfo: _HID=none _UID=0 dev.cpu.0.%parent: acpi0 dev.cpu.0.coretemp.delta: 59 dev.cpu.0.coretemp.resolution: 1 dev.cpu.0.coretemp.tjmax: 100.0C

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Adrian Chadd
Ok, so it's only hitting C1. It's not going into C2. Is this a dual core CPU with hyperthreading enabled, or a quad core CPU? How about changing the idle loop from acpi to hlt, see if that fixes things? (Without tweaking the event timer logic.) I'm worried that what you're seeing here are

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Adrian Chadd
and sysctl machdep.idle_mwait=0 -adrian On 5 November 2013 10:12, Adrian Chadd adr...@freebsd.org wrote: Ok, so it's only hitting C1. It's not going into C2. Is this a dual core CPU with hyperthreading enabled, or a quad core CPU? How about changing the idle loop from acpi to hlt, see if

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Oliver Pinter
On 11/5/13, Adrian Chadd adr...@freebsd.org wrote: Ok, so it's only hitting C1. It's not going into C2. Is this a dual core CPU with hyperthreading enabled, or a quad core CPU? quad core, i5-4670 How about changing the idle loop from acpi to hlt, see if that fixes things? (Without tweaking

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Oliver Pinter
dmesg corrected On 11/5/13, Oliver Pinter oliver.p...@gmail.com wrote: On 11/5/13, Adrian Chadd adr...@freebsd.org wrote: Ok, so it's only hitting C1. It's not going into C2. Is this a dual core CPU with hyperthreading enabled, or a quad core CPU? quad core, i5-4670 How about changing

Re: [10-STABLE, 11-CURRENT] something wrong between cam and eventtimer or geom and eventtimer

2013-11-05 Thread Oliver Pinter
hmm, and seems like, the bottleneck are not in geom or cam, but in em driver or in networking stack the scenario is: A machine: dd if=/dev/ada1 bs=1M | nc -l B machine: nc IP | dd of=/dev/null bs=1M hmm, when dd-ing from /dev/zero and switch back to idletick to 0, then the performance