Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-09-04 Thread John Baldwin
On Thursday, August 08, 2013 10:41:12 am Eric van Gyzen wrote:
 On 08/08/2013 09:19, Eric van Gyzen wrote:
  On 08/06/2013 14:23, J David wrote:
  On Tue, Aug 6, 2013 at 1:59 PM, Eric van Gyzen e...@vangyzen.net wrote:
  on an otherwise idle amd64 system with 4 CPUs.  The first command in 
the
  build.log file:
 
  rm -rf /usr/obj/home/freebsd/tmp
 
  took over three minutes.  It should have taken about three /seconds/.
 
  uptime reported a load average of around 1.00.
  top showed no threads (user or kernel) using CPU.
  iostat showed an average of less than 20 tps on ada0.
  rm was usually in the RUN state.
  We are looking at something similar.  Would you be able to try to
  reproduce it using a kernel with:
 
  nooptions  SCHED_ULE
  optionsSCHED_4BSD
 
  to see if it makes a difference?  It seems to, but the problem is
  inconsistent enough that I can't be sure.
  The 4BSD scheduler does //not// exhibit this problem.  I tested with the
  latest releng/9.2 (r254054) and an otherwise GENERIC config.
 
 To be thorough, I built a GENERIC kernel at the same rev, and it still
 exhibits the problem.

Please try this change:

Index: sched_ule.c
===
--- sched_ule.c (revision 255020)
+++ sched_ule.c (working copy)
@@ -243,7 +243,7 @@ struct tdq {
int tdq_transferable;   /* Transferable thread count. */
short   tdq_switchcnt;  /* Switches this tick. */
short   tdq_oldswitchcnt;   /* Switches last tick. */
-   u_char  tdq_lowpri; /* Lowest priority thread. */
+   u_short tdq_lowpri; /* Lowest priority thread. */
u_char  tdq_ipipending; /* IPI pending. */
u_char  tdq_idx;/* Current insert index. */
u_char  tdq_ridx;   /* Current removal index. */
@@ -2323,7 +2323,7 @@ sched_choose(void)
tdq-tdq_lowpri = td-td_priority;
return (td);
}
-   tdq-tdq_lowpri = PRI_MAX_IDLE;
+   tdq-tdq_lowpri = PRI_MAX_IDLE + 1;
return (PCPU_GET(idlethread));
 }
 

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-08 Thread Eric van Gyzen
On 08/06/2013 14:23, J David wrote:
 On Tue, Aug 6, 2013 at 1:59 PM, Eric van Gyzen e...@vangyzen.net wrote:
 on an otherwise idle amd64 system with 4 CPUs.  The first command in the
 build.log file:

 rm -rf /usr/obj/home/freebsd/tmp

 took over three minutes.  It should have taken about three /seconds/.

 uptime reported a load average of around 1.00.
 top showed no threads (user or kernel) using CPU.
 iostat showed an average of less than 20 tps on ada0.
 rm was usually in the RUN state.
 We are looking at something similar.  Would you be able to try to
 reproduce it using a kernel with:

 nooptions SCHED_ULE
 options   SCHED_4BSD

 to see if it makes a difference?  It seems to, but the problem is
 inconsistent enough that I can't be sure.

The 4BSD scheduler does //not// exhibit this problem.  I tested with the
latest releng/9.2 (r254054) and an otherwise GENERIC config.

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-08 Thread Eric van Gyzen
On 08/08/2013 09:19, Eric van Gyzen wrote:
 On 08/06/2013 14:23, J David wrote:
 On Tue, Aug 6, 2013 at 1:59 PM, Eric van Gyzen e...@vangyzen.net wrote:
 on an otherwise idle amd64 system with 4 CPUs.  The first command in the
 build.log file:

 rm -rf /usr/obj/home/freebsd/tmp

 took over three minutes.  It should have taken about three /seconds/.

 uptime reported a load average of around 1.00.
 top showed no threads (user or kernel) using CPU.
 iostat showed an average of less than 20 tps on ada0.
 rm was usually in the RUN state.
 We are looking at something similar.  Would you be able to try to
 reproduce it using a kernel with:

 nooptionsSCHED_ULE
 options  SCHED_4BSD

 to see if it makes a difference?  It seems to, but the problem is
 inconsistent enough that I can't be sure.
 The 4BSD scheduler does //not// exhibit this problem.  I tested with the
 latest releng/9.2 (r254054) and an otherwise GENERIC config.

To be thorough, I built a GENERIC kernel at the same rev, and it still
exhibits the problem.

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-07 Thread David Xu

On 2013/08/06 05:15, Dave Mischler wrote:

I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
source tree and started building world with idprio 31 and I looked back
a while later and all the CPU cores and disk were essentially idle, and
hardly any progress had been made on the build. I stopped and restarted
the build without the idle priority setting and it ran fine. Anybody
else seen any of this? Anybody know about any fairly recent changes that
might account for it?

I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
like it should (i.e. compute bound processes use most/all of the
available CPU time), but a lot of the time both the CPU and disk are
idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
ever saw this behavior before while running make buildworld -j4.  Can
anyone else confirm/rebut my findings?  Thanks.




idle should never be used, it can cause long term priority inversion
in kernel, make the system slower.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-07 Thread Andriy Gapon
on 06/08/2013 00:15 Dave Mischler said the following:
 I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
 GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
 source tree and started building world with idprio 31 and I looked back
 a while later and all the CPU cores and disk were essentially idle, and
 hardly any progress had been made on the build. I stopped and restarted
 the build without the idle priority setting and it ran fine. Anybody
 else seen any of this? Anybody know about any fairly recent changes that
 might account for it?
 
 I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
 going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
 like it should (i.e. compute bound processes use most/all of the
 available CPU time), but a lot of the time both the CPU and disk are
 idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
 ever saw this behavior before while running make buildworld -j4.  Can
 anyone else confirm/rebut my findings?  Thanks.

Are you sure that you really want to use idprio for a goal you want to achieve?
If yes, are you sure that you want to use idprio 31 specifically?
With sched_ule idprio 31 is equivalent to priority of a completely idle system.
 So the scheduler is in its right to run the idle (do nothing) thread instead
of your thread(s).

P.S.
https://wiki.freebsd.org/AvgThreadPriorityRanges
-- 
Andriy Gapon
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-07 Thread Eric van Gyzen
On 08/07/2013 04:09, Andriy Gapon wrote:
 on 06/08/2013 00:15 Dave Mischler said the following:
 I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
 GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
 source tree and started building world with idprio 31 and I looked back
 a while later and all the CPU cores and disk were essentially idle, and
 hardly any progress had been made on the build. I stopped and restarted
 the build without the idle priority setting and it ran fine. Anybody
 else seen any of this? Anybody know about any fairly recent changes that
 might account for it?

 I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
 going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
 like it should (i.e. compute bound processes use most/all of the
 available CPU time), but a lot of the time both the CPU and disk are
 idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
 ever saw this behavior before while running make buildworld -j4.  Can
 anyone else confirm/rebut my findings?  Thanks.
 Are you sure that you really want to use idprio for a goal you want to 
 achieve?
 If yes, are you sure that you want to use idprio 31 specifically?
 With sched_ule idprio 31 is equivalent to priority of a completely idle 
 system.
  So the scheduler is in its right to run the idle (do nothing) thread 
 instead
 of your thread(s).

That sounds like a bug to me, or a POLA violation at least.  A user
thread should never have the same priority as the idle threads, because
a user thread, by definition, has work to do.

From the rtprio(1) examples:

 To make depend while not disturbing other machine usage:
   idprio 31 make depend

 P.S.
 https://wiki.freebsd.org/AvgThreadPriorityRanges

Nice!  Thank you for writing it and sending the link.

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-06 Thread Eric van Gyzen
On 08/05/2013 16:15, Dave Mischler wrote:
 I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
 GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
 source tree and started building world with idprio 31 and I looked back
 a while later and all the CPU cores and disk were essentially idle, and
 hardly any progress had been made on the build. I stopped and restarted
 the build without the idle priority setting and it ran fine. Anybody
 else seen any of this? Anybody know about any fairly recent changes that
 might account for it?

 I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
 going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
 like it should (i.e. compute bound processes use most/all of the
 available CPU time), but a lot of the time both the CPU and disk are
 idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
 ever saw this behavior before while running make buildworld -j4.  Can
 anyone else confirm/rebut my findings?  Thanks.

I can confirm your findings, on 9.1-RELEASE-p5 amd64 GENERIC.

I ran

$ idprio 31 make buildworld buildkernel  /tmp/build.log 21 
/dev/null 

on an otherwise idle amd64 system with 4 CPUs.  The first command in the
build.log file:

rm -rf /usr/obj/home/freebsd/tmp

took over three minutes.  It should have taken about three /seconds/.

uptime reported a load average of around 1.00.
top showed no threads (user or kernel) using CPU.
iostat showed an average of less than 20 tps on ada0.
rm was usually in the RUN state.

/home/freebsd (src) is UFS+SUJ.
/usr/obj is UFS+SU.
/tmp/build.log is tmpfs.

Both UFS file systems are on ada0:

ada0 at ata2 bus 0 scbus2 target 0 lun 0
ada0: WDC WD2502ABYS-18B7A0 02.03B05 ATA-8 SATA 2.x device
ata2: ATA channel at channel 0 on atapci0
atapci0: Intel 5 Series/3400 Series PCH SATA300 controller

CPU: Intel(R) Xeon(R) CPU   X3430  @ 2.40GHz (2394.04-MHz
K8-class CPU)
real memory  = 8589934592 (8192 MB)
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)

FreeBSD 9.1-RELEASE-p5 #0 r+94f2ad5: Tue Aug  6 09:40:22 CDT 2013
root@srv5:/usr/obj/home/freebsd/sys/GENERIC  amd64

/boot/loader.conf contains:

console=comconsole vidconsole
comconsole_speed=115200
comconsole_port=0x2f8

/etc/sysctl.conf is empty.

I'll update to releng/9.2 and try again.

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-06 Thread Eric van Gyzen
On 08/06/2013 10:31, Eric van Gyzen wrote:
 On 08/05/2013 16:15, Dave Mischler wrote:
 I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
 GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
 source tree and started building world with idprio 31 and I looked back
 a while later and all the CPU cores and disk were essentially idle, and
 hardly any progress had been made on the build. I stopped and restarted
 the build without the idle priority setting and it ran fine. Anybody
 else seen any of this? Anybody know about any fairly recent changes that
 might account for it?

 I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
 going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
 like it should (i.e. compute bound processes use most/all of the
 available CPU time), but a lot of the time both the CPU and disk are
 idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
 ever saw this behavior before while running make buildworld -j4.  Can
 anyone else confirm/rebut my findings?  Thanks.
 I can confirm your findings, on 9.1-RELEASE-p5 amd64 GENERIC.

 I ran

 $ idprio 31 make buildworld buildkernel  /tmp/build.log 21 
 /dev/null 

 on an otherwise idle amd64 system with 4 CPUs.  The first command in the
 build.log file:

 rm -rf /usr/obj/home/freebsd/tmp

 took over three minutes.  It should have taken about three /seconds/.

 uptime reported a load average of around 1.00.
 top showed no threads (user or kernel) using CPU.
 iostat showed an average of less than 20 tps on ada0.
 rm was usually in the RUN state.

 /home/freebsd (src) is UFS+SUJ.
 /usr/obj is UFS+SU.
 /tmp/build.log is tmpfs.

 Both UFS file systems are on ada0:

 ada0 at ata2 bus 0 scbus2 target 0 lun 0
 ada0: WDC WD2502ABYS-18B7A0 02.03B05 ATA-8 SATA 2.x device
 ata2: ATA channel at channel 0 on atapci0
 atapci0: Intel 5 Series/3400 Series PCH SATA300 controller

 CPU: Intel(R) Xeon(R) CPU   X3430  @ 2.40GHz (2394.04-MHz
 K8-class CPU)
 real memory  = 8589934592 (8192 MB)
 FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
 FreeBSD/SMP: 1 package(s) x 4 core(s)

 FreeBSD 9.1-RELEASE-p5 #0 r+94f2ad5: Tue Aug  6 09:40:22 CDT 2013
 root@srv5:/usr/obj/home/freebsd/sys/GENERIC  amd64

 /boot/loader.conf contains:

 console=comconsole vidconsole
 comconsole_speed=115200
 comconsole_port=0x2f8

 /etc/sysctl.conf is empty.

 I'll update to releng/9.2 and try again.

I see more-or-less the same behavior on 9.2-RC1 (r253912).  It seems to
be faster than 9.1, but it's still much slower than I would expect.

idprio 30 is much, much faster than 31.  It's about as fast as I would
expect (for this idle machine).  So, the problem seems to affect only
idprio 31.  (Off-by-one / fencepost problem?)

CPU-bound processes, such as c++, seem to run at the normal speeds, so
the problem seems to affect system- or I/O-bound work.

Can anyone try this on a 9.0-RELEASE system?

Eric
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-06 Thread J David
On Tue, Aug 6, 2013 at 1:59 PM, Eric van Gyzen e...@vangyzen.net wrote:
 on an otherwise idle amd64 system with 4 CPUs.  The first command in the
 build.log file:

 rm -rf /usr/obj/home/freebsd/tmp

 took over three minutes.  It should have taken about three /seconds/.

 uptime reported a load average of around 1.00.
 top showed no threads (user or kernel) using CPU.
 iostat showed an average of less than 20 tps on ada0.
 rm was usually in the RUN state.

We are looking at something similar.  Would you be able to try to
reproduce it using a kernel with:

nooptions   SCHED_ULE
options SCHED_4BSD

to see if it makes a difference?  It seems to, but the problem is
inconsistent enough that I can't be sure.

Thanks!
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


unexpected idprio 31 behavior on 9.2-BETA2 and 9.2-RC1

2013-08-05 Thread Dave Mischler
I have an i5-2500 machine 8GB RAM now running 9.2-RC1 amd64 with the
GENERIC kernel. Today, while still running 9.2-BETA2, I updated my
source tree and started building world with idprio 31 and I looked back
a while later and all the CPU cores and disk were essentially idle, and
hardly any progress had been made on the build. I stopped and restarted
the build without the idle priority setting and it ran fine. Anybody
else seen any of this? Anybody know about any fairly recent changes that
might account for it?

I did a rm -rf /usr/src /usr/obj and loaded a new source tree before
going to RC1.  I still see odd behavior at RC1.  Sometimes it works just
like it should (i.e. compute bound processes use most/all of the
available CPU time), but a lot of the time both the CPU and disk are
idle (e.g. CPU 97.8% idle, disk 1% busy per systat).  I don't think I
ever saw this behavior before while running make buildworld -j4.  Can
anyone else confirm/rebut my findings?  Thanks.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org