Re: In-kernel PPPoE

2010-12-07 Thread Alexander Motin

On 06.12.2010 03:45, David Rhodus wrote:

On Sun, Dec 5, 2010 at 6:32 PM, Andriy Gapona...@freebsd.org  wrote:

on 05/12/2010 22:30 Julian Elischer said the following:

On 12/5/10 9:30 AM, Bernd Walter wrote:

On Sun, Dec 05, 2010 at 12:14:21PM -0500, Pierre Lamy wrote:

Just curious about why the in-kernel PPPoE interface was never ported
from NetBSD or OpenBSD, to FreeBSD. Does anyone know why?

Maybe because everyone who cares about in-kernel uses the FreeBSD
in-kernel ng_pppoe via mpd?


  From using it for a long time in OpenBSD I always found it quite stable
and easy to use.

The same is true with mpd/ng_pppoe.

while I like mpd, I should point out that the regular 'in source' ppp that comes
with
freebsd also uses the in-kernel netgraph pppoe module.   I use it 24 x 7 on my
gateway
as I never got around to installing mpd and it did the job.


BTW, there is a rumor that mpd may become an 'in source' program too.


Does mpd work in -current ? Last tried I, netgraph had problems with mpd.


Sure it does! What is the problem?

--
Alexander Motin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: trying to use xz on manuals.

2010-12-07 Thread Alex Kozlov
On Mon, Dec 06, 2010 at 10:50:44PM -0800, Tim Kientzle wrote:
 On Dec 6, 2010, at 11:17 AM, Chuck Swiger wrote:
 On Dec 6, 2010, at 9:13 AM, Alex Kozlov wrote:
 On Tue, Dec 07, 2010 at 02:03:50AM +0900, Norikatsu Shigemura wrote:
.xz smaller than .gz, but effective is about 96.2%:-(.
 Some time ago I do similar tests. Changing compression for base man's
  to bz2 or xz doesn't make much sense.
 Oh, agreed.  The issue with small files is that they will always take up
 at least one sector [*]; different compression routines don't gain any
  benefit if they don't change the number of sectors needed to store the file.
 More than half of the manpages end up as 1K .gz catman files as it is;
 ~90% are 2K or smaller.
 It might make sense if XZ decompression were significantly
 faster than GZip decompression.  (Especially since man pages
 are decompressed much more often than they are compressed.)
It's not.

Bigest man from the base, FreeBSD 9.0-CURRENT Sat Oct 23 amd64,
CPU: Pentium(R) Dual-Core CPU T4400 @ 2.20GHz (2194.55-MHz K8-class CPU),
average of 3 tries:

$ls -l CC.1*
-rw-r--r--  1 kozlov  kozlov  584775 Dec  7 09:14 CC.1
-rw-r--r--  1 kozlov  kozlov  161663 Dec  7 09:14 CC.1.gz
-rw-r--r--  1 kozlov  kozlov  131580 Dec  7 09:13 CC.1.xz
$cat CC.1.?z /dev/null
$time xzcat CC.1.xz /dev/null

real0m0.032s
user0m0.028s
sys 0m0.000s
$time gzcat CC.1.gz /dev/null

real0m0.012s
user0m0.008s
sys 0m0.000s


--
Adios
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Lock order reversal .

2010-12-07 Thread Mehmet Erol Sanliturk
A Dmesg.TXT is attached having a lock order reversal message .

Thank you very much .

Mehmet Erol Sanliturk
Copyright (c) 1992-2010 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 9.0-CURRENT-201011 #0: Wed Nov  3 17:44:48 UTC 2010
r...@farrell.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64
WARNING: WITNESS option enabled, expect reduced performance.
CPU: Intel(R) Core(TM)2 Quad CPUQ6600  @ 2.40GHz (2397.65-MHz K8-class CPU)
  Origin = GenuineIntel  Id = 0x6fb  Family = 6  Model = f  Stepping = 11
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0xe3bdSSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM
  AMD Features=0x20100800SYSCALL,NX,LM
  AMD Features2=0x1LAHF
  TSC: P-state invariant
real memory  = 2147483648 (2048 MB)
avail memory = 2024038400 (1930 MB)
Event timer LAPIC quality 400
ACPI APIC Table: INTEL  DG965WH 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
FreeBSD/SMP: 1 package(s) x 4 core(s)
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
 cpu2 (AP): APIC ID:  2
 cpu3 (AP): APIC ID:  3
ioapic0: Changing APIC ID to 2
ioapic0 Version 2.0 irqs 0-23 on motherboard
kbd1 at kbdmux0
acpi0: INTEL DG965WH on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x408-0x40b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
cpu2: ACPI CPU on acpi0
cpu3: ACPI CPU on acpi0
acpi_button0: Sleep Button on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
vgapci0: VGA-compatible display port 0x3410-0x3417 mem 
0x9020-0x902f,0x8000-0x8fff irq 16 at device 2.0 on pci0
agp0: Intel G965 SVGA controller on vgapci0
agp0: aperture size is 256M, detected 7676k stolen memory
pci0: simple comms at device 3.0 (no driver attached)
em0: Intel(R) PRO/1000 Network Connection 7.1.7 port 0x30e0-0x30ff mem 
0x9030-0x9031,0x90324000-0x90324fff irq 20 at device 25.0 on pci0
em0: Using an MSI interrupt
acquiring duplicate lock of same type: network driver
 1st dev_spec-swflag_mutex @ /usr/src/sys/dev/e1000/e1000_ich8lan.c:778
 2nd dev_spec-nvm_mutex @ /usr/src/sys/dev/e1000/e1000_ich8lan.c:744
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
kdb_backtrace() at kdb_backtrace+0x37
_witness_debugger() at _witness_debugger+0x2e
witness_checkorder() at witness_checkorder+0x8de
_mtx_lock_flags() at _mtx_lock_flags+0x79
e1000_acquire_nvm_ich8lan() at e1000_acquire_nvm_ich8lan+0x1e
e1000_read_nvm_ich8lan() at e1000_read_nvm_ich8lan+0x76
e1000_post_phy_reset_ich8lan() at e1000_post_phy_reset_ich8lan+0x1b1
e1000_reset_hw_ich8lan() at e1000_reset_hw_ich8lan+0x4c1
em_attach() at em_attach+0x120f
device_attach() at device_attach+0x69
bus_generic_attach() at bus_generic_attach+0x1a
acpi_pci_attach() at acpi_pci_attach+0x14f
device_attach() at device_attach+0x69
bus_generic_attach() at bus_generic_attach+0x1a
acpi_pcib_attach() at acpi_pcib_attach+0x1a7
acpi_pcib_acpi_attach() at acpi_pcib_acpi_attach+0x1fd
device_attach() at device_attach+0x69
bus_generic_attach() at bus_generic_attach+0x1a
acpi_attach() at acpi_attach+0xaa0
device_attach() at device_attach+0x69
bus_generic_attach() at bus_generic_attach+0x1a
nexus_acpi_attach() at nexus_acpi_attach+0x69
device_attach() at device_attach+0x69
bus_generic_new_pass() at bus_generic_new_pass+0xd6
bus_set_pass() at bus_set_pass+0x7a
configure() at configure+0xa
mi_startup() at mi_startup+0x77
btext() at btext+0x2c
em0: Ethernet address: 00:1c:c0:1e:c4:05
uhci0: Intel 82801H (ICH8) USB controller USB-D port 0x30c0-0x30df irq 16 at 
device 26.0 on pci0
uhci0: LegSup = 0x2f00
usbus0: Intel 82801H (ICH8) USB controller USB-D on uhci0
uhci1: Intel 82801H (ICH8) USB controller USB-E port 0x30a0-0x30bf irq 21 at 
device 26.1 on pci0
uhci1: LegSup = 0x2f00
usbus1: Intel 82801H (ICH8) USB controller USB-E on uhci1
ehci0: Intel 82801H (ICH8) USB 2.0 controller USB2-B mem 
0x90325c00-0x90325fff irq 18 at device 26.7 on pci0
usbus2: EHCI version 1.0
usbus2: Intel 82801H (ICH8) USB 2.0 controller USB2-B on ehci0
pci0: multimedia, HDA at device 27.0 (no driver attached)
pcib1: ACPI PCI-PCI bridge at device 28.0 on pci0
pci1: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 28.1 on pci0
pci2: ACPI PCI bus on pcib2
atapci0: Marvell 88SX6101 UDMA133 controller port 
0x2018-0x201f,0x2024-0x2027,0x2010-0x2017,0x2020-0x2023,0x2000-0x200f mem 
0x9010-0x901001ff irq 17 at device 0.0 on pci2
ata2: ATA channel 0 on atapci0
pcib3: ACPI PCI-PCI bridge at device 28.2 on pci0
pci3: ACPI PCI bus on pcib3
pcib4: ACPI PCI-PCI bridge at device 28.3 on pci0
pci4: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge at device 

Re: Lock order reversal .

2010-12-07 Thread Garrett Cooper
On Dec 7, 2010, at 12:26 AM, Mehmet Erol Sanliturk m.e.sanlit...@gmail.com 
wrote:

 A Dmesg.TXT is attached having a lock order reversal .

The mount LOR is well known. The duplicate lock held WITNESS warning with 
em(4) might be of interest though.
Thanks,
-Garrett___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Lock order reversal .

2010-12-07 Thread Erik Cederstrand

Den 07/12/2010 kl. 10.20 skrev Garrett Cooper:

 On Dec 7, 2010, at 12:26 AM, Mehmet Erol Sanliturk m.e.sanlit...@gmail.com 
 wrote:
 
 A Dmesg.TXT is attached having a lock order reversal .
 
The mount LOR is well known.

I see that this is the standard response to lot's of LOR reports. It seems to 
be one of the most-reported errors on CURRENT (and it's certainly a loud one), 
but I think a lot of people waste time researching the error and browsing 
Bjoerns LOR page, only to get the above response (not picking on you, Garrett).

Do we have the possibility of silencing well-known and presumably harmless 
LOR's if there isn't sufficient motivation to fix the source?

Erik

Re: In-kernel PPPoE

2010-12-07 Thread Bjoern A. Zeeb

On Tue, 7 Dec 2010, Alexander Motin wrote:


Does mpd work in -current ? Last tried I, netgraph had problems with mpd.


Sure it does! What is the problem?


There have been several reports (incl. panics) on various lists like
net, stable, ... during the last months for mostly 8.x (and HEAD).
None of which was really followed up to to my memory.

/bz

--
Bjoern A. Zeeb  Welcome a new stage of life.
ks Going to jail sucks -- bz All my daemons like it!
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/jails.html
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Lock order reversal .

2010-12-07 Thread Attilio Rao
2010/12/7 Erik Cederstrand e...@cederstrand.dk:

 Den 07/12/2010 kl. 10.20 skrev Garrett Cooper:

 On Dec 7, 2010, at 12:26 AM, Mehmet Erol Sanliturk m.e.sanlit...@gmail.com 
 wrote:

 A Dmesg.TXT is attached having a lock order reversal .

    The mount LOR is well known.

 I see that this is the standard response to lot's of LOR reports. It seems to 
 be one of the most-reported errors on CURRENT (and it's certainly a loud 
 one), but I think a lot of people waste time researching the error and 
 browsing Bjoerns LOR page, only to get the above response (not picking on 
 you, Garrett).

 Do we have the possibility of silencing well-known and presumably harmless 
 LOR's if there isn't sufficient motivation to fix the source?

Witness has an 'internal blessing list' we never wanted to use in
order to keep them popping up as reminder.
Actually, the fact the LOR is 'known' doesn't mean it is 'analyzed'.
The very few 'Analyzed but harmless' cases in the past have been
handled via _NOWITNESS flags I guess.

Attilio


-- 
Peace can only be achieved by understanding - A. Einstein
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Process accounting/timing has broken recently

2010-12-07 Thread John Baldwin
On Monday, December 06, 2010 7:11:28 pm David Xu wrote:
 John Baldwin wrote:
  On Sunday, December 05, 2010 6:18:29 pm Steve Kargl wrote:

  Sometime in the last 7-10 days, some one made a
  change that has broken process accounting/timing.
 
  laptop:kargl[42] foreach i ( 0 1 2 3 4 5 6 7 8 9 )
  foreach? time ./testf
  foreach? end
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 69.55 real38.39 user30.94 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 68.82 real40.95 user27.60 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 69.14 real38.90 user30.02 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 68.79 real40.59 user27.99 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 68.93 real39.76 user28.96 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 68.71 real41.21 user27.29 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 69.05 real39.68 user29.15 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 68.99 real39.98 user28.80 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 69.02 real39.64 user29.16 sys
  Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
 69.38 real37.49 user31.67 sys
 
  testf is a numerically intensive program that tests the
  accuracy of expf() in a tight loop.  User time varies
  by ~3 seconds on my lightly loaded 2 GHz core2 duo processor.
  I'm fairly certain that the code does not suddenly grow/loose
  6 GFLOP of operations.
  
 
  The user/sys thing is a hack (and has been).  We sample the PC at stathz 
  (~128 
  hz) to figure out a user vs sys split and use that to divide up the total 
  runtime (which actually is fairly accurate).  All you need is for the clock 
  ticks to fire just a bit differently between runs to get a swing in user vs 
  system time.
 
  What I would like is to keep separate raw bintime's for user vs system time 
  in 
  the raw data instead, but that would involve checking the CPU ticker more 
  often (e.g. twice for each syscall, interrupt, and trap in addition to the 
  current once per context switch).  So far folks seem to be more worried 
  about 
  the extra overhead rather than the loss of accuracy.
 

 Adding any instruction into global syscall path should be cautioned, it
 is worse then before, thinking about a threaded application, a userland
 thread may have locked a mutex and calls a system call, the overhead
 added to system call path can directly affect a threaded application's
 performance now, because the time window the mutex is held
 is longer than before, I have seen some people likes to fiddle with
 system call path, it should be cautioned.

OTOH, the current getrusage(2) stats cannot be trusted.  The only meaningful
thing you can do is to sum them since the total is known to be accurate at
least.

If it wouldn't make things so messy I'd consider a new kernel option
'ACCURATE_RUSAGE' or some such.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Extracting tgz file: Attempt to write to an empty file

2010-12-07 Thread Tim Kientzle
On Nov 29, 2010, at 9:19 AM, Sergey Kandaurov wrote:
 I see these errors when tar (not limited to but including the version
 from FreeBSD -current)
 
 # bsdtar -xf ~/arch.tgz
 ./: Attempt to write to an empty file
 ./.cpan/: Attempt to write to an empty file
 ./.cpan/CPAN/: Attempt to write to an empty file
 ./.cpan/build/: Attempt to write to an empty file

I just committed a fix to -CURRENT (r216258).

Please try it and let me know if this fixes the problem for you.

Tim

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: trying to use xz on manuals.

2010-12-07 Thread Norikatsu Shigemura
On Mon, 6 Dec 2010 22:50:44 -0800
Tim Kientzle t...@kientzle.com wrote:
  Some time ago I do similar tests. Changing compression for base man's to 
  bz2 or xz doesn't make much sense.
  Oh, agreed.  The issue with small files is that they will always take up at 
  least one sector [*]; different compression routines don't gain any benefit 
  if they don't change the number of sectors needed to store the file.
  More than half of the manpages end up as 1K .gz catman files as it is; ~90% 
  are 2K or smaller.
 It might make sense if XZ decompression were significantly
 faster than GZip decompression.  (Especially since man pages
 are decompressed much more often than they are compressed.)

Oh, that's good!
But this setting causes pkg-plist break of ports.
Maybe, some ports chase bsd.own.mk (COMPRESS_CMD, COMPRESS_EXT),
but it assumed that MANEXT is .gz:-(.

Thank you.

-- 
Norikatsu Shigemura n...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Lock order reversal .

2010-12-07 Thread Julian Elischer

On 12/7/10 3:41 AM, Attilio Rao wrote:

2010/12/7 Erik Cederstrande...@cederstrand.dk:

Den 07/12/2010 kl. 10.20 skrev Garrett Cooper:


On Dec 7, 2010, at 12:26 AM, Mehmet Erol Sanliturkm.e.sanlit...@gmail.com  
wrote:


A Dmesg.TXT is attached having a lock order reversal .

The mount LOR is well known.

I see that this is the standard response to lot's of LOR reports. It seems to 
be one of the most-reported errors on CURRENT (and it's certainly a loud one), 
but I think a lot of people waste time researching the error and browsing 
Bjoerns LOR page, only to get the above response (not picking on you, Garrett).

Do we have the possibility of silencing well-known and presumably harmless 
LOR's if there isn't sufficient motivation to fix the source?

Witness has an 'internal blessing list' we never wanted to use in
order to keep them popping up as reminder.
Actually, the fact the LOR is 'known' doesn't mean it is 'analyzed'.
The very few 'Analyzed but harmless' cases in the past have been
handled via _NOWITNESS flags I guess.


the problem is that the witness output tells you the second case (the 
reversed case)
but it doesn't have any clues about the first case (the one that wsa 
the other way around).


An extended witness might use a lot of memory but associate with each 
lock a 'last place called when a lock was already held'
that might give a clue as to where the other instance was. I'm not 
volunteering to write it,
but it might be very worth while.. I'd certainly like to hear other 
ideas as well.




Attilio




___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: trying to use xz on manuals.

2010-12-07 Thread Chuck Swiger
On Dec 6, 2010, at 11:30 PM, Alex Kozlov wrote:
 On Mon, Dec 06, 2010 at 10:50:44PM -0800, Tim Kientzle wrote:
 It might make sense if XZ decompression were significantly
 faster than GZip decompression.  (Especially since man pages
 are decompressed much more often than they are compressed.)
 
 It's not.

Agreed, gzip is faster than XZ, but for manpages the difference is so small 
that a human won't notice any difference.  The slowest machine I have around is 
a Pentium III @ 933 MHz, and it's getting (typical results from 5 trials, on a 
FreeBSD 7.4-PRERELEASE system) shows:

$ time gzcat CC.1.gz  /dev/null
real0m0.021s
user0m0.013s
sys 0m0.007s

$ time xzcat CC.1.xz  /dev/null
real0m0.063s
user0m0.055s
sys 0m0.007s

Regards,
-- 
-Chuck

PS: I installed bash just to get millisecond-accuracy for the timing.  :-)  Is 
there any way to convince the default /bin/sh or /usr/bin/time to output the 
same...?
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Lock order reversal .

2010-12-07 Thread Matthew Fleming
On Tue, Dec 7, 2010 at 9:18 AM, Julian Elischer jul...@freebsd.org wrote:
 On 12/7/10 3:41 AM, Attilio Rao wrote:

 2010/12/7 Erik Cederstrande...@cederstrand.dk:

 Den 07/12/2010 kl. 10.20 skrev Garrett Cooper:

 On Dec 7, 2010, at 12:26 AM, Mehmet Erol
 Sanliturkm.e.sanlit...@gmail.com  wrote:

 A Dmesg.TXT is attached having a lock order reversal .

    The mount LOR is well known.

 I see that this is the standard response to lot's of LOR reports. It
 seems to be one of the most-reported errors on CURRENT (and it's certainly a
 loud one), but I think a lot of people waste time researching the error and
 browsing Bjoerns LOR page, only to get the above response (not picking on
 you, Garrett).

 Do we have the possibility of silencing well-known and presumably
 harmless LOR's if there isn't sufficient motivation to fix the source?

 Witness has an 'internal blessing list' we never wanted to use in
 order to keep them popping up as reminder.
 Actually, the fact the LOR is 'known' doesn't mean it is 'analyzed'.
 The very few 'Analyzed but harmless' cases in the past have been
 handled via _NOWITNESS flags I guess.

 the problem is that the witness output tells you the second case (the
 reversed case)
 but it doesn't have any clues about the first case (the one that wsa the
 other way around).

 An extended witness might use a lot of memory but associate with each lock a
 'last place called when a lock was already held'
 that might give a clue as to where the other instance was. I'm not
 volunteering to write it,
 but it might be very worth while.. I'd certainly like to hear other ideas as
 well.

I have a small patch against stable/7 that adds a single bit to each
witness structure so that, if the normal lock order is ever
encountered after a reversal, the stack is printed.  It doesn't help
when the order is defined statically, though.

I could try to roll this up against -CURRENT this weekend.

Thanks,
matthew
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Process accounting/timing has broken recently

2010-12-07 Thread David Xu

John Baldwin wrote:

On Monday, December 06, 2010 7:11:28 pm David Xu wrote:

John Baldwin wrote:

On Sunday, December 05, 2010 6:18:29 pm Steve Kargl wrote:
  

Sometime in the last 7-10 days, some one made a
change that has broken process accounting/timing.

laptop:kargl[42] foreach i ( 0 1 2 3 4 5 6 7 8 9 )
foreach? time ./testf
foreach? end
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   69.55 real38.39 user30.94 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   68.82 real40.95 user27.60 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   69.14 real38.90 user30.02 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   68.79 real40.59 user27.99 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   68.93 real39.76 user28.96 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   68.71 real41.21 user27.29 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   69.05 real39.68 user29.15 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   68.99 real39.98 user28.80 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   69.02 real39.64 user29.16 sys
Max ULP: 0.501607 for x in [-18.00:88.70] with dx = 1.067100e-04
   69.38 real37.49 user31.67 sys

testf is a numerically intensive program that tests the
accuracy of expf() in a tight loop.  User time varies
by ~3 seconds on my lightly loaded 2 GHz core2 duo processor.
I'm fairly certain that the code does not suddenly grow/loose
6 GFLOP of operations.

The user/sys thing is a hack (and has been).  We sample the PC at stathz (~128 
hz) to figure out a user vs sys split and use that to divide up the total 
runtime (which actually is fairly accurate).  All you need is for the clock 
ticks to fire just a bit differently between runs to get a swing in user vs 
system time.


What I would like is to keep separate raw bintime's for user vs system time in 
the raw data instead, but that would involve checking the CPU ticker more 
often (e.g. twice for each syscall, interrupt, and trap in addition to the 
current once per context switch).  So far folks seem to be more worried about 
the extra overhead rather than the loss of accuracy.


  

Adding any instruction into global syscall path should be cautioned, it
is worse then before, thinking about a threaded application, a userland
thread may have locked a mutex and calls a system call, the overhead
added to system call path can directly affect a threaded application's
performance now, because the time window the mutex is held
is longer than before, I have seen some people likes to fiddle with
system call path, it should be cautioned.


OTOH, the current getrusage(2) stats cannot be trusted.  The only meaningful
thing you can do is to sum them since the total is known to be accurate at
least.

If it wouldn't make things so messy I'd consider a new kernel option
'ACCURATE_RUSAGE' or some such.


Our getrusage is already very slow, everytime, it needs to
iterate the threads list with a process SLOCK held. I saw some mysql
versions heavily use getrusage, and a horribly slow. I think a
ACCURATE_RUSAGE will make it worse ?



___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org