Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-12-01 Thread Eirik Øverby

On Dec 1, 2005, at 04:12 , Michael Vince wrote:

Some apps that use of frequent queries of the system time for  
example MySQL are well known in FreeBSD to be slower then Linux  
because its  more expensive to call compared to Linux, maybe Tomcat  
is also another such app this can also be double the case depending  
on on your jsp and servlet code.


True, but on equal hardware it should perform equally.

If you are on good hardware, are using 6 and keep your systems time  
updated via ntp you might want to try changing from  
kern.timecounter.hardware: ACPI-fast to TSC(-100) and doing a  
benchmark this has already proven to increase performance of MySQL  
by a significantly amount.


I will try this, though it will not solve my original problem (and  
the subject is somewhat misleading now, as this seems to be  
independent of kernel revisions).


Also some new experimental low-precision time code has been added  
to current source tree to see how much performance increases can be  
gained, weirdly enough some people have argued against it for I  
guess a wide range of reasons such as they just have crap hardware  
and don't care about performance, don't like the extra maintenance  
of code or just like Red Hat fanatics having an easy way to bad  
mouth FreeBSD performance. I think most people would agree though  
that it has to be done, or have to choose to believe FreeBSD isn't  
about performance among other goals.


I will not join this discussion ;)

With 6 you can also use the new thr threading library, try your  
libmap.conf to libthr for testing, for example

[/usr/local/jdk1.4.2/]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

I been doing some 'ab' testing libthr with Apache2 compiled for  
worker MPM and have some really interesting differences on server  
load, loads of about 40 for pthread and around 5 thr under certain  
tests with ab with the exact same test.


Too bad this causes jdk1.5.0-amd64 to crash...
Application startup times were significantly reduced, but only the  
times it actually managed to start without failing. Latest at the 2nd  
or 3rd transaction Java coredumps. :(


And as current load testing is done without Apache in between, this  
is moot..


/Eirik




Mike


Eirik Øverby wrote:

Update: The diff below was made after making sure both systems  
are  running the exact same kernel. Behavior is the same. Building  
new  kernels (6-STABLE) now to get out of the BETA stage.


/Eirik

On Nov 28, 2005, at 22:53 , Eirik Øverby wrote:


Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address   
differences):


30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000

What on earth is that all about? The slow box has the ACPI- 
fast  timecounter...


/Eirik

On Nov 28, 2005, at 22:14 , Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 09:54:30PM +0100, Eirik ?verby wrote:


Hi,

I think I have found the culprit. There must be some sort of
difference between the machines after all (BIOS revision?),  
because
while on one machine the interrupt rate for the bge card stays  
very

low (2 to be exact) during maximum load, the other machine goes
beyond 1000 and keeps rising constantly. This might also  
explain why
performance slowly degrades over time on that machine, and  
response

times vary wildly, while the fast machine responds nicely within
1-2 seconds no matter the load and testing time.

I will have to investigate this more closely. Is there a way  
to  force
the NIC to polling mode (I'm assuming that is the difference,  
an IRQ

rate of 2 is too low for a heavily loaded server if the NIC is
interrupt-driven)?

Anything else I could look at?



BIOS update.

Kris











___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-30 Thread Chris

Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
identify the origin of the epsilon :-)



Yea yea ;) Working on it..
Is there a way to force ACPI-safe on the slower system?


I'm upgrading BIOSes on both boxes now, even though they seem equal.  
Then I'll see what ACPI debug output shows me. If you have any other  
hints or ideas, please let me know...  thanks so far.


I missed the beginning of this thread so sorry if this is impractical or 
stupid suggestion but could you swap hard disks between machines? At 
least that might tell you if it is a hardware/bios or operating system 
problem.


Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-30 Thread Michael Vince
Some apps that use of frequent queries of the system time for example 
MySQL are well known in FreeBSD to be slower then Linux because its  
more expensive to call compared to Linux, maybe Tomcat is also another 
such app this can also be double the case depending on on your jsp and 
servlet code.
If you are on good hardware, are using 6 and keep your systems time 
updated via ntp you might want to try changing from 
kern.timecounter.hardware: ACPI-fast to TSC(-100) and doing a benchmark 
this has already proven to increase performance of MySQL by a 
significantly amount.
Also some new experimental low-precision time code has been added to 
current source tree to see how much performance increases can be gained, 
weirdly enough some people have argued against it for I guess a wide 
range of reasons such as they just have crap hardware and don't care 
about performance, don't like the extra maintenance of code or just like 
Red Hat fanatics having an easy way to bad mouth FreeBSD performance. I 
think most people would agree though that it has to be done, or have to 
choose to believe FreeBSD isn't about performance among other goals.


With 6 you can also use the new thr threading library, try your 
libmap.conf to libthr for testing, for example

[/usr/local/jdk1.4.2/]
libpthread.so.2 libthr.so.2
libpthread.so   libthr.so

I been doing some 'ab' testing libthr with Apache2 compiled for worker 
MPM and have some really interesting differences on server load, loads 
of about 40 for pthread and around 5 thr under certain tests with ab 
with the exact same test.


Mike


Eirik Øverby wrote:

Update: The diff below was made after making sure both systems are  
running the exact same kernel. Behavior is the same. Building new  
kernels (6-STABLE) now to get out of the BETA stage.


/Eirik

On Nov 28, 2005, at 22:53 , Eirik Øverby wrote:


Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address  
differences):


30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000

What on earth is that all about? The slow box has the ACPI-fast  
timecounter...


/Eirik

On Nov 28, 2005, at 22:14 , Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 09:54:30PM +0100, Eirik ?verby wrote:


Hi,

I think I have found the culprit. There must be some sort of
difference between the machines after all (BIOS revision?), because
while on one machine the interrupt rate for the bge card stays very
low (2 to be exact) during maximum load, the other machine goes
beyond 1000 and keeps rising constantly. This might also explain why
performance slowly degrades over time on that machine, and response
times vary wildly, while the fast machine responds nicely within
1-2 seconds no matter the load and testing time.

I will have to investigate this more closely. Is there a way to  force
the NIC to polling mode (I'm assuming that is the difference, an IRQ
rate of 2 is too low for a heavily loaded server if the NIC is
interrupt-driven)?

Anything else I could look at?



BIOS update.

Kris







___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Eirik Oeverby



On Mon, 28 Nov 2005, Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:

Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address differences):

30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---

Timecounter ACPI-fast frequency 3579545 Hz quality 1000


What on earth is that all about? The slow box has the ACPI-fast
timecounter...


Could be ACPI bugs on your system:


Yes, but the other system is 100% equal - hardware, bios config, bios and 
bootblock revision, controller bioses, etc. etc.

It all matches.

Should I complain to HP?

/Eirik




BIOS update.


Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Kris Kennaway
On Tue, Nov 29, 2005 at 09:46:09AM +0100, Eirik Oeverby wrote:
 
 
 On Mon, 28 Nov 2005, Kris Kennaway wrote:
 
 On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:
 Firmware versions are equal. BIOS settings are equal.
 However, a diff of the dmesgs show (apart from MAC address differences):
 
 30c30
  Timecounter ACPI-safe frequency 3579545 Hz quality 1000
 ---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 
 What on earth is that all about? The slow box has the ACPI-fast
 timecounter...
 
 Could be ACPI bugs on your system:
 
 Yes, but the other system is 100% equal - hardware, bios config, bios and 
 bootblock revision, controller bioses, etc. etc.
 It all matches.

Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
identify the origin of the epsilon :-)

 Should I complain to HP?

If you think you'll get anywhere, it might be worth pursuing.

Kris


pgp47f1GQPhuz.pgp
Description: PGP signature


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Eirik Øverby


On Nov 29, 2005, at 10:15 , Kris Kennaway wrote:


On Tue, Nov 29, 2005 at 09:46:09AM +0100, Eirik Oeverby wrote:



On Mon, 28 Nov 2005, Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:

Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address  
differences):


30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---

Timecounter ACPI-fast frequency 3579545 Hz quality 1000


What on earth is that all about? The slow box has the ACPI-fast
timecounter...


Could be ACPI bugs on your system:


Yes, but the other system is 100% equal - hardware, bios config,  
bios and

bootblock revision, controller bioses, etc. etc.
It all matches.


Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
identify the origin of the epsilon :-)


Yea yea ;) Working on it..
Is there a way to force ACPI-safe on the slower system?

/Eirik




Should I complain to HP?


If you think you'll get anywhere, it might be worth pursuing.

Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Joseph Koshy
EØ Yea yea ;) Working on it..
EØ Is there a way to force ACPI-safe on the slower system?

# sysctl kern.timecounter.hardware=one of the values from
kern.timecounter.choice

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Eirik Øverby

On Nov 29, 2005, at 10:44 , Joseph Koshy wrote:


EØ Yea yea ;) Working on it..
EØ Is there a way to force ACPI-safe on the slower system?

# sysctl kern.timecounter.hardware=one of the values from
kern.timecounter.choice


kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy 
(-100)


ACPI-safe is not among the choices. Which means I can't choose it, I  
presume.
I'm compiling up new kernels with ACPI_DEBUG right now, once they are  
installed, what can I do to determine differences in DSDT tables  
etc.? Or whatever else is different?


/Eirik



--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Kris Kennaway
On Tue, Nov 29, 2005 at 10:25:07AM +0100, Eirik ?verby wrote:
 
 On Nov 29, 2005, at 10:15 , Kris Kennaway wrote:
 
 On Tue, Nov 29, 2005 at 09:46:09AM +0100, Eirik Oeverby wrote:
 
 
 On Mon, 28 Nov 2005, Kris Kennaway wrote:
 
 On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:
 Firmware versions are equal. BIOS settings are equal.
 However, a diff of the dmesgs show (apart from MAC address  
 differences):
 
 30c30
  Timecounter ACPI-safe frequency 3579545 Hz quality 1000
 ---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 
 What on earth is that all about? The slow box has the ACPI-fast
 timecounter...
 
 Could be ACPI bugs on your system:
 
 Yes, but the other system is 100% equal - hardware, bios config,  
 bios and
 bootblock revision, controller bioses, etc. etc.
 It all matches.
 
 Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
 identify the origin of the epsilon :-)
 
 Yea yea ;) Working on it..
 Is there a way to force ACPI-safe on the slower system?

I think someone already mentioned this..see the
kern.timecounter.hardware and other kern.timecounter sysctls.

Kris


pgpuH42ZugHwT.pgp
Description: PGP signature


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Kris Kennaway
On Tue, Nov 29, 2005 at 10:48:33AM +0100, Eirik ?verby wrote:
 On Nov 29, 2005, at 10:44 , Joseph Koshy wrote:
 
 E? Yea yea ;) Working on it..
 E? Is there a way to force ACPI-safe on the slower system?
 
 # sysctl kern.timecounter.hardware=one of the values from
 kern.timecounter.choice
 
 kern.timecounter.choice: TSC(-100) ACPI-fast(1000) i8254(0) dummy 
 (-100)
 
 ACPI-safe is not among the choices. Which means I can't choose it, I  
 presume.
 I'm compiling up new kernels with ACPI_DEBUG right now, once they are  
 installed, what can I do to determine differences in DSDT tables  
 etc.? Or whatever else is different?

There is documentation somewhere on www.freebsd.org about how to
obtain necessary ACPI debugging information...sorry, I don't remember
specifically where, maybe in the handbook or elsewhere in the docs
section.

Kris



pgp2MVAaiHgSC.pgp
Description: PGP signature


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread Eirik Øverby


On Nov 29, 2005, at 11:37 , Kris Kennaway wrote:


On Tue, Nov 29, 2005 at 10:25:07AM +0100, Eirik ?verby wrote:


On Nov 29, 2005, at 10:15 , Kris Kennaway wrote:


On Tue, Nov 29, 2005 at 09:46:09AM +0100, Eirik Oeverby wrote:



On Mon, 28 Nov 2005, Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:

Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address
differences):

30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---

Timecounter ACPI-fast frequency 3579545 Hz quality 1000


What on earth is that all about? The slow box has the ACPI-fast
timecounter...


Could be ACPI bugs on your system:


Yes, but the other system is 100% equal - hardware, bios config,
bios and
bootblock revision, controller bioses, etc. etc.
It all matches.


Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
identify the origin of the epsilon :-)


Yea yea ;) Working on it..
Is there a way to force ACPI-safe on the slower system?


I think someone already mentioned this..see the
kern.timecounter.hardware and other kern.timecounter sysctls.


I have now forced ACPI-safe on the slow system, to match the fast one.
Too bad though, it made absolutely zero difference.

I'm upgrading BIOSes on both boxes now, even though they seem equal.  
Then I'll see what ACPI debug output shows me. If you have any other  
hints or ideas, please let me know...  thanks so far.


/Eirik



Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-29 Thread pete wright
On 11/29/05, Eirik Øverby [EMAIL PROTECTED] wrote:

 On Nov 29, 2005, at 11:37 , Kris Kennaway wrote:

  On Tue, Nov 29, 2005 at 10:25:07AM +0100, Eirik ?verby wrote:
 
  On Nov 29, 2005, at 10:15 , Kris Kennaway wrote:
 
  On Tue, Nov 29, 2005 at 09:46:09AM +0100, Eirik Oeverby wrote:
 
 
  On Mon, 28 Nov 2005, Kris Kennaway wrote:
 
  On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:
  Firmware versions are equal. BIOS settings are equal.
  However, a diff of the dmesgs show (apart from MAC address
  differences):
 
  30c30
   Timecounter ACPI-safe frequency 3579545 Hz quality 1000
  ---
  Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 
  What on earth is that all about? The slow box has the ACPI-fast
  timecounter...
 
  Could be ACPI bugs on your system:
 
  Yes, but the other system is 100% equal - hardware, bios config,
  bios and
  bootblock revision, controller bioses, etc. etc.
  It all matches.
 
  Clearly they're not 100% equal, but (100-epsilon)%.  Your job is to
  identify the origin of the epsilon :-)
 
  Yea yea ;) Working on it..
  Is there a way to force ACPI-safe on the slower system?
 
  I think someone already mentioned this..see the
  kern.timecounter.hardware and other kern.timecounter sysctls.

 I have now forced ACPI-safe on the slow system, to match the fast one.
 Too bad though, it made absolutely zero difference.

 I'm upgrading BIOSes on both boxes now, even though they seem equal.
 Then I'll see what ACPI debug output shows me. If you have any other
 hints or ideas, please let me know...  thanks so far.

 /Eirik

Have you tired turning off ACPI at boot time.  Is there an option to
turn it off in the BIOS.  This is an HP box correct?  I have had some
fun in the past chasing down hard to reproduce ACPI problems on HP
hardware before, after much software trouble shooting I realized that
by turning some knob's in the BIOS got the machines to a stable state
(in my case I turned off USB auto detection).

HTH
-pete



 
  Kris

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



--
~~o0OO0o~~
Pete Wright
www.nycbug.org
NYC's *BSD User Group
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Joseph Koshy
On 11/26/05, Eirik Øverby [EMAIL PROTECTED] wrote:
EØ [Cross-posting after lack of response on -stable]

The first step would be do some performance debugging.

 - What do top/vmstat/systat say about what the OS and
   apps are doing?  Is the CPU pegged at 100%?  What's
   the load seen by the disks?  Is the RAID in good health?
 - Any unusual messages in /var/log/messages?  Any errors
   shown by the network interfaces (I'm assuming the
   application is using the network).
 - A brief description of the workload presented by
   the app would help.

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby

On Nov 28, 2005, at 14:45 , Joseph Koshy wrote:


On 11/26/05, Eirik Øverby [EMAIL PROTECTED] wrote:
EØ [Cross-posting after lack of response on -stable]

The first step would be do some performance debugging.


Yep.


 - What do top/vmstat/systat say about what the OS and
   apps are doing?  Is the CPU pegged at 100%?  What's
   the load seen by the disks?  Is the RAID in good health?


vmstat during system idle times are found below. I think they are  
rather interesting. To your other questions: The CPU usage is  
comparable on both systems. Not pegged at 100%, but load seems to  
stabilize around 0.5. Disk load is minimal on the application  
servers, somewhat more on the database servers, but they are not  
interesting here (they are not the bottle neck, and they perform  
equally). The RAIDs are in good health on both systems.


The vmstat output is interesting.
From the fast system (6.0-BETA3, ~idle):
[EMAIL PROTECTED] ~# vmstat -w 5
procs  memory  pagedisks faults  cpu
r b w avmfre  flt  re  pi  po  fr  sr da0 pa0   in   sy  cs  
us sy id
1 0 0 2439220  38048   14   0   0   0  14   0   0   0  170  141 437   
0  0 100
0 0 0 2439220  380282   0   0   0   3   0   2   0  192   94 475   
0  0 100
0 0 0 2439220  379161   0   0   0   6   0   1   0  291  925 926   
5  0 94
0 0 0 2439220  379160   0   0   0   0   0   0   0  185   91 458   
0  0 100
0 0 0 2439220  378201   0   0   0   6   0   3   0  289 1163 1124   
6  0 94
0 0 0 2439220  378200   0   0   0   0   0   0   0  183   91 454   
0  0 100


From the slow system (6.0-BETA3, ~idle):
[EMAIL PROTECTED] ~# vmstat -w 5
procs  memory  pagedisks faults  cpu
r b w avmfre  flt  re  pi  po  fr  sr da0 pa0   in   sy  cs  
us sy id
0 0 1 2468180  51660   15   0   0   0  18   4   0   0 1048 3200 5130   
0  0 100
0 0 0 2468180  516601   0   0   0   0   0   0   0 1004 3068 5063   
0  0 100
0 0 0 2468180  516600   0   0   0   0   0   0   0 1003 3094 5057   
0  0 100
0 0 0 2468180  516600   0   0   0   0   0   1   0 1005 3068 5065   
0  0 100
0 0 0 2468180  516561   0   0   0   0   0   0   0 1002 3090 5054   
0  1 99
0 0 0 2468180  516560   0   0   0   0   0   0   0 1002 3064 5053   
0  0 100


*loads* more context switches than on the BETA-3 system. I have not  
yet tried this during load; I have to wait for the testing window for  
that. But perhaps this helps? What do I look for next?



 - Any unusual messages in /var/log/messages?  Any errors
   shown by the network interfaces (I'm assuming the
   application is using the network).


No errors shown that I can determine.


 - A brief description of the workload presented by
   the app would help.


This is a web application (payment gateway) that receives a HTTP  
POST, does some processing, asks an external service for a piece of  
information, then returns the gathered information to the client. The  
call to the external service can be eliminated, but does not change  
the performance profile.
How the application works internally is impossible for me to say;  
it's 3rd party. I can say, after asking them, that it is moderately  
threaded. Whatever moderately threaded. My interpretation is that  
the heaviest threading happens in tomcat itself, with up to 150  
concurrent connection threads running.


Thanks,
/Eirik



--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable- 
[EMAIL PROTECTED]





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby


On Nov 28, 2005, at 15:54 , Joseph Koshy wrote:


EØ *loads* more context switches than on the BETA-3 system.
EØ I have not yet tried this during load

 - Which scheduler have you configured (BSD or ULE)?


Running GENERIC/SMP kernels, with BSD scheduler.
Speaking of which; is there a way to extract the kernel configuration  
from a running kernel or kernel binary?



 - What do the interrupt statistics show?  Any interrupt
   storms?  Please check the mailing lists for a prior
   discussion on interrupt storms on some motherboards.


Slow system:
interrupt  total   rate
irq1: atkbd0   4  0
irq14: ata0   46  0
irq24: ciss0  337166  1
irq28: bge0  8038794 35
cpu0: timer446869052   1999
cpu1: timer446861051   1999
Total  902106113   4037

Fast system:
interrupt  total   rate
irq1: atkbd0   6  0
irq14: ata0   46  0
irq24: ciss0 7465831  1
irq28: bge0 20764380  2
lapic0: timer14827978729   2000
lapic1: timer14827970729   2000
Total29684179721   4003

No significant differences I'd say. Anything else I can do to dig  
deeper?



 - Could you post the dmesg output from the systems (I
   presume there aren't any significant differences).


dmesg from slow system follows. I do not have a dmesg for the fast  
system; I cannot boot it now either. However, I have compared them  
before, and they are 100% equal. Seems to be very close in serial  
numbers, probably same production run.


Copyright (c) 1992-2005 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights  
reserved.

FreeBSD 6.0-STABLE #0: Sat Nov 26 01:52:00 CET 2005
[EMAIL PROTECTED]:/usr/obj/amd64/usr/src/sys/SMP
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Opteron(tm) Processor 250 (2405.47-MHz K8-class CPU)
  Origin = AuthenticAMD  Id = 0x20f51  Stepping = 1
   
Features=0x78bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE, 
MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2

  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,b25,LM,3DNow+,3DNow
real memory  = 1073717248 (1023 MB)
avail memory = 1024946176 (977 MB)
ACPI APIC Table: HP 0083
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
MADT: Forcing active-low polarity and level trigger for SCI
ioapic0 Version 1.1 irqs 0-23 on motherboard
ioapic1 Version 1.1 irqs 24-27 on motherboard
ioapic2 Version 1.1 irqs 28-31 on motherboard
ioapic3 Version 1.1 irqs 32-35 on motherboard
ioapic4 Version 1.1 irqs 36-39 on motherboard
acpi0: HP A05 on motherboard
acpi0: Power Button (fixed)
pci_link0: ACPI PCI Link LNKA irq 5 on acpi0
pci_link1: ACPI PCI Link LNKB irq 7 on acpi0
pci_link2: ACPI PCI Link LNKC irq 0 on acpi0
pci_link3: ACPI PCI Link LNKD irq 3 on acpi0
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 32-bit timer at 3.579545MHz port 0x908-0x90b on acpi0
cpu0: ACPI CPU on acpi0
cpu1: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 3.0 on pci0
pci1: ACPI PCI bus on pcib1
ohci0: OHCI (generic) USB controller mem 0xf7df-0xf7df0fff irq  
19 at device 0.0 on pci1

ohci0: [GIANT-LOCKED]
usb0: OHCI version 1.0, legacy support
usb0: SMM does not respond, resetting
usb0: OHCI (generic) USB controller on ohci0
usb0: USB revision 1.0
uhub0: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 3 ports with 3 removable, self powered
ohci1: OHCI (generic) USB controller mem 0xf7de-0xf7de0fff irq  
19 at device 0.1 on pci1

ohci1: [GIANT-LOCKED]
usb1: OHCI version 1.0, legacy support
usb1: SMM does not respond, resetting
usb1: OHCI (generic) USB controller on ohci1
usb1: USB revision 1.0
uhub1: AMD OHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 3 ports with 3 removable, self powered
pci1: base peripheral at device 2.0 (no driver attached)
pci1: base peripheral at device 2.2 (no driver attached)
pci1: display, VGA at device 3.0 (no driver attached)
isab0: PCI-ISA bridge at device 4.0 on pci0
isa0: ISA bus on isab0
atapci0: AMD 8111 UDMA133 controller port  
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x2000-0x200f at device 4.1 on pci0

ata0: ATA channel 0 on atapci0
ata1: ATA channel 1 on atapci0
pci0: bridge at device 4.3 (no driver attached)
pcib2: ACPI PCI-PCI bridge at device 7.0 on pci0
pci2: ACPI PCI bus on pcib2
ciss0: HP Smart Array 6i port 0x5000-0x50ff mem  
0xf7ef-0xf7ef1fff,0xf7e8-0xf7eb irq 24 at device 4.0 on pci2

ciss0: [GIANT-LOCKED]
pci0: base peripheral, 

Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby

Follow-up:
I've now ran vmstat during load, which confirms the findings of  
vmstat during idle time.


Slow system - one sample before and after load start included:
procs  memory  pagedisks faults  cpu
r b w avmfre  flt  re  pi  po  fr  sr da0 pa0   in   sy  cs  
us sy id
3 0 0 2468572  45476   14   0   0   0  18   4   0   0 1049 3201 5132   
0  0 100
0 0 1 2468572  423881   0   0   0 154   0   5   0 6852 19813  
19970 22  8 70
1 0 0 2468572  393321   0   0   0 155   0  11   0 6823 19661  
19886 23  7 71
2 0 0 2468432  363361   0   0   0 160   0   6   0 7031 20356  
20534 19  7 74
0 0 0 2468432  332281   0   0   0 156   0   5   0 6685 19420  
19613 20  7 73
2 0 0 2468432  299281   0   0   0 164   0   5   0 7105 20483  
20673 21  7 71
1 0 0 2468432  535681   0   0   0 153 1308   5   0 6688 19278  
19537 21  8 72
1 0 1 2468432  505802   0   0   0 150   0   6   0 6408 18430  
18693 24  7 69
0 0 0 2468432  477482   0   0   0 143   0   6   0 6323 18098  
18328 26  7 67
0 0 0 2468432  450561   0   0   0 136   0   5   0 5607 17122  
17062 16  7 77
0 0 0 2468432  450400   0   0   0   0   0   0   0 1093 3172 5164   
0  0 100


Fast system:
procs  memory  pagedisks faults  cpu
r b w avmfre  flt  re  pi  po  fr  sr da0 pa0   in   sy  cs  
us sy id
0 0 0 2439276  397081   0   0   0   6   0   1   0  281 1029 992   
6  1 93
0 0 0 2439276  393807   0   0   0  16   0   1   0  665 1341 1714   
2  1 98
0 0 0 2439276  364725   0   0   0 145   0   6   0 5569 12409  
14821 21  7 72
0 0 0 2439276  335121   0   0   0 149   0   5   0 5862 12597  
15532 15  6 79
0 0 0 2439276  306001   0   0   0 146   0   4   0 5682 12655  
15102 19  7 74
2 0 0 2439276  541441   0   0   5 152 1310  10   0 6006 12908  
15964 17  6 77
0 0 0 2439276  511762   0   0   0 151   0   7   0 5348 11899  
14190 22  6 72
2 0 0 2439276  48104   98   0   0   0 248   0   5   0 5924 12889  
15757 15  7 78
1 0 0 2439276  451721   0   0   0 147   0   5   0 5882 12660  
15624 16  7 77
2 0 0 2439276  422761   0   0   0 145   0   5   0 5558 12477  
14864 21  6 73
0 0 0 2439276  393001   0   0   0 149   0   5   0 5842 12660  
15556 14  7 79
0 0 0 2439276  363481   0   0   0 150   0   8   0 5659 12562  
15042 21  5 74
0 0 0 2439276  334041   0   0   0 150   0   7   0 5868 12642  
15536 14  6 80
0 0 0 2439276  305881   0   0   0 142   0   6   0 5449 11961  
14487 19  7 74
0 0 0 2439276  305880   0   0   0   0   0   0   0  227  246 565   
0  0 100


I'm tempted to upgrade the fast system to 6-STABLE (same rev as the  
slow one). Even the slow system performs adequately, though it  
might help me isolate any potential hardware differences.


/Eirik

On Nov 28, 2005, at 15:54 , Joseph Koshy wrote:


EØ *loads* more context switches than on the BETA-3 system.
EØ I have not yet tried this during load

 - Which scheduler have you configured (BSD or ULE)?
 - What do the interrupt statistics show?  Any interrupt
   storms?  Please check the mailing lists for a prior
   discussion on interrupt storms on some motherboards.
 - Could you post the dmesg output from the systems (I
   presume there aren't any significant differences).

Please CC -stable too.

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby

Hi,

I think I have found the culprit. There must be some sort of  
difference between the machines after all (BIOS revision?), because  
while on one machine the interrupt rate for the bge card stays very  
low (2 to be exact) during maximum load, the other machine goes  
beyond 1000 and keeps rising constantly. This might also explain why  
performance slowly degrades over time on that machine, and response  
times vary wildly, while the fast machine responds nicely within  
1-2 seconds no matter the load and testing time.


I will have to investigate this more closely. Is there a way to force  
the NIC to polling mode (I'm assuming that is the difference, an IRQ  
rate of 2 is too low for a heavily loaded server if the NIC is  
interrupt-driven)?


Anything else I could look at?

Also, the interrupt rates for the CPUs stay at 2000 sharp on the fast  
system, but fluctuates somewhat on the other.


/Eirik

On Nov 28, 2005, at 15:54 , Joseph Koshy wrote:


EØ *loads* more context switches than on the BETA-3 system.
EØ I have not yet tried this during load

 - Which scheduler have you configured (BSD or ULE)?
 - What do the interrupt statistics show?  Any interrupt
   storms?  Please check the mailing lists for a prior
   discussion on interrupt storms on some motherboards.
 - Could you post the dmesg output from the systems (I
   presume there aren't any significant differences).

Please CC -stable too.

--
FreeBSD Volunteer, http://people.freebsd.org/~jkoshy




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Kris Kennaway
On Mon, Nov 28, 2005 at 09:54:30PM +0100, Eirik ?verby wrote:
 Hi,
 
 I think I have found the culprit. There must be some sort of  
 difference between the machines after all (BIOS revision?), because  
 while on one machine the interrupt rate for the bge card stays very  
 low (2 to be exact) during maximum load, the other machine goes  
 beyond 1000 and keeps rising constantly. This might also explain why  
 performance slowly degrades over time on that machine, and response  
 times vary wildly, while the fast machine responds nicely within  
 1-2 seconds no matter the load and testing time.
 
 I will have to investigate this more closely. Is there a way to force  
 the NIC to polling mode (I'm assuming that is the difference, an IRQ  
 rate of 2 is too low for a heavily loaded server if the NIC is  
 interrupt-driven)?
 
 Anything else I could look at?

BIOS update.

Kris


pgpBZSMkiol0I.pgp
Description: PGP signature


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby

Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address differences):

30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000

What on earth is that all about? The slow box has the ACPI-fast  
timecounter...


/Eirik

On Nov 28, 2005, at 22:14 , Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 09:54:30PM +0100, Eirik ?verby wrote:

Hi,

I think I have found the culprit. There must be some sort of
difference between the machines after all (BIOS revision?), because
while on one machine the interrupt rate for the bge card stays very
low (2 to be exact) during maximum load, the other machine goes
beyond 1000 and keeps rising constantly. This might also explain why
performance slowly degrades over time on that machine, and response
times vary wildly, while the fast machine responds nicely within
1-2 seconds no matter the load and testing time.

I will have to investigate this more closely. Is there a way to force
the NIC to polling mode (I'm assuming that is the difference, an IRQ
rate of 2 is too low for a heavily loaded server if the NIC is
interrupt-driven)?

Anything else I could look at?


BIOS update.

Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Eirik Øverby
Update: The diff below was made after making sure both systems are  
running the exact same kernel. Behavior is the same. Building new  
kernels (6-STABLE) now to get out of the BETA stage.


/Eirik

On Nov 28, 2005, at 22:53 , Eirik Øverby wrote:


Firmware versions are equal. BIOS settings are equal.
However, a diff of the dmesgs show (apart from MAC address  
differences):


30c30
 Timecounter ACPI-safe frequency 3579545 Hz quality 1000
---
 Timecounter ACPI-fast frequency 3579545 Hz quality 1000

What on earth is that all about? The slow box has the ACPI-fast  
timecounter...


/Eirik

On Nov 28, 2005, at 22:14 , Kris Kennaway wrote:


On Mon, Nov 28, 2005 at 09:54:30PM +0100, Eirik ?verby wrote:

Hi,

I think I have found the culprit. There must be some sort of
difference between the machines after all (BIOS revision?), because
while on one machine the interrupt rate for the bge card stays very
low (2 to be exact) during maximum load, the other machine goes
beyond 1000 and keeps rising constantly. This might also explain why
performance slowly degrades over time on that machine, and response
times vary wildly, while the fast machine responds nicely within
1-2 seconds no matter the load and testing time.

I will have to investigate this more closely. Is there a way to  
force

the NIC to polling mode (I'm assuming that is the difference, an IRQ
rate of 2 is too low for a heavily loaded server if the NIC is
interrupt-driven)?

Anything else I could look at?


BIOS update.

Kris


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable- 
[EMAIL PROTECTED]





___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Scot Hetzel
On 11/28/05, Eirik Øverby [EMAIL PROTECTED] wrote:
 Update: The diff below was made after making sure both systems are
 running the exact same kernel. Behavior is the same. Building new
 kernels (6-STABLE) now to get out of the BETA stage.

 /Eirik

 On Nov 28, 2005, at 22:53 , Eirik Øverby wrote:

  Firmware versions are equal. BIOS settings are equal.
  However, a diff of the dmesgs show (apart from MAC address
  differences):
 
  30c30
   Timecounter ACPI-safe frequency 3579545 Hz quality 1000
  ---
   Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 
  What on earth is that all about? The slow box has the ACPI-fast
  timecounter...
 

use sysctl to find out what time counters are available on the slow box:

sysctl kern.timecounter
:
kern.timecounter.hardware: ACPI-fast
kern.timecounter.choice: TSC(800) ACPI-fast(1000) i8254(0) dummy(-100)
:

Then try setting the Timecounter on the slow box to ACPI-safe (if
available) by using sysctl

sysctl kern.timecounter.hardware=ACPI-safe

If this fixes the problem, then add the following to /etc/sysctl.conf:

kern.timecounter.hardware=ACPI-safe

Scot

--
DISCLAIMER:
No electrons were mamed while sending this message. Only slightly bruised.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-28 Thread Kris Kennaway
On Mon, Nov 28, 2005 at 10:53:00PM +0100, Eirik ?verby wrote:
 Firmware versions are equal. BIOS settings are equal.
 However, a diff of the dmesgs show (apart from MAC address differences):
 
 30c30
  Timecounter ACPI-safe frequency 3579545 Hz quality 1000
 ---
  Timecounter ACPI-fast frequency 3579545 Hz quality 1000
 
 What on earth is that all about? The slow box has the ACPI-fast  
 timecounter...

Could be ACPI bugs on your system:

 BIOS update.

Kris


pgp6VuiFxDUSU.pgp
Description: PGP signature


Reduced java/tomcat performance 6-beta3 - 6-stable ?

2005-11-25 Thread Eirik Øverby

Hi all,

are there any obvious changes between 6.0-BETA3 and 6.0-RELEASE / 6.0- 
STABLE that I should be aware of, that could cause a quite noticeable  
decline in performance (and a change in performance patterns) for  
java/tomcat?


On a BETA-3 system I'm seeing, with the particular application we're  
running, about 28 transactions/second over a 10 minute interval. With  
-RELEASE and -STABLE I'm lucky to reach 24, and it'll usually wobble  
around 20.
Another oddity is that where the BETA-3 system starts out with good  
performance from the beginning when running load tests, the -RELEASE  
and -STABLE systems need a good 20 seconds to reach their max,  
starting out very low (3-10 transactions/second for the first 10  
seconds or so).


This is on HP DL385 servers with dual 2.4ghz Opteron CPUs, running  
FreeBSD-amd64 from 15kRPM drives in cached RAID.


Hardware and software configuration (apart from the base system),  
network configuration and latencies, database access, etc. is 100%  
equal on all systems.


Any ideas?

Thanks,
/Eirik

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]