Watch for the Domino 6.5 Redbook. I had a chance to spend some
time researching and understanding performance of linux guests
under zvm and how to improve storage requirements. The write up
will be in this redbook. Last day of residency is tomorrow, so
would suspect redpiece in 60 days or so.
Could someone help me out? we are in the process of purchasing a zSeries and plan on
running Linux. I recieved this email about performance problems on the zSeries due to
working set size, what is the working set size? and on what size Zseries are having
the problem?
[EMAIL PROTECTED]
On Thu, 28 Aug 2003, Mike Lovins wrote:
Could someone help me out? we are in the process of purchasing a zSeries and plan
on running Linux. I recieved this email about performance problems on the zSeries
due to working set size, what is the working set size? and on what size Zseries are
You can get at
http://www.redbooks.ibm.com/redbooks/SG246926.html
some tips on it .
- Original Message -
From: John Summerfield [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Thursday, August 28, 2003 5:24 PM
Subject: Re: zSeries performance heavily dependent on working set size
Dave Rivers wrote:
One of the tests was the typical dhrystone test... that's
the one I had numbers reported for.
Well, dhrystone has a so small working set that this shouldn't
really affect anything. The problem with dhrystone is really
that its main loop actually doesn't *do* anything,
An interesting test might be a ray tracer. Tends to have interesting working
sets for large images, and the code/data separation (and locality in
general) should be fairly substantial.
-- db
David Boyes
Sine Nomine Associates
Dave Rivers wrote:
I may, however, be mis-attributing the reason for the
performance benefit. There was some thought that
gcc's use of relative instructions (which should also
be fine in zSeries) might be the culprit...
Admittedly - this are all guesses, and could use
the watchful
Peter Vander Woude wrote:
Yes on the zSeries machines, the separation of code can be huge. If
the data that is being updated is within 256 bytes of the instruction
that is updating it, there is a huge performance impact.
Yes, I'm certainly aware of the effects of storing into or near
the
What you are seeing is the result of a badly configured
paging subsystem - look at your DASD performance when you
run this.
What happens: Linux touches all it's pages when it boots.
These pages then overtime get paged out. Then you run your
program - and all those pages get paged back in. Please
Dave Rivers wrote:
On a per-function basis - but not within functions; because
gcc points R13 at the literal pool; which can be quite large
(and different from the code location in sufficiently large
functions.)
Separating code and literal pool would appear likely to cause
a net win on
Most has been said, but I think responders failed to make the link to
typical Linux virtual machines. In a way, a large Linux machine will
beave like the 2nd worst case scenario that Jim describes. Apart
from the initial load that Barton outlined, a large Linux virtual machine
will continuously
On Mon, 11 Aug 2003, Dennis Wicks wrote:
Do you have some real numbers to back up that claim?
Many people make the mistake of comparing the one-time-cost
of a programmer changing a program to the recurring cost of
hardware upgrades. There may be installation charges and there
will most
Typical first reaction - its the paging subsystem and
its not tuned correctly.
However, the problem I describe is NOT related to
system paging or ANY PARTICULAR OS or DASD tuning.
During my experiments, there was NO paging going on as
reported by vmstats. The program was running in an
LPAR. The
What you describe is a very common problem with OSes that implement virtual
memory. It's typically pretty much OK when your program's data space fits in
real memory, but once you run beyond that, performance will most definitely
be much worse when your inner loop runs through all the pages. After
Typical first reaction - its the paging subsystem and
its not tuned correctly.
However, the problem I describe is NOT related to
system paging or ANY PARTICULAR OS or DASD tuning.
During my experiments, there was NO paging going on as
reported by vmstats.
You're probably overflowing the
I talked with one of the hardware guru's about what I
am seeing.
On a G6, the linesize (amount of data fetched from
memory at a time) is 256 bytes, the L1 cache is 256 KB
(1024 lines), and the L2 cache is 4 MB/6 CP (8 MB
total). The zSeries has the same linesize of 256
bytes, an L1-I
Rod,
I agree with you that the best solution to the issue of an actual
application that is doing this type of programming would be to correct
the program. While throwing hardware may seem easier, I'm always
amazed by programmers reactions when I do point out that by making
minor changes to a
- Original Message -
From: Jim Sibley [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Monday, August 11, 2003 3:44 PM
Subject: Re: zSeries performance heavily dependent on working set size
Typical first reaction - its the paging subsystem and
its not tuned correctly.
I think in your
On Maw, 2003-08-12 at 11:57, Thomas David Rivers wrote:
Some versions of gcc don't do well with locality-of-reference
for functions that reference many literals. There is a global
pool of literals which can be far away. This approach could
easily artificially inflate the working
(would it be possible for you to configure your mail software to add
attribution lines when quoting? it makes it much easier to follow who said
what)
On Mon, Aug 11, 2003 at 05:22:11PM -0700, Jim Sibley wrote:
I think that the word you're searching for is pathological.
Its only pathological
performance heavily dependent on working set size
What you describe is a very common problem with OSes that implement virtual
memory. It's typically pretty much OK when your program's data space fits in
real memory, but once you run beyond that, performance will most definitely
be much worse when
I've recently run into this effect in some performance tests on FLEX-ES
based systems.
It turns out that FLEX-ES exhibits processor cache behavior very similar to
the
zSeries systems, being very affected by programs that do a lot of store
into/near code
operations. We have an advantage in that we
Typo - this line should end with semicolon
intbytes, byteAddr, byteINpage;
=
Jim Sibley
Implementor of Linux on zSeries in the beautiful Silicon Valley
Computer are useless.They can only give answers. Pablo Picasso
__
Do you Yahoo!?
Yahoo! SiteBuilder -
Consider what happens when you reverise the page and
byteINpage loops:
for (byteINpage=0;byteINpage4096;byteINpage++)
for (page=0;pagebytes;page=page+4096)
where you touch a byte in each page before going to
the second byte. The working set becomes terrible.
And so does the performance.
And
PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size
Typical first reaction - its the paging subsystem and
its not tuned correctly.
However, the problem I describe is NOT related to
system paging or ANY PARTICULAR OS or DASD tuning.
During my experiments, there was NO paging
What you are seeing is the result of a badly configured
paging subsystem
What happens: Linux touches all it's pages when it boots.
These pages then overtime get paged out. Then you run your
program - and all those pages get paged back in.
Applications with more consistent workings sets would not
]
Subject: Re: zSeries performance heavily dependent on working set size
Do you have some real numbers to back up that claim?
Many people make the mistake of comparing the one-time-cost
of a programmer changing a program to the recurring cost of
hardware upgrades. There may be installation charges
In following up on some performance problems on the
zSeries, we've noticed that the zSeries is very
sensitive to working set size, especially for writes.
This may explain some of the poor performance that
people ascribe to the zSeries.
Is locality of reference as senstive on other
platforms? How
Ulrich,
Yes on the zSeries machines, the separation of code can be huge. If
the data that is being updated is within 256 bytes of the instruction
that is updating it, there is a huge performance impact. Moving the
data to being outside of that range (or having the code get the
storage for the
Dave Rivers wrote:
On a per-function basis - but not within functions; because
gcc points R13 at the literal pool; which can be quite large
(and different from the code location in sufficiently large
functions.)
Separating code and literal pool would appear likely to cause
a net win
On Mon, 11 Aug 2003, Kris Van Hees wrote:
What you describe is a very common problem with OSes that implement virtual
memory. It's typically pretty much OK when your program's data space fits in
real memory, but once you run beyond that, performance will most definitely
be much worse when
Erm... far be it from me to argue with the master on
the
performance of the actual physical hardware, but
isn't the
point here that they've generated a
worst-possible-case
scenario (is this called a degenerate case these
days?)
I think that the word you're searching for is
pathological.
Its
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size
Rod,
I agree with you that the best solution to the issue of an actual
application that is doing this type of programming would be to correct
the program. While throwing hardware may seem easier, I'm
be, but the memory itself and the installation is a
one time cost that can be amortized over a few years.
-Original Message-
From: Dennis Wicks [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 12:04 PM
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size
prices adding more memory is
usually cheaper than paying a programmer to rework his program.
-Original Message-
From: Kris Van Hees [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 11:26 AM
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size
On Mon, Aug 11, 2003 at 10:06:03PM +0200, Rod Furey wrote:
Erm... far be it from me to argue with the master on the
performance of the actual physical hardware, but isn't the
point here that they've generated a worst-possible-case
scenario (is this called a degenerate case these days?)
I
What would you do to improve Jims program?
-Original Message-
From: Peter Vander Woude [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 1:24 PM
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size
Rod,
I agree with you that the best
37 matches
Mail list logo