Re: zSeries performance heavily dependent on working set size

Fargusson.Alan Thu, 14 Aug 2003 18:36:27 -0700

Yes, but not ware I can get at them right now.

This may depend on location, but around here programmers are very expensive.  We are 
not allowed to just fire them when we don't need them anymore, so the cost tends to go 
on for long periods of time.


I don't know what the MMC would be, but the memory itself and the installation is a 
one time cost that can be amortized over a few years.

-----Original Message-----
From: Dennis Wicks [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 12:04 PM
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size


Do you have some real numbers to back up that claim?

Many people make the mistake of comparing the one-time-cost
of a programmer changing a program to the recurring cost of
hardware upgrades. There may be installation charges and there
will most certainly be an increase in MMC for the foreseeable
future.

Also, once you have your programmers trained, (they still do
that don't they?) they will not write any "bad" programs from
then on out, but if you just throw hardware at the problem it
is still a problem and will likely increase in severity as the
"bad" programming practices proliferate throughout the system.

Good Luck!
Dennis





                    "Fargusson.Alan"
                    <[EMAIL PROTECTED]       To:     [EMAIL PROTECTED]
                    tb.ca.gov>              cc:
                    Sent by: Linux on       Subject:     Re: zSeries performance 
heavily dependent on working set size
                    390 Port
                    <[EMAIL PROTECTED]
                    IST.EDU>


                    08/11/2003 01:31
                    PM
                    Please respond to
                    Linux on 390 Port






Or you could add more real storage.  At todays prices adding more memory is
usually cheaper than paying a programmer to rework his program.

-----Original Message-----
From: Kris Van Hees [mailto:[EMAIL PROTECTED]
Sent: Monday, August 11, 2003 11:26 AM
To: [EMAIL PROTECTED]
Subject: Re: zSeries performance heavily dependent on working set size


What you describe is a very common problem with OSes that implement virtual
memory. It's typically pretty much OK when your program's data space fits
in
real memory, but once you run beyond that, performance will most definitely
be much worse when your inner loop runs through all the pages.  After all,
you
are hitting a page fault at a lot of pages, causing major paging activity.
And when the working set itself is generally larger than available real
memory,
you end up paging out earlier touched pages in favour of new pages, only to
reverse that again when the next offset loop starts.  You just keep cycling
through the pages being swapped in and out.

As far as I know, no OS with virtual memory has a real solution to this
because
there is no way for the OS to know how to solve the problem.  It's a bad
design
on the programmer's part :)  Some performance issues can only be solved by
using algorithms implemented in a way that takes these issues into account.

        Kris

On Mon, Aug 11, 2003 at 11:17:15AM -0700, Jim Sibley wrote:
> In following up on some performance problems on the
> zSeries, we've noticed that the zSeries is very
> sensitive to working set size, especially for writes.
> This may explain some of the "poor" performance that
> people ascribe to the zSeries.
>
> Is locality of reference as senstive on other
> platforms? How does this effect such things as VM EC
> guests, data base programs with large tables, and java
> programs?
>
> Consider the loop below:
>
> It riffles through memory sequentially a byte at a
> time. It writes (flag=1) to each byte within a page
> before going to the next page. In other words, it has
> a "nice" worksing set size. (See complete program
> below for details).
>
> byteSingle=255;
> time(&start);
>   for (i=1;i<=iter;i++) {
>      hits=0;
>         for (page=0;page<bytes;page=page+4096)
>      for (byteINpage=0;byteINpage<4096;byteINpage++)
>         {
>            hits++;
>            byteAddr=page+byteINpage;
>            if (flag) byteArray[byteAddr]=byteSingle;
>            else byteSingle= byteArray[byteAddr];
>        }
>     }
> time(&finish);
>
> Consider what happens when you reverise the page and
> byteINpage loops:
>
>      for (byteINpage=0;byteINpage<4096;byteINpage++)
>     for (page=0;page<bytes;page=page+4096)
>
> where you touch a byte in each page before going to
> the second byte. The working set becomes "terrible".
> And so does the performance.
>
> Complete program:
>
> /*
>   MemSeq - memory loop, reading or writing,
> sequentially
>
>   [EMAIL PROTECTED]
>   this is uncopyrighted material, August 5, 2003
>   use at your own risk
>
> */
> #include <stdlib.h>
> #include <stdio.h>
> #include <time.h>
>
> void usage();
> int main(
>   int    argc,
>   char * argv[])
> {
> char   byteArray[ 512*1024*1024 ];
> char   byteSingle;
> int    i,j;
> int    next;
> int    mb=1;
> int    bytes, byteAddr, byteINpage
> int hits;
> int    page;
> int    iter=0;
> int    flag=0;
> time_t start;
> time_t finish;
>
> /* pick apart args */
>   for (next = 1; next < argc; next++)
>   if (memcmp(argv[next], "-m", 2) == 0)
>      {
>        mb = atol(argv[next][2]? &argv[next][2]:
> argv[++next]);
>      }
>     else if (memcmp(argv[next], "-i", 2) == 0)
>      {
>        iter = atol(argv[next][2]? &argv[next][2]:
> argv[++next]);
>      }
>     else if (memcmp(argv[next], "-w", 2) == 0) flag=1;
>     else if (memcmp(argv[next], "-r", 2) == 0) flag=0;
>     else usage();
>
> bytes = mb*1024*1024;
>
> printf("MemSeq ");
> if ( flag ) printf("Write");
> else printf("Read " );
>
> printf( " %d MB, %d iterations ", mb,iter);
>
> /* initiliaze memory to something other than all zeros
> */
> j=0;
> if ( iter > 0)
>    for (i=0;i<bytes;i=i++) {
>       byteArray[i]=j;
>       j++;
>       if (j == 256) j=0;
>       }
> /* riffle through memory, writing or reading,
> sequentially */
> /* the page and byteINpage loops are reversed from
> MemPage */
> byteSingle=255;
> time(&start);
>   for (i=1;i<=iter;i++) {
>      hits=0;
>         for (page=0;page<bytes;page=page+4096)
>      for (byteINpage=0;byteINpage<4096;byteINpage++)
>         {
>            hits++;
>            byteAddr=page+byteINpage;
>            if (flag) byteArray[byteAddr]=byteSingle;
>            else byteSingle= byteArray[byteAddr];
>        }
>     }
> time(&finish);
>
> printf(" %.0f seconds %d bytes/iter
> \n",difftime(finish,start),hits);
>
> }
> void
> usage()
> {
>   printf( "MemSeq USAGE:\n");
>   printf(
>     "MemSeq [-i KB ] [-m MB [-w | -r] \n"
>     " -i iterations (default=0 iterations) \n"
>     "      (default is 1 \n"
>     " -m bytes to malloc in MB (default=1 MB)\n"
>     " -w memory write\n"
>     " -r memory read (default)\n"
>     );
>   exit(1);
> }
>
>
>
>
> =====
> Jim Sibley
> Implementor of Linux on zSeries in the beautiful Silicon Valley
>
> "Computer are useless.They can only give answers." Pablo Picasso
>
> __________________________________
> Do you Yahoo!?
> Yahoo! SiteBuilder - Free, easy-to-use web site design software
> http://sitebuilder.yahoo.com

--
Never underestimate a Mage with:
 - the Intelligence to cast Magic Missile,
 - the Constitution to survive the first hit, and
 - the Dexterity to run fast enough to avoid being hit a second time.

Re: zSeries performance heavily dependent on working set size

Reply via email to