That is an internal link to the bug report. Here is the external link http://bugs.opensolaris.org/view_bug.do?bug_id=6583268
-Prakash. Prakash Sangappa wrote: > Are you running Solaris 10u3+? In that case this problem may be due > to > > 6583268 tmpfs tries too hard to reserve memory > <http://monaco.sfbay/detail.jsp?cr=6583268>. > > This is currently fixed in Nevada. I guess it will be back ported to > S10 patch. > > -Prakash. > > * > * > adrian cockcroft wrote: > >> How fast do disks turn? You get one page per revolution. Adding more >> swap disks would only help if there was more than one thread trying to >> read the data. Ultra 1 had a nice fast 7200rpm SCSI disk... >> >> Adrian >> >> On 8/15/07, *Peter C. Norton* <[EMAIL PROTECTED] >> <mailto:[EMAIL PROTECTED]>> wrote: >> >> What is confusing me in this case is that the speeds seem to be >> unreasonably slow. I would expect, for instance, that the speed of >> swap access would scale with the speed of the disk, cpu, etc. >> >> In this case, the speed of swap looks like it has stayed about what it >> was on an ultra1 while everything else has shot ahead, leading me to >> feel like there's a specific limit that may be there that could be >> easiliy removed or tuned to not make this as visible. >> >> Thanks, >> >> -Peter >> >> On Wed, Aug 15, 2007 at 03:49:33PM -0700, adrian cockcroft wrote: >> > I've seen this before and its expected behavior, this is my >> explanation: >> > >> > Pages are written using large sequential writes to swap in >> physical memory >> > scanner order, so they are guaranteed to be randomly jumbled on >> disk. This >> > means that swap is as fast as possible for writes and as slow as >> possible >> > for reads, which will be random seeks one page at a time. Its >> been this way >> > forever. Anything that swaps/pages out will be horribly slow on >> the way back >> > in. >> > Add enough RAM to never swap, or possibly mount a real disk or a >> solid state >> > disk for /tmp >> > >> > Adrian >> > >> > On 8/15/07, Peter C. Norton <[EMAIL PROTECTED]> >> wrote: >> > > >> > > On Wed, Aug 15, 2007 at 04:29:54PM -0400, Jim Mauro wrote: >> > > > >> > > > What would be interesting here is the paging statistics >> during your test >> > > > case. >> > > > What does "vmstat 1", and "vmstat -p 1" look like while your >> generating >> > > > this behavior? >> > > > >> > > > Is it really case the reading/writing from swap is slow, or >> simply that >> > > > the system >> > > > on the whole is slow because it's dealing with a sustained >> memory >> > > deficit? >> > > >> > > It tends to look something like this: >> > > >> > > $ vmstat 1 >> > > >> kthr memory page disk faults cpu >> > > r b w swap free re mf pi po fr de sr cd cd m1 m1 in >> sy cs us sy >> > > id >> > > 0 0 0 19625708 3285120 1 4 1 1 1 0 6 1 1 11 >> 11 455 260 247 4 0 >> > > 96 >> > > 0 1 70 16001276 645628 2 27 3428 0 0 0 0 442 447 0 0 >> 3012 516 1982 >> > > 97 3 0 >> > > 0 1 70 16001276 642208 0 0 3489 0 0 0 0 437 432 0 0 >> 3074 381 2002 >> > > 97 3 0 >> > > 0 1 70 16001276 638964 0 0 3343 0 0 0 0 417 417 0 0 >> 2997 350 1914 >> > > 98 2 0 >> > > 0 1 70 16001276 635504 0 0 3442 0 0 0 0 430 434 0 0 >> 3067 536 2016 >> > > 97 3 0 >> > > 0 1 70 16001276 632076 0 0 3434 0 0 0 0 429 425 0 0 >> 3164 885 2125 >> > > 97 3 0 >> > > 0 1 70 16001276 628548 0 0 3549 0 0 0 0 445 445 0 0 >> 3185 582 2105 >> > > 97 3 0 >> > > 0 1 70 16001276 625104 0 0 3459 0 0 0 0 463 469 0 0 >> 3376 594 2100 >> > > 97 3 0 >> > > >> > > $ vmstat -p 1 >> > > memory >> > > page executable anonymous filesystem >> > > >> > > >> swap free re mf fr de sr epi epo epf api apo apf fpi fpo >> fpf >> > > 19625616 3285052 1 4 1 0 >> > > 6 0 0 0 0 0 0 1 0 1 >> > > 16001244 440392 21 31 0 0 0 0 0 0 0 0 0 >> > > 2911 0 0 >> > > 16001244 437120 21 0 0 0 0 0 0 0 0 0 0 >> > > 3188 0 0 >> > > 16001244 433592 14 0 0 0 0 0 0 0 0 0 0 >> > > 3588 0 0 >> > > 16001244 429732 28 0 0 0 0 0 0 0 0 0 0 >> > > 3712 0 0 >> > > 16001244 426036 18 0 0 0 0 0 0 0 0 0 0 >> > > 3679 0 0 >> > > 16001244 422448 2 0 0 0 0 0 0 0 0 0 0 >> > > 3468 0 0 >> > > 16001244 418980 5 0 0 0 0 0 0 0 0 0 0 >> > > 3435 0 0 >> > > 16001244 416012 8 0 0 0 0 0 0 0 0 0 0 >> > > 2855 0 0 >> > > 16001244 412648 8 0 0 0 0 0 0 0 0 0 0 >> > > 3256 0 0 >> > > 16001244 409292 31 0 0 0 0 0 0 0 0 0 0 >> > > 3426 0 0 >> > > 16001244 405760 10 0 0 0 0 0 0 0 0 0 0 >> > > 3602 0 0 >> > > >> > > >> > > > Also, I'd like to understand better what you're looking to >> optimize for. >> > > > In general, "tuning" for swap is a pointless exercise (and >> it's not my >> > > > contention >> > > > that that is what you're looking to do - I'm not actually >> sure), because >> > > the >> > > > IO performance of the swap device is really a second order >> effect of >> > > > having a memory working set size larger than physical RAM, >> which means >> > > > the kernel spends a lot of time doing memory management things. >> > > >> > > I think we're trying to optimize for usage of swap having as >> little >> > > impact as possible. With multiple large java processes >> needing to run >> > > in as little time as possible, and with the business demands that >> > > exist making it impossible to have an overall rss < real mem >> 100% of >> > > the time, we want to minimize the impact of pagins. >> > > >> > > > The poor behavior of swap may really be a just a symptom of >> other >> > > activities >> > > > related to memory management. >> > > >> > > Possibly. >> > > >> > > > What kind of machine is this, and what does CPU utilization >> look like >> > > > when you're inducing this behavior? >> > > >> > > These are a variety of systems. IBM 360, sun v20z, and x4100 >> (we have >> > > m1's and m2's. I personally have only tested on m1 systems). This >> > > behavior seems consistant on all of them. >> > > >> > > The program we're using to pin memory is this: >> > > >> > > >> > > >> > > #include <stdio.h> >> > > #include <stdlib.h> >> > > #include <unistd.h> >> > > >> > > int main(int argc, char** argv) >> > > { >> > > if (argc != 2) { >> > > printf("Bad args\n"); >> > > return 1; >> > > } >> > > >> > > const int count = atoi(argv[1]); >> > > if (count <= 3) { >> > > printf("Bad count: %s\n", argv[1]); >> > > return 1; >> > > } >> > > >> > > // Malloc >> > > const int nints = count >> 2; >> > > int* buf = (int*)malloc(count); >> > > if (buf == NULL) { >> > > perror("Failed to malloc"); >> > > return 1; >> > > } >> > > >> > > // Init >> > > for (int i=0; i < nints; i++) { >> > > buf[i] = rand(); >> > > } >> > > >> > > // Maintain working set >> > > for (;;) { >> > > return 1; >> > > } >> > > >> > > const int count = atoi(argv[1]); >> > > if (count <= 3) { >> > > printf("Bad count: %s\n", argv[1]); >> > > return 1; >> > > } >> > > >> > > // Malloc >> > > const int nints = count >> 2; >> > > int* buf = (int*)malloc(count); >> > > if (buf == NULL) { >> > > perror("Failed to malloc"); >> > > return 1; >> > > } >> > > >> > > // Init >> > > for (int i=0; i < nints; i++) { >> > > buf[i] = rand(); >> > > } >> > > >> > > // Maintain working set >> > > for (;;) { >> > > for (int i=0; i < nints; i++) { >> > > buf[i]++; >> > > } >> > > //sleep(1); >> > > } >> > > >> > > return 0; >> > > } >> > > >> > > Nothing too complex. Reads and writes to /tmp and /var/tmp in our >> > > tests were all done with dd. >> > > >> > > I am following up with sun support for this, but in the mean >> time I am >> > > curious if you our anyone else out there see the same behavior? >> > > >> > > Thanks, >> > > >> > > -Peter >> > > >> > > -- >> > > The 5 year plan: >> > > In five years we'll make up another plan. >> > > Or just re-use this one. >> > > >> > > _______________________________________________ >> > > perf-discuss mailing list >> > > perf-discuss@opensolaris.org <mailto:perf-discuss@opensolaris.org> >> > > >> >> -- >> The 5 year plan: >> In five years we'll make up another plan. >> Or just re-use this one. >> >> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> perf-discuss mailing list >> perf-discuss@opensolaris.org >> >> > > _______________________________________________ > perf-discuss mailing list > perf-discuss@opensolaris.org > _______________________________________________ perf-discuss mailing list perf-discuss@opensolaris.org