Re: Expanded Storage

Anne & Lynn Wheeler Mon, 06 Feb 2006 23:30:06 -0800

Rob van der Heij <[EMAIL PROTECTED]> writes:
> Yes, in your configuration you should define expanded storage. It's
> for providing a hierarchy in storage management as well as a
> circumvention to reduce the impact of contention under the bar.
> Especially when the total active memory of your Linux server is
> getting close to 2G (and unless you do things, eventually the entire
> Linux virtual machine main memory will appear active to VM).
> 25% has been suggested as a starting point, but measurements should
> help you determine the right value. The right value depends a lot on
> what Linux is doing. And make sure to disable MDC into expanded
> storage as I suggested yesterday.


note if you have 16gbytes of expanded store and 16gbytes of regular
storage ... then only stuff in the 16gbytes of regular store and be
used/executed. stuff in expanded store has to be brought into regular
store to be accessed (and something in regular store pushed out
... possibly exchanging places with stuff in expanded store).

if you have 32gbytes of regular store ... then everything in regular
store can be directly used/executed ... w/o being moved around.

expanded store was introduced in 3090 because of memory physical
packaging problem. a lot more electronic memory could be attached
cost effectively to a 3090 than could be physically packaged within
the normal processor execution/access latency requirements.

rather than going to something like numa & sci ... it was packaged on
a special wide bus (with longer latency) and special software
introduced to manage pages. it might be considered akin to electronic
paging drums or controller caches ... but the movement was
singificantly more efficient being done with high-performance
syncronous instruction rather than very expensive asyncronous i/o
handling. as an aside, when 800mbit hippi i/o was attached to 3090 ...
it was crafted into the expanded storage bus using peek/poke semantics
since it was the only interface on the 3090 that was capable of
handling the data rate.

part of the issue was some cache and i/o studies performed by SJC in
the late 70s and and early 80s. a special vm system was built that
efficiently captured all record references (much more efficently than
monitor) and this was deployed on a number of systems in the san jose
area (standard product vm/cms ... but also some number of production mvs
systems running virtually).

the detailed i/o trace information was captured for weeks of data on
different systems. various cache, record, and paging models were built
and run with the actual trace data. for a given, fixed amount of
electronic store, the most efficient use of that electronic store was
a single global system cache ... that divided the same amount of
stored into (partitioned) channel, controller, and/or drive caches was
always less efficient than having a single large global system cache.

this also supports the issue i raised as an undergraduate in the 60s
with the enhancements i had done to cp/67. the original cp/67 had a
very inefficient thrashing controls and replacement algorithm. about
that time, there was some literature published about working set for
controlling thrashing and "local" LRU replacement algorithms. For
cp/67, I implemented a highly efficient "global" LRU replacment
algorithm and my own variation on working set for thrashing controls.

However, in much the same way that my global LRU replacement algorithm
was much more efficient that local LRU replacement algorithm ...the
I/O cache simulation studies showed that a single global cache was
more efficient that any partitioned cache implementation (given the
same amount of fixed electronic storage).

Somewhat in the same time frame as the electronic cache studies,
(better than ten years after i had done the original global LRU work)
there was a big uproar over a draft stanford phd thesis that involved
global LRU replacement strategies. There was significant pushback on
not granting the stanford phd on the grounds that the global LRU
strategies were in conflict with the local LRU stuff that had been
published in the literature in the late 60s. After much conflicts, the
stanford Phd thesis was finally approved and the person was awarded
their phd.

in any case, back to the original example. if you have 16gbytes of
normal storage and 16gbytes of expanded storage, then there can be a
total of 32gbytes of virtual pages resident in electronic storage, but
only 16gbytes of those virtual pages can be used at any one time.  Any
access to virtual pages in expanded storage (at best) requires moving
a page from expanded to normal and a page from normal to expanded
(exchanging pages).

however, if you configure 32gbytes of normal storage and no expanded
storage ... then you can also have 32gbytes of virtual pages resident
in electronic storage ... but all 32gbytes of virtual pages are
useable directly (no fiddling moving pages back & forth between
expanded storage and normal storage).

the possible exception is if the paging supervisor has some
difficiencies in identifying virtual pages with a variety of activity
levels ... and there is going to be access to more total virtual pages
than real pages (resulting in some real page i/o). reducing real
storage by allocating some to expanded storage can fix some page
management problems by forcing the kernel to more frequently "trim"
what it considers the number of active pages in an address space.  the
downside of this trimming is mitigated by pages being moved back&forth
to expanded storage. with more frequent trimming, the code might do a
better job of deciding which pages go to disk and which stay in
electronic storage some place. the hope is that bad decisions about
what is on disk and what is in memory are reduced and the better
decisions offset both things like more frequent trimming and also the
overhead of the brownian motion of any pages going to & fro between
expanded storage and normal storage.

of course the ideal situation is to not have expanded storage at all
(eliminating the unnecessary overhead of moving pages back & forth).
and simply do a much more sophisticated job of managing all the pages
in a single global storage.

for some additional topic drift, a side effect of the i/o record trace
work was that it was noticed that there was daily, weekly and monthly
cycles ... where collections of data that wasn't normally being used
on a constant basis would have clustered bursty use. some of this
later showed up in places like adms (now tsm) having to do with
migration of clusters of data (that was used together) as part of a
"container". past collected postings on having done the internal
backup system that eventually morphed into workstation datasave
product and then into adsm (and since renamed tsm).
http://www.garlic.com/~lynn/subtopic.html#backup

past mention of the detailed i/o cache work:
http://www.garlic.com/~lynn/99.html#104 Fixed Head Drive (Was: Re:Power
distribution (Was: Re: A primeval C compiler)
http://www.garlic.com/~lynn/99.html#105 Fixed Head Drive (Was: Re:Power
distribution (Was: Re: A primeval C compiler)
http://www.garlic.com/~lynn/2003g.html#55 Advantages of multiple cores on
single chip
http://www.garlic.com/~lynn/2003n.html#33 Cray to commercialize Red Storm
http://www.garlic.com/~lynn/2004g.html#13 Infiniband - practicalities for small
clusters
http://www.garlic.com/~lynn/2004g.html#20 Infiniband - practicalities for small
clusters
http://www.garlic.com/~lynn/2004i.html#0 Hard disk architecture: are outer
cylinders still faster than
http://www.garlic.com/~lynn/2004i.html#1 Hard disk architecture: are outer
cylinders still faster than inner cylinders?
http://www.garlic.com/~lynn/2004q.html#76 Athlon cache question
http://www.garlic.com/~lynn/2005.html#2 Athlon cache question
http://www.garlic.com/~lynn/2005m.html#12 IBM's mini computers--lack thereof
http://www.garlic.com/~lynn/2005m.html#13 IBM's mini computers--lack thereof
http://www.garlic.com/~lynn/2005m.html#28 IBM's mini computers--lack thereof
http://www.garlic.com/~lynn/2005m.html#55 54 Processors?
http://www.garlic.com/~lynn/2005n.html#23 Code density and performance?

past mention of the stanford phd global lru work:
http://www.garlic.com/~lynn/98.html#2 CP-67 (was IBM 360 DOS (was Is Win95
without DOS...))
http://www.garlic.com/~lynn/99.html#18 Old Computers
http://www.garlic.com/~lynn/2001c.html#10 Memory management - Page replacement
http://www.garlic.com/~lynn/2002c.html#16 OS Workloads : Interactive etc
http://www.garlic.com/~lynn/2002k.html#63 OT (sort-of) - Does it take math
skills to do data processing ?
http://www.garlic.com/~lynn/2003f.html#30 Alpha performance, why?
http://www.garlic.com/~lynn/2003f.html#55 Alpha performance, why?
http://www.garlic.com/~lynn/2003g.html#0 Alpha performance, why?
http://www.garlic.com/~lynn/2003k.html#8 z VM 4.3
http://www.garlic.com/~lynn/2003k.html#9 What is timesharing, anyway?
http://www.garlic.com/~lynn/2004.html#25 40th anniversary of IBM System/360 on
7 Apr 2004
http://www.garlic.com/~lynn/2004g.html#13 Infiniband - practicalities for small
clusters
http://www.garlic.com/~lynn/2004q.html#73 Athlon cache question
http://www.garlic.com/~lynn/2005d.html#37 Thou shalt have no other gods before
the ANSI C standard
http://www.garlic.com/~lynn/2005d.html#48 Secure design
http://www.garlic.com/~lynn/2005f.html#47 Moving assembler programs above the
line
http://www.garlic.com/~lynn/2005h.html#10 Exceptions at basic block boundaries
http://www.garlic.com/~lynn/2005n.html#23 Code density and performance?

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Re: Expanded Storage

Reply via email to