The shared L2 reduces the penalty for those situations when you can't avoid
dispatching on a new engine.  That is when the system is very busy.  This
is one of the reasons for the difference in utilization.  As the machine
gets  busier other machines are forced into L2-L2 or remote L3-localL1
(victim cache) transfers which have a high penalty.  In z the migration is
from shared L2 to L1.   The less affinity scheduling delays dispatching,
the more the system behaves like  a multiple server single queue system,
which is the optimum case.   The more scheduling delays dispatching the
more the system behaves like multiple single server single queue systems,
which will not perform well if the load has skew or high variability.
Thus if the affinities are hardened (often done in skewless benchmark runs)
skew will cause some cpus to overload while others are idle.   If there is
no affinity then there are more cache migrations.   In between the there is
a combination of the first and second cases and it is a matter of what the
migration penalty is v the queueing penalty for affinity scheduling.   Of
course this is yet  another reason that relative capacity is workload
dependent.

Another aspect of z's common L2 is that it always  holds a copy of the data
in  the L1s  attached to it and therefore  snooping is avoided.  High end
System x systems (X460 class) do this by keeping a shadow directory that
covers the on chip caches.


Joe Temple
Distinguished Engineer
Sr. Certified IT Specialist
[EMAIL PROTECTED]
845-435-6301  295/6301   cell 914-706-5211
Home office 845-338-1448  Home 845-338-8794



             Alan
             Altmark/Endicott/
             [EMAIL PROTECTED]                                                  
To
             Sent by: Linux on         LINUX-390@VM.MARIST.EDU
             390 Port                                                   cc
             <[EMAIL PROTECTED]
             IST.EDU>                                              Subject
                                       Re: Who's been reading our list...

             05/18/2006 08:55
             AM


             Please respond to
             Linux on 390 Port
             <[EMAIL PROTECTED]
                 IST.EDU>






On Thursday, 05/18/2006 at 10:03 ZE2, Martin Schwidefsky
<[EMAIL PROTECTED]> wrote:
> The cache is a different story. Mainframes have the advantage of a
> shared level 2 cache compared to x86. If a process migrates from one
> processor to another, the cache lines of the process just have to be
> loaded from level 2 cache to level 1 cache again before they can be
> accessed. On x86 it goes over memory.

The cache designs on the mainframe change from generation to generation to
deal with more work, changes in the relationship of CPU speed to memory
speed, and more CPUs.  You want the benefits of cache, but you want to
minimize the serialization/synchronization effects on the processors. This
is why we do our best to dispatch a virtual machine on the same CPU as was
used in the previous time slice.  The relationship between the CPUs and a
particular cache is not always equal, but is always the best if you use
the same CPU again.

Alan Altmark
z/VM Development
IBM Endicott

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or
visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to