[jvm-l] Re: Performance characteristics of mutable static primitives?

John Rose Tue, 08 Apr 2008 15:05:35 -0700

On Apr 8, 2008, at 10:25 AM, Martin Probst wrote:

> I guess once you're into concurrency, everything argues for immutable
> data structures.


Some ad lib observations on memory sharing, concurrency, and the JVM:

- Probably the hardest part about updating the Java memory model was  
making sure the rules for immutable objects (final fields) would work  
properly without synchronization.  It was necessary, because Java  
needs to work well on multiprocessors.

- Hotspot put in thread-local allocation buffers a decade ago, again  
to avoid avoid contention by memory partitioning.  We weren't quite  
as painfully cache conscious back then, but cache locality was also a  
consideration.  (Still, somebody needs to experiment more with cache- 
conscious coloring for TLABs.  Sounds like a masters' project to me.)

- An immutable object is not immutable at first.  The trick is that  
the mutable part of its lifecycle is private, and so the system can  
cleverly protect it from contention, e.g., using TLABs.  (Occurs to  
me that "larvatus" = "hidden"; an immutable object has a larval stage  
during which it is constructed.)  This can be generalized to other  
controlled mutation phases; that's what GC safepoints do, for  
example:  Allow immutable objects to be temporarily mutated.  There  
might be an interesting "phased immutability" software abstraction to  
be investigated along these lines.

- The JVM needs to know about immutability in order to optimize it  
properly.  However, there are interesting (semi-)global analyses  
which can help it discover immutability even when it is not clearly  
marked in the source program.  (Optimistic ones are the coolest, and  
require a framework for dependency checking and deoptimization to  
recover.  Hotspot is good at that, and can get better.  I think there  
are PhD projects there.)

- In modern memory systems, main memory is usually many tens of  
cycles away from each CPU.  Caches try to hide this.  As a corollary,  
there are at least two protocols in the hidden network that connects  
memories, caches, and CPUs, and they differ greatly in intrinsic  
performance characteristics.  In essence, one protocol is (often)  
called RTS (read to share) and one is RTO (read to own).  RTO means  
your CPU & cache have to handshake with the rest of the world to make  
sure your have an exclusive right to a block of memory (cache line),  
before your can either read or write it.  Your program incurs this  
cost because it intends to write the memory.  But you want to use RTS  
when you can, because any number of CPUs can RTS the same block of  
memory (cache line) and happily work on the same data, as long as it  
remains immutable.  (I'm tempted to think RTS:RTO :: UDP:TCP; there  
are some basic similarities I suppose.)  In any case, when memory is  
shared, immutable memory is faster than mutable, and the JVM works  
hard to exploit the difference.

Best,
-- John

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "JVM 
Languages" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/jvm-languages?hl=en
-~----------~----~----~----~------~----~------~--~---

[jvm-l] Re: Performance characteristics of mutable static primitives?

Reply via email to