On Apr 8, 2008, at 10:25 AM, Martin Probst wrote: > I guess once you're into concurrency, everything argues for immutable > data structures.
Some ad lib observations on memory sharing, concurrency, and the JVM: - Probably the hardest part about updating the Java memory model was making sure the rules for immutable objects (final fields) would work properly without synchronization. It was necessary, because Java needs to work well on multiprocessors. - Hotspot put in thread-local allocation buffers a decade ago, again to avoid avoid contention by memory partitioning. We weren't quite as painfully cache conscious back then, but cache locality was also a consideration. (Still, somebody needs to experiment more with cache- conscious coloring for TLABs. Sounds like a masters' project to me.) - An immutable object is not immutable at first. The trick is that the mutable part of its lifecycle is private, and so the system can cleverly protect it from contention, e.g., using TLABs. (Occurs to me that "larvatus" = "hidden"; an immutable object has a larval stage during which it is constructed.) This can be generalized to other controlled mutation phases; that's what GC safepoints do, for example: Allow immutable objects to be temporarily mutated. There might be an interesting "phased immutability" software abstraction to be investigated along these lines. - The JVM needs to know about immutability in order to optimize it properly. However, there are interesting (semi-)global analyses which can help it discover immutability even when it is not clearly marked in the source program. (Optimistic ones are the coolest, and require a framework for dependency checking and deoptimization to recover. Hotspot is good at that, and can get better. I think there are PhD projects there.) - In modern memory systems, main memory is usually many tens of cycles away from each CPU. Caches try to hide this. As a corollary, there are at least two protocols in the hidden network that connects memories, caches, and CPUs, and they differ greatly in intrinsic performance characteristics. In essence, one protocol is (often) called RTS (read to share) and one is RTO (read to own). RTO means your CPU & cache have to handshake with the rest of the world to make sure your have an exclusive right to a block of memory (cache line), before your can either read or write it. Your program incurs this cost because it intends to write the memory. But you want to use RTS when you can, because any number of CPUs can RTS the same block of memory (cache line) and happily work on the same data, as long as it remains immutable. (I'm tempted to think RTS:RTO :: UDP:TCP; there are some basic similarities I suppose.) In any case, when memory is shared, immutable memory is faster than mutable, and the JVM works hard to exploit the difference. Best, -- John --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "JVM Languages" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/jvm-languages?hl=en -~----------~----~----~----~------~----~------~--~---
