I agree with most of his arguments but not all of them. Memory subsystems have always been a major concern. Since 35 years ago, many designs went through simulation before burning anything on chip. Especially SMP designs with shared memory given the cost of prototyping.
Simulations used "typical" workloads to identify which strategies should be implemented in hardware. Shared memory sub-systems proved to be the main bottleneck every time. Any choices of strategies could result in bottlenecks if the payload differed somewhat from the "average" simulated ones. Threads appeared much more later and this alone makes you wonder about the pressure increase on shared memory sub-systems. The numbers Martin claims as speed increases (myth #x, CPUs are not getting faster) are simply better/worse hardware strategy choices which decreased idle time. Manufacturers are doing the same job as decade ago, simulating and finding how to optimize as much as possible typical workloads. We cannot break the physical barriers anymore, we need a technology upgrade (atom level, quantum level, whatever). But such a change is no more elegant than a brute force approach that will reshape performance and create a new barrier farther away. Having tuned generated machine code when domestic appliance sized computers were the norm, I would not like to get back to this age of analyzing the hardware to twist my software accordingly. Relying on hardware architectures to craft software is not to me a path with a bright future. Server hardware changes often enough that such an approach would mess design and be unsustainable in the long run. We may end up someday with "tunable" hardware that adapt to workloads. By tunable I mean giving the hardware instructions on how to behave. This would be better than the above. We had such options with traditional compilers. I think that significant optimizations have to be decided at a higher level. I doubt that any of that can be implemented at the hardware level alone and let it decide on the fly. This sounds like magic, too good to be true. I am also quite convinced that optimizing in hardware single threaded and muti-threaded processing are antagonist goals given the current hardware designs. Martin hits a number of significant nails, we need to be aware of hardware limitations and we need to measure the impacts of our choices and change these within some reachable goals. However achieving 10% or less cpu idle time on a specific server architecture to me is not a goal. I'm interested by the constraints I have to met (business and physical ones) and playing within a large playground to meet these wether it involves using more powerful/specially designed hardware or using better software designs. Luc P. > Martin's point about immutable and persistent data structures is further > developed in his interview on infoq > <http://www.infoq.com/interviews/reactive-system-design-martin-thompson>, you > can skim to point #9 if you're in a hurry. > Overall what he says is that in terms of scalability of the development > activity, immutability and persistence are great ideas, since we don't > have to deal with non-deterministic behaviour any more. When one needs > to scale the running system, meaning increasing the rate at which the > persistent data structure is updated, these can lead to performance > issues in various ways: > - longer GC pauses because persistency increases the number of objects > that are neither very short-lived nor long-lived, > - contention because the root of the tree of the persistent data > structure becomes the focal point of concurrency, > - increased CPU cache misses since persistent data structures are trees > that increasingly span larger non-contiguous and non-sequential parts of > memory > Of these the last point is probably the most painful, since there's no > way to deal with it unless one reconsiders the whole persistent data > structure. > In other words increasing the number of threads and cores may eventually > lower throughput because the time taken for dealing with these issues > (GC pauses, locking, cache misses) grows larger than the time taken for > useful computation. > I can't backup any of this with actual data and experience. However I > think this old thread about poor performance on multicore > <https://groups.google.com/forum/#%21topic/clojure/48W2eff3caU> does > provide a clear picture of the problem, which becomes even clearer with > actual stats showing > <https://groups.google.com/d/msg/clojure/48W2eff3caU/FBFQp2vrWFgJ>CPUs > <https://groups.google.com/d/msg/clojure/48W2eff3caU/FBFQp2vrWFgJ>were > 83% idle > <https://groups.google.com/d/msg/clojure/48W2eff3caU/FBFQp2vrWFgJ>, i.e. > waiting for memory. > > Also one should view Martin's other vidoes on infoq > <http://www.infoq.com/author/Martin-Thompson> to get a better > understanding of his arguments. He's actually quite positive about > Clojure in general. It's just that depending on the scalability and > performance requirements, persistent data structures may not provide a > satisfactory answer and could even lower throughput. > > > On 14/03/14 18:01, ?????? ????????? wrote: > > He talks about simple things actually. > > > > When you have any sort of immutable data structure and you want to > > change it from multiple threads > > you just must have a mutable reference which points to the current > > version of that data structure. > > Now, updates to that mutable reference are fundamentally serial. > > Whatever synchronization > > strategy you chose been that optimistic updates (atom) or queuing > > (agent) or locks you inevitably > > will have a contention on a large number of threads. When you will run > > on that you will also > > have hundred ways to solve a problem. > > > > There is nothing magical about persistent data structures on > > multi-core machines :) > > > > ???????, 13 ????? 2014 ?., 20:58:54 UTC+4 ???????????? Andy C ???????: > > > > Hi, > > > > So the other day I came across this > > presentation:http://www.infoq.com/presentations/top-10-performance-myths > > <http://www.infoq.com/presentations/top-10-performance-myths> > > > > The guy seems to be smart and know what he talks about however > > when at 0:22:35 he touches on performance (or lack of thereof) of > > persistent data structures on multi-core machines I feel puzzled. > > > > He seems to have a point but really does not back it with any > > details. There is also a claim that STM does not cooperate well > > with GC. Is it true? > > > > Thanks, > > Andy > > > > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with your > first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- Softaddicts<lprefonta...@softaddicts.ca> sent by ibisMail from my ipad! -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.