Re: STM and persistent data structures performance on mutli-core archs

2014-04-10 Thread Justin Smith
One could quantify this with Category Theory. In order to map from one domain to another reversibly, the target domain must have at least as many elements as the source domain, if not more. An abstraction which simplifies by reducing the number of elements in play is guaranteed to be leaky. Of

Re: STM and persistent data structures performance on mutli-core archs

2014-03-31 Thread Andy C
This is a slightly different result as this time I measure elapsed time (see appendix and excuse not so nice code) as opposed to a clock time. Results are similar (unless you have more processes than cores). I am planning to release the code to github soon. +--++---+ | # of

Re: STM and persistent data structures performance on mutli-core archs

2014-03-31 Thread Andy C
Memory access patterns make a huge a difference to memory throughput. I've explored this in some detail in the following blog. http://mechanical-sympathy.blogspot.co.uk/2012/08/memory-access-patterns-are-important.html Thanks for sharing. From the Clojure perspective and using reducers, it

Re: STM and persistent data structures performance on mutli-core archs

2014-03-30 Thread François Rey
On 30/03/14 07:40, Andy C wrote: Here are results where numbers are normalized gains. ++---++ | # of processes | random | linear| ++---++ |1 | 1.00| 1.00 |

Re: STM and persistent data structures performance on mutli-core archs

2014-03-30 Thread Martin Thompson
Memory access patterns make a huge a difference to memory throughput. I've explored this in some detail in the following blog. http://mechanical-sympathy.blogspot.co.uk/2012/08/memory-access-patterns-are-important.html On Sunday, 30 March 2014 06:40:24 UTC+1, Andy C wrote: Hi, So this is a

Re: STM and persistent data structures performance on mutli-core archs

2014-03-29 Thread Andy C
Hi, So this is a follow-up. I claimed that 1 CPU core can saturate the memory but it turns out I was wrong, at least to some extend. Driven by curiosity I decided to do some measurements and test my somewhat older MBP 2.2GHz Inter Core i7. While it obviously all depends on the hardware, I thought

Re: STM and persistent data structures performance on mutli-core archs

2014-03-20 Thread François Rey
On 20/03/14 04:03, Andy C wrote: So, the following test puzzles me. Not because it takes virtually the same time (I know that Fork/Join is not cheap and memory is probably the biggest bottleneck here). But because I do not get why map (as opposed to r/ma) uses all 8 cores on my MacBookPro.

Re: STM and persistent data structures performance on mutli-core archs

2014-03-20 Thread László Török
into uses reduce under the hood, you probably need fold to have the computation to run on FJ: (def a (time (r/foldcat (r/map #( Math/sin (* % %)) l) this runs more than 2x as fast on my macbook pro Las On Thu, Mar 20, 2014 at 10:34 AM, François Rey fmj...@gmail.com wrote: On 20/03/14

Re: STM and persistent data structures performance on mutli-core archs

2014-03-20 Thread François Rey
On 20/03/14 12:26, László Török wrote: into uses reduce under the hood, you probably need fold to have the computation to run on FJ: Yes, now I see all my CPUs being maxed-out. francois@laptop:~$ perf stat java -cp ~/.m2/repository/org/clojure/clojure/1.5.1/clojure-1.5.1.jar clojure.main -e

Re: STM and persistent data structures performance on mutli-core archs

2014-03-19 Thread Raoul Duke
I like FSMs, but they do not compose well. some have argued for generative grammars that generate the fsm, because it is generally easier to compose grammars, and then generate the final fsm. iiuc. -- You received this message because you are subscribed to the Google Groups Clojure group. To

Re: STM and persistent data structures performance on mutli-core archs

2014-03-19 Thread John Mastro
Due to the path-copy semantics, the contention gets driven to the root of the tree. Out of curiosity, would reference counting (rather than or in addition to normal GC) help with this? Or is reference counting problematic in a highly concurrent environment? It seems like reference cycles

Re: STM and persistent data structures performance on mutli-core archs

2014-03-19 Thread Martin Thompson
1. Due to the path-copy semantics, the contention gets driven to the root of the tree. Out of curiosity, would reference counting (rather than or in addition to normal GC) help with this? Or is reference counting problematic in a highly concurrent environment? It seems like

Re: STM and persistent data structures performance on mutli-core archs

2014-03-19 Thread Andy C
So, the following test puzzles me. Not because it takes virtually the same time (I know that Fork/Join is not cheap and memory is probably the biggest bottleneck here). But because I do not get why map (as opposed to r/ma) uses all 8 cores on my MacBookPro. All of them seem to be running

Re: STM and persistent data structures performance on mutli-core archs

2014-03-19 Thread Andy C
On Wed, Mar 19, 2014 at 11:14 AM, Raoul Duke rao...@gmail.com wrote: I like FSMs, but they do not compose well. some have argued for generative grammars that generate the fsm, because it is generally easier to compose grammars, and then generate the final fsm. iiuc. I thought about it too

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Martin Thompson
As a co-author of the reactive manifesto I'd like to point out that reactive can be considered a superset of async. Good reactive applications are event driven and non-blocking. They are also responsive, resilient, and scalable which async can help with but does not prescribe. What are the bad

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Martin Thompson
In my personal experience I cannot get within 10X the throughput, or latency, of mutable data models when using persistent data models. Hi Martin, Thanks for finding this thread :-). Let me ask a reversed question. Given you come from a persistent data model where code remains

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Andy C
As a co-author of the reactive manifesto I'd like to point out that reactive can be considered a superset of async. Good reactive applications are event driven and non-blocking. They are also responsive, resilient, and scalable which async can help with but does not prescribe. What are the

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Gary Trakhman
Martin, You recommend message-passing approaches like Erlang's as generally superior, but I'm curious if there's any more specific thoughts to the relative tradeoffs from shared-memory by default vs message-passing, ie, where you might rely on hardware-level copies (cache coherence) for

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Andy C
I've never heard of imperative model. I'm aware of imperative programming. Can you expand on what you mean? I meant mutable data model. Sorry for mixing up terms. http://blog.codinghorror.com/separating-programming-sheep-from-non-programming-goats/ Hope this helps clarify. It does.

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Gary Trakhman
If it's any consolation, queues or delays are used in hardware to overcome limitations in hardware, too :-). Specifically, I'm thinking of anti-jitter stuff in audio circuits. On Tue, Mar 18, 2014 at 12:45 PM, Andy C andy.coolw...@gmail.com wrote: I've never heard of imperative model. I'm

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Martin Thompson
As a co-author of the reactive manifesto I'd like to point out that reactive can be considered a superset of async. Good reactive applications are event driven and non-blocking. They are also responsive, resilient, and scalable which async can help with but does not prescribe. What are

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Laurent PETIT
2014-03-18 17:45 GMT+01:00 Andy C andy.coolw...@gmail.com: I've never heard of imperative model. I'm aware of imperative programming. Can you expand on what you mean? I meant mutable data model. Sorry for mixing up terms.

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Raoul Duke
some sort of FSM. Perhaps concurrency could be modeled using FSMs, but I do not believe it is always a simple transition. http://www.cis.upenn.edu/~stevez/papers/LZ06b.pdf :-) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group,

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread François Rey
On 18/03/14 18:03, Martin Thompson wrote: Our use of language in the technology industry could, for sure, be better. Take simple examples like RAM where random should be arbitrary, or don't get me started on people who misuse the term agnostic ;-) I would even say our use of abstractions in

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Raoul Duke
The thing is that our industry is based on layers upon layers of abstractions, whether at the physical level (integrated circuits, interfaces, etc.) or at the software level: binary (1GL) abstracted into assembly (2GL), then C language (3GL), etc. Virtual machines is now another you maybe

Re: STM and persistent data structures performance on mutli-core archs

2014-03-18 Thread Andy C
On Tue, Mar 18, 2014 at 11:06 AM, Raoul Duke rao...@gmail.com wrote: some sort of FSM. Perhaps concurrency could be modeled using FSMs, but I do not believe it is always a simple transition. http://www.cis.upenn.edu/~stevez/papers/LZ06b.pdf I like FSMs, but they do not compose well. A.

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread François Rey
On 16/03/14 18:24, Softaddicts wrote: I think that significant optimizations have to be decided at a higher level. I doubt that any of that can be implemented at the hardware level alone and let it decide on the fly. This sounds like magic, too good to be true. I am also quite convinced that

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread Luc Prefontaine
I would say that guidelines can help but curiosity is mandatory. I used to sleep with books on my bedside table about hardware design, operating system internals, io subsystems, networking, ... I did this for 15 years after I started to work with computers. Aside from work related stuff of

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread Andy C
From what I understand, a single core can easily saturate a memory bus. At the same time L2 and L3 caches are so small as compared to GB of memory systems yet growing them does not necessarily help either due to larger latencies. It all limits the number of practical applications which could

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread Luc Prefontaine
In the old days, the principle of locality was prevalent. Most data allocation was static or if not contiguous (the stack). Code was also contiguous being most of the time statically compiled on several pages. Memory was costly, its size remained reasonable. Today, data is scattered everywhere

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread Andy C
Today, data is scattered everywhere in a huge memory, its location changes frequently, code can be dynamically compiled, get loaded by small chunks, It describes my case actually - I am loading about 2-8GB of stuff in the memory and to tons of mapping, grouping by and filtering. That

Re: STM and persistent data structures performance on mutli-core archs

2014-03-17 Thread Andy C
In my personal experience I cannot get within 10X the throughput, or latency, of mutable data models when using persistent data models. Hi Martin, Thanks for finding this thread :-). Let me ask a reversed question. Given you come from a persistent data model where code remains reasonably

Re: STM and persistent data structures performance on mutli-core archs

2014-03-16 Thread Softaddicts
I agree with most of his arguments but not all of them. Memory subsystems have always been a major concern. Since 35 years ago, many designs went through simulation before burning anything on chip. Especially SMP designs with shared memory given the cost of prototyping. Simulations used typical

Re: STM and persistent data structures performance on mutli-core archs

2014-03-15 Thread François Rey
On 15/03/14 01:59, Andy C wrote: Maybe one day this idea http://en.wikipedia.org/wiki/Lisp_machine will come back, I mean in a new form .. That reminds me of this Urbit project http://www.urbit.org/, which is nowhere near usefulness at present, but deserves to be mentioned as a (radical)

Re: STM and persistent data structures performance on mutli-core archs

2014-03-15 Thread François Rey
Martin's point about immutable and persistent data structures is further developed in his interview on infoq http://www.infoq.com/interviews/reactive-system-design-martin-thompson, you can skim to point #9 if you're in a hurry. Overall what he says is that in terms of scalability of the

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Raoul Duke
cough cough erlang cough ahem -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Эльдар Габдуллин
He talks about simple things actually. When you have any sort of immutable data structure and you want to change it from multiple threads you just must have a mutable reference which points to the current version of that data structure. Now, updates to that mutable reference are fundamentally

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Gary Trakhman
On Fri, Mar 14, 2014 at 12:58 PM, Raoul Duke rao...@gmail.com wrote: cough cough erlang cough ahem Care to elaborate? :-) -- You received this message because you are subscribed to the Google Groups Clojure group. To post to this group, send email to clojure@googlegroups.com Note that

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Raoul Duke
cough cough erlang cough ahem Care to elaborate? :-) Now his point is that GC acts a super GIL which effectively kills all the hard work done on the language and application design level. erlang's approach to gc / sharing data / multi process / smp is i think an interesting sweet spot. there

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Gary Trakhman
Shared memory is kind of a degenerate case of message-passing when you think about cache-coherency protocols and such. I'd argue erlang's message-passing isn't inherently superior, but kind of obfuscates the issue. What's the sequential fraction of an arbitrary erlang program, can you even know

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Raoul Duke
i am probably out of my depth here, i do not have extensive real-world experience with the various ways to approach parallelism and concurrency (to be distinguished of course), more run of the mill stuff. so if i sound like i'm missing your point or am clueless i ask for your patience :-) What's

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Gary Trakhman
On Fri, Mar 14, 2014 at 1:35 PM, Raoul Duke rao...@gmail.com wrote: i am probably out of my depth here, i do not have extensive real-world experience with the various ways to approach parallelism and concurrency (to be distinguished of course), more run of the mill stuff. so if i sound like

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Gary Trakhman
Everything has tradeoffs. One more example was when Martin explained how he was able to get a 10x performance boost by pinning a JVM to each socket in a server, then pinning that JVM's memory to the RAM sockets closest to that CPU socket. Then he carefully setup shared memory message passing via

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Raoul Duke
that closely match or can be massaged to match or 'have sympathy' for the hardware realities. I think this can get lost when we stray too far. i wish this were somehow more modeled, composed, and controllable up in our ides and source code, rather than being esoteric tweaky options in the

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Gary Trakhman
Yes I agree, that was my next thought. I wish we could do these knobs in a single JVM. On Fri, Mar 14, 2014 at 2:20 PM, Raoul Duke rao...@gmail.com wrote: that closely match or can be massaged to match or 'have sympathy' for the hardware realities. I think this can get lost when we stray

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread guns
On Fri 14 Mar 2014 at 02:15:07PM -0400, Gary Trakhman wrote: Everything has tradeoffs. One more example was when Martin explained how he was able to get a 10x performance boost by pinning a JVM to each socket in a server, then pinning that JVM's memory to the RAM sockets closest to that CPU

Re: STM and persistent data structures performance on mutli-core archs

2014-03-14 Thread Andy C
Maybe one day this idea http://en.wikipedia.org/wiki/Lisp_machine will come back, I mean in a new form .. In any case, I think that it would be great to see some performance benchmarks for STM A. -- You received this message because you are subscribed to the Google Groups Clojure group. To

Re: STM and persistent data structures performance on mutli-core archs

2014-03-13 Thread Timothy Baldridge
I talked to Martin after a CodeMesh, and had a wonderful discussion with him about performance from his side of the issue. From his side I mean super high performance. You have to get a bit of background on some of his work (at least the work he talks about), much of what he does is high

Re: STM and persistent data structures performance on mutli-core archs

2014-03-13 Thread Andy C
On Thu, Mar 13, 2014 at 11:24 AM, Timothy Baldridge tbaldri...@gmail.comwrote: I talked to Martin after a CodeMesh, and had a wonderful discussion with him about performance from his side of the issue. From his side I mean super high performance. [...] Hi Tim, Thanks for explaining the