Re: STM and persistent data structures performance on mutli-core archs

Martin Thompson Tue, 18 Mar 2014 04:04:41 -0700

>
> In my personal experience I cannot get within 10X the throughput, or 
>> latency, of mutable data models when using persistent data models.
>>
>
> Hi Martin,
> Thanks for finding this thread :-). Let me ask a reversed question. Given 
> you come from a persistent data model where code remains reasonably simple. 
> How much effort it really takes to make an imperative model working well 
> with relatively low number of defects? How deep you go to make sure that 
> data structures fit CPU architecture in terms of topology as well as of 
> size of caches? And how it scales in terms of writing the code itself (I 
> mean are code alternations are easy straightforward or you have to write it 
> from scratch)?
>


I've never heard of "imperative model". I'm aware of imperative 
programming. Can you expand on what you mean?

I think in the context of this thread 2 points are being conflated here. We 
started discussing the point about how "immutable data (actually persistent 
data structures) solve the parallel problem".  Parallel in the context of 
increasing thread count as a result of core counts increasing in CPUs, but 
actually concurrent in access and mutation of data structures in the micro 
and domain models in the macro. This is being conflated with the 
performance of data structures in general. 

I believe that concurrent access to data structures should not be the 
default design approach. The greatest complexity in any system often comes 
from concurrent access to state, be the persistent or not. It is better to 
have private data structures within processing contexts (threads, 
processes, services, etc.), which communicate via messages. Within these 
private processing contexts concurrent access is not an issue and thus 
non-concurrent data structures can be employed. With non-concurrent access 
it is easy to employ rich data structures like basic arrays, 
open-addressing hash maps, B+ and B* trees, bitsets, bloom filters, etc., 
without which many applications would be unusable due to performance and 
memory constraints.

When working in the context of single threaded access to data structures I 
see using a blend of functional, set theory, OO, and imperative programming 
techniques as the best way to go. Right tool for the job. I see the effort 
levels as very similar when choosing the appropriate technique that fits a 
problem. I have many times seen a code mess and pain result from 
inappropriate techniques applied blindly like religion to problems that 
just are not a good fit.

So to more directly answer you question. If I need to have concurrent 
access to a shared data structures I prefer it to be persistent from a 
usability perspective given performance constraints of my application at 
satisfied. If I need greater performance I find non-persistent data 
structures can give better performance for a marginal increase in 
complexity. I've never quantified it but it feels like percentage points 
rather than factors. I frame this in the context that the large step in 
complexity here is choosing to have concurrent access to the data. It is 
the elephant in the room. Those who are not good at concurrent programming 
should just not be in this space in the first place because if they do it 
will get ugly either way. I'd recommend reading the findings of the "double 
hump" paper from Brunel University on peoples natural ability in 
programming.

http://blog.codinghorror.com/separating-programming-sheep-from-non-programming-goats/

Hope this helps clarify.

Martin...


-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: STM and persistent data structures performance on mutli-core archs

Reply via email to