Hello,

I'm interested in hearing use cases and parallelism problems where Spark
was *not* a good fit for you. This is an effort to understand the limits of
MapReduce style parallelism.

Some broad things that pop out:
-- recursion
-- problems where the task graph is not known ahead of time
-- some graph problems (
http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html)

This is not in anyway an attack on Spark. It's an amazing tool that does
it's job very well. I'm just curious where it starts breaking down. Let me
know if you have any experiences!

Thanks very much,
Ben

Reply via email to