H2O will launch an internal Task in the single-digit microsecond range. Because of this, we can launch 100,000's (millions?) a second... leading to fine-grained data parallelism, and high CPU utilization. This is a big piece of our single-node speed. Some other distributed Task-launching solutions I've seen tend to require a network-hop per-task... leading to your 10ms to launch as task requirement, leading to a limit of a few 1000 Tasks/sec requiring tasks that are much larger and coarser than H2O's... leading to much lower CPU utilization.

Also, I'm getting 200micro-second ping's between my datacenter machines.... down from 10msec. It's decent commodity hardware, nothing special. Meaning: H2O can launch task on an entire 32-node cluster in about 1msec, starting from a single driving node (log-tree fanout, depth 5, 200micro-second single UDP packet launch, 1micro-second internal task launch).

And this latency matters when the work itself is lots and lots "small" jobs, as is common when a DSL such as Mahout or Spark/Scala or R is driving simple operators over bulk data.

Cliff


On 4/30/2014 3:35 PM, Dmitriy Lyubimov wrote:
This is kind of an old news. They all do, for years now. I've been building a system that does real time distributed pipelines (~30 ms to start all steps in pipeline + in-core complexity) for years. Note that node-to-node hop in clouds are usually mean at about 10ms so microseconds are kind of out of question for network performance reasons in real life except for private racks. The only thing that doesn't do this is the MR variety of Hadoop.

Reply via email to