Re: Straw poll re: H2O ?

Cliff Click Fri, 02 May 2014 14:58:33 -0700

"detailed description of h20's programming and execution model."

No *formal* documentation for this exists; been no time to write such athing.

There's easy-to-find slide-share & video talks.  Here are two:
 - http://www.infoq.com/presentations/api-memory-analytics
 - http://www.infoq.com/interviews/click-0xdata

Summary:

- A high-performance in-memory K/V store (cache-hits are 150 nano's,misses depend on network transfer times). Supports full JMM exactsemantics & transactions. Used to hold the Big Data & to controlcomputations- Big Data support via Frames/Vecs/Chunks - see the above slides forgraphical overview; compression "is a implementation feature" but notvisible in the execution model except as speed or size constraints.

- A well-tuned data-ingestion system

- Map/Reduce coding style, uses Java 1.7's Fork/Join on a single-node,but distributed across nodes. Maps are fine-grained F/J tasks and canproduce both a Big output (distributed parallel writing to Frames/Vecs)and a Small output (anything in a POJO). Reductions are alsofine-grained, and happen anytime 2 maps are done... so separate"reduction" phase. Not the hadoop M/R - no sort or shuffle steps,everything in DRAM.

- REST/JSON access to most algo's & coding.  Web browser/html over that.

- Internal DSL - A work-in-progress. Right now converts a subset of theR language to AST's, then executes the AST's. Covers a fairly largesubset of the bulk/array operators in R, and expressions built thereof.Includes 1st-class functions and e.g. GroupBy (ddply in R lingo).Expressions like "|apply(someFrame,2,function(x){ifelse(is.na(x),mean(x),x)})|" will replace NA's in "someFrame" with themean of the column. It's R syntax (or very close to R), not Scala.


Cliff



On 5/1/2014 10:13 AM, Dmitriy Lyubimov wrote:

I'd be happy to see a concept of how to bring the operations of the DSL
onto h20, as well as a detailed description of h20's programming and
execution model.

+1.


--sebastian

Re: Straw poll re: H2O ?

Reply via email to