[ 
https://issues.apache.org/jira/browse/MAHOUT-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956898#comment-13956898
 ] 

Dmitriy Lyubimov commented on MAHOUT-1500:
------------------------------------------

bq. What is the "Algebraic DSL"? Is that the one which came with the scala 
bindings (with "%*%" operator etc.)?

There are two sets of operators -- for mahout-math (in-core), i call it scala 
bindings and it is in the math-scala. It doesn't do much actually but just 
providing a syntactic sugar for passing off things to in-core cost-based 
optimizers (where they are implemented). 

The second set of DSL is for (looking identically to in-core set of operators)  
is for distributed stuff. (on diagram those two are not visually separated 
other than there's just part of it over in-core and part of it over distributed 
optimizer).

bq. Today, what distinguishes "Logical translation layer" vs "Physical 
translation layer" in the code? What parts of the code is considered to be the 
"Logical translation layer"? 

Well you need to keep in perspective that distributed optimizer part was done 
in like 3 days and it is now fairly tightly bound to spark code so separation 
at this point is not very clean until we introduce another engine (which is 
coming). Obviously at the time of introducing second engine, this needs to be 
abstracted in a separate module without spark dependencies.

Logical translation is everything in drm.plan (operators implementing DrmLike[] 
). 
Physical translation to Spark is CheckpointedDrm, CheckpointAction and 
everything in blas package (actual spark specific support for physical plan 
after optimization run). 

bq. Is the selection of "physical translation layer" a run-time decision?
yes it is run time optimizer action based on operand types, geometry (size), 
orientation and partitioning. (very similar in fact to what happens in Pig 
graph, except such graph rewrites are much more elegant in Scala).



> H2O integration
> ---------------
>
>                 Key: MAHOUT-1500
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1500
>             Project: Mahout
>          Issue Type: Improvement
>            Reporter: Anand Avati
>             Fix For: 1.0
>
>
> Integration with h2o (github.com/0xdata/h2o) in order to exploit its high 
> performance computational abilities.
> Start with providing implementations of AbstractMatrix and AbstractVector, 
> and more as we make progress.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to