[
https://issues.apache.org/jira/browse/MAHOUT-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13956898#comment-13956898
]
Dmitriy Lyubimov commented on MAHOUT-1500:
------------------------------------------
bq. What is the "Algebraic DSL"? Is that the one which came with the scala
bindings (with "%*%" operator etc.)?
There are two sets of operators -- for mahout-math (in-core), i call it scala
bindings and it is in the math-scala. It doesn't do much actually but just
providing a syntactic sugar for passing off things to in-core cost-based
optimizers (where they are implemented).
The second set of DSL is for (looking identically to in-core set of operators)
is for distributed stuff. (on diagram those two are not visually separated
other than there's just part of it over in-core and part of it over distributed
optimizer).
bq. Today, what distinguishes "Logical translation layer" vs "Physical
translation layer" in the code? What parts of the code is considered to be the
"Logical translation layer"?
Well you need to keep in perspective that distributed optimizer part was done
in like 3 days and it is now fairly tightly bound to spark code so separation
at this point is not very clean until we introduce another engine (which is
coming). Obviously at the time of introducing second engine, this needs to be
abstracted in a separate module without spark dependencies.
Logical translation is everything in drm.plan (operators implementing DrmLike[]
).
Physical translation to Spark is CheckpointedDrm, CheckpointAction and
everything in blas package (actual spark specific support for physical plan
after optimization run).
bq. Is the selection of "physical translation layer" a run-time decision?
yes it is run time optimizer action based on operand types, geometry (size),
orientation and partitioning. (very similar in fact to what happens in Pig
graph, except such graph rewrites are much more elegant in Scala).
> H2O integration
> ---------------
>
> Key: MAHOUT-1500
> URL: https://issues.apache.org/jira/browse/MAHOUT-1500
> Project: Mahout
> Issue Type: Improvement
> Reporter: Anand Avati
> Fix For: 1.0
>
>
> Integration with h2o (github.com/0xdata/h2o) in order to exploit its high
> performance computational abilities.
> Start with providing implementations of AbstractMatrix and AbstractVector,
> and more as we make progress.
--
This message was sent by Atlassian JIRA
(v6.2#6252)