[
https://issues.apache.org/jira/browse/MAHOUT-1500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13962017#comment-13962017
]
Dmitriy Lyubimov commented on MAHOUT-1500:
------------------------------------------
After reviewing the newly announced https://github.com/tdunning/h2o-matrix and
making a willful conjecture that it is what this issue is about (since it is
still not explicitly confirmed on this Jira), I am changing my vote to -0.
Here are components of my vote.
(1) +1 Do-ocracy -- those who willing do things, and (what is especially
important in our case) provide continued support for it, deserve componential
+1 to begin with.
(2) Big +1 on using h20 as external dependency. I don't think we want to be in
business of creating, maintaining, or merging with distributed execution
engines, we should be just translating high level ML semantics to them.
(3) +0 in-core API stability: This work must not change or deprecate in-core
API contracts thus forcing existing mahout-math users to do unreasonable
migration and refactoring steps and/or experience performance decline.
Mahout-math is one of the few still very valuable components, this is
important. (Current state of the things do not introduce such changes).
(4) +0 in-core API augmentation. This work must not create API duplication
(alternatives to existing contracts) or augmented API contracts that are either
not adequately backed by the existing multitude of in-core matrix types or do
not make sense for in-core structures. (Current state of the things does not
introduce such changes).
(5) -1 I still maintain that major Matrix and Vector in-core contracts do not
provide adequate basis, nor are a good fit for for building shared-nothing
generic environment. Thus, further partitioning of Matrix and Vector contract
sets is required If distributed structures must share same hierarchy base with
in-core ones. However, doing so will contradict positions (3) and (4) above.
Which is why i maintain that the least painful way to address those is to
create a separate hierarchy base for H20Matrix which would intersect some of
high-level algebraic contracts with in-core contracts while bearing identical
semantics.
This concern seems to be shared even by the authors of the code, if I am not
misinterpreting the meaning of the comments here.
{code:title="H2OMatrix.java"}
// Single-element accessors. Calling these likely indicates a huge performance
bug.
@Override public double getQuick(int row, int column) { return
_fr.vecs()[column].at(row); }
@Override public void setQuick(int row, int column, double value) {
_fr.vecs()[column].set(row,value); _fr.vecs()[column].
{code}
I reserve the right to change my vote if components of my vote are affected by
future changes.
I will not raise objections or add points based on performance.
> H2O integration
> ---------------
>
> Key: MAHOUT-1500
> URL: https://issues.apache.org/jira/browse/MAHOUT-1500
> Project: Mahout
> Issue Type: Improvement
> Reporter: Anand Avati
> Fix For: 1.0
>
>
> Integration with h2o (github.com/0xdata/h2o) in order to exploit its high
> performance computational abilities.
> Start with providing implementations of AbstractMatrix and AbstractVector,
> and more as we make progress.
--
This message was sent by Atlassian JIRA
(v6.2#6252)