[
https://issues.apache.org/jira/browse/MAHOUT-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643481#comment-14643481
]
ASF GitHub Bot commented on MAHOUT-1570:
----------------------------------------
Github user dlyubimov commented on the pull request:
https://github.com/apache/mahout/pull/137#issuecomment-125352778
also on the topic of test suite coverage: we need to pass our standard
tests. The base clases for them are:
https://github.com/apache/mahout/blob/master/math-scala/src/test/scala/org/apache/mahout/math/decompositions/DistributedDecompositionsSuiteBase.scala
https://github.com/apache/mahout/blob/master/math-scala/src/test/scala/org/apache/mahout/math/drm/DrmLikeOpsSuiteBase.scala
https://github.com/apache/mahout/blob/master/math-scala/src/test/scala/org/apache/mahout/math/drm/DrmLikeSuiteBase.scala
https://github.com/apache/mahout/blob/master/math-scala/src/test/scala/org/apache/mahout/math/drm/RLikeDrmOpsSuiteBase.scala
The technique here is to take these test cases as a base class for a
distributed test case (you may want to see how it was done for Spark and H2O).
This is our basic assertion that our main algorithms are passing on a toy
problem for a given backend.
> Adding support for Apache Flink as a backend for the Mahout DSL
> ---------------------------------------------------------------
>
> Key: MAHOUT-1570
> URL: https://issues.apache.org/jira/browse/MAHOUT-1570
> Project: Mahout
> Issue Type: Improvement
> Reporter: Till Rohrmann
> Assignee: Alexey Grigorev
> Labels: DSL, flink, scala
> Fix For: 0.11.0
>
>
> With the finalized abstraction of the Mahout DSL plans from the backend
> operations (MAHOUT-1529), it should be possible to integrate further backends
> for the Mahout DSL. Apache Flink would be a suitable candidate to act as a
> good execution backend.
> With respect to the implementation, the biggest difference between Spark and
> Flink at the moment is probably the incremental rollout of plans, which is
> triggered by Spark's actions and which is not supported by Flink yet.
> However, the Flink community is working on this issue. For the moment, it
> should be possible to circumvent this problem by writing intermediate results
> required by an action to HDFS and reading from there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)