[
https://issues.apache.org/jira/browse/MAHOUT-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14965411#comment-14965411
]
ASF GitHub Bot commented on MAHOUT-1570:
----------------------------------------
Github user andrewpalumbo commented on the pull request:
https://github.com/apache/mahout/pull/161#issuecomment-149637044
with the memory bumped up to 4G (inside IntelliJ) the stack trace shows
FlinkOpAtA as the source of the oom error in dals:
```
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:93)
at java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:140)
at
java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1876)
at
java.io.ObjectOutputStream$BlockDataOutputStream.flush(ObjectOutputStream.java:1821)
at java.io.ObjectOutputStream.flush(ObjectOutputStream.java:718)
at java.io.ObjectOutputStream.close(ObjectOutputStream.java:739)
at
org.apache.flink.util.InstantiationUtil.serializeObject(InstantiationUtil.java:319)
at
org.apache.flink.util.InstantiationUtil.writeObjectToConfig(InstantiationUtil.java:268)
at
org.apache.flink.runtime.operators.util.TaskConfig.setStubWrapper(TaskConfig.java:273)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.createDataSourceVertex(JobGraphGenerator.java:864)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:260)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.preVisit(JobGraphGenerator.java:103)
at
org.apache.flink.optimizer.plan.SourcePlanNode.accept(SourcePlanNode.java:87)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.SingleInputPlanNode.accept(SingleInputPlanNode.java:199)
at
org.apache.flink.optimizer.plan.OptimizedPlan.accept(OptimizedPlan.java:127)
at
org.apache.flink.optimizer.plantranslate.JobGraphGenerator.compileJobGraph(JobGraphGenerator.java:170)
at
org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:176)
at
org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:54)
at
org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:789)
at org.apache.flink.api.java.DataSet.collect(DataSet.java:408)
at
org.apache.mahout.flinkbindings.blas.FlinkOpAtA$.slim(FlinkOpAtA.scala:57)
at
org.apache.mahout.flinkbindings.blas.FlinkOpAtA$.at_a(FlinkOpAtA.scala:39)
at
org.apache.mahout.flinkbindings.FlinkEngine$.flinkTranslate(FlinkEngine.scala:124)
at
org.apache.mahout.flinkbindings.FlinkEngine$.toPhysical(FlinkEngine.scala:91)
```
> Adding support for Apache Flink as a backend for the Mahout DSL
> ---------------------------------------------------------------
>
> Key: MAHOUT-1570
> URL: https://issues.apache.org/jira/browse/MAHOUT-1570
> Project: Mahout
> Issue Type: Improvement
> Reporter: Till Rohrmann
> Assignee: Alexey Grigorev
> Labels: DSL, flink, scala
> Fix For: 0.11.1
>
>
> With the finalized abstraction of the Mahout DSL plans from the backend
> operations (MAHOUT-1529), it should be possible to integrate further backends
> for the Mahout DSL. Apache Flink would be a suitable candidate to act as a
> good execution backend.
> With respect to the implementation, the biggest difference between Spark and
> Flink at the moment is probably the incremental rollout of plans, which is
> triggered by Spark's actions and which is not supported by Flink yet.
> However, the Flink community is working on this issue. For the moment, it
> should be possible to circumvent this problem by writing intermediate results
> required by an action to HDFS and reading from there.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)