[
https://issues.apache.org/jira/browse/FLINK-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andra Lungu reopened FLINK-2293:
--------------------------------
Unfortunately it still fails... And now I am sure I pulled the changes for the
latest master...
This is the tail of the Job Manager log:
Vertices(GSAJaccard.java:67)) -> Map (Map at
cleanupEdges(SplitVertex.java:252)) (155/224)
(a781650bddb6ef3640ceade61e385a35) switched from CANCELING to CANCELED
22:48:20,706 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- CHAIN Join(Join at getTriplets(Graph.java:335)) -> Map (Map at
computeJaccardForSplitVertices(GSAJaccard.java:67)) -> Map (Map at
cleanupEdges(SplitVertex.java:252)) (154/224)
(473136130456da82aa3fc3c28dc9cd4e) switched from CANCELING to CANCELED
22:48:20,712 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- CHAIN Join(Join at getTriplets(Graph.java:335)) -> Map (Map at
computeJaccardForSplitVertices(GSAJaccard.java:67)) -> Map (Map at
cleanupEdges(SplitVertex.java:252)) (224/224)
(4269d4b6a626781e45d1232a3846be44) switched from CANCELING to CANCELED
22:48:20,732 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph
- CHAIN Join(Join at getTriplets(Graph.java:335)) -> Map (Map at
computeJaccardForSplitVertices(GSAJaccard.java:67)) -> Map (Map at
cleanupEdges(SplitVertex.java:252)) (67/224) (7a18869b576feca07305c178cc3ecd1f)
switched from CANCELING to CANCELED
22:48:20,732 INFO org.apache.flink.runtime.jobmanager.JobManager
- Status of job 3a7b88c3b6065fdd7f1be155905caa5f (Node Splitting GSA Jaccard
Similarity Measure) changed to FAILED.
java.lang.ArithmeticException: / by zero
at
org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:836)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:819)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
at
org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
at
org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
at
org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
at
org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
at
org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
at java.lang.Thread.run(Thread.java:722)
> Division by Zero Exception
> --------------------------
>
> Key: FLINK-2293
> URL: https://issues.apache.org/jira/browse/FLINK-2293
> Project: Flink
> Issue Type: Bug
> Components: Local Runtime
> Affects Versions: 0.9, 0.10
> Reporter: Andra Lungu
> Assignee: Stephan Ewen
> Priority: Critical
> Fix For: 0.10, 0.9.1
>
>
> I am basically running an algorithm that simulates a Gather Sum Apply
> Iteration that performs Traingle Count (Why simulate it? Because you just
> need a superstep -> useless overhead if you use the runGatherSumApply
> function in Graph).
> What happens, at a high level:
> 1). Select neighbors with ID greater than the one corresponding to the
> current vertex;
> 2). Propagate the received values to neighbors with higher ID;
> 3). compute the number of triangles by checking whether
> trgVertex.getValue().get(srcVertex.getId());
> As you can see, I *do not* perform any division at all;
> code is here:
> https://github.com/andralungu/gelly-partitioning/blob/master/src/main/java/example/GSATriangleCount.java
> Now for small graphs, 50MB max, the computation finishes nicely with the
> correct result. For a 10GB graph, however, I got this:
> java.lang.ArithmeticException: / by zero
> at
> org.apache.flink.runtime.operators.hash.MutableHashTable.insertIntoTable(MutableHashTable.java:836)
> at
> org.apache.flink.runtime.operators.hash.MutableHashTable.buildTableFromSpilledPartition(MutableHashTable.java:819)
> at
> org.apache.flink.runtime.operators.hash.MutableHashTable.prepareNextPartition(MutableHashTable.java:508)
> at
> org.apache.flink.runtime.operators.hash.MutableHashTable.nextRecord(MutableHashTable.java:544)
> at
> org.apache.flink.runtime.operators.hash.NonReusingBuildFirstHashMatchIterator.callWithNextKey(NonReusingBuildFirstHashMatchIterator.java:104)
> at
> org.apache.flink.runtime.operators.MatchDriver.run(MatchDriver.java:173)
> at
> org.apache.flink.runtime.operators.RegularPactTask.run(RegularPactTask.java:496)
> at
> org.apache.flink.runtime.operators.RegularPactTask.invoke(RegularPactTask.java:362)
> at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559)
> at java.lang.Thread.run(Thread.java:722)
> see the full log here: https://gist.github.com/andralungu/984774f6348269df7951
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)