[ https://issues.apache.org/jira/browse/HAMA-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15265532#comment-15265532 ]
Edward J. Yoon commented on HAMA-941: ------------------------------------- First of all, it looks like boundary score factor seems always 0.0. This is the user-defined parameter. 2nd, if vertex count is (vC <= 1), score should be 1.0. Please apply my patch and test again. Do you see more bugs? {code} diff --git a/ml/src/main/java/org/apache/hama/ml/semiclustering/SemiClusteringVertex.java b/ml/src/main/java/org/apache/hama/ml/semiclustering/SemiClusteringVertex.java index 9a905c1..38481fd 100644 --- a/ml/src/main/java/org/apache/hama/ml/semiclustering/SemiClusteringVertex.java +++ b/ml/src/main/java/org/apache/hama/ml/semiclustering/SemiClusteringVertex.java @@ -71,7 +71,7 @@ candidates.add(msg); if (!msg.contains(this.getVertexID()) - && msg.size() == semiClusterMaximumVertexCount) { + && msg.size() < semiClusterMaximumVertexCount) { SemiClusterMessage msgNew = WritableUtils.clone(msg, this.getConf()); msgNew.addVertex(this); msgNew.setSemiClusterId("C" @@ -149,14 +149,15 @@ * @return the value to calcualte the Score of a semi-cluster. */ public double semiClusterScoreCalcuation(SemiClusterMessage message) { - double iC = 0.0, bC = 0.0, fB = 0.0, sC = 0.0; - int vC = 0, eC = 0; + // TODO fB is the bounday score factor. This should be configurable by user + // the default is 0.5 + double iC = 0.0, bC = 0.0, fB = 0.5, sC = 0.0; + int vC = 0; vC = message.size(); for (Vertex<Text, DoubleWritable, SemiClusterMessage> v : message .getVertexList()) { List<Edge<Text, DoubleWritable>> eL = v.getEdges(); for (Edge<Text, DoubleWritable> e : eL) { - eC++; if (message.contains(e.getDestinationVertexID()) && e.getValue() != null) { iC = iC + e.getValue().get(); @@ -165,8 +166,12 @@ } } } + if (vC > 1) - sC = ((iC - fB * bC) / ((vC * (vC - 1)) / 2)) / eC; + sC = ((iC - fB * bC) / ((vC * (vC - 1)) / 2)); + else + sC = 1.0; + return sC; } {code} > Semiclustering Termination > -------------------------- > > Key: HAMA-941 > URL: https://issues.apache.org/jira/browse/HAMA-941 > Project: Hama > Issue Type: Improvement > Components: examples, graph > Reporter: Edward J. Yoon > Priority: Minor > > Currently Semiclustering example will be terminated when the number of > iterations exceeded the predefined threshold max iteration. > App should be stopped if there's no cluster changes (I guess). Please check > and improve it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)