[
https://issues.apache.org/jira/browse/MAHOUT-1612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385456#comment-14385456
]
ASF GitHub Bot commented on MAHOUT-1612:
----------------------------------------
Github user smarthi commented on a diff in the pull request:
https://github.com/apache/mahout/pull/55#discussion_r27347144
--- Diff:
integration/src/main/java/org/apache/mahout/utils/clustering/JsonClusterWriter.java
---
@@ -70,20 +71,30 @@ public void write(ClusterWritable clusterWritable)
throws IOException {
Map<String, Object> res = Maps.newHashMap();
// get top terms
- List<Object> topTerms = getTopFeaturesList(clusterWritable.getValue()
+ if (dictionary != null) {
+ List<Object> topTerms = getTopFeaturesList(clusterWritable.getValue()
.getCenter(), dictionary, numTopFeatures);
- res.put("top_terms", topTerms);
-
+ res.put("top_terms", topTerms);
+ } else {
+ res.put("top_terms", Lists.newArrayList());
+ }
+
// get human-readable cluster representation
- Cluster cluster = clusterWritable.getValue();
- Map<String,Object> fmtStr = cluster.asJson(dictionary);
- res.put("cluster_id", cluster.getId());
- res.put("cluster", fmtStr);
-
- // get points
- List<Object> points = getPoints(cluster, dictionary);
- res.put("points", points);
-
+ Cluster cluster = clusterWritable.getValue();
+ if (dictionary != null) {
+ Map<String,Object> fmtStr = cluster.asJson(dictionary);
+ res.put("cluster_id", cluster.getId());
+ res.put("cluster", fmtStr);
+
+ // get points
+ List<Object> points = getPoints(cluster, dictionary);
+ res.put("points", points);
+ } else {
+ res.put("cluster_id", cluster.getId());
+ res.put("cluster", new HashMap<String,Object>());
+ res.put("points", Lists.newArrayList());
--- End diff --
replace by new ArrayList<>()
> NullPointerException happens during JSON output format for clusterdumper
> ------------------------------------------------------------------------
>
> Key: MAHOUT-1612
> URL: https://issues.apache.org/jira/browse/MAHOUT-1612
> Project: Mahout
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 0.9
> Reporter: Guo Ruijing
> Assignee: Suneel Marthi
> Labels: legacy
> Fix For: 0.10.0
>
>
> 1. download datafile from:
> http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
> 2. put data file on hdfs:
> hdfs dfs -mkdir testdata
> hdfs dfs -put synthetic_control.data testdata/
> 3. run a mahout clustering job:
> mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
> 4. run clusterdump with JSON format:
> mahout clusterdump i output/clusters*-final -p output/clusteredPoints -o
> /tmp/report -of JSON
> expected:
> clusterdump with JSON format should succeeded same as CSV and TEXT
> actually:
> clusterdump with JSON format throw NullPointerException
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)