[
https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14237336#comment-14237336
]
ASF GitHub Bot commented on MAHOUT-1493:
----------------------------------------
Github user andrewpalumbo commented on a diff in the pull request:
https://github.com/apache/mahout/pull/32#discussion_r21431045
--- Diff: h2o/src/main/java/org/apache/mahout/h2obindings/H2OHelper.java ---
@@ -327,9 +327,11 @@ public static H2ODrm drmFromMatrix(Matrix m, int
minHint, int exactHint) {
labels = frame.anyVec().makeZero();
Vec.Writer writer = labels.open();
Map<Integer,String> rmap = reverseMap(map);
-
- for (long r = 0; r < m.rowSize(); r++) {
- writer.set(r, rmap.get(r));
+ // TODO: fix BUG here... h20 water.fvec.Vec does not accept String
values
+ // TODO: need a new distributed data structure for storing String
keys.
+ for (int r = 0; r < m.rowSize(); r++) {
+ //writer.set(r, rmap.get(r));
+ labels.chunkForRow(r).set(r, rmap.get(r));
--- End diff --
Thanks again for looking at this, Anand - I'm getting the following with
patch applied and the h2o tests enabled. Appreciate it!
- NB Aggregator *** FAILED ***
java.lang.IllegalArgumentException: Not a String
at water.fvec.Chunk.set_impl(Chunk.java:189)
at water.fvec.Chunk.set0(Chunk.java:158)
at water.fvec.Chunk.set(Chunk.java:105)
at
org.apache.mahout.h2obindings.H2OHelper.drmFromMatrix(H2OHelper.java:334)
at
org.apache.mahout.h2obindings.H2OEngine$.drmParallelizeWithRowLabels(H2OEngine.scala:83)
at
org.apache.mahout.math.drm.package$.drmParallelizeWithRowLabels(package.scala:67)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$2.apply$mcV$sp(NBTestBase.scala:90)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$2.apply(NBTestBase.scala:69)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$2.apply(NBTestBase.scala:69)
at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
...
- Model DFS Serialization *** FAILED ***
java.lang.IllegalArgumentException: Not a String
at water.fvec.Chunk.set_impl(Chunk.java:189)
at water.fvec.Chunk.set0(Chunk.java:158)
at water.fvec.Chunk.set(Chunk.java:105)
at
org.apache.mahout.h2obindings.H2OHelper.drmFromMatrix(H2OHelper.java:334)
at
org.apache.mahout.h2obindings.H2OEngine$.drmParallelizeWithRowLabels(H2OEngine.scala:83)
at
org.apache.mahout.math.drm.package$.drmParallelizeWithRowLabels(package.scala:67)
at
org.apache.mahout.classifier.naivebayes.NBModel.dfsWrite(NBModel.scala:144)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$3.apply$mcV$sp(NBTestBase.scala:142)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$3.apply(NBTestBase.scala:117)
at
org.apache.mahout.classifier.naivebayes.NBTestBase$$anonfun$3.apply(NBTestBase.scala:117)
...
> Port Naive Bayes to the Spark DSL
> ---------------------------------
>
> Key: MAHOUT-1493
> URL: https://issues.apache.org/jira/browse/MAHOUT-1493
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Reporter: Sebastian Schelter
> Assignee: Andrew Palumbo
> Fix For: 1.0
>
> Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch,
> MAHOUT-1493.patch, MAHOUT-1493a.patch
>
>
> Port our Naive Bayes implementation to the new spark dsl. Shouldn't require
> more than a few lines of code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)