[
https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14426396#comment-14426396
]
ASF GitHub Bot commented on MAHOUT-1493:
----------------------------------------
Github user andrewpalumbo commented on a diff in the pull request:
https://github.com/apache/mahout/pull/111#discussion_r27781100
--- Diff: spark/src/main/scala/org/apache/mahout/drivers/TestNBDriver.scala
---
@@ -76,7 +77,7 @@ object TestNBDriver extends MahoutSparkDriver {
/** Read the test set from inputPath/part-x-00000 sequence file of form
<Text,VectorWritable> */
private def readTestSet: DrmLike[_] = {
val inputPath = parser.opts("input").asInstanceOf[String]
- val trainingSet = drm.drmDfsRead(inputPath)
+ val trainingSet = drm.drmDfsRead(inputPath).par(auto = true)
--- End diff --
@dlyubimov, @pferrel -does adding `.par(auto = true)` here make sense?
Otherwise there are no explicit partitioning instructions for the input data.
> Port Naive Bayes to the Spark DSL
> ---------------------------------
>
> Key: MAHOUT-1493
> URL: https://issues.apache.org/jira/browse/MAHOUT-1493
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Reporter: Sebastian Schelter
> Assignee: Andrew Palumbo
> Labels: DSL, h2o, scala
> Fix For: 0.10.0
>
> Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch,
> MAHOUT-1493.patch, MAHOUT-1493a.patch
>
>
> Port our Naive Bayes implementation to the new spark dsl. Shouldn't require
> more than a few lines of code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)