[ 
https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14426396#comment-14426396
 ] 

ASF GitHub Bot commented on MAHOUT-1493:
----------------------------------------

Github user andrewpalumbo commented on a diff in the pull request:

    https://github.com/apache/mahout/pull/111#discussion_r27781100
  
    --- Diff: spark/src/main/scala/org/apache/mahout/drivers/TestNBDriver.scala 
---
    @@ -76,7 +77,7 @@ object TestNBDriver extends MahoutSparkDriver {
       /** Read the test set from inputPath/part-x-00000 sequence file of form 
<Text,VectorWritable> */
       private def readTestSet: DrmLike[_] = {
         val inputPath = parser.opts("input").asInstanceOf[String]
    -    val trainingSet = drm.drmDfsRead(inputPath)
    +    val trainingSet = drm.drmDfsRead(inputPath).par(auto = true)
    --- End diff --
    
    @dlyubimov, @pferrel  -does adding `.par(auto = true)` here make sense?  
Otherwise there are no explicit partitioning instructions for the input data. 


> Port Naive Bayes to the Spark DSL
> ---------------------------------
>
>                 Key: MAHOUT-1493
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1493
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>            Reporter: Sebastian Schelter
>            Assignee: Andrew Palumbo
>              Labels: DSL, h2o, scala
>             Fix For: 0.10.0
>
>         Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch, 
> MAHOUT-1493.patch, MAHOUT-1493a.patch
>
>
> Port our Naive Bayes implementation to the new spark dsl. Shouldn't require 
> more than a few lines of code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to