[
https://issues.apache.org/jira/browse/MAHOUT-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14425996#comment-14425996
]
ASF GitHub Bot commented on MAHOUT-1493:
----------------------------------------
GitHub user andrewpalumbo opened a pull request:
https://github.com/apache/mahout/pull/111
MAHOUT-1493: Add CLI options for --overwrite and --alphaI to NB Drivers
Presently `mahout spark-trainnb` will not complete if a model already
exists in the output directory. These last options add in an `--overwrite`
option to overwrite a model in the given output directory.
as well:
- add `.par(auto = true)` to the input Drm
- ads a `delete(...)` method to `Hadppo1HDFSUtils` which does not handle
any IO exceptions
- adds an almost trivial `--alphaI` option to set the Laplace smoothing
factor from the CLI
This patch will complete the full port of the old MapReduce Naive Bayes to
the `math-scala` and `spark` modules.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/andrewpalumbo/mahout MAHOUT-1493g
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/mahout/pull/111.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #111
----
commit 37f40e1ba32871227cc025effe87b135b5c28f31
Author: Andrew Palumbo <[email protected]>
Date: 2015-04-05T20:56:48Z
set .par(auto = true) on input data in CLI Drivers
commit 149b0d0ef4ee5ea6e161b669f09d4fa2eeb2fff3
Author: Andrew Palumbo <[email protected]>
Date: 2015-04-05T22:12:27Z
add CLI driver options for -ow and -alphaI. added a delete(...) method in
hHadoop1HDFSUtil.
commit 59ca8c1b2e960b5bfc10de39d5cd8e2bb0042c12
Author: Andrew Palumbo <[email protected]>
Date: 2015-04-05T22:14:58Z
Adjust Example accordingly
----
> Port Naive Bayes to the Spark DSL
> ---------------------------------
>
> Key: MAHOUT-1493
> URL: https://issues.apache.org/jira/browse/MAHOUT-1493
> Project: Mahout
> Issue Type: Bug
> Components: Classification
> Reporter: Sebastian Schelter
> Assignee: Andrew Palumbo
> Labels: DSL, h2o, scala
> Fix For: 0.10.0
>
> Attachments: MAHOUT-1493.patch, MAHOUT-1493.patch, MAHOUT-1493.patch,
> MAHOUT-1493.patch, MAHOUT-1493a.patch
>
>
> Port our Naive Bayes implementation to the new spark dsl. Shouldn't require
> more than a few lines of code.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)