[
https://issues.apache.org/jira/browse/MAHOUT-451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916762#action_12916762
]
Drew Farris commented on MAHOUT-451:
------------------------------------
I have a patch that changes this to work using the hadoop filesystem api. I
plan to get this posted and tested by Monday for inclusion in 0.4
> Simple utility to split bayes input into training/test sets
> -----------------------------------------------------------
>
> Key: MAHOUT-451
> URL: https://issues.apache.org/jira/browse/MAHOUT-451
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Affects Versions: 0.3
> Reporter: Drew Farris
> Assignee: Drew Farris
> Priority: Minor
> Attachments: MAHOUT-451.patch, MAHOUT-451.patch
>
>
> Provides a simply utility that you point at a directory containing files in
> Bayes classifier input format. Given the number of documents to write to the
> test set, for each input file it will produce files in two output
> directories, one containing training data with the test documents removed and
> a second containing the test documents.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.