Simple utility to generate to split bayes input into training/test sets
-----------------------------------------------------------------------
Key: MAHOUT-451
URL: https://issues.apache.org/jira/browse/MAHOUT-451
Project: Mahout
Issue Type: New Feature
Components: Classification
Affects Versions: 0.3
Reporter: Drew Farris
Priority: Minor
Provides a simply utility that you point at a directory containing files in
Bayes classifier input format. Given the number of documents to write to the
test set, for each input file it will produce files in two output directories,
one containing training data with the test documents removed and a second
containing the test documents.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.