Simple utility to generate to split bayes input into training/test sets
-----------------------------------------------------------------------

                 Key: MAHOUT-451
                 URL: https://issues.apache.org/jira/browse/MAHOUT-451
             Project: Mahout
          Issue Type: New Feature
          Components: Classification
    Affects Versions: 0.3
            Reporter: Drew Farris
            Priority: Minor


Provides a simply utility that you point at a directory containing files in 
Bayes classifier input format. Given the number of documents to write to the 
test set, for each input file it will produce files in two output directories, 
one containing training data with the test documents removed and a second 
containing the test documents. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to