Using "split" without partitioning the data to train/test

Mahmood Naderan Mon, 31 Mar 2014 07:21:12 -0700

Hi,
In an old Mahout, I used wikipediaDataSetCreator on an input to create the 
training data
    
    mahout wikipediaDataSetCreator -i 
wiki-tr/chunks -o tr-input -c labels.txt


and then fed the tr-input to the trainclassifier using

    mahout trainclassifier -i tr-input -o wikimodel


Now, in Mahout 0.9, I see some examples that create 80% of the input file as 
training model using "split"

    mahout split -i input-vectors --trainingOutput tr-vectors --testOutput 
ts-vectors --randomSelectionPct 20

My question is how can I use "split" to split the input without partitioning it 
to train and test parts? I want to use one file as training input and the other 
file as the test input.


 
Regards,
Mahmood

Using "split" without partitioning the data to train/test

Reply via email to