Thanks! That helped quite a bit, I'm further along. Not quite working yet,
but closer.
On Sunday, September 27, 2015 at 6:33:50 AM UTC-4, Tim Holy wrote:
>
> You can try JLD.jl.
>
> --Tim
>
> On Saturday, September 26, 2015 05:31:18 PM Marc Stein wrote:
> > I'm just starting out with Julia, so please forgive me if this is a
> > simplistic question.
> >
> > I'm using the DecisionTree package which generates an Ensemble of
> > DecisionTrees in the code below:
> >
> >
> > ######
> >
> > using DataFrames
> > using DecisionTree
> >
> > clarity = readtable("/Users/marcstein/Active/julia/clarity.csv");
> > head(clarity)
> >
> > labels = array(clarity[:, 41]);
> > features = array(clarity[:, 1:40]);
> >
> > # Random Forest Classifier
> >
> > # train random forest classifier
> > # using 2 random features, 10 trees, and 0.5 portion of samples per tree
> > (optional)
> >
> > model = build_forest(labels, features, 2, 10, 0.5)
> >
> > # apply learned model
> >
> > outcome = apply_forest(model,
> >
> [2,761,0,0,2,1.32,74,0,365,3,2,15,10,1,0,1,24,36,2000,0,1,1,0,0,0,1,0,0,0,1,
>
> > 5,1,0,2,0,2,220,221,220,221])
> >
> > # # run n-fold cross validation for forests
> > # # using 2 random features, 10 trees, 3 folds and 0.5 of samples per
> tree
> > (optional)
> >
> > accuracy = nfoldCV_forest(labels, features, 2, 10, 3, 0.5)
> >
> > score = (mean(accuracy[1:3]))
> >
> > println(outcome)
> > println(score)
> >
> > ######
> >
> >
> > This code works fine. But because the DataFrame that is the training set
> is
> > quite large, I want to build the model and store it in one app and then
> > load the model and generate the outcome in a second app.
> >
> > It seems like this should be simple, just persist the model in a file
> and
> > pass it into the apply_forest method.
> >
> > I can't find a way, though, to persist the model. If I try
> >
> > writedlm(outfile,model)
> >
> > I get:
> >
> > EnsembleERROR: `start` has no method matching start(::Ensemble)
> > in writedlm at datafmt.jl:535
> > in writedlm at datafmt.jl:554
> >
> > If I try:
> >
> > print(outfile,model)
> >
> > the output file contains:
> >
> > Ensemble of Decision Trees
> > Trees: 10
> > Avg Leaves: 117.8
> > Avg Depth: 20.8
> >
> > which is a summary of the Ensemble, not the individual elements.
> >
> > I'm clearly missing something, but I haven't been able to figure it out
> so
> > far.
> >
> > Any suggestions would be greatly appreciated!
>
>