On Wed, Jul 4, 2012 at 12:09 AM, Caspar Hsieh <[email protected]> wrote:

> Hi, Ted Dunning
>
> I comment the line "Collection.shuffle(files);" in TrainNewsGroups.java,
> let the model trained with same order of examples each time.
>

This will prevent effective learning.  You must shuffle the data at least
once.


>
> After recompile the code and redo the experiment, the model are still not
> the same each time. :(
>

There is also non-determinism in the AdaptiveLogisticRegression in the
CrossFoldLearner.


> And I make sure the vectors before input to train are the same value and
> order each time.
>
> How do I fixed the order of examples??
>

One thing that might help is to tell RandomUtils that you are running a
test.  This will make the random number generators that Mahout controls
become deterministic.

You can add a random number generator argument to the shuffle call to get
determinism there as well.

Reply via email to