On Wed, Jul 4, 2012 at 12:09 AM, Caspar Hsieh <[email protected]> wrote:
> Hi, Ted Dunning > > I comment the line "Collection.shuffle(files);" in TrainNewsGroups.java, > let the model trained with same order of examples each time. > This will prevent effective learning. You must shuffle the data at least once. > > After recompile the code and redo the experiment, the model are still not > the same each time. :( > There is also non-determinism in the AdaptiveLogisticRegression in the CrossFoldLearner. > And I make sure the vectors before input to train are the same value and > order each time. > > How do I fixed the order of examples?? > One thing that might help is to tell RandomUtils that you are running a test. This will make the random number generators that Mahout controls become deterministic. You can add a random number generator argument to the shuffle call to get determinism there as well.
