On Thu, Jul 7, 2011 at 2:20 PM, hakeem <[email protected]> wrote: > Because I have so few documents, I run the set of documents through train() > in epochs -- up to 1000 times, shuffling the order of the documents on each > epoch. >
Fair. > My questions: > 1) Are these results surprising to you? Or, should they be expected given > the small size of my data set? > They are surprising. > 2) How might I tweak the OnlineLogisticRegression settings to accommodate > my > small data set? > You didn't mention how you encode the data, nor what kind of features you have. Is this a standard data set? Can you post the data so that we can turn it into a worked example?
