Downsampling negatives should make little difference to accuracy. It can substantially affect training time however.
Sent from my iPhone On Jul 11, 2011, at 6:56, Svetlomir Kasabov <[email protected]> wrote: > Hello, > > I plan using logistic regression for predicting the probability that a > patient will be given a drug Y. The problem is, that patients don't get that > drug so often and I have many more training examples with Y=0 than examples > with Y=1. Do you think I should keep the number of negative examples equal to > that of positive examples? Or should I ignore that number difference and give > my logistic regression model all of the training examples ? > > Thanks! > > Svetlomir.
