As title, I'm confused why some algo can partial_fit and some algo can't.
For regression model, I found SGD can but RF can't.
Is about the difference of algo? I thought it's able to partial_fit because
gradient descent, or just another reason?
thx
___
It's not necessarily unique to stochastic gradient descent, it's more that some
other algorithms are generally not well suited for "partial_fit". For SGD,
partial fit is a more natural thing to do since you estimate the training loss
from minibatches anyway -- i.e., you do SGD step by step anywa