Hi, I have a quick question regarding regression + SGD technique.
so if we take say logistic regression + SGD, we can use it to regress certain dependent variable. But what if the variable being regressed has signficant variance? Say in such a heartbreaking example where people are coming into store, some of them end up buying something for some $$ but most don't buy anything (sale $=0). Suppose i use SGD regression to regress the sale $ using bunch of individual sale regressors (such as person's profile, store theme/focus etc.) Obviously this regressand has a very high variance... But... If i can hope to converge on the math expectancy of the sale, then i would be able to predict say daily sales for individual stores based on amount of people visited per day --or for that matter whatever interval as long as we know how many people were there (which basically makes my manager happy for the moment). Another thing is that i want to try to come up with E(sale) for every new person coming into a store before he or she makes any deals based on various regressors such as person profile, store focus etc. So intuitively i feel that that SGD must converge on the E(regressand) in cases where variance(regressand) is quite high as SGD basically minimizes RMSE (which is essentially same as the variance). Is that correct? But i am not quite sure if that is backed by the math of stochastic gradient descent. Could anybody (perhaps Ted?) chime in on the subject? Another question is would there be difference between the cases of SGD+MLE vs. SGD+least squares methods for high variance regressands? Thank you very much in advance. -Dmitriy
