On that subject, does anyone have any resources re: feature engineering for churn analysis?
On Thu, Jul 25, 2013 at 4:12 AM, Sayed Seliman <[email protected]>wrote: > Hi, > > mahout is a customer requirement. > Can I use the logistic regression with Mahout ? > How I have to prepare my data to be processed with the logistic regression > ? > > Thanks > > > -----Messaggio originale----- > Da: Fernando Fernández [mailto:[email protected]] > Inviato: giovedì 25 luglio 2013 07:20 > A: [email protected] > Oggetto: Re: churn analysis > > If you don't know where to start, I would recommend starting with something > more conventional than HMM that can be tricky to fully understand and > explain. A logistic regression model can perform very well if predictors > are > built with care. I wouldn't start also with mahout unless this is a > requirement from a client (some clients are so thrilled about "big data" > that they want to use mahout even if it's overkill for most predictive > analytics tasks...), You will probably not need more than 100k-200k records > to build a pretty good model, an undersampling scheme can also be good for > the model (not necessary, but it won't hurt) and lead you that sample size > anyway. > > If you need to go for mahout, there is an SGD implementation for logistic > regression in mahout. > > The key point for building a good churn model though is in how you build > predictor variables, then any binary classification model would do the > trick. > > > 2013/7/24 <[email protected]> > > > I've not used Mahout to do it, but in the past colleagues have used > > HMM to create a way for discovering customers who are in an "about to > churn" > > state, this was used to populate a target list for winback > > intervention (they're about to curn, contact them and offer something > > - or just help - to keep them). I tried the Mahout HMM earlier in the > > year, but got discouraged by some odd behaviour which I have still not > > managed to delve into. > > > > The problem that we saw with churn analysis for our domain was that > > most churners leave with no event on their account in the recent past. > > Essentially there are external factors that are generating churn over > > the whole population (competitor offers, demographics, economics) > > which mean that the domain model is not accessible from the data. So, > > while a much better than "random" predictor can be built it only > > barely costed in to operate, and is sufficiently far from a conclusive > > knockdown winner to allow homebrew.spreadsheet.witchcraft alternatives > > to pop up and be given air time by people not familiar with the idea > > that if you flip 1000 coins in the air at once some of them are going > > to keep coming up as heads for a bit. One way round this is "more > > data, better data" which is kinda where I came in on for Mahout and > HMM's. > > > > So, my suggestion would be : > > > > - look at your data; do your churners have events in an actionable > > period (this depends on your domain) that could be the basis of a > > signal? If there are enough of them in this category to power a > > business case based on intervention and win back you're on... if not > > then more data, better data is needed.. > > - if there are strong correlations between the last event and the churn? > > Then use a decision tree or similar to classify churn prospects from > > stables - if you get a good predictor no need to do more, if not then.. > > - try a HMM, it could help you find groups of sequences of action that > > lead to churning (repeated contacts, escalations, resorting to letter > > writing etc.) But check that Mahouts one is sound and works for you (I > > am not confident that I did enough work to say that my problems > > weren't a case of "problem between screen and chair" so if you get > > things working then > > superduper!) > > > > Hope that helps you, > > > > Simon > > > > > > > > ________________________________________ > > From: Sayed Seliman [[email protected]] > > Sent: 24 July 2013 21:37 > > To: [email protected] > > Subject: churn analysis > > > > Hi, > > > > > > > > what are your experiences in building churn analysis system with mahout ? > > > > What do you suggest to implement ? > > > > Any success story implementing churn analysis system with mahout ? > > > > > > > > thanks > > > >
