I think also that essentially all of the power that you are going to get
from an HMM model can be captured by other means such as a sparse
event-sequence feature.  Sparse logistic regression or a large scale random
forest modeling system can work wonders on these features if they have good
means for regularization (L_1 or Lasso or Elastic band for logistic
regression, native characteristics for random forest).



On Wed, Jul 24, 2013 at 10:19 PM, Fernando Fernández <
[email protected]> wrote:

> If you don't know where to start, I would recommend starting with something
> more conventional than HMM that can be tricky to fully understand and
> explain. A logistic regression model can perform very well if predictors
> are built with care. I wouldn't start also with mahout unless this is a
> requirement from a client (some clients are so thrilled about "big data"
> that they want to use mahout even if it's overkill for most predictive
> analytics tasks...), You will probably not need more than 100k-200k records
> to build a pretty good model, an undersampling scheme can also be good for
> the model (not necessary, but it won't hurt) and lead you that sample size
> anyway.
>
> If you need to go for mahout, there is an SGD implementation for logistic
> regression in mahout.
>
> The key point for building a good churn model though is in how you build
> predictor variables, then any binary classification model would do the
> trick.
>
>
> 2013/7/24 <[email protected]>
>
> > I've not used Mahout to do it, but in the past colleagues have used HMM
> to
> > create a way for discovering customers who are in an "about to churn"
> > state, this was used to populate a target list for winback intervention
> > (they're about to curn, contact them and offer something - or just help -
> > to keep them). I tried the Mahout HMM earlier in the year, but got
> > discouraged by some odd behaviour which I have still not managed to delve
> > into.
> >
> > The problem that we saw with churn analysis for our domain was that most
> > churners leave with no event on their account in the recent past.
> > Essentially there are external factors that are generating churn over the
> > whole population (competitor offers, demographics, economics) which mean
> > that the domain model is not accessible from the data. So, while a much
> > better than "random" predictor can be built it only barely costed in to
> > operate, and is sufficiently far from a conclusive knockdown winner to
> > allow homebrew.spreadsheet.witchcraft alternatives to pop up and be given
> > air time by people not familiar with the idea that if you flip 1000 coins
> > in the air at once some of them are going to keep coming up as heads for
> a
> > bit. One way round this is "more data, better data" which is kinda where
> I
> > came in on for Mahout and HMM's.
> >
> > So, my suggestion would be :
> >
> > - look at your data; do your churners have events in an actionable period
> > (this depends on your domain) that could be the basis of a signal? If
> there
> > are enough of them in this category to power a business case based on
> > intervention and win back you're on... if not then more data, better data
> > is needed..
> > - if there are strong correlations between the last event and the churn?
> > Then use a decision tree or similar to classify churn prospects from
> > stables - if you get a good predictor no need to do more, if not then..
> > - try a HMM, it could help you find groups of sequences of action that
> > lead to churning (repeated contacts, escalations, resorting to letter
> > writing etc.) But check that Mahouts one is sound and works for you (I am
> > not confident that I did enough work to say that my problems weren't a
> case
> > of "problem between screen and chair" so if you get things working then
> > superduper!)
> >
> > Hope that helps you,
> >
> > Simon
> >
> >
> >
> > ________________________________________
> > From: Sayed Seliman [[email protected]]
> > Sent: 24 July 2013 21:37
> > To: [email protected]
> > Subject: churn analysis
> >
> > Hi,
> >
> >
> >
> > what are your experiences in building churn analysis system with mahout ?
> >
> > What do you suggest to implement ?
> >
> > Any success story implementing churn analysis system with mahout ?
> >
> >
> >
> > thanks
> >
>

Reply via email to