I was wondering if anyone could help me with an interesting problem.
I am trying to forecast customer life span for a set of data.

Basically, we have 8 years data and thousands of rows regarding a
subscription service. Three raw variables are as follows.

a) Starting Date of subscription
b) Cancellation Date of subscription
c) Demograhpic Segments that a customer belongs to. We have 66
categorical values such as 01, 02..etc. These segments are given to
us by an outside firm that basically appends a segment to a customer
data based on variables such as what kind of car a customer drives,
how much she is educated, or how much she earns etc.

I am interested in predicting the number of months a customer would
stay with the product. I was thinking I could use the following
variables in my regression model.

Dependent Variables: NumberOfMonths (derived from taking the
difference between the starting and ending date of subscription for
both cancelled customers and customer who are still with us)
Independent Variables
a) Status (whether a customer has cancelled (0) or still with us (1))
b) Demograhpic Segment

Questions:
Q1) Is it ok to calculate "NumberOfMonths" variable from starting and
ending date of subscription? The reason I ask this is that for
customers who have
not cancelled subscription yet, it will only result in a number that
will be
the same whether they are still with us. Of course this information
(cancellation of subscription) will simultaneously be captured in the
"status" independent variable (0 or 1).

Q2) I don't know how to use "Demograhpic Segment " independent
variable since there are 66 different numeric codes for these
segments. Should I use 65 (=66-1) dummy variables? Because if I do
use 65 dummy variables my regression equation may not only be
extremely long, but also potentially meaningless (dealing with so
many variables).

Q3) What extra information do you think I may need in order to create
this model?

Q4) Should I use the starting year as well in my model?

Forecasting customer life span for a subscription service seems to be
a common business problem and I was wondering if anyone had any
canned solutions or provide me with pointers.
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to