Thanks for your suggestions folks.  I made some progress.  

In my dataset PRODUCT:
Y           - dependent variable (NUMBER_OF_MONTHS- it can be any
positive integer)
SEGMENT     - is the categorical independent variable (takes values
01,02....60)
STATUS      - is the indicator for censoring: 1-censor; 0-uncensor. 
PURCHASE_DT - the date when customer purchased the product.
CANCEL_DT   - the date when customer canceled the product and
missing-value if customer has not canceled yet.
CANCEL      - same date if we have cancel_dt and if missing then we
give
it todays date.
*The censored obs are all RIGHT-CENSORED* 


/* My sas code */
DATA PRODUCT;
  SET PRODUCT;
  IF CANCEL_DT EQ . THEN CANCEL=TODAY(); ELSE CANCEL = CANCEL_DT;
  IF CANCEL_DT EQ . THEN STATUS=1; ELSE STATUS=0;  /* Status=0 means
CANCELLED, 1 means NOT CANCELLED*/
  Y = INTCK('MONTH' , PURCHASE_DT, CANCEL);
  FORMAT CANCEL DATE9. ;
  if Y ge 3;
RUN;

PROC LIFEREG data=PRODUCT;
  CLASS SEGMENT;
  MODEL Y*STATUS(1)=SEGMENT / DIST= weibull;
  OUTPUT OUT=PROD_OUT P=PREDICTED_Y;
RUN;
QUIT;
/**********/

My questions are:

1. How to decide which distribution to use. I tried exponential,
weibull, normal etc.

2. About 75% of my obs. are censored (there are in total 110,000 obs
in my dataset-PRODUCT)

3. The PREDICTED_Y are really huge numbers like 170, 200 etc. which
are above what I expected. I am also suspecting if this is due to
large no. of right-censored obs in my dataset. I have heared that-huge
censoring can lead to highly extrapolated predictions. Is there a way
to handling such censoring problems. Also,is it really a problem or
it's ok to have this kind of situation?

4. If anybody knows of any better way of getting predicted_y, or
different ways of analysis, please let me know.

Thanks a lot,
AJ
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to