Todd answered me privately, but was willing to continue the discussion in public. You can see his detailed answers to my questions below.
Cutting down "Customer Events" and "Business Events" do a shorter list of the most important events would be a good thing. Here's an idea... You might want to have models trained on typical customer profiles... say "Male 20-30 yrs old in the Midwest US".... you could have a bunch of these and train them only on relevant life cycle events, picking only those that match your profile for training. Then you could have several models that know the patterns for those profiles. As new life cycle events are fed into each profile's model, abnormal events should be anomalous. This would be a very interesting experiment. I have not attempted something like this before. Thanks, --------- Matt Taylor OS Community Flag-Bearer Numenta ---------- Forwarded message ---------- From: Todd Schlosser <[email protected]> Date: Thu, Oct 15, 2015 at 4:07 PM Subject: Re: Customer lifecycle use-case To: Matthew Taylor <[email protected]> Hi Matt, Thank you for the reply. I'll answer your questions in order, and provide a sample record set (see below). I hope this helps! 1. How many "Customer Events" are there total? This could be as many as a business has processes, or just a few. This would be considered a categorical variable such as ("Visit", "Register", "Questionnaire", "Checkout", "Order", "Refund", "Renew", "Cancel", etc.) It all depends on the complexity of the customer lifecycle, but could be kept simple to just success events, or those events that impact revenue ("Order", "Refund", "Renew") 2. How many "Business Events" are there total? This is a similar question, but could be limited to those events which communicate with the customer ("Welcome Email", "Sales Call", "Renewal Reminder", etc.) 3. What exactly are "Customer Attributes"? Either categorical variables ("Gender", "State", "Age"), or some behavioral measures which define the state of the users, such as ("# Orders", "# Visits", "# Customer Care Calls", " # Renewals") 4. How many events are within a typical customer lifecycle? Similar to question 1, this depends on how complicated the customer lifecycle is and which events are relevant (e.g. revenue generating events) This is a typical classification problem where the likelihood of an event is determined by the series of events that preceded it, and the attribute space of the customer. I've simplified the dataset below, removing 'Business Events", and assuming there were millions of customers with a series of events in their own customer lifecycle. Question: What is the most likely event to occur after "Questionnaire" for the following customer (Age=42, Gender=Male, State=CA, Registered=1) Customer ID Timestamp Customer Event Age Gender State 1000 10/15/2015 Visit Unknown Unknown Unknown 1000 10/18/2015 Register 42 Male CA 1000 10/27/2015 Click Email 42 Male CA 1000 11/5/2015 Subscribe 42 Male CA 1000 11/10/2015 Renew 42 Male CA 1000 11/11/2015 Click Email 42 Male CA 1000 11/18/2015 Visit 42 Male CA 1000 11/22/2015 Questionnaire 42 Male CA 1000 11/23/2015 Order 42 Male CA 1000 11/25/2015 Refund 42 Male CA 1001 11/27/2015 Cancel Subscription 42 Male CA On Thursday, October 15, 2015 8:31 AM, Matthew Taylor <[email protected]> wrote: Hi Todd, I think there is potential there, but I have a few questions, 1. How many "Customer Events" are there total? 2. How many "Business Events" are there total? 3. What exactly are "Customer Attributes"? 4. How many events are within a typical customer lifecycle? --------- Matt Taylor OS Community Flag-Bearer Numenta On Thu, Oct 15, 2015 at 6:13 AM, Todd Schlosser via nupic < [email protected]> wrote: ---------- Forwarded message ---------- From: Todd Schlosser <[email protected]> To: <[email protected]> Cc: Date: Thu, 15 Oct 2015 06:12:35 -0700 Subject: Customer Behavior Prediction with Nupic I’ve developed a robust data set in Hadoop called the ‘CrystalBall’ which describes in detail the history of the customer lifecycle for every customer captured. I’m try to determine if Nupic could be used to predict the next event in the customer lifecycle from a set of records which describe a time-sequenced set of customer attributes and activities? Those attributes could either be some customer demographic, psychographic, behavioral, or even some targeted business activity such as an email, or promotion. For example, if I had a data set which had: 1. Customer ID 2. Timestamp 3. Customer Event (Visit, Registration, Subscription, Refund, Renewal, Optout, Expiration) 4. Business Event (Ad, Email, Discount, AB/Test, etc.) 5. Customer Attributes (V1 – Vn) I then want Nupic to predict the next Customer Event which is most likely to occur? Is that something Nupic could be used for? Thank you.
