Hi Ted,

I dont understand the composite features and super-products that you
mentioned. Please explain a bit. Are you pointing to a specific data
mining method?

Thanks,
Nishant

On Mon, Nov 28, 2011 at 5:44 AM, Ted Dunning <[email protected]> wrote:
> There are several good ways to deal with this.  The idea of super-products
> which are composite features that are derived from history is a good one.
>  I would recommend that you limit the number of such super features by
> first finding which products cooccur within a reasonable time window more
> than you would expect.
>
> The cooccurrence analysis system in Mahout can be misused for this analysis
> by building one document per user per sliding window period.  This is a bit
> flawed since the sliding windows overlap and thus the appearances of a
> transaction in multiple documents is not really an indication of
> independent appearances.  Also, the intermediate window documents are much
> larger than you might like and they won't take ordering into account.
>
> A better approach is to adapt the current code.  The basic data you need to
> collect are:
>
> - the number of times each product appears in a single users transaction
> history before another product.
>
> - the number of times each product appears in a transaction history after
> another product
>
> - the number of times product i appears after product j.
>
> You can then use the LLR code in Mahout to find cases where a product
> sequence occurs anomalously often.  You can then use a Bloom filter or
> similar data structure to analyze histories so that you emit product and
> super-products as input to a conventional collaborative filtering analysis.
>
>
> The second major approach to this problem is to build a separate classifier
> for each product of interest.  I wouldn't recommend that if you have lots
> of possible products, but this can work very well if you have a reasonably
> small number of products (say a few hundred or thousand) that you might be
> about to recommend.
>
>
> On Sun, Nov 27, 2011 at 2:09 AM, Nishant Chandra
> <[email protected]>wrote:
>
>> Use case is related to purchase transactions.
>>
>> Sample data set:
>> Customer ID Acquisition time Products
>> 101 30 June 2007 Product 1
>> 101 12 August 2007 Product 3
>> 101 20 December 2008 Product 4
>> 102 10 September 2008 Product 3
>> 102 12 September 2008 Product 5
>> 102 20 January 2009 Product 5.....
>>
>> Sample rule:
>> Rule ID Consequent Antecedents                        Support %
>> Confidence %
>> Rule 1   Product 4    Product 1 then Product 3        57.1
>>  75.0
>>
>> I want to identify rules such as: after acquiring product 1 and then
>> product 3, customers have an increased likelihood
>> (75%) of purchasing product 4 next.
>>
>> Thanks,
>> Nishant
>>
>>
>> On Sun, Nov 27, 2011 at 3:27 PM, Paritosh Ranjan <[email protected]>
>> wrote:
>> > Can you tell something about your use case?
>> >
>> > Paritosh
>> >
>> > On 27-11-2011 15:14, Nishant Chandra wrote:
>> >>
>> >> Hi,
>> >>
>> >> Is there any implementation for Sequential Pattern Mining in Mahout? I
>> >> see there is an implementation of Sequential Pattern Mining but I am
>> >> unsure if it can be used for my use case.
>> >>
>> >> Thanks,
>> >> Nishant
>> >>
>> >>
>> >> -----
>> >> No virus found in this message.
>> >> Checked by AVG - www.avg.com
>> >> Version: 10.0.1411 / Virus Database: 2092/4041 - Release Date: 11/26/11
>> >
>> >
>>
>

Reply via email to