Hi Pat,
I have a problem in practical recommendations, looking forward to your
reply and thank you.
/rec/si/input/data.txt:
--------------------------
u1,purchase,iphone
u1,purchase,ipad
u2,purchase,nexus
u2,purchase,galaxy
u3,purchase,surface
u4,purchase,iphone
u4,purchase,galaxy
u1,view,iphone
u1,view,ipad
u1,view,nexus
u1,view,galaxy
u2,view,iphone
u2,view,ipad
u2,view,nexus
u2,view,galaxy
u3,view,surface
u3,view,nexus
u4,view,iphone
u4,view,ipad
u4,view,galaxy
u1,addToCart,iphone
u1,addToCart,ipad
u1,addToCart,nexus
u2,addToCart,iphone
u2,addToCart,nexus
u2,addToCart,galaxy
u3,addToCart,surface
u4,addToCart,iphone
u4,addToCart,galaxy
with the command line:
mahout spark-itemsimilarity -i /rec/si/input/data.txt -o /rec/si/output
-f1 purchase -f2 view -os -ic 2 -fc 1 -td ,
and created two directories ---- /rec/si/output/indicator-matrix and
/rec/si/output/cross-indicator-matrix, contents as follows:
/rec/si/output/indicator-matrix/part-00000
--------------------------
galaxy nexus
surface
iphone ipad
nexus galaxy
ipad iphone
/rec/si/output/cross-indicator-matrix/part-00000
--------------------------
galaxy galaxy,iphone,nexus,ipad
surface surface,nexus
iphone galaxy,iphone,nexus,ipad
nexus galaxy,iphone,nexus,ipad
ipad galaxy,iphone,nexus,ipad
the second command:
mahout spark-itemsimilarity -i /rec/si/input/data.txt -o /rec/si/output2
-f1 addToCart -os -ic 2 -fc 1 -td ,
/rec/si/output2/indicator-matrix/part-00000
--------------------------
galaxy iphone
surface
iphone galaxy,nexus,ipad
nexus iphone,ipad
ipad nexus,iphone
Through the above outputs to create an index or table, here with the table for
test:
+----+---------+----------+--------------------------+-------------------+
| id | item | purchase | view | addToCart |
+----+---------+----------+--------------------------+-------------------+
| 1 | galaxy | nexus | galaxy,iphone,nexus,ipad | iphone |
| 2 | surface | | surface,nexus | |
| 3 | iphone | ipad | galaxy,iphone,nexus,ipad | galaxy,nexus,ipad |
| 4 | nexus | galaxy | galaxy,iphone,nexus,ipad | iphone,ipad |
| 5 | ipad | iphone | galaxy,iphone,nexus,ipad | nexus,iphone |
+----+---------+----------+--------------------------+-------------------+
Now, a user viewed "nexus" and "ipad", "other people also viewed items"
recommendations to him, Which writing is right? Or the other?
1. select item from test.rec where view like '%nexus%' or view like '%ipad%'
order by id;
+---------+
| item |
+---------+
| galaxy |
| surface |
| iphone |
| nexus |
| ipad |
+---------+
2. select item from test.rec where view like '%nexus%' and view like '%ipad%'
order by id;
+--------+
| item |
+--------+
| galaxy |
| iphone |
| nexus |
| ipad |
+--------+
3. select distinct view from test.rec where view like '%nexus%' or view like
'%ipad%' order by id;
+--------------------------+
| view |
+--------------------------+
| galaxy,iphone,nexus,ipad |
| surface,nexus |
+--------------------------+
4. select distinct view from test.rec where view like '%nexus%' and view like
'%ipad%' order by id;
+--------------------------+
| view |
+--------------------------+
| galaxy,iphone,nexus,ipad |
+--------------------------+
5. select distinct view from test.rec where item = 'nexus' or item = 'ipad'
order by id;
+--------------------------+
| view |
+--------------------------+
| galaxy,iphone,nexus,ipad |
+--------------------------+
I just learn Mahout for a period of time, and is not very understanding.
Thanks.
On Sep 19, 2014, at 22:41, pol <[email protected]> wrote:
> Hi Pat,
> I made a spelling mistake, As you said, I am a reference to this
> example:
> http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html
>
> I know, I understand is right. Thank you again.
>
>
> On Sep 19, 2014, at 22:14, Pat Ferrel <[email protected]> wrote:
>
>> First it looks like some misspelled IDs
>>
>> ipad != iPad
>> iphone != iPhone
>>
>> Second you have to treat purchase as the primary action and view as the
>> secondary action this will create two indicator matrices in two different
>> directories as the docs say. Use the command line in the docs for two
>> actions.
>>
>> Notice:
>> --filter1 purchase \ # word that flags input for the primary action
>> --filter2 view \ # word that flags input for the secondary action
>>
>> This tells the job to create an indicator matrix from lines with “purchase”
>> and a cross-indicator from lines with “view”
>>
>> Read the "More Complex Input” section.
>>
>>
>> On Sep 19, 2014, at 1:27 AM, pol <[email protected]> wrote:
>>
>> Hi Pat,
>>
>> Thank you very much! I had a little understanding. In this example:
>>
>> item purchase view
>> --------------------------------------------------------
>> galaxy nexus galaxy iphone nexus iPad
>> surface surface nexus
>> iPhone ipad galaxy iphone nexus ipad
>> nexus galaxy galaxy iphone nexus ipad
>> iPad iphone galaxy iphone nexus iPad
>>
>> When a user view "surface", "surface" recommended for him to view;
>> When a user purchase "nexus" and "iPad", "galaxy" and "iPhone" recommended
>> for him to purchase;
>> Of course, there is no filtering for recommendation result. I understand is
>> right?
>>
>> Thanks.
>>
>> On Sep 19, 2014, at 04:40, Pat Ferrel <[email protected]> wrote:
>>
>>> You create the indicator and cross-indicator matrices with —omitStrength
>>> then if you are using a database with solr or elasticsearch you will create
>>> a table:
>>>
>>> item ID, list of indicator Item IDs, list of cross-indicator item IDs
>>>
>>> 3 columns. All IDs will be like “nexus” in the example—they are your
>>> application’s item IDs. The second and third column contain lists of item
>>> IDs. There are several ways you can do this either by using a multi-valued
>>> field (array of IDs) or a space delimited string depending on how you want
>>> to integrate with your search engine and database. Check the instructions
>>> for your particular search engine.
>>>
>>> At the time you want to recommend, take the user’s history of the primary
>>> action (purchase in the example) and map it to the "list of indicator Item
>>> IDs” field. Take the user’s history of the secondary action (view in the
>>> example) and map it to the "list of cross-indicator Item IDs” field. Then
>>> perform the search engine query and you’ll get back a list of item IDs to
>>> recommend. Filter out any items that the user has in their history (if you
>>> wish) and recommend the items in the order they were returned
>>>
>>> This blog explains more:
>>> http://occamsmachete.com/ml/2014/09/09/mahout-on-spark-whats-new-in-recommenders-part-2/
>>>
>>> Ted’s book gives an example architecture:
>>> https://www.mapr.com/practical-machine-learning
>>>
>>> On Sep 18, 2014, at 10:00 AM, pol <[email protected]> wrote:
>>>
>>> Hi, All
>>> I saw spark-itemsimilarity doc at
>>> http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html,
>>> but I don’t understand how can creating a recommender by
>>> spark-itemsimilarity? I don’t understand "3 Creating a Recommender" chapter.
>>> For input of the form:
>>> u1,purchase,iphone
>>> u1,purchase,ipad
>>> u2,purchase,nexus
>>> u2,purchase,galaxy
>>> u3,purchase,surface
>>> u4,purchase,iphone
>>> u4,purchase,galaxy
>>> u1,view,iphone
>>> u1,view,ipad
>>> u1,view,nexus
>>> u1,view,galaxy
>>> u2,view,iphone
>>> u2,view,ipad
>>> u2,view,nexus
>>> u2,view,galaxy
>>> u3,view,surface
>>> u3,view,nexus
>>> u4,view,iphone
>>> u4,view,ipad
>>> u4,view,galaxy
>>> output
>>> out-path
>>> |-- indicator-matrix - TDF part files
>>> \-- cross-indicator-matrix - TDF part-files
>>> The indicator matrix will contain the lines:
>>> galaxy\tnexus:1.7260924347106847
>>> ipad\tiphone:1.7260924347106847
>>> nexus\tgalaxy:1.7260924347106847
>>> iphone\tipad:1.7260924347106847
>>> surface
>>> The cross-indicator matrix will contain:
>>> iphone\tnexus:1.7260924347106847 iphone:1.7260924347106847
>>> ipad:1.7260924347106847 galaxy:1.7260924347106847
>>> ipad\tnexus:0.6795961471815897 iphone:0.6795961471815897
>>> ipad:0.6795961471815897 galaxy:0.6795961471815897
>>> nexus\tnexus:0.6795961471815897 iphone:0.6795961471815897
>>> ipad:0.6795961471815897 galaxy:0.6795961471815897
>>> galaxy\tnexus:1.7260924347106847 iphone:1.7260924347106847
>>> ipad:1.7260924347106847 galaxy:1.7260924347106847
>>> surface\tsurface:4.498681156950466 nexus:0.6795961471815897
>>> ————----
>>> Now,u4 view nexus, how to recommend for u4 by the above of output?
>>>
>>> Thanks.
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>
>