> On Nov 24, 2015, at 12:21 PM, Niklas Ekvall <niklas.ekv...@gmail.com> wrote: > > Okay! > > No pre-filter and the user/item ids should start from 0 and go as many user > and items there are. So, all the data we have should go into Mahout and we > filter inside Mahout....correct?
Yes, but I wouldn't filter. The recs will very likely be better than random with only a small number of events. > > We do the same pre-filter for Spark item-similarity, is that wrong to? No, spark-itemsimilarity uses string ids. > > Best regards, Niklas > > On Tuesday, November 24, 2015, Pat Ferrel <p...@occamsmachete.com> wrote: > >> I wouldn’t pre-filter but in any case the ids input to hadoop-mahout need >> to follow those rules. >> >> The new recommender I mentioned has no such requirements, it uses string >> IDs. >> >> On Nov 24, 2015, at 11:44 AM, Niklas Ekvall <niklas.ekv...@gmail.com >> <javascript:;>> wrote: >> >> No, it does not start from 0 and does not cover all number between 0 and >> the number of items/users. We do a prefiltering before (a user must have >> bought at lest 5 product and a product must have been bought by 3 users) >> we use Mahout on the dataset. Therefore we start with user 3, then it jumps >> to user 5, etc. >> >> Is this wrong? Should we use all data as input to Mahout and do the >> filtring inside Mahout? >> >> We use the second latest version of Mahout! >> >> Best regards, Niklas >> >> On Tuesday, November 24, 2015, Pat Ferrel <p...@occamsmachete.com >> <javascript:;> >> <javascript:_e(%7B%7D,'cvml','p...@occamsmachete.com <javascript:;>');>> >> wrote: >> >>> Do your ids start with 0 and cover all numbers between 0 and the number >> of >>> items -1 (same for user ids)? >>> The old hadoop-mahout code required ordinal ids starting at 0 >>> >>> >>> On Nov 24, 2015, at 8:19 AM, Niklas Ekvall <niklas.ekv...@gmail.com >> <javascript:;>> >>> wrote: >>> >>> Hi Pat, >>> >>> Here is some input: >>> >>> 3 7414 >>> 3 12682 >>> 3 18947 >>> 3 19980 >>> 3 26975 >>> 3 54635 >>> 3 67789 >>> 3 73212 >>> 3 118932 >>> 3 138846 >>> 3 141268 >>> 5 3 >>> 5 2123 >>> 5 37955 >>> 5 39975 >>> 5 113289 >>> 6 3 >>> 6 456 >>> 6 2188 >>> 6 2496 >>> 6 6194 >>> 6 6361 >>> 6 6768 >>> 6 6919 >>> 6 6920 >>> 6 7257 >>> 6 7705 >>> 6 7706 >>> 6 11788 >>> >>> And some output: >>> >>> 3 >>> >>> >> [122086:1.0,1846:1.0,74638:1.0,63240:1.0,87540:1.0,2742:1.0,2981:1.0,8325:1.0,145598:1.0,49675:1.0,131388:1.0,72113:1.0,3493:1.0,56131:1.0,30422:1.0,87829:1.0,111190:1.0,13597:1.0,83436:1.0,61772:1.0] >>> 5 >>> >>> >> [32349:1.0,29413:1.0,111896:1.0,61845:1.0,50016:1.0,1607:1.0,15237:1.0,133229:1.0,65805:1.0,34034:1.0,133071:1.0,28894:1.0,18658:1.0,32095:1.0,4402:1.0,47522:1.0,31022:1.0,23936:1.0,6243:1.0,53214:1.0] >>> 6 >>> >>> >> [40756:1.0,34420:1.0,31153:1.0,114717:1.0,53945:1.0,71148:1.0,26095:1.0,112941:1.0,55284:1.0,111346:1.0,112201:1.0,65759:1.0,133127:1.0,61378:1.0,16413:1.0,113289:1.0,49675:1.0,14995:1.0,141028:1.0,27506:1.0] >>> >>> Best regards, Niklas >>> >>> 2015-11-24 16:48 GMT+01:00 Pat Ferrel <p...@occamsmachete.com >> <javascript:;>>: >>> >>>> Sounds like you may not have the input right. Recommendations should be >>>> sorted by the strength and so shouldn’t all be 1 unless the data is very >>>> odd. >>>> >>>> Can you give us a small sample of the input? >>>> >>>> >>>> BTW a newer recommender using Mahout’s Spark based code and a search >>>> engine is here: >>>> >>> >> https://github.com/PredictionIO/template-scala-parallel-universal-recommendation >>>> a single machine install script is here: >>> https://docs.prediction.io/start/ >>>> >>>> On Nov 24, 2015, at 2:16 AM, Niklas Ekvall <niklas.ekv...@gmail.com >> <javascript:;>> >>>> wrote: >>>> >>>> Hello Mahout Users! >>>> >>>> I use today Mahout - Recommenditembased with Log-similarity to produce >>>> personal recommendations for Trigger Eamils in a offline mode. But when >> I >>>> produce e.g. 50 recommendations the rank value of the recommendations >> are >>>> always of magnitude 1. Why is this so? And, is the first recommendations >>> in >>>> this list the best one or is there some randomness in this list? >>>> >>>> Best regards, >>>> >>>> Niklas Ekvall >>>> >>>> >>> >>> >> >> >