Millions of clusters are not likely to make sense. Tens of thousands might just make sense.
On Fri, Jun 14, 2013 at 9:48 AM, Neetha <[email protected]> wrote: > Thank you for the reply. At Ted: if we are talking in the sense of a > millions of users there will be a millions of cluster , do this clustering > be feasible. > > > On Fri, Jun 14, 2013 at 1:27 AM, Ted Dunning <[email protected]> > wrote: > > > [image: Boxbe] <https://www.boxbe.com/overview> This message is > eligible > > for Automatic Cleanup! ([email protected]) Add cleanup rule< > https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Ftoken%3DUidJJCGBsgyNZdyg04nO%252BNvneXvGTMg50O7JlejSkQAOd91gWBRYa6rImYY52P8PP2QNQf2o6SMVPDmLny0W8ELvhxQpm7qeEdJw16b0QIVsH6MiPq6MiWqm4aWRqUNMYY3hHYtjfotF2DiEYRkFXQ%253D%253D%26key%3D5eB29OSPchFmdH044S6TA0UftcTY%252FTd7ebJrWBroYQA%253D&tc_serial=14371471469&tc_rand=1859759626&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>| > More > > info< > http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=14371471469&tc_rand=1859759626&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001 > > > > > > Thanks Grant. Exactly correct. > > > > Some pig or hive action is indicated here. Or write a map-reduce where > the > > reducer does the vector generation. > > > > > > > > On Thu, Jun 13, 2013 at 7:13 PM, Grant Ingersoll <[email protected] > > >wrote: > > > > > I think Ted was implying just write a script to aggregate the Movielens > > > data by user id. Should be pretty straightforward. > > > > > > On Jun 13, 2013, at 10:05 AM, Neetha <[email protected]> wrote: > > > > > > > Thank you, for the reply. How can we group the user. > > > > > > > > > > > > On Thu, Jun 13, 2013 at 3:41 PM, Ted Dunning <[email protected]> > > > wrote: > > > > > > > >> [image: Boxbe] <https://www.boxbe.com/overview> This message is > > > eligible > > > >> for Automatic Cleanup! ([email protected]) Add cleanup rule< > > > > > > https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Ftoken%3DGYex%252FPN%252FsEWDwuSs%252F9AS43g45aYbNc1OMuaZA7xu3TRldhNItvxAspHuwKeaedBKYvZ5Ah5DVIK7%252F%252B0qQSbX3CvYa7lvPle4%252BTdcv5k4cI%252BL4yoMK8by1Rm7UhZnW7TcvFw%252FeqoeYWXhz%252BgDPSUIWA%253D%253D%26key%3D0Lbb2Ob2N7oax0oxeBQTRLmrOCps42qosLO9Gh82kvs%253D&tc_serial=14367563490&tc_rand=1983549237&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001 > > >| > > > More > > > >> info< > > > > > > http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=14367563490&tc_rand=1983549237&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001 > > > > > > > >> > > > >> You need to group by user before converting to vector to get > sensible > > > >> clustering. > > > >> > > > >> > > > >> On Wed, Jun 12, 2013 at 1:06 PM, Grant Ingersoll < > [email protected] > > > >>> wrote: > > > >> > > > >>> The CSVVectorIterator in the Integration package will take in a CSV > > > file > > > >>> and produce vectors. It assumes that each row is the equivalent > of a > > > >>> DenseVector (does MovieLens fit that?) If you need otherwise, I'd > > > >> suggest > > > >>> starting with the code and modifying to fit your needs. > > > >>> > > > >>> > > > >>> -Grant > > > >>> > > > >>> On Jun 12, 2013, at 6:11 AM, Neetha <[email protected]> wrote: > > > >>> > > > >>>> Hi, > > > >>>> > > > >>>> > > > >>>> I am using 1m movielens. > > > >>>> > > > >>>> I need to run the K-means clustering using mahout and hadoop. > > > Actually, > > > >>>> 1st step in the clustering is to convert into a sequence file, > then > > > >> into > > > >>>> vector format and then apply the clustering algorithm. My doubt > is, > > Is > > > >>>> there any need to convert the movielens rating.csv file into a > > > sequence > > > >>>> file. If needed what are the commands for applying clustering > > > technique > > > >>>> using mahout and the hadoop. > > > >>>> > > > >>>> Thanking you, > > > >>>> Neetha Suan Thampi > > > >>> > > > >>> -------------------------------------------- > > > >>> Grant Ingersoll | @gsingers > > > >>> http://www.lucidworks.com > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> > > > >> > > > >> > > > > > > -------------------------------------------- > > > Grant Ingersoll | @gsingers > > > http://www.lucidworks.com > > > > > > > > > > > > > > > > > > > > > > >
