[jira] [Comment Edited] (BIGTOP-1272) BigPetStore: Productionize the Mahout recommender

jay vyas (JIRA) Fri, 04 Jul 2014 06:31:18 -0700

    [ 
https://issues.apache.org/jira/browse/BIGTOP-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14052437#comment-14052437
 ]


jay vyas edited comment on BIGTOP-1272 at 7/4/14 1:29 PM:
----------------------------------------------------------

You know a simpler way I think that will work is :

{{hadoop fs -copyFromLocal pig-without-hadoop*.jar 
hdfs://localhost:1234/tmp/pig.jar}}
{{/usr/lib/hadoop/bin/hadoop jar hive-pig/bigpetstore-1.3.10.jar 
org.bigtop.bigpetstore.etl.PigCSVCleaner -libjars 
hdfs://localhost:1234/tmp/pig.jar bigpetstore bigpetstore_cleaned}} 

We can try that out.  Thats how iirc I run it in some test scripts.  The 
alternative:

{{export HADOOP_CLASSPATH=/usr/lib/pig/pig-0.12.0.2.0.6.1-101-withouthadoop.jar 

hadoop jar ……}}

Either way is (i think) equivalent , but libjars might be easier since you 
don't have to copy the file to every node on the cluster, you just copy the jar 
once into whatever dfs you are using. 


was (Author: jayunit100):
You know a simpler way I think that will work is :

{{hadoop fs -copyFromLocal pig-without-hadoop*.jar 
hdfs://localhost:1234/tmp/pig.jar}}
{{/usr/lib/hadoop/bin/hadoop jar hive-pig/bigpetstore-1.3.10.jar 
org.bigtop.bigpetstore.etl.PigCSVCleaner -libjars 
hdfs://localhost:1234/tmp/pig.jar bigpetstore bigpetstore_cleaned}} 

We can try that out.  Thats how iirc I run it in some test scripts.  The 
alternative:

{{export HADOOP_CLASSPATH=/usr/lib/pig/pig-0.12.0.2.0.6.1-101-withouthadoop.jar 
/usr/lib/hadoop/bin/hadoop jar hive-pig/bigpetstore-1.3.10.jar 
org.bigtop.bigpetstore.etl.PigCSVCleaner bigpetstore bigpetstore_cleaned
bps=$?}}





> BigPetStore: Productionize the Mahout recommender
> -------------------------------------------------
>
>                 Key: BIGTOP-1272
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1272
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: Blueprints
>    Affects Versions: backlog
>            Reporter: jay vyas
>         Attachments: BIGTOP-1272.patch, BIGTOP-1272.patch, arch.jpeg
>
>
> BIGTOP-1271 adds patterns into the data that gaurantee that a meaningfull 
> type of product recommendation can be given for at least *some* customers, 
> since we know that there are going to be many customers who only bought 1 
> product, and also customers that bought 2 or more products -- even in a 
> dataset size of 10. due to the gaussian distribution of purchases that is 
> also in the dataset generator. 
> The current mahout recommender code is statically valid: It runs to 
> completion in local unit tests if a hadoop 1x tarball is present but 
> otherwise it hasn't been tested at scale.  So, lets get it working.  this 
> JIRA also will comprise:
> - deciding wether to use mahout 2x for unit tests (default on mahout maven 
> repo is the 1x impl) and wether or not bigtop should host a mahout 2x jar?  
> After all, bigtop builds a mahout 2x jar as part of its packaging process, 
> and BigPetStore might thus need a mahout 2x jar in order to test against the 
> right same of bigtop releases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Comment Edited] (BIGTOP-1272) BigPetStore: Productionize the Mahout recommender

Reply via email to