[ 
https://issues.apache.org/jira/browse/BIGTOP-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14067372#comment-14067372
 ] 

jay vyas commented on BIGTOP-1272:
----------------------------------

Unfortunately, we can't really run this on a hadoop cluster because of the 
dependencies, unless you can come up with a way to build them into a uber-jar 
(i tried that, it failed because of a META_INF issue, not sure what it was)

1) I resolved at least some of the dependencies : JFairy, CommonsLang3, and 
scala lib 2.10, and added them manually to hadoop/lib...    but still there 
were other dependencies missing (org.yaml...) So

2) I also tried to create a fat jar with gradle, but that failed because of a 
META issue in the jarfile.  Maybe we  *can* a get fat jar solution to work ?

[~bhashit] so even though the code should work, i cannot deploy it in a cluster 
in any easy way. we will have to come up with a good way to deploy this jar 
file in a way which is reliable.  
Maybe there is a way that gradle can copy / write all the jars to a directory 
in /build/ and then as part of the instructions we can say users need to point 
to that directory using -libjars ?

So, we will need to revise the instructions and build.gradle to create 
something that is easy to deploy on a hadoop cluster.  

Let me know what ideas you have here, or just attach an updated patch and ill 
test it .  thanks!

> BigPetStore: Productionize the Mahout recommender
> -------------------------------------------------
>
>                 Key: BIGTOP-1272
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1272
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: Blueprints
>    Affects Versions: backlog
>            Reporter: jay vyas
>         Attachments: BIGTOP-1272.patch, BIGTOP-1272.patch, BIGTOP-1272.patch, 
> arch.jpeg, build.gradle
>
>
> BIGTOP-1271 adds patterns into the data that gaurantee that a meaningfull 
> type of product recommendation can be given for at least *some* customers, 
> since we know that there are going to be many customers who only bought 1 
> product, and also customers that bought 2 or more products -- even in a 
> dataset size of 10. due to the gaussian distribution of purchases that is 
> also in the dataset generator. 
> The current mahout recommender code is statically valid: It runs to 
> completion in local unit tests if a hadoop 1x tarball is present but 
> otherwise it hasn't been tested at scale.  So, lets get it working.  this 
> JIRA also will comprise:
> - deciding wether to use mahout 2x for unit tests (default on mahout maven 
> repo is the 1x impl) and wether or not bigtop should host a mahout 2x jar?  
> After all, bigtop builds a mahout 2x jar as part of its packaging process, 
> and BigPetStore might thus need a mahout 2x jar in order to test against the 
> right same of bigtop releases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to