[jira] [Commented] (BIGTOP-1270) BigPetStore: Productionize the Hive portion

jay vyas (JIRA) Thu, 22 May 2014 08:11:22 -0700

    [ 
https://issues.apache.org/jira/browse/BIGTOP-1270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14005995#comment-14005995
 ]


jay vyas commented on BIGTOP-1270:
----------------------------------

After looking more into this : I think the hive tooling can simply be replaced 
with a hive script, which we maintain, but don't run as part of the official 
pipeline.  The reason being , that hive adds a huge testing burden to 
bigpetstore, given the complexity of maintaining a local hive and hadoop 
tarball on developer machines.

Then we can wait for HIVE-1776 to come out , and once it does, bigtop can 
integrate it.

> BigPetStore: Productionize the Hive portion
> -------------------------------------------
>
>                 Key: BIGTOP-1270
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1270
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: Blueprints
>    Affects Versions: backlog
>            Reporter: jay vyas
>
> The hive portion of the BigPetStore blueprint app builds a "view" over the 
> cleaned data that Mahout can then use to do product recommendations.
> The hive code in bigpetstore only runs locally - lets add the necessary 
> configuration hooks and/or (if we have to) externalize the hive script itself 
> from java so that its easy to run directly on a cluster.
> And lets actually run it on some kind of a cluster at scale.  The contract 
> for the hive portion is an output file with three numbers like this: 
> {noformat}
> 100 30021 1
> 100 212341 1
> ...
> {noformat}
> Signifying that customer=100 likes both of the products "30021" and "212341". 
>  



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (BIGTOP-1270) BigPetStore: Productionize the Hive portion

Reply via email to