[ 
https://issues.apache.org/jira/browse/BIGTOP-1272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14062987#comment-14062987
 ] 

jay vyas commented on BIGTOP-1272:
----------------------------------

Finished reading th patch, and looks like its all there. 

But, after running the mahout integration test,  I get an 
{{InvalidInputException}}.  I suspect this is related to the formatting of the 
pig stage, but not sure: 

{noformat}
    Command line arguments: {--alpha=[0.8], --endPhase=[2147483647], 
--implicitFeedback=[false], --input=[bps_integration_/cleaned/Mahout], 
--lambda=[0.1], --numFeatures=[2], --numIterations=[5], 
--numThreadsPerSolver=[1], --output=[bps_integration_/Mahout/AlsFactorization], 
--startPhase=[0], --tempDir=[/tmp/mahout_1405475399824]}
    Command line arguments: {--alpha=[0.8], --endPhase=[2147483647], 
--implicitFeedback=[false], --input=[bps_integration_/cleaned/Mahout], 
--lambda=[0.1], --numFeatures=[2], --numIterations=[5], 
--numThreadsPerSolver=[1], --output=[bps_integration_/Mahout/AlsFactorization], 
--startPhase=[0], --tempDir=[/tmp/mahout_1405475399824]}
    mapred.input.dir is deprecated. Instead, use 
mapreduce.input.fileinputformat.inputdir
    mapred.input.dir is deprecated. Instead, use 
mapreduce.input.fileinputformat.inputdir
    mapred.compress.map.output is deprecated. Instead, use 
mapreduce.map.output.compress
    mapred.compress.map.output is deprecated. Instead, use 
mapreduce.map.output.compress
    mapred.output.dir is deprecated. Instead, use 
mapreduce.output.fileoutputformat.outputdir
    mapred.output.dir is deprecated. Instead, use 
mapreduce.output.fileoutputformat.outputdir
    Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - 
already initialized
    Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - 
already initialized
    Cleaning up the staging area 
file:/tmp/hadoop-bigpetstore/mapred/staging/bigpetstore1132639609/.staging/job_local1132639609_0002
    Cleaning up the staging area 
file:/tmp/hadoop-bigpetstore/mapred/staging/bigpetstore1132639609/.staging/job_local1132639609_0002

org.apache.bigtop.bigpetstore.BigPetStoreMahoutIT > testPetStorePipeline FAILED
    org.apache.hadoop.mapreduce.lib.input.InvalidInputException at 
BigPetStoreMahoutIT.java:69

1 test completed, 1 failed
:integrationTest FAILED

{noformat}

Will dive some more.

> BigPetStore: Productionize the Mahout recommender
> -------------------------------------------------
>
>                 Key: BIGTOP-1272
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1272
>             Project: Bigtop
>          Issue Type: New Feature
>          Components: Blueprints
>    Affects Versions: backlog
>            Reporter: jay vyas
>         Attachments: BIGTOP-1272.patch, BIGTOP-1272.patch, BIGTOP-1272.patch, 
> arch.jpeg
>
>
> BIGTOP-1271 adds patterns into the data that gaurantee that a meaningfull 
> type of product recommendation can be given for at least *some* customers, 
> since we know that there are going to be many customers who only bought 1 
> product, and also customers that bought 2 or more products -- even in a 
> dataset size of 10. due to the gaussian distribution of purchases that is 
> also in the dataset generator. 
> The current mahout recommender code is statically valid: It runs to 
> completion in local unit tests if a hadoop 1x tarball is present but 
> otherwise it hasn't been tested at scale.  So, lets get it working.  this 
> JIRA also will comprise:
> - deciding wether to use mahout 2x for unit tests (default on mahout maven 
> repo is the 1x impl) and wether or not bigtop should host a mahout 2x jar?  
> After all, bigtop builds a mahout 2x jar as part of its packaging process, 
> and BigPetStore might thus need a mahout 2x jar in order to test against the 
> right same of bigtop releases.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to