[ 
https://issues.apache.org/jira/browse/BIGTOP-1470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14154186#comment-14154186
 ] 

jay vyas edited comment on BIGTOP-1470 at 10/1/14 1:28 AM:
-----------------------------------------------------------

Okay, here is my minimum reproduce.  The conclusion is that certain mahout jobs 
don't work with bigtop 0.8.0 release, in particular the ones that use the 
*HadoopUtil* to do operations.    Thanks for helping to provide the synthetic 
control test [~rvs] that really helped to narrow it down.  

*thoughts?*

Here are the details.

*Roman's example works* 
{noformat}
[vagrant@bigtop1 vagrant]$ mahout 
org.apache.mahout.clustering.syntheticcontrol.canopy.Job                        
                      
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and 
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.9-job.jar
14/10/01 01:23:32 WARN driver.MahoutDriver: No 
org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props found on 
classpath, will use command-line arguments only
14/10/01 01:23:32 INFO canopy.Job: Running with default arguments
14/10/01 01:23:34 INFO common.HadoopUtil: Deleting output
14/10/01 01:23:34 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
longer used.
14/10/01 01:23:34 INFO client.RMProxy: Connecting to ResourceManager at 
bigtop1.vagrant/10.10.10.11:8032
14/10/01 01:23:34 WARN mapreduce.JobSubmitter: Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
14/10/01 01:23:35 INFO input.FileInputFormat: Total input paths to process : 1
14/10/01 01:23:35 INFO mapreduce.JobSubmitter: number of splits:1
14/10/01 01:23:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1412124745008_0004
14/10/01 01:23:35 INFO impl.YarnClientImpl: Submitted application 
application_1412124745008_0004
14/10/01 01:23:35 INFO mapreduce.Job: The url to track the job: 
http://bigtop1.vagrant:20888/proxy/application_1412124745008_0004/
14/10/01 01:23:35 INFO mapreduce.Job: Running job: job_1412124745008_0004
14/10/01 01:23:43 INFO mapreduce.Job: Job job_1412124745008_0004 running in 
uber mode : false
14/10/01 01:23:43 INFO mapreduce.Job:  map 0% reduce 0%
{noformat}

*but this example invocation will fail immediately !* 
{noformat}
[vagrant@bigtop1 vagrant]$ mahout splitDataset --input x --output y             
                                                                                
                                                   
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and 
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.9-job.jar
14/10/01 01:23:57 INFO common.AbstractJob: Command line arguments: 
{--endPhase=[2147483647], --input=[x], --output=[y], --probePercentage=[0.1], 
--startPhase=[0], --tempDir=[temp], --trainingPercentage=[0.9]}
14/10/01 01:23:57 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
longer used.
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.compress.map.output is 
deprecated. Instead, use mapreduce.map.output.compress
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
        at 
org.apache.mahout.common.HadoopUtil.getCustomJobName(HadoopUtil.java:174)
        at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:587)
        at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:572)
        at 
org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.run(DatasetSplitter.java:90)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at 
org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.main(DatasetSplitter.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{noformat}

So it appears that some mahout jobs don't exhibit the JobContext failure, and 
others do - possibnly its the jobs that specifically are using HadoopUtil to 
create custom job names.


was (Author: jayunit100):
Okay, here is my minimum reproduce. 

*Roman's example works* 
{noformat}
[vagrant@bigtop1 vagrant]$ mahout 
org.apache.mahout.clustering.syntheticcontrol.canopy.Job                        
                      
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and 
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.9-job.jar
14/10/01 01:23:32 WARN driver.MahoutDriver: No 
org.apache.mahout.clustering.syntheticcontrol.canopy.Job.props found on 
classpath, will use command-line arguments only
14/10/01 01:23:32 INFO canopy.Job: Running with default arguments
14/10/01 01:23:34 INFO common.HadoopUtil: Deleting output
14/10/01 01:23:34 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
longer used.
14/10/01 01:23:34 INFO client.RMProxy: Connecting to ResourceManager at 
bigtop1.vagrant/10.10.10.11:8032
14/10/01 01:23:34 WARN mapreduce.JobSubmitter: Hadoop command-line option 
parsing not performed. Implement the Tool interface and execute your 
application with ToolRunner to remedy this.
14/10/01 01:23:35 INFO input.FileInputFormat: Total input paths to process : 1
14/10/01 01:23:35 INFO mapreduce.JobSubmitter: number of splits:1
14/10/01 01:23:35 INFO mapreduce.JobSubmitter: Submitting tokens for job: 
job_1412124745008_0004
14/10/01 01:23:35 INFO impl.YarnClientImpl: Submitted application 
application_1412124745008_0004
14/10/01 01:23:35 INFO mapreduce.Job: The url to track the job: 
http://bigtop1.vagrant:20888/proxy/application_1412124745008_0004/
14/10/01 01:23:35 INFO mapreduce.Job: Running job: job_1412124745008_0004
14/10/01 01:23:43 INFO mapreduce.Job: Job job_1412124745008_0004 running in 
uber mode : false
14/10/01 01:23:43 INFO mapreduce.Job:  map 0% reduce 0%
{noformat}

*but this example invocation will fail immediately !* 
{noformat}
[vagrant@bigtop1 vagrant]$ mahout splitDataset --input x --output y             
                                                                                
                                                   
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /usr/lib/hadoop/bin/hadoop and 
HADOOP_CONF_DIR=/etc/hadoop/conf
MAHOUT-JOB: /usr/lib/mahout/mahout-examples-0.9-job.jar
14/10/01 01:23:57 INFO common.AbstractJob: Command line arguments: 
{--endPhase=[2147483647], --input=[x], --output=[y], --probePercentage=[0.1], 
--startPhase=[0], --tempDir=[temp], --trainingPercentage=[0.9]}
14/10/01 01:23:57 WARN mapred.JobConf: The variable mapred.child.ulimit is no 
longer used.
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.compress.map.output is 
deprecated. Instead, use mapreduce.map.output.compress
14/10/01 01:23:57 INFO Configuration.deprecation: mapred.output.dir is 
deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
interface org.apache.hadoop.mapreduce.JobContext, but class was expected
        at 
org.apache.mahout.common.HadoopUtil.getCustomJobName(HadoopUtil.java:174)
        at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:587)
        at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:572)
        at 
org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.run(DatasetSplitter.java:90)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at 
org.apache.mahout.cf.taste.hadoop.als.DatasetSplitter.main(DatasetSplitter.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
{noformat}

So it appears that some mahout jobs don't exhibit the JobContext failure, and 
others do - possibnly its the jobs that specifically are using HadoopUtil to 
create custom job names.

> Mahout 0.9 couldn't work with hadoop 2.x
> ----------------------------------------
>
>                 Key: BIGTOP-1470
>                 URL: https://issues.apache.org/jira/browse/BIGTOP-1470
>             Project: Bigtop
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.8.0
>            Reporter: Hu Liu,
>            Priority: Blocker
>
> I built Mahout 0.9 with Bigtop and found it couldn't work with hadoop 2.x.
> {code}
> Exception in thread "main" java.lang.IncompatibleClassChangeError: Found 
> interface org.apache.hadoop.mapreduce.JobContext, but class was expected
>     at 
> org.apache.mahout.common.HadoopUtil.getCustomJobName(HadoopUtil.java:174)
>     at org.apache.mahout.common.AbstractJob.prepareJob(AbstractJob.java:614)
>     at 
> org.apache.mahout.cf.taste.hadoop.preparation.PreparePreferenceMatrixJob.run(PreparePreferenceMatrixJob.java:73)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at 
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.run(RecommenderJob.java:164)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>     at 
> org.apache.mahout.cf.taste.hadoop.item.RecommenderJob.main(RecommenderJob.java:322)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
>     at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
>     at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:152)
>     at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>     at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>     at java.lang.reflect.Method.invoke(Method.java:606)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> {code}
> The root cause is that bigtop uses the wrong parameter hadoop2.version when 
> building Mahout.
> {code}
> mvn clean install -Dmahout.skip.distribution=false -DskipTests 
> -Dhadoop2.version=$HADOOP_VERSION "$@"
> {code} 
> For Mahout 0.9 , the correct property is hadoop.version in pom.xml of Mahout 
> 0.9
> {code}
> <hadoop.version>1.2.1</hadoop.version>
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to