Hi, I coded a map reduce program with hadoop java api.
When I submitted the job to the cluster, I got the following errors: Exception in thread "main" java.lang.NoClassDefFoundError: org/codehaus/jackson/map/JsonMappingException at org.apache.hadoop.mapreduce.Job$1.run(Job.java:489) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapreduce.Job.connect(Job.java:487) at org.apache.hadoop.mapreduce.Job.submit(Job.java:475) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:52) Caused by: java.lang.ClassNotFoundException: org.codehaus.jackson.map.JsonMappingException at java.net.URLClassLoader$1.run(URLClassLoader.java:202) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:190) at java.lang.ClassLoader.loadClass(ClassLoader.java:306) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301) at java.lang.ClassLoader.loadClass(ClassLoader.java:247) ... 8 more I found the classes not found here is in "jackson-core-asl-1.5.2" and "jackson-mapper-asl-1.5.2", so added these two jars to the project and resubmitted the job. But I got the following errors: Jun 7, 2012 4:18:55 PM org.apache.hadoop.metrics.jvm.JvmMetrics init INFO: Initializing JVM Metrics with processName=JobTracker, sessionId= Jun 7, 2012 4:18:55 PM org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Jun 7, 2012 4:18:55 PM org.apache.hadoop.mapred.JobClient copyAndConfigureFiles WARNING: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. Jun 7, 2012 4:18:55 PM org.apache.hadoop.mapred.JobClient$2 run INFO: Cleaning up the staging area file:/tmp/hadoop-huanchen/mapred/staging/huanchen757608919/.staging/job_local_0001 Exception in thread "main" org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/data/huanchen/pagecrawler/url at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:231) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:248) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961) at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1127) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833) at org.apache.hadoop.mapreduce.Job.submit(Job.java:476) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506) at com.ipinyou.data.preprocess.mapreduce.ExtractFeatureFromURLJob.main(ExtractFeatureFromURLJob.java:51) Note that the error is "Input path does not exist: file:/" instead of "Input path does not exist: hdfs:/" . So does it mean the job does not successfully connect to the hadoop cluster? The first "NoClassDefFoundError: org/codehaus/jackson/map/JsonMappingException" error is also for this reason? Any one has any ideas? Thank you ! Best, Huanchen 2012-06-07 huanchen.zhang