Hello Fu Bin-zhang, The error message is very weird because FileSplit is a class derived from InputSplit, and the conversion is legal. However, I've seen this message several times. The error is highly likely related to the location of the hadoop tmp directory. Could you please compress and send me the your $HADOOP_HOME/conf folder? No need to broadcast to the list, send it to me directly.
Regards, Djordje ________________________________________ From: Fu Bin-zhang [[email protected]] Sent: Wednesday, April 11, 2012 4:11 PM To: [email protected] Subject: A question about "data analytics" Hi all, I am trying to run the data analytics benchmark. I followed the intructions in the cloudsuite website. Everything is ok until the 7th step "create the category-based split of the Wikipedia dataset: ". The error is "java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit". I failed to find the answer with google. Can anybody give a hint? Thanks in advance. The output is: ------------------------------- hadoop@debian-98:~$ $MAHOUT_HOME/bin/mahout wikipediaDataSetCreator -i wikipedia/chunks -o wikipediainput -c $MAHOUT_HOME/examples/temp/categories.txt MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using HADOOP_HOME=/home/hadoop/hadoop-0.20.2 No HADOOP_CONF_DIR set, using /home/hadoop/hadoop-0.20.2/conf MAHOUT-JOB: /home/hadoop/mahout-distribution-0.6/examples/target/mahout-examples-0.6-job.jar 12/04/11 06:55:10 WARN driver.MahoutDriver: No wikipediaDataSetCreator.props found on classpath, will use command-line arguments only 12/04/11 06:55:12 INFO bayes.WikipediaDatasetCreatorDriver: Input: wikipedia/chunks Out: wikipediainput Categories: /home/hadoop/mahout-distribution-0.6/examples/temp/categories.txt 12/04/11 06:55:13 INFO common.HadoopUtil: Deleting wikipediainput 12/04/11 06:55:13 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 12/04/11 06:55:15 INFO input.FileInputFormat: Total input paths to process : 7 12/04/11 06:55:17 INFO mapred.JobClient: Running job: job_201204110624_0002 12/04/11 06:55:18 INFO mapred.JobClient: map 0% reduce 0% 12/04/11 06:55:44 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000003_0, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:55:48 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000000_0, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:55:48 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000001_0, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:55:51 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000002_0, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:55:54 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000003_1, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:03 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000001_1, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:03 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000000_1, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:03 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000002_1, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:06 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000003_2, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:15 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000002_2, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:18 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000000_2, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:18 INFO mapred.JobClient: Task Id : attempt_201204110624_0002_m_000001_2, Status : FAILED java.lang.ClassCastException: org.apache.hadoop.mapreduce.lib.input.FileSplit cannot be cast to org.apache.hadoop.mapred.InputSplit at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:323) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307) at org.apache.hadoop.mapred.Child.main(Child.java:170) 12/04/11 06:56:24 INFO mapred.JobClient: Job complete: job_201204110624_0002 12/04/11 06:56:24 INFO mapred.JobClient: Counters: 3 12/04/11 06:56:24 INFO mapred.JobClient: Job Counters 12/04/11 06:56:24 INFO mapred.JobClient: Launched map tasks=14 12/04/11 06:56:24 INFO mapred.JobClient: Data-local map tasks=14 12/04/11 06:56:24 INFO mapred.JobClient: Failed map tasks=1 12/04/11 06:56:24 INFO driver.MahoutDriver: Program took 74439 ms (Minutes: 1.24065) ---------------- Fu, Binzhang 2012-04-11
