[ 
https://issues.apache.org/jira/browse/MAHOUT-91?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Ingersoll resolved MAHOUT-91.
-----------------------------------

    Resolution: Fixed

Committed.

> Wikipedia Example has incorrect input Key
> -----------------------------------------
>
>                 Key: MAHOUT-91
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-91
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>            Reporter: Grant Ingersoll
>            Assignee: Grant Ingersoll
>            Priority: Minor
>             Fix For: 0.1
>
>
> Running the WikipediaDataSetCreator 
> {code}
>  bin/hadoop jar ~/projects/lucene/mahout/mahout-clean/examples/build/ 
> org.apache.mahout.examples.classifiers.cbayes.WikipediaDatasetCreator -i 
> wikipediadump -o wikipediainput -c 
> ~/projects/lucene/mahout/mahout-clean/examples/src/test/resources/country.txt 
> {code}
> yielded:
> 08/10/31 11:15:26 INFO mapred.JobClient: Task Id : 
> attempt_200810301619_0001_m_000000_0, Status : FAILED
> java.lang.ClassCastException: org.apache.hadoop.io.LongWritable cannot be 
> cast to org.apache.hadoop.io.Text
>         at 
> org.apache.mahout.classifier.bayes.WikipediaDatasetCreatorMapper.map(WikipediaDatasetCreatorMapper.java:41)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:47)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:227)
>         at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2207)
> The fix is:
> {code}
> Index: 
> src/main/java/org/apache/mahout/classifier/bayes/WikipediaDatasetCreatorMapper.java
> ===================================================================
> --- 
> src/main/java/org/apache/mahout/classifier/bayes/WikipediaDatasetCreatorMapper.java
>  (revision 709230)
> +++ 
> src/main/java/org/apache/mahout/classifier/bayes/WikipediaDatasetCreatorMapper.java
>  (working copy)
> @@ -20,6 +20,7 @@
>  import org.apache.commons.lang.StringEscapeUtils;
>  import org.apache.hadoop.io.DefaultStringifier;
>  import org.apache.hadoop.io.Text;
> +import org.apache.hadoop.io.LongWritable;
>  import org.apache.hadoop.mapred.JobConf;
>  import org.apache.hadoop.mapred.MapReduceBase;
>  import org.apache.hadoop.mapred.Mapper;
> @@ -39,11 +40,11 @@
>  import java.util.Set;
>  
>  public class WikipediaDatasetCreatorMapper extends MapReduceBase implements
> -    Mapper<Text, Text, Text, Text> {
> +    Mapper<LongWritable, Text, Text, Text> {
>  
>    private static Set<String> countries = null;
>    
> -  public void map(Text key, Text value,
> +  public void map(LongWritable key, Text value,
>        OutputCollector<Text, Text> output, Reporter reporter)
>        throws IOException {
>      String document = value.toString();
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to