Hi,
I can't sort that ! I'm using hadoop CDH3u6, and trying to get ES index my
data. I tried with raw json and MapWritable, I always get the same kind of
errors :
java.lang.Exception: org.elasticsearch.hadoop.
EsHadoopIllegalArgumentException: [org.elasticsearch.hadoop.serialization.
field.MapWritableFieldExtractor@35b5f7bd] cannot extract value from object [
org.apache.hadoop.io.MapWritable@11c757a1]
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:
349)
Caused by: org.elasticsearch.hadoop.EsHadoopIllegalArgumentException: [org.
elasticsearch.hadoop.serialization.field.MapWritableFieldExtractor@35b5f7bd]
cannot extract value from object [org.apache.hadoop.io.MapWritable@11c757a1]
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk$FieldWriter
.write(TemplatedBulk.java:49)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.
writeTemplate(TemplatedBulk.java:101)
at org.elasticsearch.hadoop.serialization.bulk.TemplatedBulk.write(
TemplatedBulk.java:77)
at org.elasticsearch.hadoop.rest.RestRepository.writeToIndex(
RestRepository.java:130)
at org.elasticsearch.hadoop.mr.EsOutputFormat$EsRecordWriter.write(
EsOutputFormat.java:161)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(
MapTask.java:531)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(
TaskInputOutputContext.java:80)
at my.jobs.index.IndexMapper.map(IndexMapper.java:27)
at my.jobs.index.IndexMapper.map(IndexMapper.java:19)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:648)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:322)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(
LocalJobRunner.java:218)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:
471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.
java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor
.java:615)
at java.lang.Thread.run(Thread.java:724)
Seems to me that all is right, here the configuration of the index mapper :
Job job = new Job(getConf(), "Indexing into Elastic search.");
job.setJarByClass(getClass());
DomainRankDriver.loadLibrariesToDistributedCache(job);
Path input = new Path(args[0]);
FileInputFormat.addInputPath(job, input);
FileOutputFormat.setOutputPath(job, new Path(args[1]));
// Used by ES-hadoop to take Text as Json
job.setOutputFormatClass(EsOutputFormat.class);
// job.setMapOutputValueClass(Text.class);
job.setMapOutputValueClass(MapWritable.class);
job.setMapperClass(IndexMapper.class);
job.setNumReduceTasks(0);
And my simple mapper :
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException{
MapWritable map = new MapWritable();
map.put(new Text("test"), new Text("value"));
context.write(new LongWritable(), map);
}
Any clue to search for more ? I'm stuck.
Thanks,
Aurelien
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/7f6545ab-d6d9-4fdf-8923-0b60e0ea5297%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.