Hi Costin,
I didnt find any Jira or anything for issues on that help page.
Anyway I created two small java classes. One for MR job and one IT test to
run that job and get the exception (to not bother from cmd line).
Plus some dummy input file just to get inside the mapper.
I attached them. If you need some more info let me know.
Dummy input file was placed in src/test/resources/input/input.txt
for test to read it.
I tested this with gradle project (within my existing one).
elasticsearch-hadoop 2.0.2 dependency and java 7.
You can see the exception being thrown in console when running it from IDE
as junit test.
Cheers,
Kamil.
On Tue, Dec 16, 2014 at 12:39 PM, Costin Leau <[email protected]> wrote:
>
> Having multiple types shouldn't be an issue - ES is a document store so
> it's pretty common to have different types.
> In other words, this is not the intended behavior - can you please create
> a small sample/snippet that reproduces the error
> and raise an issue for it [1] ?
>
> Thanks!
>
> [1] http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/
> master/troubleshooting.html
>
>
> On 12/15/14 6:03 PM, Kamil Dziublinski wrote:
>
>> Hi,
>>
>> I had only one jar on classpath and none in hadoop cluster.
>> I had different types of values in my MapWritable tho. It turns out this
>> was the problem.
>> So I had always Text as a key, but depending on type Text, LongWritable,
>> BooleanWritable or DoubleWritable as value in
>> that map.
>> When I changed everything to be Text it started working.
>>
>> Is this intended behaviour?
>>
>> Cheers,
>> Kamil.
>>
>> On Friday, December 12, 2014 8:37:03 PM UTC+1, Costin Leau wrote:
>>
>> Hi,
>>
>> This error is typically tied to a classpath issue - make sure you
>> have only one elasticsearch-hadoop jar version in
>> your
>> classpath and on the Hadoop cluster.
>>
>> On 12/12/14 5:56 PM, Kamil Dziublinski wrote:
>> > Hi guys,
>> >
>> > I am trying to run a MR job that reads from HDFS and stores into
>> ElasticSearch cluster.
>> >
>> > I am getting following error:
>> > Error: org.elasticsearch.hadoop.serialization.
>> EsHadoopSerializationException: Cannot handle type [class
>> > org.apache.hadoop.io.MapWritable], instance [org.apache.hadoop.io.
>> MapWritable@3879429f] using writer
>> > [org.elasticsearch.hadoop.mr.WritableValueWriter@3fc8f1a2]
>> > at org.elasticsearch.hadoop.serialization.builder.
>> ContentBuilder.value(ContentBuilder.java:259)
>> > at org.elasticsearch.hadoop.serialization.bulk.
>> TemplatedBulk.doWriteObject(TemplatedBulk.java:68)
>> > at org.elasticsearch.hadoop.serialization.bulk.
>> TemplatedBulk.write(TemplatedBulk.java:55)
>> > at org.elasticsearch.hadoop.rest.
>> RestRepository.writeToIndex(RestRepository.java:130)
>> > at org.elasticsearch.hadoop.mr.
>> EsOutputFormat$EsRecordWriter.write(EsOutputFormat.java:159)
>> > at org.apache.hadoop.mapred.MapTask$
>> NewDirectOutputCollector.write(MapTask.java:635)
>> > at org.apache.hadoop.mapreduce.task.
>> TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
>> > at org.apache.hadoop.mapreduce.
>> lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
>> > at com.teradata.cybershot.mr.es.userprofile.
>> EsOnlineProfileMapper.map(EsOnlineProfileMapper.java:35)
>> > at com.teradata.cybershot.mr.es.userprofile.
>> EsOnlineProfileMapper.map(EsOnlineProfileMapper.java:20)
>> > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
>> > at org.apache.hadoop.mapreduce.lib.input.DelegatingMapper.
>> run(DelegatingMapper.java:55)
>> > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.
>> java:764)
>> > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
>> > at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.
>> java:167)
>> > at java.security.AccessController.doPrivileged(Native
>> Method)
>> > at javax.security.auth.Subject.doAs(Subject.java:415)
>> > at org.apache.hadoop.security.UserGroupInformation.doAs(
>> UserGroupInformation.java:1554)
>> > at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:
>> 162)
>> >
>> > We are using cdh5.1.0 and es-hadoop dependency 2.0.2
>> >
>> > I have this set in my job configuration:
>> > job.setOutputFormatClass(EsOutputFormat.class);
>> > job.setMapOutputValueClass(MapWritable.class);
>> >
>> > together with nodes and resource props like it is described on ES
>> page.
>> >
>> > in my mapper I simply write: context.write(NullWritable.get(),
>> esMap); where esMap is org.apache.hadoop.io.MapWritable.
>> >
>> > I do not know why it's failing as everything looks ok to me. Maybe
>> you will have some ideas.
>> >
>> > Thanks in advance,
>> > Kamil.
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "elasticsearch" group.
>> > To unsubscribe from this group and stop receiving emails from it,
>> send an email to
>> >[email protected] <javascript:> <mailto:
>> [email protected] <javascript:>>.
>> > To view this discussion on the web visit
>> >https://groups.google.com/d/msgid/elasticsearch/71c57e2a-
>> 2210-47c0-aa9e-cbbf164ef05b%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/71c57e2a-
>> 2210-47c0-aa9e-cbbf164ef05b%40googlegroups.com>
>> > <https://groups.google.com/d/msgid/elasticsearch/71c57e2a-
>> 2210-47c0-aa9e-cbbf164ef05b%40googlegroups.com?utm_medium=
>> email&utm_source=footer
>> <https://groups.google.com/d/msgid/elasticsearch/71c57e2a-
>> 2210-47c0-aa9e-cbbf164ef05b%40googlegroups.com?utm_medium=
>> email&utm_source=footer>>.
>>
>> > For more options, visithttps://groups.google.com/d/optout <
>> https://groups.google.com/d/optout>.
>>
>> --
>> Costin
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to
>> [email protected] <mailto:elasticsearch+
>> [email protected]>.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/47200ca2-
>> efd7-4741-832d-89c8b9ec088f%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/47200ca2-
>> efd7-4741-832d-89c8b9ec088f%40googlegroups.com?utm_medium=
>> email&utm_source=footer>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
> --
> Costin
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/elasticsearch/SdlubzrL0xU/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/549019F9.7040100%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJV0mRtN3rukpBDfiLETDYxVPNDoCOJyR4jyS9NempeFv23Htw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
test1
test2
test3
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAJV0mRtN3rukpBDfiLETDYxVPNDoCOJyR4jyS9NempeFv23Htw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
package es.test;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.util.Properties;
import org.apache.commons.cli.CommandLine;
import org.apache.commons.cli.CommandLineParser;
import org.apache.commons.cli.HelpFormatter;
import org.apache.commons.cli.OptionBuilder;
import org.apache.commons.cli.Options;
import org.apache.commons.cli.PosixParser;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.conf.Configured;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.serde2.io.DoubleWritable;
import org.apache.hadoop.io.BooleanWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.MapWritable;
import org.apache.hadoop.io.NullWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.util.Tool;
import org.elasticsearch.hadoop.mr.EsOutputFormat;
public class EsOutputJob extends Configured implements Tool {
static final String MAPPING_ID = "key";
static final String INPUT_PATH = "input.path";
private final Properties props;
public EsOutputJob() {
super();
props = new Properties();
}
public EsOutputJob(Properties props) {
super();
this.props = props;
}
@SuppressWarnings("static-access")
@Override
public int run(String[] args) throws Exception {
Options options = new Options();
options.addOption("p", true, "properties filename from the classpath");
options.addOption("P", true, "external properties filename");
options.addOption(OptionBuilder.withArgName("property=value").hasArgs(2).withValueSeparator()
.withDescription("use value for given property").create("D"));
CommandLineParser parser = new PosixParser();
CommandLine cmd = parser.parse(options, args);
if (!(cmd.hasOption('p') || cmd.hasOption('P'))) {
HelpFormatter formatter = new HelpFormatter();
formatter.printHelp(this.getClass().getSimpleName(), options);
return 1;
}
if (cmd.hasOption('p'))
props.load(this.getClass().getClassLoader().getResourceAsStream(cmd.getOptionValue('p')));
if (cmd.hasOption('P')) {
File file = new File(cmd.getOptionValue('P'));
FileInputStream fStream = new FileInputStream(file);
props.load(fStream);
}
props.putAll(cmd.getOptionProperties("D"));
boolean success = run();
if(!success)
return 1;
return 0;
}
boolean run() throws Exception {
Configuration conf = getConf();
disableSpeculativeExecution(conf);
conf.set("es.resource", "test/rows");
conf.set("es.mapping.id", MAPPING_ID);
Job job = Job.getInstance(conf, EsOutputJob.class.getSimpleName());
job.setJarByClass(EsOutputJob.class);
job.setInputFormatClass(TextInputFormat.class);
String path = props.getProperty(INPUT_PATH);
Path inputPath = new Path(path);
TextInputFormat.setInputPaths(job, inputPath);
job.setNumReduceTasks(0);
job.setMapperClass(EsOutputMapper.class);
job.setOutputFormatClass(EsOutputFormat.class);
job.setMapOutputValueClass(MapWritable.class);
// submits and waits
return job.waitForCompletion(true);
}
private void disableSpeculativeExecution(Configuration conf) {
conf.setBoolean("mapred.map.tasks.speculative.execution", false);
conf.setBoolean("mapred.reduce.tasks.speculative.execution", false);
}
private static class EsOutputMapper extends Mapper<LongWritable, Text, NullWritable, MapWritable> {
@Override
public void map(LongWritable lineNr, Text line, Context context) throws IOException, InterruptedException {
MapWritable esMap = new MapWritable();
esMap.put(new Text(EsOutputJob.MAPPING_ID), new Text("key"));
esMap.put(new Text("attr1"), new Text("val1"));
//uncomment to make it run without exception
// esMap.put(new Text("attr2"), new Text("val2"));
// esMap.put(new Text("attr3"), new Text("val3"));
// esMap.put(new Text("attr4"), new Text("val4"));
//comment to make it run without exception
esMap.put(new Text("attr2"), new LongWritable(1l));
esMap.put(new Text("attr3"), new BooleanWritable(true));
esMap.put(new Text("attr4"), new DoubleWritable(3.0));
context.write(NullWritable.get(), esMap);
}
}
}
package es.test;
import java.io.File;
import java.util.Properties;
import org.apache.hadoop.conf.Configuration;
import org.junit.Test;
public class EsOutputJobIT {
@Test
public void testWithSnapshot() throws Exception {
String inputPath = getClass().getResource("/input/input.txt").getPath();
Configuration conf = new Configuration();
executeJob(conf, new File(inputPath));
}
private void executeJob(Configuration conf, File inputPath) throws Exception {
Properties props = new Properties();
props.setProperty(EsOutputJob.INPUT_PATH, inputPath.getAbsolutePath());
EsOutputJob producer = new EsOutputJob(props);
conf.set("mapreduce.framework.name", "local");
producer.setConf(conf);
producer.run();
}
}