Hi,
All I'm doing is building a map and passing that to Gson for serialization.
A snippet from my map method:
logEntryMap.put("cs(User-Agent)", values[9]);
context.write(NullWritable.get(), new Text(gson.toJson(logEntryMap)));
values[] is a String array. Everything that goes into the map that gets
serialized is a string.
I do have es.input.json set to true. This failure doesn't occur until
>100,000,000 records are in the index, so its happening late in the load
process. The part that I find strange is that the field in question isn't
in my mapping, and I've not touched the default mapping. I'm not sure why
it would try to parse it as anything other than a string.
I'll turn on TRACE logging and see what happens.
Brian
On Wed, Mar 19, 2014 at 5:35 PM, Costin Leau <[email protected]> wrote:
> Hi,
>
> How do you pass the json to es-hadoop? Do you have an example? By the way,
> you can enable TRACE logging on org.elasticsearch.hadoop and see everything
> that es-hadoop does, including the data that goes over the wire.
> My guess is that the conversion of logs to JSON creates some extra
> artifacts which are later on interpreted as Writable object (instead of raw
> JSON) by ES Hadoop.
> Make sure you tell es-hadoop that its source it's json (through
> es.input.json set to true).
> The logs will likely confirm (or not) the above :)
>
> Cheers,
>
>
> On 3/19/14 11:14 PM, Brian Stempin wrote:
>
>> Hi List,
>> I have an ES cluster that takes in some data from our logs. We use
>> Hadoop to parse the individual log entries into JSON
>> strings, which does a bulk insert using ES's output format. For whatever
>> reason, ES attempts to parse base64 strings as
>> a dates and fails. Here's a line from one of my Hadoop logs:
>>
>> java.lang.IllegalStateException: Found unrecoverable error [Bad
>> Request(400) - MapperParsingException[failed to parse [csUriParams.d]];
>> nested: MapperParsingException[failed to parse date field [REDACTED BASE64
>> STRING], tried both date format [dateOptionalTime], and timestamp number
>> with locale []]; nested: IllegalArgumentException[Invalid format: "
>> Y2lkPURFJml0ZW1zPWE2NTJjLXgxZTFj..."]; ]; Bailing out..
>>
>> at org.elasticsearch.hadoop.rest.RestClient.retryFailedEntries(
>> RestClient.java:145)
>>
>> at org.elasticsearch.hadoop.rest.RestClient.bulk(RestClient.
>> java:120)
>>
>> at org.elasticsearch.hadoop.rest.RestRepository.sendBatch(
>> RestRepository.java:147)
>>
>> <SNIP>
>>
>>
>> csUriParams.d does not appear in my mapping, so I never explicitly asked
>> for it to be treated as a date.
>>
>> Any idea why ES is trying to treat it as a date?
>>
>> Thanks,
>> Brian
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to
>> [email protected] <mailto:elasticsearch+
>> [email protected]>.
>>
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/elasticsearch/49e5fe0b-
>> cec3-4914-b8d6-99440dd5fb69%40googlegroups.com
>> <https://groups.google.com/d/msgid/elasticsearch/49e5fe0b-
>> cec3-4914-b8d6-99440dd5fb69%40googlegroups.com?utm_medium=
>> email&utm_source=footer>.
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> Costin
>
>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "elasticsearch" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/elasticsearch/_iE0t92CUzA/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/elasticsearch/532A0D9C.7010401%40gmail.com.
>
> For more options, visit https://groups.google.com/d/optout.
>
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CANB1ciCdBYj_68DCxEcDxfYucuyhJ7NPWrmEWtV2CypqGp0dSA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.