[
https://issues.apache.org/jira/browse/FLUME-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14150783#comment-14150783
]
Otis Gospodnetic commented on FLUME-2476:
-----------------------------------------
I checked with Pavel off-line to see what this is about and it looks like this
is about 2 things:
# there is a bug here that incorrectly handles JSON
# a question about how to go about fixing it - should he stick with the
existing decode+encode approach and fix the bug, or should he also take the
opportunity to rewrite things so that decode-encode is not needed (and, of
course, make sure the bug with incorrect JSON handling is fixed in this new
impl)
Hoping one of the committers can provide the guidance.
> ContentBuilderUtil.appendField incorrectly manages json-like data
> -----------------------------------------------------------------
>
> Key: FLUME-2476
> URL: https://issues.apache.org/jira/browse/FLUME-2476
> Project: Flume
> Issue Type: Bug
> Components: Sinks+Sources
> Affects Versions: v1.5.0.1
> Environment: elasticsearch 1.1.0, elasticsearch 0.90.13
> Reporter: Pavel Zalunin
> Priority: Blocker
>
> There is a problem in
> org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer.getContentBuilder
> returns incorrect value, in case when Event contains well-formed json
> (possible as sub-string).
> Example:
> {code}
> ElasticSearchDynamicSerializer ser = new ElasticSearchDynamicSerializer();
> Event event = new SimpleEvent();
> event.setBody("{\"true\":\"false\"}".getBytes());
> System.out.println(ser.getContentBuilder(event).string());
> //prints:
> //{"body":"org.elasticsearch.common.xcontent.XContentBuilder@31a5fdb9"}
> {code}
> I tried to find origins of the problem, and found this chunk of code
> (flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java):
> {code}
> public static void addComplexField(XContentBuilder builder, String fieldName,
> XContentType contentType, byte[] data) throws IOException {
> XContentParser parser = null;
> try {
> XContentBuilder tmp = jsonBuilder();
> parser = XContentFactory.xContent(contentType).createParser(data);
> parser.nextToken();
> tmp.copyCurrentStructure(parser);
> builder.field(fieldName, tmp); //here field (String, String) called,
> because there is no method(String,XContentBuilder)
> //maybe tmp.string() should be here instead?
> {code}
> It makes impossible to send any string, which contains json to elasticsearch
> sink using this serializer.
> Maybe I'm wrong, but what are benefits from decoding, then encoding chunk of
> json data?
> If it really needed, maybe it is possible to add some option to
> ElasticSearchDynamicSerializer which forces to use
> ContentBuilderUtil.addSimpleField instead of ContentBuilderUtil.appendField ?
> I can volunteer to create patch when will be decided what is the best way to
> avoid this issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)