Pavel Zalunin created FLUME-2476:
------------------------------------

             Summary: ContentBuilderUtil.appendField incorrectly manages 
json-like data
                 Key: FLUME-2476
                 URL: https://issues.apache.org/jira/browse/FLUME-2476
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
    Affects Versions: v1.5.0.1
         Environment: elasticsearch 1.1.0, elasticsearch 0.90.13
            Reporter: Pavel Zalunin
            Priority: Blocker


There is a problem in 
org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer.getContentBuilder
 returns incorrect value, in case when Event contains well-formed json 
(possible as sub-string).

Example:
{code}
ElasticSearchDynamicSerializer ser = new ElasticSearchDynamicSerializer();
Event event = new SimpleEvent();
event.setBody("{\"true\":\"false\"}".getBytes());
System.out.println(ser.getContentBuilder(event).string());
//prints:
//{"body":"org.elasticsearch.common.xcontent.XContentBuilder@31a5fdb9"}
{code} 

I tried to find origins of the problem, and found this chunk of code 
(flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java):
{code}
public static void addComplexField(XContentBuilder builder, String fieldName,
      XContentType contentType, byte[] data) throws IOException {
    XContentParser parser = null;
    try {
      XContentBuilder tmp = jsonBuilder();
      parser = XContentFactory.xContent(contentType).createParser(data);
      parser.nextToken();
      tmp.copyCurrentStructure(parser);
      builder.field(fieldName, tmp); //here field (String, String) called, 
because there is no method(String,XContentBuilder) 
//maybe tmp.string() should be here instead?
{code}

It makes impossible to send any string, which contains json to elasticsearch 
sink using this serializer.

Maybe I'm wrong, but what are benefits from decoding, then encoding chunk of 
json data?

If it really needed, maybe it is possible to add some option to 
ElasticSearchDynamicSerializer which forces to use 
ContentBuilderUtil.addSimpleField instead of ContentBuilderUtil.appendField ?

I can volunteer to create patch when will be decided what is the best way to 
avoid this issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to