[ 
https://issues.apache.org/jira/browse/FLUME-2476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14151132#comment-14151132
 ] 

Edward Sargisson commented on FLUME-2476:
-----------------------------------------

[~pavel.zalunin], [~otis]

I'm not a committer but I take an interest in this code.

A patch would be welcome in this particular area but I would recommend care to 
ensure everything is covered. A lot of comments and defects come from this 
spot. For example, it needs to handle JSON data, text data, XML data as well as 
errors from the above. e.g. See flume-2126.


> ContentBuilderUtil.appendField incorrectly manages json-like data
> -----------------------------------------------------------------
>
>                 Key: FLUME-2476
>                 URL: https://issues.apache.org/jira/browse/FLUME-2476
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.5.0.1
>         Environment: elasticsearch 1.1.0, elasticsearch 0.90.13
>            Reporter: Pavel Zalunin
>            Priority: Blocker
>
> There is a problem in 
> org.apache.flume.sink.elasticsearch.ElasticSearchDynamicSerializer.getContentBuilder
>  returns incorrect value, in case when Event contains well-formed json 
> (possible as sub-string).
> Example:
> {code}
> ElasticSearchDynamicSerializer ser = new ElasticSearchDynamicSerializer();
> Event event = new SimpleEvent();
> event.setBody("{\"true\":\"false\"}".getBytes());
> System.out.println(ser.getContentBuilder(event).string());
> //prints:
> //{"body":"org.elasticsearch.common.xcontent.XContentBuilder@31a5fdb9"}
> {code} 
> I tried to find origins of the problem, and found this chunk of code 
> (flume-ng-sinks/flume-ng-elasticsearch-sink/src/main/java/org/apache/flume/sink/elasticsearch/ContentBuilderUtil.java):
> {code}
> public static void addComplexField(XContentBuilder builder, String fieldName,
>       XContentType contentType, byte[] data) throws IOException {
>     XContentParser parser = null;
>     try {
>       XContentBuilder tmp = jsonBuilder();
>       parser = XContentFactory.xContent(contentType).createParser(data);
>       parser.nextToken();
>       tmp.copyCurrentStructure(parser);
>       builder.field(fieldName, tmp); //here field (String, String) called, 
> because there is no method(String,XContentBuilder) 
> //maybe tmp.string() should be here instead?
> {code}
> It makes impossible to send any string, which contains json to elasticsearch 
> sink using this serializer.
> Maybe I'm wrong, but what are benefits from decoding, then encoding chunk of 
> json data?
> If it really needed, maybe it is possible to add some option to 
> ElasticSearchDynamicSerializer which forces to use 
> ContentBuilderUtil.addSimpleField instead of ContentBuilderUtil.appendField ?
> I can volunteer to create patch when will be decided what is the best way to 
> avoid this issue.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to