-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/47098/
-----------------------------------------------------------
(Updated May 24, 2016, 4:11 p.m.)
Review request for Flume.
Changes
-------
added the document change.
Repository: flume-git
Description
-------
The issue is when the header value is null it throws null pointer exception and
flume stops processing further events.
For example:
[{
"headers" : {
"timestamp" : "434324343",
"host" : null
},
"body" : "random_body"
}]
The solution to fix this is:
1. If the header has a null value in the json, flume will replace it with a
replacement string.
2. The default value for a replacement string is an empty string.
3. To overwrite default string, set "handler.nullReplacementHeader" property
in flume config.
Diffs (updated)
-----
flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSource.java
b520b03
flume-ng-core/src/main/java/org/apache/flume/source/http/HTTPSourceConfigurationConstants.java
86caf7d
flume-ng-core/src/main/java/org/apache/flume/source/http/JSONHandler.java
197f66a
flume-ng-core/src/main/java/org/apache/flume/source/http/NullHeaderReplacement.java
PRE-CREATION
flume-ng-core/src/test/java/org/apache/flume/source/http/TestJSONHandler.java
455781c
flume-ng-doc/sphinx/FlumeUserGuide.rst e04fd50
Diff: https://reviews.apache.org/r/47098/diff/
Testing
-------
The following are the test cases:
1. Header has null value in json and handler.nullReplacementHeader is not set
in flume config. The default value will be used to replace null.
[{
"headers" : {
"timestamp" : "434324343",
"host" : null
},
"body" : "random_body"
}]
Output in hdfs : {timestamp=434324343, host=} random_body
2. Header is not null in json and handler.nullReplacementHeader is not set in
flume config. The replacement implementation doesnt come in to consideration.
[{
"headers" : {
"timestamp" : "434324343",
"host" : 1
},
"body" : "random_body"
}]
Output in hdfs : {timestamp=434324343, host=1} random_body
3. Header has null value in json and handler.nullReplacementHeader=abc is set
in flume config. The null value in header will be replaced by abc.
[{
"headers" : {
"timestamp" : "434324343",
"host" : null
},
"body" : "random_body"
}]
Output in hdfs {timestamp=434324343, host=abc} random_body
4. Header has null value in json and handler.nullReplacementHeader=1 is set
in flume config. The null value in header will be replaced by 1 as a string .
[{
"headers" : {
"timestamp" : "434324343",
"host" : null
},
"body" : "random_body"
}]
Output in hdfs: {timestamp=434324343, host=1} random_body
5. Header is not null in json and handler.nullReplacementHeader is also set
in flume config. The replacement implementation doesnt come in to consideration.
[{
"headers" : {
"timestamp" : "434324343",
"host" : 1
},
"body" : "random_body"
}]
Output in hdfs : {timestamp=434324343, host=1} random_body
File Attachments
----------------
flume-2620
https://reviews.apache.org/media/uploaded/files/2016/05/09/0eff1d56-caf3-4d36-bb45-6b9e7fd6a1ff__FLUME-2620-1.patch
Thanks,
neerja khattar