hello ! A Chinese with pool English, be careful ! My github about Flume : https://github.com/hotfey/flume.ng.1.5.2
Four sources flume on 4 machines, with logs files as sources, avro as sinks. One sink flume on an other machine, with avro as source, hdfs as sink. Create a class implements RegexExtractorInterceptorSerializer, that is the annex(also see github). My logs files start with timestamp every line, so as events. I implements RegexExtractorInterceptorSerializer, just want to create directorys reference the timestamp in hdfs. (e.g. A timestamp 28/Jul/2015, will create a hdfs directory .../2015/07/28) But, when i start all the flumes, i do not know how to ensure the thread safety about my implements. (e.g. If one of the sources machines's timestamp is 28/Jul/2015, and an other's 21/Jun/2015, The fact, may create .../2015/06/21, .../2015/06/28, .../2015/07/21 or .../2015/07/28.) Can you give me some advices about it. That's all, Thanks ! The Best Wishes For You !
