Hi tinawenqiao, Thanks for moving this conversation from github to flume dev list. I believe this is the best place to discuss development efforts. As mentioned we generally don't accept pull request so please create a jira(s) (based on how many different issues you would like to address) and attach your patch to it/them as it is described on the https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute in detail.
Regarding to your proposed changes: 1) Sounds like a good improvement for TailDirSource (please check Spooling Directory Source how similar is supported using deserializer: https://flume.apache.org/FlumeUserGuide.html#spooling-directory-source) Your logic can be a part of a new deserializer. 2) Sounds like a good improvement for TailDirSource but would be good to avoid inventing a new pattern syntax (on a side note please check out the latest development on SpoolingDirSource as it is now capable of checking a directory subtree recursively it might have already what you want to achieve) 3) Bugfix sounds awesome 4) Is something looks very specific to your use case. I believe it could be a little bit more generalised and or driven by configuration parameter(s). Cheers, Attila Attila Simon Software Engineer Email: [email protected] On Wed, Jul 6, 2016 at 5:42 AM, 黄鹏程 <[email protected]> wrote: > Fantastic Features! Support for this pull! > > > > > ------------------ 原始邮件 ------------------ > 发件人: "文乔";<[email protected]>; > 发送时间: 2016年7月6日(星期三) 中午11:38 > 收件人: "dev"<[email protected]>; > > 主题: Add Support multiline and recursive directory in > TaildirSource(Flume-1.7). And make the buffersize be configured > > > > Hi,all: > I submit a pull request to flume-1.7 on github. The address is > https://github.com/apache/flume/pull/54 . > The changes are as follows: > 1. Support multiline. Users can define the start regex of multiline. > Add a parameter REGEX_START in > TaildirSourceConfigurationConstants.java.REGEX_START is used for generating > Flume events containing multiple lines in the body, per event. The parameter > determines the start of an event. Default value is "". If the value is set to > "", a line with the end of '\n' will be dealed into one flume event. > The sample usage: > agent.sources.taildirsource.lineStartRegex = > \\s?\\d\\d\\d\\d-\\d\\d-\\d\\d\\s\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d > > 2. Support recursive directory. Wildcards are allowed in the directory > name. > Modify the function getMatchFiles() in > ReliableTaildirEventReader.java to support this functionality. > The sample usage: > agent.sources.taildirsource.filegroups.f1 = > /Users/wenqiao/work/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/*/01/[ab].log > 3. Fix the bug if a line‘s length exceeds 8192 bytes. Make the buffer > size be configured. > Add a parameter BUFFER_SIZE in > TaildirSourceConfigurationConstants.java.BUFFER_SIZE is used to define the > max number of bytes for one flume event body's content. Default size is 8192. > > > > 4. Put the filePath, hostname, IP into the headers of a flume event if > the headers do not contain the keys.
