Hi tinawenqiao,

Thanks for moving this conversation from github to flume dev list. I
believe this is the best place to discuss development efforts. As
mentioned we generally don't accept pull request so please create a
jira(s) (based on how many different issues you would like to address)
and attach your patch to it/them as it is described on the
https://cwiki.apache.org/confluence/display/FLUME/How+to+Contribute in
detail.

Regarding to your proposed changes:
1) Sounds like a good improvement for TailDirSource (please check
Spooling Directory Source how similar is supported using deserializer:
https://flume.apache.org/FlumeUserGuide.html#spooling-directory-source)
Your logic can be a part of a new deserializer.
2) Sounds like a good improvement for TailDirSource but would be good
to avoid inventing a new pattern syntax (on a side note please check
out the latest development on SpoolingDirSource as it is now capable
of checking a directory subtree recursively it might have already what
you want to achieve)
3) Bugfix sounds awesome
4) Is something looks very specific to your use case. I believe it
could be a little bit more generalised and or driven by configuration
parameter(s).


Cheers,
Attila

Attila Simon
Software Engineer
Email:   [email protected]




On Wed, Jul 6, 2016 at 5:42 AM, 黄鹏程 <[email protected]> wrote:
> Fantastic Features! Support for this pull!
>
>
>
>
> ------------------ 原始邮件 ------------------
> 发件人: "文乔";<[email protected]>;
> 发送时间: 2016年7月6日(星期三) 中午11:38
> 收件人: "dev"<[email protected]>;
>
> 主题: Add Support multiline and recursive directory in 
> TaildirSource(Flume-1.7). And make the buffersize be configured
>
>
>
> Hi,all:
>    I submit a pull request to flume-1.7 on github. The address is 
> https://github.com/apache/flume/pull/54 .
>    The changes are as follows:
>    1.  Support multiline. Users can define the start regex of multiline.
>         Add a parameter REGEX_START in 
> TaildirSourceConfigurationConstants.java.REGEX_START is used for generating 
> Flume events containing multiple lines in the body, per event. The parameter 
> determines the start of an event. Default value is "". If the value is set to 
> "", a line with the end of '\n' will be dealed into one flume event.
>         The sample usage:
>         agent.sources.taildirsource.lineStartRegex =  
> \\s?\\d\\d\\d\\d-\\d\\d-\\d\\d\\s\\d\\d:\\d\\d:\\d\\d,\\d\\d\\d
>
>    2.   Support recursive directory. Wildcards are allowed in the directory 
> name.
>          Modify the function getMatchFiles() in 
> ReliableTaildirEventReader.java to support this functionality.
>          The sample usage:
>          agent.sources.taildirsource.filegroups.f1 = 
> /Users/wenqiao/work/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/*/01/[ab].log
>    3.   Fix the bug if a line‘s length exceeds 8192 bytes. Make the buffer 
> size be configured.
>          Add a parameter BUFFER_SIZE in 
> TaildirSourceConfigurationConstants.java.BUFFER_SIZE is used to define the 
> max number of bytes for one flume event body's content. Default size is 8192.
>
>
>
>     4.  Put the filePath, hostname, IP into the headers of a flume event if 
> the headers do not contain the keys.

Reply via email to