[
https://issues.apache.org/jira/browse/FLUME-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580590#comment-16580590
]
zhenzhao wang commented on FLUME-3268:
--------------------------------------
[~fszabo] hdfs.batchSize is used for flush frequency. It will call append rpc
for each event . What we are going to do is batching events for HDFS append.
There're some problem with current pull request while I try to refactor the
code, will let you known it's good for reviewing.
> Introducing micro batch processing to HDFSEventSink
> ---------------------------------------------------
>
> Key: FLUME-3268
> URL: https://issues.apache.org/jira/browse/FLUME-3268
> Project: Flume
> Issue Type: New Feature
> Reporter: zhenzhao wang
> Priority: Major
> Attachments: FLUME-3268-0.patch
>
>
> In our test with HDFSEvent sink, we found that we could increase the draining
> speed of HDFSSink up to 4x by introducing micro batch processing. With the
> micro batch processing feature, we will batch the events written to HDFS
> instead of one by one.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]