[jira] [Commented] (FLUME-2352) HDFSCompressedDataStream should support appendBatch

Hari Shreedharan (JIRA) Fri, 12 Sep 2014 17:23:32 -0700

    [ 
https://issues.apache.org/jira/browse/FLUME-2352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132363#comment-14132363
 ]


Hari Shreedharan commented on FLUME-2352:
-----------------------------------------

Also, batch size of 200000 is not realistic. I'd like to see if batch sizes 
between 1000 and 10000 show any difference with events of 500 bytes or so.

> HDFSCompressedDataStream should support appendBatch
> ---------------------------------------------------
>
>                 Key: FLUME-2352
>                 URL: https://issues.apache.org/jira/browse/FLUME-2352
>             Project: Flume
>          Issue Type: Improvement
>          Components: Sinks+Sources
>    Affects Versions: v1.5.0
>            Reporter: chenshangan
>            Assignee: chenshangan
>             Fix For: v1.6.0
>
>         Attachments: FLUME-2352.patch
>
>
> compressing events in batch is much more efficient than compressing one by 
> one.
> I set hdfs.batchSize to 200000, when I use appendBatch() in BucketWriter, the 
> append operation cost less than 1 seconds, while one by one might cost 10 
> seconds.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (FLUME-2352) HDFSCompressedDataStream should support appendBatch

Reply via email to