[ 
https://issues.apache.org/jira/browse/FLUME-2128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13804799#comment-13804799
 ] 

Hari Shreedharan edited comment on FLUME-2128 at 10/24/13 11:26 PM:
--------------------------------------------------------------------

Hey Ted,

Thinking about it, let's just keep 1 param - rollSize to check the size of the 
files - even in case of a normal data stream too.  Also, can you also add a 
couple of tests with the minicluster (so we know it works even without mocking).


was (Author: hshreedharan):
Hey Ted,

Thinking about it, let's just keep 1 param - rollSize to check the size of the 
files - even in case of a normal data stream too. 

> HDFS Sink rollSize is calculated based off of uncompressed size of cumulative 
> events.
> -------------------------------------------------------------------------------------
>
>                 Key: FLUME-2128
>                 URL: https://issues.apache.org/jira/browse/FLUME-2128
>             Project: Flume
>          Issue Type: Bug
>          Components: Sinks+Sources
>    Affects Versions: v1.4.0, v1.3.1
>            Reporter: Jeff Lord
>            Assignee: Ted Malaska
>              Labels: features
>         Attachments: FLUME-2128-0.patch, FLUME-2128-1.patch
>
>
> The hdfs sink rollSize parameter is compared against uncompressed event sizes.
> The net of this is that if you are using compression and expect the size of 
> your files on HDFS to be rolled/sized based on the value set for rollSize 
> than your files will be much smaller due to compression.
> We should take into account when compression is set and roll based on the 
> compressed size on hdfs.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to