[
https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15138478#comment-15138478
]
Daniel Templeton commented on HDFS-9782:
----------------------------------------
In HDFS-9780, [~andrew.wang] suggested the interval be milliseconds. Given
that most intervals are going to be on the order of hours, an interval of
milliseconds seems cruel. How many milliseconds in a day? Plus any interval
less than about 10 minutes is at risk of creating a problematic number of small
files. For these reasons I'm going to ignore Andrew's suggestion and go with
minutes as the interval.
> RollingFileSystemSink should have configurable roll interval
> ------------------------------------------------------------
>
> Key: HDFS-9782
> URL: https://issues.apache.org/jira/browse/HDFS-9782
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
>
> Right now it defaults to rolling at the top of every hour. Instead that
> interval should be configurable. The interval should also allow for some
> play so that all hosts don't try to flush their files simultaneously.
> I'm filing this in HDFS because I suspect it will involve touching the HDFS
> tests. If it turns out not to, I'll move it into common instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)