[ 
https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163795#comment-15163795
 ] 

Andrew Wang commented on HDFS-9782:
-----------------------------------

bq. In most clusters, this is not needed. It's only the large (1000-ish node) 
clusters that will need to worry about staggering the rolls. And then how much 
staggering is required depends heavily on the cluster. I think 0 is a 
reasonable default.

Is there a downside to having it non-zero for small clusters? It's better to 
have defaults that work for all cluster sizes. If your concern is the linking 
between the interval and the offset, we could make the offset configuration a 
percent of the interval.

One nit, we can use TimeUnit.convert rather than using the new constants. I 
also agree with Robert and would prefer that we didn't add this unit parsing 
code at all, but that's not a blocker.

Also, if you look at BPServiceActor#Scheduler, this is an example of how we can 
unit test a scheduler like this without sleeps. Food for thought.

> RollingFileSystemSink should have configurable roll interval
> ------------------------------------------------------------
>
>                 Key: HDFS-9782
>                 URL: https://issues.apache.org/jira/browse/HDFS-9782
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>         Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, 
> HDFS-9782.003.patch, HDFS-9782.004.patch
>
>
> Right now it defaults to rolling at the top of every hour.  Instead that 
> interval should be configurable.  The interval should also allow for some 
> play so that all hosts don't try to flush their files simultaneously.
> I'm filing this in HDFS because I suspect it will involve touching the HDFS 
> tests.  If it turns out not to, I'll move it into common instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to