[ 
https://issues.apache.org/jira/browse/HDFS-9782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15166672#comment-15166672
 ] 

Daniel Templeton commented on HDFS-9782:
----------------------------------------

Thanks, [~andrew.wang]!  HADOOP-8608 appears to be exactly what we want.

Fair point about needing to deal with a GC pause, but having the offset on by 
default still strikes me as a potentially nasty surprise.  I'm trying to think 
about this in terms of customer experience.  We know that there's no noticeable 
performance impact at 200 nodes.  We're just assuming that we'll run into 
issues at larger scale, but we don't actually know for sure.  It just seems 
wrong to me to add this little bit of unexpected uncertainty into the mix for 
all users when we suspect that a handful of users might run into the issue.  
Also consider that an admin running a 1000-node cluster is going to be a bit 
more careful when changing configuration settings than someone with a 10-node 
cluster.  The big cluster's admin is less likely to be surprised by needing to 
turn on the offset than the little cluster's admin will be about it being on by 
default.

The above begs the question about what the default should be.  If we think it's 
1 second or even 10 seconds, I'll stop arguing now and turn it on by default.  
I assumed we'd want something more like 1 minute.  At a minute, that's long 
enough that some user will trip over it and be confused.  I don't think more 
than 1 minute is a reasonable default for several reasons, one of which is that 
it could interact badly with a short roll interval.  (I don't think it makes 
sense to set the offset as a percentage of the roll interval, because the need 
for the offset is independent of the length of the roll interval.)

bq. What kind of timeliness do we really require? Would it be acceptable if we 
did not synchronize rolling, but rolled more frequently?

The use case behind this JIRA requires log rolls at the top of every hour with 
a known time by which the logs are guaranteed to be available.  Having a 1 
minute offset as the default is fine for this use case.  The discussion we're 
having here is about all the use cases we haven't seen yet.  Sorry to suck up 
time on such an trivial detail, but I think it's worth getting right.

> RollingFileSystemSink should have configurable roll interval
> ------------------------------------------------------------
>
>                 Key: HDFS-9782
>                 URL: https://issues.apache.org/jira/browse/HDFS-9782
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Daniel Templeton
>            Assignee: Daniel Templeton
>         Attachments: HDFS-9782.001.patch, HDFS-9782.002.patch, 
> HDFS-9782.003.patch, HDFS-9782.004.patch
>
>
> Right now it defaults to rolling at the top of every hour.  Instead that 
> interval should be configurable.  The interval should also allow for some 
> play so that all hosts don't try to flush their files simultaneously.
> I'm filing this in HDFS because I suspect it will involve touching the HDFS 
> tests.  If it turns out not to, I'll move it into common instead.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to