[
https://issues.apache.org/jira/browse/HADOOP-12759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15132810#comment-15132810
]
Andrew Wang commented on HADOOP-12759:
--------------------------------------
bq. On the probing logic, the reason I do it that way is to get synchronization
across daemons. I let HDFS sort out who gets any given file name. If I list
files first, the list of files could change by the time I go to create the file.
I mentioned this off-hand in my previous comment, but how about we try once, if
it fails list to find the last element and try n+1, then keep probing linearly
until it works. This is then no overhead for the common case (no collisions)
and we skip to the end if there is a conflict. Intent is to avoid a full linear
probe.
bq. Any heartburn about a half-second sleep?
Tolerable heartburn, but I was hoping for some solution with advancing a fake
clock and then waking up the sleeping thread. I'll still +1 though if you don't
want to change this.
> RollingFileSystemSink should eagerly rotate directories
> -------------------------------------------------------
>
> Key: HADOOP-12759
> URL: https://issues.apache.org/jira/browse/HADOOP-12759
> Project: Hadoop Common
> Issue Type: Improvement
> Affects Versions: 2.8.0
> Reporter: Daniel Templeton
> Assignee: Daniel Templeton
> Priority: Critical
> Attachments: YARN-4664.001.patch
>
>
> The RollingFileSystemSink only rolls over to a new directory if a new metrics
> record comes in. The issue is that HDFS does not update the file size until
> it's closed (HDFS-5478), and if no new metrics record comes in, then the file
> size will never be updated.
> This JIRA is to add a background thread to the sink that will eagerly close
> the file at the top of the hour.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)