Eric Czech created FLUME-2725:
---------------------------------

             Summary: HDFS Sink does not use configured timezone for rounding
                 Key: FLUME-2725
                 URL: https://issues.apache.org/jira/browse/FLUME-2725
             Project: Flume
          Issue Type: Bug
          Components: Sinks+Sources
            Reporter: Eric Czech
            Priority: Minor


When a BucketPath used by an HDFS sink is configured to run with some roundUnit 
and roundValue > 1 (e.g. 6 hours), the "roundDown" function used by BucketPath 
does not actually round the date correctly.

That function calls TimestampRoundDownUtil which creates a Calendar instance 
using the *local* timezone to truncate a unix timestamp rather than the 
TimeZone that the sink was configured to convert dates to paths with (and that 
timezone is already available in the BucketPath class but it just isn't passed 
to TimestampRoundDownUtil).

The net effect of this is that if a flume jvm is running on a system with an 
EST clock while trying to write, say, 6 hour directories in UTC time, the 
directories are written with the hours 04, 10, 16, 22 rather than 00, 06, 12, 
18 like you would expect.

I found a workaround for this by passing "-Duser.timezone=<hdfs_sink_timezone>" 
as a system property, but I wanted to create a ticket for this since it seems 
like it would be very minimal effort to carry that configured timezone down 
into the rounding utility as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to