Johannes Alberti created HIVE-17363:
---------------------------------------

             Summary: Metrics output JSON_FILE issues with 
{hive.service.metrics.file.location} not being renamed as expected
                 Key: HIVE-17363
                 URL: https://issues.apache.org/jira/browse/HIVE-17363
             Project: Hive
          Issue Type: Bug
          Components: Configuration, Logging
    Affects Versions: 2.1.1
         Environment: CentOS 6.5/Hadoop 2.7.3/Java 7
            Reporter: Johannes Alberti


Due to a patch introduced with HIVE-13705, the target output json file 
(report.json) is not replace properly, only report.json.tmp is continuously 
updated.

The local filesystem 
(https://github.com/apache/hive/blob/branch-2.1/common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/CodahaleMetrics.java#L428)
 at the time of output is an instanceof ProxyLocalFileSystem 
(https://github.com/apache/hive/blob/branch-2.1/ql/src/java/org/apache/hadoop/hive/ql/io/ProxyLocalFileSystem.java)
 which overrides the rename method of the Hadoop LocalFileSystem.

The Hadooo LocalFileSystem delegates rename() to the JVM which delegates 
rename() to the OS ... 
http://pubs.opengroup.org/onlinepubs/9699919799/functions/rename.html.

The POSIX rename behavior is what the JSON_FILE output handler really wants 
here, I assume, as it supposedly ensures that a reader thread at no time ends 
up with no file, which in the deprecated Haddop FileSystem ... rename(src, dst, 
options) method could occur.

No simple patch seems obvious, unless the JSON_FILE output handler would be 
leveraging the JVM FileSystem in case a local filesystem for the output is 
configured. Delegating to the Hadoop original LocalFilesystem seems not safe, 
if we can assume that at one point in the future, Hadoop will align 
LocalFileSystem and DFS behavior as requested originally in HDFS-10385.

Comments appreciated, I'm inclined to rip out the Hadoop LocalFileSystem here 
and replace it with the JVM original.

Hive master seems to still have the same issue, at least no obvious code 
changes are observed, despite some metrics refactoring 
(https://github.com/apache/hive/blob/master/common/src/java/org/apache/hadoop/hive/common/metrics/metrics2/JsonFileMetricsReporter.java#L116)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to