Mithun Radhakrishnan created HIVE-11475:
-------------------------------------------

             Summary: Bad rename of directory during commit, when using HCat 
dynamic-partitioning.
                 Key: HIVE-11475
                 URL: https://issues.apache.org/jira/browse/HIVE-11475
             Project: Hive
          Issue Type: Bug
          Components: HCatalog
    Affects Versions: 1.2.0
            Reporter: Mithun Radhakrishnan
            Assignee: Mithun Radhakrishnan
            Priority: Critical


Here's one that [~knoguchi] found and root-caused. This one's a doozy. 

Under seemingly random conditions, the temporary output (under 
{{_SCRATCH1.234*}}) for HCat's dynamic partitioner isn't promoted correctly to 
the final table directory.

The namenode logs indicated a botched directory-rename:

{noformat}
2015-08-02 03:24:29,090 INFO FSNamesystem.audit: allowed=true ugi=myth 
(auth:TOKEN) via wrkf...@grid.myth.net (auth:TOKEN) ip=/10.192.100.117 
cmd=rename 
src=/projects/hive/myth.db/myth_table_15m/_SCRATCH2.8772158158263395E-4/tc=1/utc_time=201508020145/part-r-00000
 
dst=/projects/hive/myth.db/myth_table_15mE-4/tc=1/utc_time=201508020145/part-r-00000
 perm=myth:madcaps:rw-r-r- proto=rpc
{noformat}

Note that the table-directory name {{"myth_table_15m"}} is appended with 
{{"E-4"}}. This'll break anything that uses HDFS-based polling.

[~knoguchi] points out the following code:

{code:title=HCatOutputFormat.java}
119   if ((idHash = conf.get(HCatConstants.HCAT_OUTPUT_ID_HASH)) == null) {
120         idHash = String.valueOf(Math.random());
121   }
{code}

{code:title=FileOutputCommitterContainer.java}
370       String finalLocn = jobLocation.replaceAll(Path.SEPARATOR + 
SCRATCH_DIR_NAME + "\\d\\.?\\d+","");
{code}

The problem is that when {{Math.random()}} produces a number <= 10 ^-3^, 
{{String.valueOf(double)}} uses exponential notation. The regex doesn't capture 
or handle this notation.

The fix belies the debugging-effort.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to