[ 
https://issues.apache.org/jira/browse/HIVE-22771?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17043662#comment-17043662
 ] 

Shivam commented on HIVE-22771:
-------------------------------

Thanks Naveen.

Actually the issue arises when the generated idHash is alphanumeric 
(exponential representation) then the regex used to generate the final location 
"\\d\\.?\\d+" does not remove alphabets,  and since the idHash is generated 
using Math.random() the issue won't arise even without the fix most of the 
times, only when Math.random() return a number in exponential notation would we 
experience the issue.

> Partition location incorrectly formed in FileOutputCommitterContainer
> ---------------------------------------------------------------------
>
>                 Key: HIVE-22771
>                 URL: https://issues.apache.org/jira/browse/HIVE-22771
>             Project: Hive
>          Issue Type: Bug
>          Components: HCatalog
>    Affects Versions: 1.2.1
>            Reporter: Shivam
>            Assignee: Shivam
>            Priority: Critical
>              Labels: pull-request-available
>         Attachments: HIVE-22771.2.patch, HIVE-22771.3.patch, 
> HIVE-22771.4.patch, HIVE-22771.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Class _HCatOutputFormat_ in package _org.apache.hive.hcatalog.mapreduce_ uses 
> function _setOutput_ to generate _idHash_ using below statement:
> *+In file org/apache/hive/hcatalog/mapreduce/HCatOutputFormat.java+*
>  *line 116: idHash = String.valueOf(Math.random());*
> The output of idHash can be similar to values like this : 7.145347157239135E-4
>  
> And, in class _FileOutputCommitterContainer_ in package 
> _org.apache.hive.hcatalog.mapreduce;_
> Uses below statement to compute final partition path:
> +*In org/apache/hive/hcatalog/mapreduce/FileOutputCommitterContainer.java*+
> *line 366: String finalLocn = jobLocation.replaceAll(Path.SEPARATOR + 
> SCRATCH_DIR_NAME + "{color:#ff0000}\\d
> .?
>  d+"{color},"");*
> *line 367: partPath = new Path(finalLocn);*
>  
> Regex used here is incorrect, since it will only remove integers after the 
> *SCRATCH_DIR_NAME,* and hence will append  'E-4' (for the above example) in 
> the final partition location. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to