[ 
https://issues.apache.org/jira/browse/HIVE-22002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16902039#comment-16902039
 ] 

bencao commented on HIVE-22002:
-------------------------------

I found this:

When I execute sql :

 
{code:java}
Insert into test double partition(dbtest) values (1,10);{code}
 

The partition saved by the metabase is dbtest=10.

When executed:

 
{code:java}
Insert into test_double partition(dbtest) values (1, cast (10 as double));{code}
 

The partition saved by the metabase is dbtest=10.0.

When statstask is executed, when the partition information is assembled, it 
will perform partition type conversion such as 10 will change 10.0.

This will result in the partition information not being found in the metabase, 
because the partition saved by the metabase is Dbtest=10.

Therefore, I think it can be solved by doing this. When saving the partition 
information, perform partition type conversion to ensure the partition value is 
consistent.

Finally, I submitted the modified patch

> Insert into table partition fails partially with stats.autogather is on.
> ------------------------------------------------------------------------
>
>                 Key: HIVE-22002
>                 URL: https://issues.apache.org/jira/browse/HIVE-22002
>             Project: Hive
>          Issue Type: Bug
>          Components: HiveServer2
>    Affects Versions: 4.0.0
>            Reporter: Naveen Gangam
>            Assignee: bencao
>            Priority: Major
>         Attachments: HIVE-22002.patch, image-2019-07-31-20-02-38-069.png
>
>
> create table test_double(id int) partitioned by (dbtest double); 
> insert into test_double partition(dbtest) values (1,9.9); --> this works
> insert into test_double partition(dbtest) values (1,10); --> this fails 
> But if we change it to
> insert into test_double partition(dbtest) values (1, cast (10 as double)); it 
> succeeds 
> -> the problem is only seen when trying to insert a whole number i.e. 10, 
> 10.0, 15, 14.0 etc. The issue is not seen when inserting a number with 
> decimal values other than 0. So insert of 10.1 goes though. 
> The underlying  from the HMS is 
> {code}
> 2019-07-11T07:58:16,670  [pool-6-thread-196]: server.TThreadPoolServer 
> (TThreadPoolServer.java:run(297)) -  occurred during processing of message. 
> java.lang.IndexOutOfBoundsException: Index: 0 at 
> java.util.Collections$EmptyList.get(Collections.java:4454) ~[?:1.8.0_112] at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:7808)
>  ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:7769)
>  ~[hive-exec-3.1.0.3.1.0.0-78.jar:3.1.0.3.1.0.0-78] 
> {code}
> With {{hive.stats.column.autogather=false}}, this exception does not occur 
> with or without the explicit casting.
> The issue stems from the fact that HS2 created a partition with value 
> {{dbtest=10}} for the table and the stats processor is attempting to add 
> column statistics for partition with value {{dbtest=10.0}}. Thus HMS 
> {{getPartitionsByNames}} cannot find the partition with that value and thus 
> fails to insert the stats. So while the failure initiates on HMS side, the 
> cause in the HS2 query planning.
> It makes sense that turning off {{hive.stats.column.autogather}} resolves the 
> issue because there is no StatsTask in a query plan.
> But {{SHOW PARTITIONS}} shows the partition as created while the query 
> planner is not including it any plan because of the absence of stats on the 
> partition.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to