[ 
https://issues.apache.org/jira/browse/HIVE-23576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lev Katzav updated HIVE-23576:
------------------------------
    Description: 
+given the following situation:+

there are 2 tables (in db "intpartitionbugtest"), each with a few rows:
 # *test_table_int_1* partitioned by *y* of type *int*
 # *test_table_string_1* partitioned by *x* of type *string*

here is the output of the following query on the metastore db:
{code:sql}
select
        "PARTITIONS"."PART_ID",
        "TBLS"."TBL_NAME",
        "FILTER0"."PART_KEY_VAL",
        "PART_NAME"
from
        "PARTITIONS"
inner join "TBLS" on
        "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
inner join "DBS" on
        "TBLS"."DB_ID" = "DBS"."DB_ID"
inner join "PARTITION_KEY_VALS" "FILTER0" on
        "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
{code}
 

!image-2020-05-29-14-16-29-356.png!

+the problem+

when running a hive query on the table *test_table_int_1* that filters on *y=1*
 sometimes the following exception will happen on the metastore

 
{code:java}
javax.jdo.JDODataStoreException: Error executing SQL query "select 
"PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = ?   inner 
join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and "DBS"."NAME" = ? inner 
join "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 where "DBS"."CTLG_NAME" 
= ?  and (((case when "FILTER0"."PART_KEY_VAL" <> ? then 
cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end) = ?))".
        at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
 ~[datanucleus-api-jdo-4.2.4.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2003)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:593)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:481)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3853)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3843)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3577)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:3861)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:3516)
 [hive-exec-3.1.2.jar:3.1.2]
        at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
[hive-exec-3.1.2.jar:3.1.2]
        at com.sun.proxy.$Proxy28.getPartitionsByFilter(Unknown Source) [?:?]
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:5883)
 [hive-exec-3.1.2.jar:3.1.2]
        at sun.reflect.GeneratedMethodAccessor69.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 [hive-exec-3.1.2.jar:3.1.2]
        at com.sun.proxy.$Proxy30.get_partitions_by_filter(Unknown Source) [?:?]
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16234)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16218)
 [hive-exec-3.1.2.jar:3.1.2]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
[hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
 [hive-exec-3.1.2.jar:3.1.2]
        at java.security.AccessController.doPrivileged(Native Method) 
[?:1.8.0_112]
        at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 [hadoop-common-3.2.0.jar:?]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_112]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_112]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for 
type numeric: "c"
        at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306) 
~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) 
~[postgresql-42.2.2.jar:42.2.2]
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) 
~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:155)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:118)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
 ~[HikariCP-2.6.1.jar:?]
        at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
 ~[HikariCP-2.6.1.jar:?]
        at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:375)
 ~[datanucleus-rdbms-4.1.19.jar:?]
        at 
org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:552)
 ~[datanucleus-rdbms-4.1.19.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:645) 
~[datanucleus-rdbms-4.1.19.jar:?]
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) 
~[datanucleus-core-4.1.17.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) 
~[datanucleus-rdbms-4.1.19.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        ... 35 more
{code}
the query that is failing is generated by: 

org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal

here is the query with the parameters added:
{code:sql}
select
        "PARTITIONS"."PART_ID"
from
        "PARTITIONS"
inner join "TBLS" on
        "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
        and "TBLS"."TBL_NAME" = 'test_table_int_1'
inner join "DBS" on
        "TBLS"."DB_ID" = "DBS"."DB_ID"
        and "DBS"."NAME" = 'intpartitionbugtest'
inner join "PARTITION_KEY_VALS" "FILTER0" on
        "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
        and "FILTER0"."INTEGER_IDX" = 0
where
"DBS"."CTLG_NAME" = 'hive'
        and
        ((
        (
                case
                when "FILTER0"."PART_KEY_VAL" <> '__HIVE_DEFAULT_PARTITION__' 
then cast("FILTER0"."PART_KEY_VAL" as decimal(21, 0))
                else null end) = 1))
{code}
 the query is failing on the casting to decimal of the string partition values 
(a,b,c)

+Analysis of the problem+

this issue doesn't always happen.
 when I start with a clean db, the query runs successfully, but after running 
many automated tests that are doing inserts and queries from hive tables, the 
query breaks

I run *EXPLAIN VERBOSE* on the query when it works and when it breaks:

*working plan:*

!working plan_2.png|width=887,height=277!

looks like first there is an index filtering in "TBL_NAME" that limits the 
query only to the correct table,
 and later when there is a filter on "PART_KEY_VAL", all the values are numeric

*broken plan:*

!broken plan_2.png|width=914,height=283!

we can see that first there is filter on "PART_KEY_VAL", and it runs on all the 
tables, and fail on the casting of the string columns

my guess is that after some inserts and queries of the relevant tables, the 
statistics for those tables change, and postgres decides no to use the index on 
"TBL_NAME"

+Workaround+

I managed to workaround the issue by setting the following setting on the 
postgres:
{code:java}
enable_seqscan = false{code}
from here: [https://www.postgresql.org/docs/9.5/runtime-config-query.html]
{quote}_It is impossible to suppress sequential scans entirely, but turning 
this variable off discourages the planner from using one if there are other 
methods available._ 
{quote}
so it will force the use of the index on "TBL_NAME",
 but that is not a good solution for production, as it will hurt performance

 

I think the solution should be to perform a safe cast in the query

  was:
+given the following situation:+

there are 2 tables (in db "intpartitionbugtest"), each with a few rows:
 # *test_table_int_1* partitioned by *y* of type *int*
 # *test_table_string_1* partitioned by *x* of type *string*

here is the output of the following query on the metastore db:
{code:sql}
select
        "PARTITIONS"."PART_ID",
        "TBLS"."TBL_NAME",
        "FILTER0"."PART_KEY_VAL",
        "PART_NAME"
from
        "PARTITIONS"
inner join "TBLS" on
        "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
inner join "DBS" on
        "TBLS"."DB_ID" = "DBS"."DB_ID"
inner join "PARTITION_KEY_VALS" "FILTER0" on
        "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
{code}
 

!image-2020-05-29-14-16-29-356.png!

+the problem+

when running a hive query on the table *test_table_int_1* that filters on *y=1*
 sometimes the following exception will happen on the metastore

 
{code:java}
javax.jdo.JDODataStoreException: Error executing SQL query "select 
"PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
"PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = ?   inner 
join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and "DBS"."NAME" = ? inner 
join "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
"PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 where "DBS"."CTLG_NAME" 
= ?  and (((case when "FILTER0"."PART_KEY_VAL" <> ? then 
cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end) = ?))".
        at 
org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
 ~[datanucleus-api-jdo-4.2.4.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2003)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:593)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:481)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3853)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3843)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3577)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:3861)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:3516)
 [hive-exec-3.1.2.jar:3.1.2]
        at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
        at 
org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
[hive-exec-3.1.2.jar:3.1.2]
        at com.sun.proxy.$Proxy28.getPartitionsByFilter(Unknown Source) [?:?]
        at 
org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:5883)
 [hive-exec-3.1.2.jar:3.1.2]
        at sun.reflect.GeneratedMethodAccessor69.invoke(Unknown Source) ~[?:?]
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_112]
        at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
 [hive-exec-3.1.2.jar:3.1.2]
        at com.sun.proxy.$Proxy30.get_partitions_by_filter(Unknown Source) [?:?]
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16234)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16218)
 [hive-exec-3.1.2.jar:3.1.2]
        at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
[hive-exec-3.1.2.jar:3.1.2]
        at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
[hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
 [hive-exec-3.1.2.jar:3.1.2]
        at java.security.AccessController.doPrivileged(Native Method) 
[?:1.8.0_112]
        at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 [hadoop-common-3.2.0.jar:?]
        at 
org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
 [hive-exec-3.1.2.jar:3.1.2]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[?:1.8.0_112]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[?:1.8.0_112]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for 
type numeric: "c"
        at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306) 
~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) 
~[postgresql-42.2.2.jar:42.2.2]
        at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) 
~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:155)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:118)
 ~[postgresql-42.2.2.jar:42.2.2]
        at 
com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
 ~[HikariCP-2.6.1.jar:?]
        at 
com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
 ~[HikariCP-2.6.1.jar:?]
        at 
org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:375)
 ~[datanucleus-rdbms-4.1.19.jar:?]
        at 
org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:552)
 ~[datanucleus-rdbms-4.1.19.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:645) 
~[datanucleus-rdbms-4.1.19.jar:?]
        at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) 
~[datanucleus-core-4.1.17.jar:?]
        at 
org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807) 
~[datanucleus-rdbms-4.1.19.jar:?]
        at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368) 
~[datanucleus-api-jdo-4.2.4.jar:?]
        ... 35 more
{code}
the query that is failing is generated by: 

org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal

here is the query with the parameters added:
{code:sql}
select
        "PARTITIONS"."PART_ID"
from
        "PARTITIONS"
inner join "TBLS" on
        "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
        and "TBLS"."TBL_NAME" = 'test_table_int_1'
inner join "DBS" on
        "TBLS"."DB_ID" = "DBS"."DB_ID"
        and "DBS"."NAME" = 'intpartitionbugtest'
inner join "PARTITION_KEY_VALS" "FILTER0" on
        "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
        and "FILTER0"."INTEGER_IDX" = 0
where
"DBS"."CTLG_NAME" = 'hive'
        and
        ((
        (
                case
                when "FILTER0"."PART_KEY_VAL" <> '__HIVE_DEFAULT_PARTITION__' 
then cast("FILTER0"."PART_KEY_VAL" as decimal(21, 0))
                else null end) = 1))
{code}
 the query is failing on the casting to decimal of the string partition values 
(a,b,c)

+Analysis of the problem+

this issue doesn't always happen.
 when I start with a clean db, the query runs successfully, but after running 
many automated tests that are doing inserts and queries from hive tables, the 
query breaks

I run *EXPLAIN VERBOSE* on the query when it works and when it breaks:

*working plan:*

!working plan_2.png|width=887,height=277!

looks like first there is an index filtering in "TBL_NAME" that limits the 
query only to the correct table,
 and later when there is a filter on "PART_KEY_VAL", all the values are numeric

*broken plan:*

!broken plan_2.png|width=914,height=283! we can see that first there is filter 
on "PART_KEY_VAL", and it runs on all the tables, and fail on the casting of 
the string columns

my guess is that after some inserts and queries of the relevant tables, the 
statistics for those tables change, and postgres decides no to use the index on 
"TBL_NAME"

+Workaround+

I managed to workaround the issue by setting the following setting on the 
postgres:
{code:java}
enable_seqscan = false{code}
from here: [https://www.postgresql.org/docs/9.5/runtime-config-query.html]
{quote}_It is impossible to suppress sequential scans entirely, but turning 
this variable off discourages the planner from using one if there are other 
methods available._ 
{quote}
so it will force the use of the index on "TBL_NAME",
but that is not a good solution for production, as it will hurt performance

 

I think the solution should be to perform a safe cast in the query


> Getting partition of type int from metastore sometimes fail on cast error
> -------------------------------------------------------------------------
>
>                 Key: HIVE-23576
>                 URL: https://issues.apache.org/jira/browse/HIVE-23576
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, Standalone Metastore
>    Affects Versions: 3.1.2
>         Environment: metastore db - postgres (tried on 9.3 and 11.5)
>            Reporter: Lev Katzav
>            Priority: Major
>         Attachments: broken plan_2.png, image-2020-05-29-14-16-29-356.png, 
> working plan_2.png
>
>
> +given the following situation:+
> there are 2 tables (in db "intpartitionbugtest"), each with a few rows:
>  # *test_table_int_1* partitioned by *y* of type *int*
>  # *test_table_string_1* partitioned by *x* of type *string*
> here is the output of the following query on the metastore db:
> {code:sql}
> select
>       "PARTITIONS"."PART_ID",
>       "TBLS"."TBL_NAME",
>       "FILTER0"."PART_KEY_VAL",
>       "PART_NAME"
> from
>       "PARTITIONS"
> inner join "TBLS" on
>       "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
> inner join "DBS" on
>       "TBLS"."DB_ID" = "DBS"."DB_ID"
> inner join "PARTITION_KEY_VALS" "FILTER0" on
>       "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
> {code}
>  
> !image-2020-05-29-14-16-29-356.png!
> +the problem+
> when running a hive query on the table *test_table_int_1* that filters on 
> *y=1*
>  sometimes the following exception will happen on the metastore
>  
> {code:java}
> javax.jdo.JDODataStoreException: Error executing SQL query "select 
> "PARTITIONS"."PART_ID" from "PARTITIONS"  inner join "TBLS" on 
> "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"     and "TBLS"."TBL_NAME" = ?   inner 
> join "DBS" on "TBLS"."DB_ID" = "DBS"."DB_ID"      and "DBS"."NAME" = ? inner 
> join "PARTITION_KEY_VALS" "FILTER0" on "FILTER0"."PART_ID" = 
> "PARTITIONS"."PART_ID" and "FILTER0"."INTEGER_IDX" = 0 where 
> "DBS"."CTLG_NAME" = ?  and (((case when "FILTER0"."PART_KEY_VAL" <> ? then 
> cast("FILTER0"."PART_KEY_VAL" as decimal(21,0)) else null end) = ?))".
>       at 
> org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:543)
>  ~[datanucleus-api-jdo-4.2.4.jar:?]
>       at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391) 
> ~[datanucleus-api-jdo-4.2.4.jar:?]
>       at org.datanucleus.api.jdo.JDOQuery.executeWithArray(JDOQuery.java:267) 
> ~[datanucleus-api-jdo-4.2.4.jar:?]
>       at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.executeWithArray(MetaStoreDirectSql.java:2003)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal(MetaStoreDirectSql.java:593)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilter(MetaStoreDirectSql.java:481)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3853)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.ObjectStore$11.getSqlResult(ObjectStore.java:3843)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.ObjectStore$GetHelper.run(ObjectStore.java:3577)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilterInternal(ObjectStore.java:3861)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.ObjectStore.getPartitionsByFilter(ObjectStore.java:3516)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at sun.reflect.GeneratedMethodAccessor70.invoke(Unknown Source) ~[?:?]
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
>       at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
>       at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97) 
> [hive-exec-3.1.2.jar:3.1.2]
>       at com.sun.proxy.$Proxy28.getPartitionsByFilter(Unknown Source) [?:?]
>       at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_partitions_by_filter(HiveMetaStore.java:5883)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at sun.reflect.GeneratedMethodAccessor69.invoke(Unknown Source) ~[?:?]
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[?:1.8.0_112]
>       at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_112]
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at com.sun.proxy.$Proxy30.get_partitions_by_filter(Unknown Source) [?:?]
>       at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16234)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_partitions_by_filter.getResult(ThriftHiveMetastore.java:16218)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) 
> [hive-exec-3.1.2.jar:3.1.2]
>       at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) 
> [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:636)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor$1.run(HadoopThriftAuthBridge.java:631)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at java.security.AccessController.doPrivileged(Native Method) 
> [?:1.8.0_112]
>       at javax.security.auth.Subject.doAs(Subject.java:422) [?:1.8.0_112]
>       at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
>  [hadoop-common-3.2.0.jar:?]
>       at 
> org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge$Server$TUGIAssumingProcessor.process(HadoopThriftAuthBridge.java:631)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
>  [hive-exec-3.1.2.jar:3.1.2]
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  [?:1.8.0_112]
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [?:1.8.0_112]
>       at java.lang.Thread.run(Thread.java:745) [?:1.8.0_112]
> Caused by: org.postgresql.util.PSQLException: ERROR: invalid input syntax for 
> type numeric: "c"
>       at 
> org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:2433)
>  ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:2178)
>  ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:306) 
> ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441) 
> ~[postgresql-42.2.2.jar:42.2.2]
>       at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365) 
> ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:155)
>  ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:118)
>  ~[postgresql-42.2.2.jar:42.2.2]
>       at 
> com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
>  ~[HikariCP-2.6.1.jar:?]
>       at 
> com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
>  ~[HikariCP-2.6.1.jar:?]
>       at 
> org.datanucleus.store.rdbms.ParamLoggingPreparedStatement.executeQuery(ParamLoggingPreparedStatement.java:375)
>  ~[datanucleus-rdbms-4.1.19.jar:?]
>       at 
> org.datanucleus.store.rdbms.SQLController.executeStatementQuery(SQLController.java:552)
>  ~[datanucleus-rdbms-4.1.19.jar:?]
>       at 
> org.datanucleus.store.rdbms.query.SQLQuery.performExecute(SQLQuery.java:645) 
> ~[datanucleus-rdbms-4.1.19.jar:?]
>       at org.datanucleus.store.query.Query.executeQuery(Query.java:1855) 
> ~[datanucleus-core-4.1.17.jar:?]
>       at 
> org.datanucleus.store.rdbms.query.SQLQuery.executeWithArray(SQLQuery.java:807)
>  ~[datanucleus-rdbms-4.1.19.jar:?]
>       at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:368) 
> ~[datanucleus-api-jdo-4.2.4.jar:?]
>       ... 35 more
> {code}
> the query that is failing is generated by: 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.getPartitionsViaSqlFilterInternal
> here is the query with the parameters added:
> {code:sql}
> select
>       "PARTITIONS"."PART_ID"
> from
>       "PARTITIONS"
> inner join "TBLS" on
>       "PARTITIONS"."TBL_ID" = "TBLS"."TBL_ID"
>       and "TBLS"."TBL_NAME" = 'test_table_int_1'
> inner join "DBS" on
>       "TBLS"."DB_ID" = "DBS"."DB_ID"
>       and "DBS"."NAME" = 'intpartitionbugtest'
> inner join "PARTITION_KEY_VALS" "FILTER0" on
>       "FILTER0"."PART_ID" = "PARTITIONS"."PART_ID"
>       and "FILTER0"."INTEGER_IDX" = 0
> where
> "DBS"."CTLG_NAME" = 'hive'
>       and
>       ((
>       (
>               case
>               when "FILTER0"."PART_KEY_VAL" <> '__HIVE_DEFAULT_PARTITION__' 
> then cast("FILTER0"."PART_KEY_VAL" as decimal(21, 0))
>               else null end) = 1))
> {code}
>  the query is failing on the casting to decimal of the string partition 
> values (a,b,c)
> +Analysis of the problem+
> this issue doesn't always happen.
>  when I start with a clean db, the query runs successfully, but after running 
> many automated tests that are doing inserts and queries from hive tables, the 
> query breaks
> I run *EXPLAIN VERBOSE* on the query when it works and when it breaks:
> *working plan:*
> !working plan_2.png|width=887,height=277!
> looks like first there is an index filtering in "TBL_NAME" that limits the 
> query only to the correct table,
>  and later when there is a filter on "PART_KEY_VAL", all the values are 
> numeric
> *broken plan:*
> !broken plan_2.png|width=914,height=283!
> we can see that first there is filter on "PART_KEY_VAL", and it runs on all 
> the tables, and fail on the casting of the string columns
> my guess is that after some inserts and queries of the relevant tables, the 
> statistics for those tables change, and postgres decides no to use the index 
> on "TBL_NAME"
> +Workaround+
> I managed to workaround the issue by setting the following setting on the 
> postgres:
> {code:java}
> enable_seqscan = false{code}
> from here: [https://www.postgresql.org/docs/9.5/runtime-config-query.html]
> {quote}_It is impossible to suppress sequential scans entirely, but turning 
> this variable off discourages the planner from using one if there are other 
> methods available._ 
> {quote}
> so it will force the use of the index on "TBL_NAME",
>  but that is not a good solution for production, as it will hurt performance
>  
> I think the solution should be to perform a safe cast in the query



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to