[ 
https://issues.apache.org/jira/browse/HIVE-24343?focusedWorklogId=510182&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-510182
 ]

ASF GitHub Bot logged work on HIVE-24343:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 11/Nov/20 10:49
            Start Date: 11/Nov/20 10:49
    Worklog Time Spent: 10m 
      Work Description: vnhive commented on a change in pull request #1640:
URL: https://github.com/apache/hive/pull/1640#discussion_r521272977



##########
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/MetaStoreDirectSql.java
##########
@@ -870,44 +870,88 @@ private boolean isViewTable(String catName, String 
dbName, String tblName) throw
       List<? extends Object> paramsForFilter, List<String> joinsForFilter, 
Integer max)
       throws MetaException {
     boolean doTrace = LOG.isDebugEnabled();
+
+    // The JDBC client driver implementations of Derby, PostgreSQL, MySQL, 
MariaDB and Oracle send the number of
+    // parameters being set in a executed statement as a 2 byte signed integer 
to the servers. This indirectly
+    // sets s limit on the number of parameters that can be set to 32767. This 
is also the number of parameters
+    // that can be passed to the JDO executeArray call.
+    final int JDBC_STMT_PARAM_LIMIT = 32767;

Review comment:
       1. I am OK with this. Will do this and resubmit the patch.
   
   > 2. But do we really know that it works? We dont have a test case for this 
that tests across all databases nor have we 
   > tested this, afaik.
   
   Agreed. I will look into creating a test. Insofar as the issue itself is 
concerned they seem to be documented limitations of the JDBC driver and the 
solution is to break the original query into multiple queries. Please do take a 
look at - https://www.postgresql.org/message-id/16832734.post%40talk.nabble.com
   
   > We dont have a test case for this that tests across all databases nor have 
we tested this, afaik.
   
   Agreed. I can work on this.
   
   > Also we are starting to support a pluggable model for DB support. So users 
can use HMS with other DBs as well for 
   > example DB2 (all they have to do is provide the schema scripts). So we 
don't know what their limits would be.
   > Would it make sense to use a smaller number? what is the downside of doing 
so?
   
   Thank you for explaining about the pluggable model for DB support. We can 
consider a smaller size for the parameters that are passed to the query.
   
   The downside of reducing the number of parameters that are passed would be 
that it can potentially result in increases the number of iterations required 
to get the result of the query for the same number of parameters.
   
   for example if there are 30000 parameters, with a size of 32767, it can be 
executed in a single query. But if we set the size to 1000, it will result in 
30000/1000 = 30 iterations.
   
   This will result in increased latency.
   
   It is important to note that this part of the code is executed for all DDL, 
DML and queries that involved partitioned tables. Hence it has potential to 
increase the latency across the spectrum.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 510182)
    Time Spent: 0.5h  (was: 20m)

> Table partition operations (create, drop, select) fail when the number of 
> partitions is greater than 32767 (signed int)
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-24343
>                 URL: https://issues.apache.org/jira/browse/HIVE-24343
>             Project: Hive
>          Issue Type: Bug
>          Components: Metastore
>            Reporter: Narayanan Venkateswaran
>            Assignee: Narayanan Venkateswaran
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The table partition operations - create, drop, select access the underlying 
> relation database using JDO, which internally routes the operations through 
> the JDBC driver. Most of the underlying JDBC driver implementations place a 
> limit on the number of parameters that can be passed through a statement 
> implementation. The limitations are as follows,
> postgreSQL - 32767
> (https://www.postgresql.org/message-id/16832734.post%40talk.nabble.com)
> MySQL - 32767 - 2 Byte Integer - num of params
> (https://dev.mysql.com/doc/internals/en/com-stmt-prepare-response.html#packet-COM_STMT_PREPARE_OK)
> Oracle - 32767 -
> https://www.jooq.org/doc/3.12/manual/sql-building/dsl-context/custom-settings/settings-inline-threshold/
> Derby - 32767 - stored in an unsinged integer - Note the Prepared
> Statement implementation here -
> [https://svn.apache.org/repos/asf/db/derby/code/branches/10.1/java/client/org/apache/derby/client/am/PreparedStatement.java]
>  
> These limits should be taken into account when querying the underlying 
> metastore.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to