VenuReddy2103 opened a new pull request, #3905: URL: https://github.com/apache/hive/pull/3905
### What changes were proposed in this pull request? Proposing to use direct SQL for add partitions. 1. Added a datanucleus property `datanucleus.query.sql.allowAll` to allow insert operation from JDO SQL(default value is true). 2. Added a property `metastore.direct.sql.max.parameters.in.insert` to control the maximum rows in multiple row insert query to database. 3. Implemented utility to get identity value from the DataNucleus value generator. 4. Implemented multiple row insert in batches for all the tables involved in add partitions. 5. Implemented fallback to JDO based in case of any failure in direct SQL. ### Why are the changes needed? When data is loaded/inserted to the dynamic partition table, many partitions are created based on the cardinality of the partition column(s). These partitions are updated to the hive metastore backing datastore using `pm.makePersistentAll()` datanucleus JDO API. Partition model class depends on(or holds reference to) several other model classes/structs such as Storage descriptor, Serdes, Serde params, Partition params, Partition key values etc. When a model class instance is persisted using JDO API, datanucleus internally does cascade persistence of all the other model instances it depends on recursively. During this process, it fires a lot of SQL queries to the backing datastore for the tables of the model classes/structs respectively. Many of the queries among them are redundant and takes longer time when partitions are more. [HIVE-26035](https://issues.apache.org/jira/browse/HIVE-26035) ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested the scenario manually. Ran all the unittest cases. Below is the performance measurement in the local system using MySQL database with metastore-benchmarks tool available at [https://github.com/akolb1/hclient/tree/master/metastore-benchmarks](https://github.com/akolb1/hclient/tree/master/metastore-benchmarks). <img width="616" alt="image" src="https://user-images.githubusercontent.com/35334869/210361898-6a18ed2b-5347-4001-bbea-fa0f2dddf5fc.png"> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
