[jira] [Created] (HIVE-20589) Collect metrics for HMS notifications
Alexander Kolbasov created HIVE-20589: - Summary: Collect metrics for HMS notifications Key: HIVE-20589 URL: https://issues.apache.org/jira/browse/HIVE-20589 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov Currently we do not have much visibility in the HMS notifications - there are no metrics showing counts for different events or any timing stats - it would be good to collect these. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20564) Remove Hive Server dependency on Metastore Server
Alexander Kolbasov created HIVE-20564: - Summary: Remove Hive Server dependency on Metastore Server Key: HIVE-20564 URL: https://issues.apache.org/jira/browse/HIVE-20564 Project: Hive Issue Type: Sub-task Reporter: Alexander Kolbasov Currently Hive Server2 still depends on some classes from Metastore Server - we should break this dependency. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20492) Failing test [est_teradatabinaryfile
Alexander Kolbasov created HIVE-20492: - Summary: Failing test [est_teradatabinaryfile Key: HIVE-20492 URL: https://issues.apache.org/jira/browse/HIVE-20492 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 4.0.0 Reporter: Alexander Kolbasov I see a failure of org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[test_teradatabinaryfile] test on master branch after the commit for HIVE-20225. The test fails for completely unrelated changes. [~luli] Can you check this? -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20483) Really move metastore common classes into metastore-common
Alexander Kolbasov created HIVE-20483: - Summary: Really move metastore common classes into metastore-common Key: HIVE-20483 URL: https://issues.apache.org/jira/browse/HIVE-20483 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 3.0.1, 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov HIVE-20482 patch was supposed to move a bunch of files from metastore-server to metastore-common but for some reason it didn't happen, so now these files should be moved. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20482) Remove dependency on metastore-server
Alexander Kolbasov created HIVE-20482: - Summary: Remove dependency on metastore-server Key: HIVE-20482 URL: https://issues.apache.org/jira/browse/HIVE-20482 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 3.0.1, 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov Now that we separated common and server classes we should remove dependency on the server module from poms. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20404) Split HMS security classes into client and server parts
Alexander Kolbasov created HIVE-20404: - Summary: Split HMS security classes into client and server parts Key: HIVE-20404 URL: https://issues.apache.org/jira/browse/HIVE-20404 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov Currently many security-related classes handle both client and server side together. We would like to separate these into server and client components so that clients do not need server-related code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20390) Split TxnUtils into common and server parts.
Alexander Kolbasov created HIVE-20390: - Summary: Split TxnUtils into common and server parts. Key: HIVE-20390 URL: https://issues.apache.org/jira/browse/HIVE-20390 Project: Hive Issue Type: Sub-task Reporter: Alexander Kolbasov HiveMetastoreClient uses some static methods from TxnUtils which should move to metastore-common package. Remaining server-specific methods should remain in metastore-server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20388) Move common classes out of metastore-server
Alexander Kolbasov created HIVE-20388: - Summary: Move common classes out of metastore-server Key: HIVE-20388 URL: https://issues.apache.org/jira/browse/HIVE-20388 Project: Hive Issue Type: Sub-task Reporter: Alexander Kolbasov There are many classes in metastore-server module that should be moved to metastore-common. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20387) Move non-server related methods from Warehouse to MetastoreUtils
Alexander Kolbasov created HIVE-20387: - Summary: Move non-server related methods from Warehouse to MetastoreUtils Key: HIVE-20387 URL: https://issues.apache.org/jira/browse/HIVE-20387 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 3.0.1, 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov Most of the functions in Warehouse class are only relevant for the server. There are some utility methods that are there that we can move to MetastoreUtils - these are used outside of the server. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20198) Constant time table drops/renames
Alexander Kolbasov created HIVE-20198: - Summary: Constant time table drops/renames Key: HIVE-20198 URL: https://issues.apache.org/jira/browse/HIVE-20198 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 4.0.0 Reporter: Alexander Kolbasov Currently table drops and table renames have O(P) performance (where P is the number of partitions). When a managed table is deleted, the implementation deletes table metadata and then deletes all partitions in HDFS. HDFS operations are optimized and only do a sequential deletes for partitions outside of table prefix. This operation is O(P)where Pis the number of partitions. Table rename goes through the list of partitions and modifies table name (and potentially db name) in each partition. It also modifies each partition location to match the new db/table name and renames directories (which is a non-atomic and slow operation on S3). This is O(P) operation where P is the number of partitions. Basic idea is to do the following: # Assign unique ID to each table # Create directory name based on unique ID rather then the name # Table rename then becomes metadata-only operation - there is no need to change any location information. # Table drop can become an asynchronous operation where the table is marked as "deleted". Subsequent public metadata APIs should skip such tables. A background cleaner thread may then go and clean up directories. Since the table location is unique for each table, new tables will not reuse existing locations. This change isn't compatible with the current behavior where there is an assumption that table location is based on table name. We can get around this by providing "opt-in" mechanism - special table property that tells that the table can have such new behavior, so the improvement will initially work for new tables created with this feature enabled. We may later provide some tool to convert existing tables to the new scheme. One complication is there in case where impersonation is enabled - the FS operations should be performed using client UGI rather then server's, so the cleaner thread should be able to use client UGIs. Initially we can punt on this and do standard table drops when impersonation is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20196) Separate MetastoreConf into common and server parts
Alexander Kolbasov created HIVE-20196: - Summary: Separate MetastoreConf into common and server parts Key: HIVE-20196 URL: https://issues.apache.org/jira/browse/HIVE-20196 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov MetastoreConf has knowledge about some server-specific classes. We need to separate these into a separate server-specific class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20195) Split MetastoreUtils into common and server-specific parts
Alexander Kolbasov created HIVE-20195: - Summary: Split MetastoreUtils into common and server-specific parts Key: HIVE-20195 URL: https://issues.apache.org/jira/browse/HIVE-20195 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Alexander Kolbasov Parts of MetastoreUtils are used by clients and the server, parts are used by server only. We need to separate server-only parts in a separate class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20194) HiveMetastoreClient should use reflection to instantiate embedded HMS instance
Alexander Kolbasov created HIVE-20194: - Summary: HiveMetastoreClient should use reflection to instantiate embedded HMS instance Key: HIVE-20194 URL: https://issues.apache.org/jira/browse/HIVE-20194 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov When HiveMetastoreClient is used in embedded mode, it instantiates metastore server. Since we want to separate client and server code we can no longer instantiate the class directly but need to use reflection for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20189) Separate metastore client code into its own module
Alexander Kolbasov created HIVE-20189: - Summary: Separate metastore client code into its own module Key: HIVE-20189 URL: https://issues.apache.org/jira/browse/HIVE-20189 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov The goal of this JIRA is to split HiveMetastoreClient code out of metastore-common. This is a pom-only change that does not require any changes in the code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20188) Split server-specific code outside of standalone metastore-common
Alexander Kolbasov created HIVE-20188: - Summary: Split server-specific code outside of standalone metastore-common Key: HIVE-20188 URL: https://issues.apache.org/jira/browse/HIVE-20188 Project: Hive Issue Type: Sub-task Components: Standalone Metastore Affects Versions: 4.0.0, 3.2.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov The goal of this JIRA is to split metastore-common and separate the server code into a separate module. This is still a pom-only change so all consumers will have access to both. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-20097) Convert standalone-metastore to a submodule
Alexander Kolbasov created HIVE-20097: - Summary: Convert standalone-metastore to a submodule Key: HIVE-20097 URL: https://issues.apache.org/jira/browse/HIVE-20097 Project: Hive Issue Type: Sub-task Components: Hive, Metastore, Standalone Metastore Affects Versions: 3.1.0, 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov This is a subtask to stage HIVE-17751 changes into several smaller phases. The first part is moving existing code in hive-standalone-metastore to a sub-module. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19902) Provide Metastore micro-benchmarks
Alexander Kolbasov created HIVE-19902: - Summary: Provide Metastore micro-benchmarks Key: HIVE-19902 URL: https://issues.apache.org/jira/browse/HIVE-19902 Project: Hive Issue Type: Improvement Components: Standalone Metastore Affects Versions: 3.1.0, 4.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov It would be very useful to have metastore benchmarks to be able to track perf issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19719) Adding metastore batch API for partitions
Alexander Kolbasov created HIVE-19719: - Summary: Adding metastore batch API for partitions Key: HIVE-19719 URL: https://issues.apache.org/jira/browse/HIVE-19719 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 3.1.0, 4.0.0 Reporter: Alexander Kolbasov Hive Metastore provides APIs for fetching a collection of objects (usually tables or partitions). These APIs provide a way to fetch all available objects so the size of the response is O(N) where N is the number of objects. These calls have several problems: * All objects (and there may be thousands or even millions) should be fetched from the database, serialized to Java list of thrift objects then serialized into byte array for sending over the network. This creates spikes of huge memory pressure, especially since in some cases multiple of copies of the same data are present in memory (e.g. unserialized and serialized versions). * Even though HMS tries to avoid string duplication by use of string interning in JAVA, duplicated strings must be serialized in the output array. * Java has 2Gb limit on the maximum size of byte array, and crashes with Out Of Memory exception if this array size is exceeded * Fetching huge amount of objects blows up DB caches and memory caches in the system. Receiving such huge messages also creates memory pressure on the receiver side (usually HS2) which can cause it crashing with Out of Memory exception as well. * Such requests have very big latencies since the server must collect all objects, serialize them and send them all to the network before the client can do anything with the result. To prevent cases of Out Of Memory exceptions, the server now has a configurable limit on the maximum number of objects returned. This helps to avoid crashes, but doesn’t allow for correct query execution since the result will include random and incomplete set of K objects. Currently this is addressed on the client side by simulating batching by getting list of table or partition names first and then requesting table information for parts of this list. Still, the list of objects can be big as well and this method requires locking to ensure that objects are not added or removed between the calls, especially if this is done outside of HS2. Instead we can do simple modification of existing APIs which allows for batch iterator-style operations without keeping any server-side state. The main idea is to have a unique incrementing IDs for each objects. The IDs should be only unique within their container (e.g. table IDs should be unique within a database and partition IDs should be unique within a table). Such ID can be easily generated using database auto-increment mechanism or we can be simply reuse existing ID column that is already maintained by the Data Nucleus. The request is then modified to include * Starting ID i0 * Batch size (B) The server fetches up to B objects starting from i0, serlalizes them and sends to the client. The client then requests next batch by using the ID of the last received request plus one. It is possible to construct an SQL query (either by using DataNucleus JDOQL or in DirectSQL code) which only selects needed objects avoiding big reads from the database. The client then iterates until it fetches all the objects and each request memory size is limited by the value of batch size. If we extend the API a little bit, providing a way to get the minimum and maximum ID values (either via a separate call or piggybacked to the normal reply), clients can request such batches concurrently, thus also reducing the latency. Clients can easily estimate number of batches by knowing the total number of IDs. While this isn’t a precise method it is good enough to divide the work. It is also possible to wrap this in a way similar to {{PartitionIterator}} and async-fetch next batch while we are processing current batch. * Consistency considerations* * HMS only provides consistency guarantees for a single call. The set of objects that should be returned may change while we are iterating over it. In some cases this is not an issue since HS2 may use ZooKeeper locks on the table to prevent modifications, but in some cases this may be an issue (for example for calls that originate from external systems. We should consider additions and removals separately. * New objects are added during iteration. All new objects are always added at the ‘end’ of ID space, so they will be always picked up by the iterator. We assume that IDs are always incrementing. * Some objects are removed during iteration. Removal of objects that are not already consumed is not a problem. It is possible that some objects which were already consumed are returned. Although this results in an inconsistent list of objects, this situation is
[jira] [Created] (HIVE-19718) Adding partitions in bulk also fetches table for each partition
Alexander Kolbasov created HIVE-19718: - Summary: Adding partitions in bulk also fetches table for each partition Key: HIVE-19718 URL: https://issues.apache.org/jira/browse/HIVE-19718 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 3.0.0 Environment: Looking at {{convertToMPart}}: {code:Java} private MPartition convertToMPart(Partition part, boolean useTableCD) throws InvalidObjectException, MetaException { MTable mt = getMTable(part.getCatName(), part.getDbName(), part.getTableName()); ... {code} So what we have as a result is that we fetch table for every partition where it should be done just once. Reporter: Alexander Kolbasov The ObjectStore.addPartitions() method does this: {code:java} for (Partition part : parts) { if (!part.getTableName().equals(tblName) || !part.getDbName().equals(dbName)) { throw new MetaException("Partition does not belong to target table " + dbName + "." + tblName + ": " + part); } MPartition mpart = convertToMPart(part, true); // <-- Here toPersist.add(mpart); ...{code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19337) Partition whitelist regex doesn't work (and never did)
Alexander Kolbasov created HIVE-19337: - Summary: Partition whitelist regex doesn't work (and never did) Key: HIVE-19337 URL: https://issues.apache.org/jira/browse/HIVE-19337 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.3.3 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov {{ObjectStore.setConf()}} has the following code: {code:java} String partitionValidationRegex = hiveConf.get(HiveConf.ConfVars.METASTORE_PARTITION_NAME_WHITELIST_PATTERN.name()); {code} Note that it uses {{name()}}method which returns enum name ({{METASTORE_PARTITION_NAME_WHITELIST_PATTERN}}) rather then {.varname} As a result the regex will always be null. The code was introduced as part of HIVE-7223 Support generic PartitionSpecs in Metastore partition-functions So looks like this was broken since the original code drop. This is fixed in Hive3 - probably when [~alangates] reworked access to configuration (HIVE-17733) so it isn't a bug in Hive-3. [~stakiar_impala_496e] FYI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19253) HMS ignores tableType property for external tables
Alexander Kolbasov created HIVE-19253: - Summary: HMS ignores tableType property for external tables Key: HIVE-19253 URL: https://issues.apache.org/jira/browse/HIVE-19253 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.2, 3.0.0, 3.1.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov When someone creates a table using Thrift API they may think that setting tableType to {{EXTERNAL_TABLE}} creates an external table. And boom - their table is gone later because HMS will silently change it to managed table. here is the offending code: {code:java} private MTable convertToMTable(Table tbl) throws InvalidObjectException, MetaException { ... // If the table has property EXTERNAL set, update table type // accordingly String tableType = tbl.getTableType(); boolean isExternal = Boolean.parseBoolean(tbl.getParameters().get("EXTERNAL")); if (TableType.MANAGED_TABLE.toString().equals(tableType)) { if (isExternal) { tableType = TableType.EXTERNAL_TABLE.toString(); } } if (TableType.EXTERNAL_TABLE.toString().equals(tableType)) { if (!isExternal) { // Here! tableType = TableType.MANAGED_TABLE.toString(); } } {code} So if the EXTERNAL parameter is not set, table type is changed to managed even if it was external in the first place - which is wrong. More over, in other places code looks at the table property to decide table type and some places look at parameter. HMS should really make its mind which one to use. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19241) HMSHandler initialization isn't thread-safe
Alexander Kolbasov created HIVE-19241: - Summary: HMSHandler initialization isn't thread-safe Key: HIVE-19241 URL: https://issues.apache.org/jira/browse/HIVE-19241 Project: Hive Issue Type: Bug Affects Versions: 2.0.2, 3.0.0, 3.1.0 Reporter: Alexander Kolbasov The code in HMSHandler uses the double-check anti-pattern: {code:java} public HMSHandler(String name, Configuration conf, boolean init) throws MetaException { super(name); this.conf = conf; isInTest = MetastoreConf.getBoolVar(this.conf, ConfVars.HIVE_IN_TEST); if (threadPool == null) { // No lock held!! synchronized (HMSHandler.class) { int numThreads = MetastoreConf.getIntVar(conf, ConfVars.FS_HANDLER_THREADS_COUNT); threadPool = Executors.newFixedThreadPool(numThreads, new ThreadFactoryBuilder().setDaemon(true) .setNameFormat("HMSHandler #%d").build()); } } {code} Notice that the check for threadPool == null isn't protected. This means that users of threadPool may see thread pool that isn't completely initialized. See https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html for a detailed explanation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19177) ObjectStore.setConf() is doing dangerous work while holding global lock
Alexander Kolbasov created HIVE-19177: - Summary: ObjectStore.setConf() is doing dangerous work while holding global lock Key: HIVE-19177 URL: https://issues.apache.org/jira/browse/HIVE-19177 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.2, 3.0.0, 3.1.0 Reporter: Alexander Kolbasov The {{ObjectStore.setConf()}} function grabs static {{pmfPropLock}} and then calls {{initialize}} which goes through DataNucleus, accesses database, waits on DB thread pools, retries with sleep, etc, all while holding the lock. This is rather dangerous and expensive since no one else can call setConf at the same time on a different instance. All of these should be done without holding the lock. [~vihangk1] FYI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-19086) Write notifications in bulk only when commitTransaction actually commits
Alexander Kolbasov created HIVE-19086: - Summary: Write notifications in bulk only when commitTransaction actually commits Key: HIVE-19086 URL: https://issues.apache.org/jira/browse/HIVE-19086 Project: Hive Issue Type: Sub-task Components: Metastore Affects Versions: 3.0.0, 2.4.0 Reporter: Alexander Kolbasov This is an optimization that is targeting reducing the amount of time the global DB lock is held for notifications. The idea is to collect all notifications and only push them when commitTransaction() actually commits. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18942) ALTER TABLE may generate huge event (with all partitions)
Alexander Kolbasov created HIVE-18942: - Summary: ALTER TABLE may generate huge event (with all partitions) Key: HIVE-18942 URL: https://issues.apache.org/jira/browse/HIVE-18942 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 3.0.0 Reporter: Alexander Kolbasov ALTER TABLE handler in HiveAlterHandler has this code: {code:java} if (isPartitionedTable) { parts = msdb.getPartitions(newt.getDbName(), newt.getTableName(), -1); MetaStoreListenerNotifier.notifyEvent(transactionalListeners, EventMessage.EventType.ADD_PARTITION, new AddPartitionEvent(newt, parts, true, handler), environmentContext); }{code} The problem is that table may contain huge number of partitions and the event will contain all of them. Partition object itself isn't very small either, so we may end up with huge events which would be stored and then transmitted over the wire to consumers. [~spena] [~kkalyan] [~lina.li] [~vaidyand] FYI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18941) HMS non-transactional listener may be called in transactional context
Alexander Kolbasov created HIVE-18941: - Summary: HMS non-transactional listener may be called in transactional context Key: HIVE-18941 URL: https://issues.apache.org/jira/browse/HIVE-18941 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.2, 3.0.0 Reporter: Alexander Kolbasov When HMS code calls listeners it assumes that they are *not* called as part of the transaction. This isn't quite true because of the nested transaction - it is quite possible that these listeners are called as part of the bigger nested transaction. This causes several potential issues: 1) It changes the assumptions about the context in which these listeners run 2) It creates possibilities for deadlocks 3) Some of these listeners may do relative long operations which may delay transaction commits. [~spena] FYI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18888) Replace synchronizedMap with ConcurrentHashMap
Alexander Kolbasov created HIVE-1: - Summary: Replace synchronizedMap with ConcurrentHashMap Key: HIVE-1 URL: https://issues.apache.org/jira/browse/HIVE-1 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0, 2.3.3 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov There are a bunch of places that use Collections.synchronizedMap instead of ConcurrentHashMap which are better. We should search/replace the uses. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18885) Cascaded alter table + notifications = disaster
Alexander Kolbasov created HIVE-18885: - Summary: Cascaded alter table + notifications = disaster Key: HIVE-18885 URL: https://issues.apache.org/jira/browse/HIVE-18885 Project: Hive Issue Type: Bug Components: Hive, Metastore Affects Versions: 3.0.0 Reporter: Alexander Kolbasov You can see the problem from looking at the code, but it actually created severe problems for real life Hive user. When {{alter table}} has {{cascade}} option it does the following: {code:java} msdb.openTransaction() ... List parts = msdb.getPartitions(dbname, name, -1); for (Partition part : parts) { List oldCols = part.getSd().getCols(); part.getSd().setCols(newt.getSd().getCols()); String oldPartName = Warehouse.makePartName(oldt.getPartitionKeys(), part.getValues()); updatePartColumnStatsForAlterColumns(msdb, part, oldPartName, part.getValues(), oldCols, part); msdb.alterPartition(dbname, name, part.getValues(), part); } {code} So it walks all partitions (and this may be huge list) and does some non-trivial operations in one single uber-transaction. When DbNotificationListener is enabled, it adds an event for each partition, all while holding a row lock on NOTIFICATION_SEQUENCE table. As a result, while this is happening no other write DDL can proceed. This can sometimes cause DB lock timeouts which cause HMS level operation retries which make things even worse. In one particular case this pretty much made HMS unusable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18768) Use Datanucleus to serialize notification updates
Alexander Kolbasov created HIVE-18768: - Summary: Use Datanucleus to serialize notification updates Key: HIVE-18768 URL: https://issues.apache.org/jira/browse/HIVE-18768 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 2.0.2, 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov HIVE-16886 added code to serialize notification updates using LOCK FOR UPDATE. It turns out that there is a simpler way - see HIVE-18526. The goal of this JIRA is to use the approach from HIVE-18526 - Datanucleus based solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18712) Design HMS Api v2
Alexander Kolbasov created HIVE-18712: - Summary: Design HMS Api v2 Key: HIVE-18712 URL: https://issues.apache.org/jira/browse/HIVE-18712 Project: Hive Issue Type: Improvement Components: Metastore Affects Versions: 3.0.0 Reporter: Alexander Kolbasov This is an umbrella Jira covering the design of Hive Metastore API v2. It is supposed to be a placeholder for discussion and design documents. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18707) Dropping database via HiveMetastoreClient involves useless work
Alexander Kolbasov created HIVE-18707: - Summary: Dropping database via HiveMetastoreClient involves useless work Key: HIVE-18707 URL: https://issues.apache.org/jira/browse/HIVE-18707 Project: Hive Issue Type: Bug Components: Metastore Affects Versions: 3.0.0, 2.3.3 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov HiveMetastoreClient has dropDatabase() method which does this: {code:java} if (cascade) { List tableList = getAllTables(name); for (String table : tableList) { try { // Subclasses can override this step (for example, for temporary tables) dropTable(name, table, deleteData, true); } catch (UnsupportedOperationException e) { // Ignore Index tables, those will be dropped with parent tables } } }{code} This isn't needed since the similar thing is done on the server side and just wastes time in sending multiple server calls. [~pvary] [~alangates] FYI. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18526) Backport HIVE-16886 to Hive 2
Alexander Kolbasov created HIVE-18526: - Summary: Backport HIVE-16886 to Hive 2 Key: HIVE-18526 URL: https://issues.apache.org/jira/browse/HIVE-18526 Project: Hive Issue Type: Sub-task Components: Hive Affects Versions: 2.3.3 Reporter: Alexander Kolbasov The fix for HIVE-16886 isn't in Hive 2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (HIVE-18249) Remove Thrift dependency on fb303
Alexander Kolbasov created HIVE-18249: - Summary: Remove Thrift dependency on fb303 Key: HIVE-18249 URL: https://issues.apache.org/jira/browse/HIVE-18249 Project: Hive Issue Type: Bug Components: Hive, Metastore Affects Versions: 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov Looks like we are not really using fb303 and can remove fb303 dependency. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18247) Use DB auto-increment for indexes
Alexander Kolbasov created HIVE-18247: - Summary: Use DB auto-increment for indexes Key: HIVE-18247 URL: https://issues.apache.org/jira/browse/HIVE-18247 Project: Hive Issue Type: Bug Components: Hive, Metastore Affects Versions: 3.0.0 Reporter: Alexander Kolbasov I initially noticed this problem in Apache Sentry - see SENTRY-1960. Hive has the same issue. DataNucleus uses SEQUENCE table to allocate IDs which requires raw locks on multiple tables during transactions and this creates scalability problems. Instead DN should rely on DB auto-increment mechanisms which are much more scalable. See SENTRY-1960 for extra details. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-18039) Use ConcurrentHashMap for CachedStore
Alexander Kolbasov created HIVE-18039: - Summary: Use ConcurrentHashMap for CachedStore Key: HIVE-18039 URL: https://issues.apache.org/jira/browse/HIVE-18039 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Alexander Kolbasov SharedCache used by CachedStore uses single big lock to synchronize all access. This looks like an overkill - looks like it is possible to use ConcurrentHashMap instead. Also it makes sense to move deepCopy() operations outside the lock to reduce lock hold times. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17953) Metrics should move to destination atomically
Alexander Kolbasov created HIVE-17953: - Summary: Metrics should move to destination atomically Key: HIVE-17953 URL: https://issues.apache.org/jira/browse/HIVE-17953 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov HIVE-17563 reimplemented metrics using native nio interfaces. It used the assumption that{{Files.move()}} is atomic operation. It turns out that by default it isn't, unless {{ATOMIC_MOVE}} option is specified. Otherwise the destination file is unlinked and then the source file is copied. This may cause test failure since the file may be temporarily unavailable. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17872) Ignoring schema autostart doesn't work (HIVE-14152 used the wrong setting)
Alexander Kolbasov created HIVE-17872: - Summary: Ignoring schema autostart doesn't work (HIVE-14152 used the wrong setting) Key: HIVE-17872 URL: https://issues.apache.org/jira/browse/HIVE-17872 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0, 2.4.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov The fix for HIVE-14152 used the wrong datanucleus property. Correct one is {{datanucleus.autoStartMechanism}} and the patch uses {{datanucleus.autoStartMechanismMode}} which is not supported by DataNucleus. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17849) alterPartition() may fail to rollback transaction
Alexander Kolbasov created HIVE-17849: - Summary: alterPartition() may fail to rollback transaction Key: HIVE-17849 URL: https://issues.apache.org/jira/browse/HIVE-17849 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.3.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov In HiveAlterHandle.alterPartition() there is this code: {code} try { msdb.openTransaction(); msdb.alterPartition(dbname, name, new_part.getValues(), oldPart); if (transactionalListeners != null && !transactionalListeners.isEmpty()) { MetaStoreListenerNotifier.notifyEvent(transactionalListeners, EventMessage.EventType.ALTER_PARTITION, new AlterPartitionEvent(new_part, oldPart, tbl, success, handler)); } revertMetaDataTransaction = msdb.commitTransaction(); } catch (Exception ex2) { LOG.error("Attempt to revert partition metadata change failed. The revert was attempted " + "because associated filesystem rename operation failed with exception " + ex.getMessage(), ex2); if (!revertMetaDataTransaction) { msdb.rollbackTransaction(); } } {code} Note that there is no {{finally}} clause, so it is possible for some unchecked exception to occur in which case the transaction will remain active. Once this happens, all subsequent transactions on this thread will not behave correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17806) Create directory for metrics file if it doesn't exist
Alexander Kolbasov created HIVE-17806: - Summary: Create directory for metrics file if it doesn't exist Key: HIVE-17806 URL: https://issues.apache.org/jira/browse/HIVE-17806 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov HIVE-17563 changed metrics code to use local file system operations instead of Hadoop local file system operations. There is an unintended side effect - hadoop file systems create the directory if it doesn't exist and java nio interfaces don't. The purpose of this fix is to revert the behavior to the original one to avoid surprises. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17738) CommitTransaction/rollbackTransaction may throw exceptions
Alexander Kolbasov created HIVE-17738: - Summary: CommitTransaction/rollbackTransaction may throw exceptions Key: HIVE-17738 URL: https://issues.apache.org/jira/browse/HIVE-17738 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.3.0, 3.0.0 Reporter: Alexander Kolbasov The code in ObjectStore assumes that commitTransaction/rollbackTransaction never throws exceptions when, in fact, they do. As a result all the callers down the chain are not doing anything with these which causes potential problems. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17737) ObjectStore.getNotificationEventsCount may cause NPE
Alexander Kolbasov created HIVE-17737: - Summary: ObjectStore.getNotificationEventsCount may cause NPE Key: HIVE-17737 URL: https://issues.apache.org/jira/browse/HIVE-17737 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 2.3.0, 3.0.0 Reporter: Alexander Kolbasov In ObjectStore.getNotificationEventsCount(): {code} public NotificationEventsCountResponse getNotificationEventsCount(NotificationEventsCountRequest rqst) { Long result = 0L; try { openTransaction(); long fromEventId = rqst.getFromEventId(); String inputDbName = rqst.getDbName(); String queryStr = "select count(eventId) from " + MNotificationLog.class.getName() + " where eventId > fromEventId && dbName == inputDbName"; query = pm.newQuery(queryStr); query.declareParameters("java.lang.Long fromEventId, java.lang.String inputDbName"); result = (Long) query.execute(fromEventId, inputDbName); // <- Here commited = commitTransaction(); return new NotificationEventsCountResponse(result.longValue()); } } {code} It is possible that query.execute will return null in which case rsult.longValue() may throw NPE. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17736) ObjectStore transaction handling can be simplified
Alexander Kolbasov created HIVE-17736: - Summary: ObjectStore transaction handling can be simplified Key: HIVE-17736 URL: https://issues.apache.org/jira/browse/HIVE-17736 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov There are many places in ObjectStore that do something like this: {code} boolean commited = false; try { openTransaction(); commited = commitTransaction(); } finally { if (!commited) { rollbackTransaction(); } } {code} We can simplify this in two ways: 1) Create a wrapper that calls given piece of code inside the block of code above. This is similar to TransactionManager in Sentry. 2) Create a special auto-closeable object that does the check and rollback on close. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17735) ObjectStore.addNotificationEvent is leaking queries
Alexander Kolbasov created HIVE-17735: - Summary: ObjectStore.addNotificationEvent is leaking queries Key: HIVE-17735 URL: https://issues.apache.org/jira/browse/HIVE-17735 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 3.0.0 Reporter: Alexander Kolbasov Assignee: Alexander Kolbasov In ObjectStore.addNotificationEvent(): {code} Query objectQuery = pm.newQuery(MNotificationNextId.class); Collection ids = (Collection) objectQuery.execute(); {code} The query is never closed. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17730) Queries can be closed automatically
Alexander Kolbasov created HIVE-17730: - Summary: Queries can be closed automatically Key: HIVE-17730 URL: https://issues.apache.org/jira/browse/HIVE-17730 Project: Hive Issue Type: Bug Reporter: Alexander Kolbasov HIVE-16213 made QueryWrapper AutoCloseable, but queries are still closed manually and not by using try-with-resource. And now Query itself is auto closeable, so we don't need the wrapper at all. So we should get rid of QueryWrapper and use try-with-resource to create queries. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-17402) Provide more useful information in the HMS notification messages
Alexander Kolbasov created HIVE-17402: - Summary: Provide more useful information in the HMS notification messages Key: HIVE-17402 URL: https://issues.apache.org/jira/browse/HIVE-17402 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 2.2.0 Reporter: Alexander Kolbasov While working on the Apache Sentry project that uses HMS notifications we noticed that these notifications are using some useful data - e.g. location information for the objects. To get around these, ApacheSentry implemented its own version of events (https://github.com/apache/sentry/tree/master/sentry-binding/sentry-binding-hive-follower/src/main/java/org/apache/sentry/binding/metastore/messaging/json). It seems to be a useful information for Hive as well, so why not add it directly into the standard message factory? -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16994) Support connection pooling for HiveMetaStoreClient
Alexander Kolbasov created HIVE-16994: - Summary: Support connection pooling for HiveMetaStoreClient Key: HIVE-16994 URL: https://issues.apache.org/jira/browse/HIVE-16994 Project: Hive Issue Type: Improvement Components: Hive Reporter: Alexander Kolbasov The native {{HiveMetaStoreClient}} doesn't support connection pooling. I think it would be a very useful feature, especially in Kerberos environments where connection establishment may be especially expensive. A similar feature is now supported in Sentry - see SENTRY-1580. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (HIVE-16213) ObjectStore can leak Queries when rolbackTransaction
Alexander Kolbasov created HIVE-16213: - Summary: ObjectStore can leak Queries when rolbackTransaction Key: HIVE-16213 URL: https://issues.apache.org/jira/browse/HIVE-16213 Project: Hive Issue Type: Bug Components: Hive Reporter: Alexander Kolbasov In ObjectStore.java there are a few places with the code similar to: {code} Query query = null; try { openTransaction(); query = pm.newQuery(Something.class); ... commited = commitTransaction(); } finally { if (!commited) { rollbackTransaction(); } if (query != null) { query.closeAll(); } } {code} The problem is that rollbackTransaction() may throw an exception in which case query.closeAll() wouldn't be executed. The fix would be to wrap rollbackTransaction in its own try-catch block. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (HIVE-15373) Transaction management isn't thread-safe
Alexander Kolbasov created HIVE-15373: - Summary: Transaction management isn't thread-safe Key: HIVE-15373 URL: https://issues.apache.org/jira/browse/HIVE-15373 Project: Hive Issue Type: Bug Components: Hive Reporter: Alexander Kolbasov ObjectStore.java has several important calls which are not thread-safe: * openTransaction() * commitTransaction() * rollbackTransaction() These should be made thread-safe. -- This message was sent by Atlassian JIRA (v6.3.4#6332)