[
https://issues.apache.org/jira/browse/HIVE-18696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Advertising
Peter Vary updated HIVE-18696:
------------------------------
Resolution: Fixed
Fix Version/s: 3.1.0
Status: Resolved (was: Patch Available)
Pushed to master.
Thanks for the patch [~kuczoram]!
> The partition folders might not get cleaned up properly in the
> HiveMetaStore.add_partitions_core method if an exception occurs
> ------------------------------------------------------------------------------------------------------------------------------
>
> Key: HIVE-18696
> URL: https://issues.apache.org/jira/browse/HIVE-18696
> Project: Hive
> Issue Type: Bug
> Components: Metastore
> Reporter: Marta Kuczora
> Assignee: Marta Kuczora
> Priority: Major
> Fix For: 3.1.0
>
> Attachments: HIVE-18696.1.patch, HIVE-18696.2.patch,
> HIVE-18696.3.patch, HIVE-18696.4.patch, HIVE-18696.5.patch, HIVE-18696.6.patch
>
>
> When trying to add multiple partitions, but one of them cannot be created
> successfully, none of the partitions are created, but the folders might not
> be cleaned up properly. See the test case "testAddPartitionsOneInvalid" in
> the TestAddPartitions test.
> This is the problematic code in the HiveMetaStore.add_partitions_core method:
> {code:java}
> for (final Partition part : parts) {
> if (!part.getTableName().equals(tblName) ||
> !part.getDbName().equals(dbName)) {
> throw new MetaException("Partition does not belong to target
> table "
> + dbName + "." + tblName + ": " + part);
> }
> boolean shouldAdd = startAddPartition(ms, part, ifNotExists);
> if (!shouldAdd) {
> existingParts.add(part);
> LOG.info("Not adding partition " + part + " as it already
> exists");
> continue;
> }
> final UserGroupInformation ugi;
> try {
> ugi = UserGroupInformation.getCurrentUser();
> } catch (IOException e) {
> throw new RuntimeException(e);
> }
> partFutures.add(threadPool.submit(new Callable<Partition>() {
> @Override
> public Partition call() throws Exception {
> ugi.doAs(new PrivilegedExceptionAction<Object>() {
> @Override
> public Object run() throws Exception {
> try {
> boolean madeDir = createLocationForAddedPartition(table,
> part);
> if (addedPartitions.put(new PartValEqWrapper(part),
> madeDir) != null) {
> // Technically, for ifNotExists case, we could insert
> one and discard the other
> // because the first one now "exists", but it seems
> better to report the problem
> // upstream as such a command doesn't make sense.
> throw new MetaException("Duplicate partitions in the
> list: " + part);
> }
> initializeAddedPartition(table, part, madeDir);
> } catch (MetaException e) {
> throw new IOException(e.getMessage(), e);
> }
> return null;
> }
> });
> return part;
> }
> }));
> }
> {code}
> When going through the partitions, let's say for the first two partitions the
> threads are successfully submitted to create the folders. But an exception
> occurs for the third partition in the code before submitting the thread. (It
> can happen if the partition has different table or db name as the others or
> it has invalid value.)
> In this case the execution will jump to the finally part where the folders
> in the "addedPartitions" map will be cleaned up. However it can happen that
> the threads for the first two partitions are not finished with the folder
> creation yet, so the map can be empty or it can contain only one of the
> partitions.
> This issue also happens in the HiveMetastore.add_partitions_pspec_core
> method, as this code part is the same as in the add_partitions_core method.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)