szehon-ho commented on code in PR #6570:
URL: https://github.com/apache/iceberg/pull/6570#discussion_r1159171448
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -572,8 +595,39 @@ private static boolean hiveEngineEnabled(TableMetadata
metadata, Configuration c
ConfigProperties.ENGINE_HIVE_ENABLED,
TableProperties.ENGINE_HIVE_ENABLED_DEFAULT);
}
+ /**
+ * Returns if the hive locking should be enabled on the table, or not.
+ *
+ * <p>The decision is made like this:
+ *
+ * <ol>
+ * <li>Table property value {@link TableProperties#HIVE_LOCK_ENABLED}
+ * <li>If the table property is not set then check the hive-site.xml
property value {@link
+ * ConfigProperties#LOCK_HIVE_ENABLED}
+ * <li>If none of the above is enabled then use the default value {@link
+ * TableProperties#HIVE_LOCK_ENABLED_DEFAULT}
+ * </ol>
+ *
+ * @param metadata Table metadata to use
+ * @param conf The hive configuration to use
+ * @return if the hive engine related values should be enabled or not
+ */
+ private static boolean hiveLockEnabled(TableMetadata metadata, Configuration
conf) {
Review Comment:
Why do we want to override the table property using conf? I thought all
writes should use the hive lock enabled, or all writers should not.
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreUtil.java:
##########
@@ -48,14 +50,28 @@ public class MetastoreUtil {
private MetastoreUtil() {}
/**
- * Calls alter_table method using the metastore client. If possible, an
environmental context will
- * be used that turns off stats updates to avoid recursive listing.
+ * Calls alter_table method using the metastore client. If the HMS supports
then, environmental
+ * context with will be set in a way that turns off stats updates to avoid
recursive file listing.
Review Comment:
extra 'with'
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -263,6 +266,15 @@ protected void doCommit(TableMetadata base, TableMetadata
metadata) {
throw e;
} catch (Throwable e) {
+ if (e.getMessage()
Review Comment:
This is a bit unfortunate, we didn't go with a proper subclass of
MetaException
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/HiveTableOperations.java:
##########
@@ -572,8 +595,39 @@ private static boolean hiveEngineEnabled(TableMetadata
metadata, Configuration c
ConfigProperties.ENGINE_HIVE_ENABLED,
TableProperties.ENGINE_HIVE_ENABLED_DEFAULT);
}
+ /**
+ * Returns if the hive locking should be enabled on the table, or not.
+ *
+ * <p>The decision is made like this:
+ *
+ * <ol>
+ * <li>Table property value {@link TableProperties#HIVE_LOCK_ENABLED}
+ * <li>If the table property is not set then check the hive-site.xml
property value {@link
+ * ConfigProperties#LOCK_HIVE_ENABLED}
+ * <li>If none of the above is enabled then use the default value {@link
+ * TableProperties#HIVE_LOCK_ENABLED_DEFAULT}
+ * </ol>
+ *
+ * @param metadata Table metadata to use
+ * @param conf The hive configuration to use
+ * @return if the hive engine related values should be enabled or not
+ */
+ private static boolean hiveLockEnabled(TableMetadata metadata, Configuration
conf) {
Review Comment:
Also, is there any way to validate Hive compaitibility, to prevent old Hive
version from disabling Hive locks?
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreUtil.java:
##########
@@ -48,14 +50,28 @@ public class MetastoreUtil {
private MetastoreUtil() {}
/**
- * Calls alter_table method using the metastore client. If possible, an
environmental context will
- * be used that turns off stats updates to avoid recursive listing.
+ * Calls alter_table method using the metastore client. If the HMS supports
then, environmental
Review Comment:
How about "If the HMS supports then" -> "If the HMS supports it"
##########
hive-metastore/src/main/java/org/apache/iceberg/hive/MetastoreUtil.java:
##########
@@ -48,14 +50,28 @@ public class MetastoreUtil {
private MetastoreUtil() {}
/**
- * Calls alter_table method using the metastore client. If possible, an
environmental context will
- * be used that turns off stats updates to avoid recursive listing.
+ * Calls alter_table method using the metastore client. If the HMS supports
then, environmental
+ * context with will be set in a way that turns off stats updates to avoid
recursive file listing.
*/
public static void alterTable(
IMetaStoreClient client, String databaseName, String tblName, Table
table) {
- EnvironmentContext envContext =
- new EnvironmentContext(
- ImmutableMap.of(StatsSetupConst.DO_NOT_UPDATE_STATS,
StatsSetupConst.TRUE));
- ALTER_TABLE.invoke(client, databaseName, tblName, table, envContext);
+ alterTable(client, databaseName, tblName, table, ImmutableMap.of());
+ }
+
+ /**
+ * Calls alter_table method using the metastore client. If the HMS supports
then, environmental
+ * context with will be set in a way that turns off stats updates to avoid
recursive file listing.
+ */
+ public static void alterTable(
+ IMetaStoreClient client,
+ String databaseName,
+ String tblName,
+ Table table,
+ Map<String, String> extraEnv) {
+ Map<String, String> env = Maps.newHashMapWithExpectedSize(extraEnv.size()
+ 1);
+ env.putAll(extraEnv);
+ env.put(StatsSetupConst.DO_NOT_UPDATE_STATS, StatsSetupConst.TRUE);
+
Review Comment:
Maybe to validate hive version, one way is check here if alter_table method
has enough arguments for the env? If not, I feel the whole scheme is not going
to work.
##########
core/src/main/java/org/apache/iceberg/TableProperties.java:
##########
@@ -303,6 +303,9 @@ private TableProperties() {}
public static final String ENGINE_HIVE_ENABLED = "engine.hive.enabled";
public static final boolean ENGINE_HIVE_ENABLED_DEFAULT = false;
+ public static final String HIVE_LOCK_ENABLED = "hive.lock.enabled";
+ public static final boolean HIVE_LOCK_ENABLED_DEFAULT = true;
Review Comment:
Interesting, I always read 'engine' as for hive-mr, ie running
hive-on-iceberg, whereas this is for all engines (spark/flink) that use hive
catalog.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]