[
https://issues.apache.org/jira/browse/HIVE-24484?focusedWorklogId=769572&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-769572
]
ASF GitHub Bot logged work on HIVE-24484:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 12/May/22 12:07
Start Date: 12/May/22 12:07
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on code in PR #3279:
URL: https://github.com/apache/hive/pull/3279#discussion_r871272145
##########
common/pom.xml:
##########
@@ -195,6 +194,11 @@
<artifactId>tez-api</artifactId>
<version>${tez.version}</version>
</dependency>
+ <dependency>
+ <groupId>org.fusesource.jansi</groupId>
+ <artifactId>jansi</artifactId>
+ <version>2.3.4</version>
Review Comment:
move version to root pom
##########
itests/pom.xml:
##########
@@ -352,6 +352,12 @@
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-yarn-client</artifactId>
<version>${hadoop.version}</version>
+ <exclusions>
+ <exclusion>
+ <groupId>org.jline</groupId>
+ <artifactId>jline</artifactId>
+ </exclusion>
Review Comment:
I'm not sure if this fix will work; it could work for the tests; but you've
just excluded the dependency; I think that will not prevent that dep from
appearing on the classpath during runtime...
have you tested a dist build as well?
##########
ql/src/java/org/apache/hadoop/hive/ql/io/RecordReaderWrapper.java:
##########
@@ -69,7 +70,14 @@ static RecordReader create(InputFormat inputFormat,
HiveInputFormat.HiveInputSpl
JobConf jobConf, Reporter reporter) throws IOException {
int headerCount = Utilities.getHeaderCount(tableDesc);
int footerCount = Utilities.getFooterCount(tableDesc, jobConf);
- RecordReader innerReader =
inputFormat.getRecordReader(split.getInputSplit(), jobConf, reporter);
+
+ RecordReader innerReader = null;
+ try {
+ innerReader = inputFormat.getRecordReader(split.getInputSplit(), jobConf,
reporter);
+ } catch (InterruptedIOException iioe) {
+ // If reading from the underlying record reader is interrupted, return a
no-op record reader
Review Comment:
why not simply propagate the `Exception` ?
This will hide away the exception
##########
ql/src/java/org/apache/hadoop/hive/ql/security/authorization/StorageBasedAuthorizationProvider.java:
##########
@@ -178,8 +178,7 @@ public void authorize(Database db, Privilege[]
readRequiredPriv, Privilege[] wri
private static boolean userHasProxyPrivilege(String user, Configuration
conf) {
try {
- if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf,
- HMSHandler.getIPAddress())) {
+ if (MetaStoreServerUtils.checkUserHasHostProxyPrivileges(user, conf,
HMSHandler.getIPAddress())) {
Review Comment:
I think max_linelength should be <=100 ; are you using the
`dev-support/eclipse-styles.xml` ?
##########
streaming/src/test/org/apache/hive/streaming/TestStreaming.java:
##########
@@ -1317,6 +1318,11 @@ public void testTransactionBatchEmptyCommit() throws
Exception {
connection.close();
}
+ /**
+ * Starting with HDFS 3.3.1, the underlying system NOW SUPPORTS hflush so
this
+ * test fails.
Review Comment:
ok; then I think this test could be probably converted into a test which
checks that it works
##########
hcatalog/core/src/test/java/org/apache/hive/hcatalog/mapreduce/TestHCatMultiOutputFormat.java:
##########
@@ -315,18 +320,19 @@ public void testOutputFormat() throws Throwable {
// Check permisssion on partition dirs and files created
for (int i = 0; i < tableNames.length; i++) {
- Path partitionFile = new Path(warehousedir + "/" + tableNames[i]
- + "/ds=1/cluster=ag/part-m-00000");
- FileSystem fs = partitionFile.getFileSystem(mrConf);
- Assert.assertEquals("File permissions of table " + tableNames[i] + " is
not correct",
- fs.getFileStatus(partitionFile).getPermission(),
- new FsPermission(tablePerms[i]));
- Assert.assertEquals("File permissions of table " + tableNames[i] + " is
not correct",
- fs.getFileStatus(partitionFile.getParent()).getPermission(),
- new FsPermission(tablePerms[i]));
- Assert.assertEquals("File permissions of table " + tableNames[i] + " is
not correct",
-
fs.getFileStatus(partitionFile.getParent().getParent()).getPermission(),
- new FsPermission(tablePerms[i]));
+ final Path partitionFile = new Path(warehousedir + "/" + tableNames[i] +
"/ds=1/cluster=ag/part-m-00000");
+ final Path grandParentOfPartitionFile = partitionFile.getParent();
Review Comment:
I would expect `grandParent` to be parent-of-parent;
I think this change could be revoked - it was more readable earlier; the
last assert now checks for the `parent` dir and not for `parent.parent`; the
second assert was also clobbered....
##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
}}, "test_key123");
- List<String> dumpWithClause = Arrays.asList(
- "'hive.repl.add.raw.reserved.namespace'='true'",
- "'" + HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname +
"'='"
- + replica.externalTableWarehouseRoot + "'",
- "'distcp.options.skipcrccheck'=''",
- "'" + HiveConf.ConfVars.HIVE_SERVER2_ENABLE_DOAS.varname +
"'='false'",
- "'" + HiveConf.ConfVars.HIVE_DISTCP_DOAS_USER.varname + "'='"
- + UserGroupInformation.getCurrentUser().getUserName()
+"'");
- WarehouseInstance.Tuple tuple =
- primary.run("use " + primaryDbName)
- .run("create table encrypted_table (id int, value string)")
- .run("insert into table encrypted_table values
(1,'value1')")
- .run("insert into table encrypted_table values
(2,'value2')")
- .dump(primaryDbName, dumpWithClause);
-
- replica
- .run("repl load " + primaryDbName + " into " + replicatedDbName
- + " with('hive.repl.add.raw.reserved.namespace'='true', "
- + "'hive.repl.replica.external.table.base.dir'='" +
replica.externalTableWarehouseRoot + "', "
- + "'hive.exec.copyfile.maxsize'='0',
'distcp.options.skipcrccheck'='')")
- .run("use " + replicatedDbName)
- .run("repl status " + replicatedDbName)
- .verifyResult(tuple.lastReplicationId);
-
- try {
- replica
- .run("select value from encrypted_table")
- .verifyResults(new String[] { "value1", "value2" });
- Assert.fail("Src EZKey shouldn't be present on target");
- } catch (IOException e) {
- Assert.assertTrue(e.getCause().getMessage().contains("KeyVersion name
'test_key@0' does not exist"));
- }
-
//read should pass without raw-byte distcp
- dumpWithClause = Arrays.asList( "'" +
HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
+ List<String> dumpWithClause = Arrays.asList( "'" +
HiveConf.ConfVars.REPL_EXTERNAL_TABLE_BASE_DIR.varname + "'='"
+ replica.externalTableWarehouseRoot + "'");
- tuple = primary.run("use " + primaryDbName)
+ WarehouseInstance.Tuple tuple =
+ primary.run("use " + primaryDbName)
.run("create external table encrypted_table2 (id int, value
string)")
.run("insert into table encrypted_table2 values (1,'value1')")
.run("insert into table encrypted_table2 values (2,'value2')")
.dump(primaryDbName, dumpWithClause);
replica
- .run("repl load " + primaryDbName + " into " + replicatedDbName
- + " with('hive.repl.replica.external.table.base.dir'='" +
replica.externalTableWarehouseRoot + "', "
- + "'hive.exec.copyfile.maxsize'='0',
'distcp.options.skipcrccheck'='')")
- .run("use " + replicatedDbName)
- .run("repl status " + replicatedDbName)
- .verifyResult(tuple.lastReplicationId)
Review Comment:
wasn't this the expected behaviour?
##########
storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java:
##########
@@ -18,10 +18,10 @@
package org.apache.hadoop.hive.common;
-import org.apache.commons.lang.StringUtils;
+import org.apache.commons.lang3.StringUtils;
Review Comment:
these lang/lang3 changes seem unrelated to me; I think they could be done in
a separate jira to reduce the amount of work.
if you are moving away from the usage of `org.apache.commons.lang` ; could
you please also ban it in thr root pom.xml?
##########
standalone-metastore/pom.xml:
##########
@@ -227,6 +227,10 @@
<artifactId>hadoop-mapreduce-client-core</artifactId>
<version>${hadoop.version}</version>
<exclusions>
+ <exclusion>
+ <groupId>org.jline</groupId>
+ <artifactId>jline</artifactId>
+ </exclusion>
Review Comment:
this jline dep just creeps in from multiple hadoop artifacts; the best would
be to upgrade jline and not risk our chances with exclusions
##########
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HMSHandler.java:
##########
@@ -9361,7 +9362,8 @@ public NotificationEventsCountResponse
get_notification_events_count(Notificatio
private void authorizeProxyPrivilege() throws TException {
// Skip the auth in embedded mode or if the auth is disabled
if (!HiveMetaStore.isMetaStoreRemote() ||
- !MetastoreConf.getBoolVar(conf,
ConfVars.EVENT_DB_NOTIFICATION_API_AUTH)) {
+ !MetastoreConf.getBoolVar(conf,
ConfVars.EVENT_DB_NOTIFICATION_API_AUTH) ||
conf.getBoolean(HIVE_IN_TEST.getVarname(),
+ false)) {
Review Comment:
you are turning a feature off based on this `HIVE_IN_TEST` boolean; which
means the feature will not be tested during regular hive test; please find
another way; and since it seems like this is being turned off multiple places -
can you cover it with a test?
##########
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/parse/TestReplicationOnHDFSEncryptedZones.java:
##########
@@ -123,57 +122,24 @@ public void
targetAndSourceHaveDifferentEncryptionZoneKeys() throws Throwable {
put(HiveConf.ConfVars.REPLDIR.varname, primary.repldDir);
}}, "test_key123");
- List<String> dumpWithClause = Arrays.asList(
Review Comment:
seems like a testcase was removed; I wonder if this it not supported anymore
? ...and why are we removing this case in the scope of a hadoop upgrade?
##########
ql/src/java/org/apache/hadoop/hive/ql/exec/FetchOperator.java:
##########
@@ -18,20 +18,8 @@
package org.apache.hadoop.hive.ql.exec;
-import java.io.FileNotFoundException;
Review Comment:
import order is different in your IDE than in existing code; can you
configure it to not reorder the imports in every file?
I wonder if we have some agreement what order we want to use....
Issue Time Tracking
-------------------
Worklog Id: (was: 769572)
Time Spent: 8h 43m (was: 8.55h)
> Upgrade Hadoop to 3.3.1
> -----------------------
>
> Key: HIVE-24484
> URL: https://issues.apache.org/jira/browse/HIVE-24484
> Project: Hive
> Issue Type: Improvement
> Reporter: David Mollitor
> Assignee: David Mollitor
> Priority: Major
> Labels: pull-request-available
> Time Spent: 8h 43m
> Remaining Estimate: 0h
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)