[jira] [Work logged] (HIVE-26105) Show columns shows extra values if column comments contains specific Chinese character

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26105?focusedWorklogId=751402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751402
 ]

ASF GitHub Bot logged work on HIVE-26105:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 05:40
Start Date: 01/Apr/22 05:40
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request #3166:
URL: https://github.com/apache/hive/pull/3166


   …ins specific Chinese character
   
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751402)
Remaining Estimate: 0h
Time Spent: 10m

> Show columns shows extra values if column comments contains specific Chinese 
> character 
> ---
>
> Key: HIVE-26105
> URL: https://issues.apache.org/jira/browse/HIVE-26105
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The issue is happening because the UTF code for one of the Chinese character 
> contains the binary value of '\r' (CR). Because of this, the Hadoop line 
> reader (used by fetch task in Hive) is assuming the value after that 
> character as new value and this extra value with junk is getting displayed. 
> The issue is with 0x540D 名 ... The last value is "D" ..that is 13. While 
> reading the result, Hadoop line reader interpreting it as CR ( '\r'). Thus an 
> extra value with Junk is coming as output. For show column, we do not need 
> the comments. So while writing to the file, only column names should be 
> included.
> [https://github.com/apache/hadoop/blob/0fbd96a2449ec49f840d93e1c7d290c5218ef4ea/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java#L238]
>  
> {code:java}
> create table tbl_test  (fld0 string COMMENT  '期 ' , fld string COMMENT 
> '期末日期', fld1 string COMMENT '班次名称', fld2  string COMMENT '排班人数');
> show columns from tbl_test;
> ++
> | field  |
> ++
> | fld    |
> | fld0   |
> | fld1   |
> | �      |
> | fld2   |
> ++
> 5 rows selected (171.809 seconds)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26105) Show columns shows extra values if column comments contains specific Chinese character

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26105:
--
Labels: pull-request-available  (was: )

> Show columns shows extra values if column comments contains specific Chinese 
> character 
> ---
>
> Key: HIVE-26105
> URL: https://issues.apache.org/jira/browse/HIVE-26105
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The issue is happening because the UTF code for one of the Chinese character 
> contains the binary value of '\r' (CR). Because of this, the Hadoop line 
> reader (used by fetch task in Hive) is assuming the value after that 
> character as new value and this extra value with junk is getting displayed. 
> The issue is with 0x540D 名 ... The last value is "D" ..that is 13. While 
> reading the result, Hadoop line reader interpreting it as CR ( '\r'). Thus an 
> extra value with Junk is coming as output. For show column, we do not need 
> the comments. So while writing to the file, only column names should be 
> included.
> [https://github.com/apache/hadoop/blob/0fbd96a2449ec49f840d93e1c7d290c5218ef4ea/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java#L238]
>  
> {code:java}
> create table tbl_test  (fld0 string COMMENT  '期 ' , fld string COMMENT 
> '期末日期', fld1 string COMMENT '班次名称', fld2  string COMMENT '排班人数');
> show columns from tbl_test;
> ++
> | field  |
> ++
> | fld    |
> | fld0   |
> | fld1   |
> | �      |
> | fld2   |
> ++
> 5 rows selected (171.809 seconds)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26095) Add queryid in QueryLifeTimeHookContext

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26095?focusedWorklogId=751397=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751397
 ]

ASF GitHub Bot logged work on HIVE-26095:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 05:13
Start Date: 01/Apr/22 05:13
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #3156:
URL: https://github.com/apache/hive/pull/3156#discussion_r840236630



##
File path: 
service/src/test/org/apache/hive/service/cli/operation/TestQueryLifeTimeHooksWithSQLOperation.java
##
@@ -110,6 +118,45 @@ public void afterExecution(QueryLifeTimeHookContext ctx, 
boolean hasError) {
   assertNull(ctx.getHookContext().getException());
   assertNotNull(ctx.getHookContext().getQueryInfo());
   assertNotNull(ctx.getHookContext().getQueryInfo().getQueryDisplay());
+  assertQueryId(ctx.getQueryId());
 }
   }
+
+  /**
+   * Asserts that the specified query id exists and has the expected prefix 
and size.
+   *
+   * 
+   * A query id looks like below:
+   * 
+   *   username_20220330093338_dab90f30-5e79-463d-8359-0d2fff57effa
+   * 
+   * and we can accurately predict how the prefix should look like. T
+   * 
+   *
+   * @param actualQueryId the query id to verify
+   */
+  private static void assertQueryId(String actualQueryId) {
+assertNotNull(actualQueryId);
+String expectedIdPrefix = makeQueryIdStablePrefix();
+String actualIdPrefix = actualQueryId.substring(0, 
expectedIdPrefix.length());
+assertEquals(expectedIdPrefix, actualIdPrefix);
+assertEquals(expectedIdPrefix.length() + 41, actualQueryId.length());
+  }
+
+  /**
+   * Makes a query id prefix that is stable for an hour. The prefix changes 
every hour but this is enough to guarantee

Review comment:
   IUUC  `TestQueryLifeTimeHooksWithSQLOperation` tests whether data is 
passed to  `QueryLifeTimeHookWithParseHooks`. I think the test is more isolated 
and focus only to the data broadcasting functionality if the data is generated 
by the test. It can be achieved by setting the queryId explicitly in the 
beginning of the test to some constant and check it in the assertion part the 
exact same constant. I believe it can be set by `hive.query.id` config setting.
   
   Or you can also ask the actual queryId via the `hive.query.id` config 
setting somewhere in the beginning of the test and use that value in the assert 
part instead of generating a new one.
   
   Or as a last option if none of above works a regex can be used to check the 
format.
   
   If you also want to test the queryId format is correct an UT for 
`QueryPlan.makeQueryId` would do that but I think a regex is still needed for 
that or at least for checking the date time part.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751397)
Time Spent: 1h 20m  (was: 1h 10m)

> Add queryid in QueryLifeTimeHookContext
> ---
>
> Key: HIVE-26095
> URL: https://issues.apache.org/jira/browse/HIVE-26095
> Project: Hive
>  Issue Type: New Feature
>  Components: Hooks
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> A 
> [QueryLifeTimeHook|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHook.java]
>  is executed various times in the life-cycle of a query but it is not always 
> possible to obtain the id of the query. The query id is inside the 
> {{HookContext}} but the latter is not always available notably during 
> compilation.
> The query id is useful for many purposes as it is the only way to uniquely 
> identify the query/command that is currently running. It is also the only way 
> to match together events appearing in before and after methods.
> The goal of this jira is to add the query id in 
> [QueryLifeTimeHookContext|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContext.java]
>  and make it available during all life-cycle events.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?focusedWorklogId=751388=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751388
 ]

ASF GitHub Bot logged work on HIVE-26098:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 04:31
Start Date: 01/Apr/22 04:31
Worklog Time Spent: 10m 
  Work Description: maheshk114 commented on a change in pull request #3160:
URL: https://github.com/apache/hive/pull/3160#discussion_r840221727



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java
##
@@ -1424,10 +1433,9 @@ public LocalResource localizeResource(
 return createLocalResource(destFS, dest, type, 
LocalResourceVisibility.PRIVATE);
   }
   try {
-if (src.toUri().getScheme()!=null) {
-  FileUtil.copy(src.getFileSystem(conf), src, destFS, dest, false, 
false, conf);
-}
-else {
+if (!isSrcLocal) {
+  FileUtil.copy(srcFs, src, destFS, dest, false, false, conf);
+} else {
   destFS.copyFromLocalFile(false, false, src, dest);

Review comment:
   done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751388)
Time Spent: 50m  (was: 40m)

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  

[jira] [Assigned] (HIVE-26105) Show columns shows extra values if column comments contains specific Chinese character

2022-03-31 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-26105:
--


> Show columns shows extra values if column comments contains specific Chinese 
> character 
> ---
>
> Key: HIVE-26105
> URL: https://issues.apache.org/jira/browse/HIVE-26105
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
> The issue is happening because the UTF code for one of the Chinese character 
> contains the binary value of '\r' (CR). Because of this, the Hadoop line 
> reader (used by fetch task in Hive) is assuming the value after that 
> character as new value and this extra value with junk is getting displayed. 
> The issue is with 0x540D 名 ... The last value is "D" ..that is 13. While 
> reading the result, Hadoop line reader interpreting it as CR ( '\r'). Thus an 
> extra value with Junk is coming as output. For show column, we do not need 
> the comments. So while writing to the file, only column names should be 
> included.
> [https://github.com/apache/hadoop/blob/0fbd96a2449ec49f840d93e1c7d290c5218ef4ea/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/LineReader.java#L238]
>  
> {code:java}
> create table tbl_test  (fld0 string COMMENT  '期 ' , fld string COMMENT 
> '期末日期', fld1 string COMMENT '班次名称', fld2  string COMMENT '排班人数');
> show columns from tbl_test;
> ++
> | field  |
> ++
> | fld    |
> | fld0   |
> | fld1   |
> | �      |
> | fld2   |
> ++
> 5 rows selected (171.809 seconds)
>  {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-21456) Hive Metastore Thrift over HTTP

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-21456?focusedWorklogId=751311=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751311
 ]

ASF GitHub Bot logged work on HIVE-21456:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 00:32
Start Date: 01/Apr/22 00:32
Worklog Time Spent: 10m 
  Work Description: sourabh912 commented on a change in pull request #3105:
URL: https://github.com/apache/hive/pull/3105#discussion_r840095321



##
File path: standalone-metastore/pom.xml
##
@@ -103,6 +103,7 @@
 4.0.3
 2.8.4
 1.7.30
+4.4.10

Review comment:
   Sure. I will upgrade it to 4.4.13 which is what HS2 has as of today.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java
##
@@ -343,21 +366,162 @@ public static void startMetaStore(int port, 
HadoopThriftAuthBridge bridge,
 startMetaStore(port, bridge, conf, false, null);
   }
 
-  /**
-   * Start Metastore based on a passed {@link HadoopThriftAuthBridge}.
-   *
-   * @param port The port on which the Thrift server will start to serve
-   * @param bridge
-   * @param conf Configuration overrides
-   * @param startMetaStoreThreads Start the background threads (initiator, 
cleaner, statsupdater, etc.)
-   * @param startedBackgroundThreads If startMetaStoreThreads is true, this 
AtomicBoolean will be switched to true,
-   *  when all of the background threads are scheduled. Useful for testing 
purposes to wait
-   *  until the MetaStore is fully initialized.
-   * @throws Throwable
-   */
-  public static void startMetaStore(int port, HadoopThriftAuthBridge bridge,
-  Configuration conf, boolean startMetaStoreThreads, AtomicBoolean 
startedBackgroundThreads) throws Throwable {
-isMetaStoreRemote = true;
+  public static boolean isThriftServerRunning() {
+return thriftServer != null && thriftServer.isRunning();
+  }
+
+  // TODO: Is it worth trying to use a server that supports HTTP/2?
+  //  Does the Thrift http client support this?
+
+  public static ThriftServer startHttpMetastore(int port, Configuration conf)
+  throws Exception {
+LOG.info("Attempting to start http metastore server on port: {}", port);

Review comment:
   I didn't get the context of the question but we don't disable TRACE for 
the server.

##
File path: 
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java
##
@@ -1496,6 +1503,23 @@ public static ConfVars getMetaConf(String name) {
 USERS_IN_ADMIN_ROLE("metastore.users.in.admin.role", 
"hive.users.in.admin.role", "", false,
 "Comma separated list of users who are in admin role for 
bootstrapping.\n" +
 "More users can be added in ADMIN role later."),
+// TODO: Should we have a separate config for the metastoreclient or 
THRIFT_TRANSPORT_MODE
+// would suffice ?
+
METASTORE_CLIENT_THRIFT_TRANSPORT_MODE("metastore.client.thrift.transport.mode",
+"hive.metastore.client.thrift.transport.mode", "binary",
+"Transport mode to be used by the metastore client. It should be the 
same as " + THRIFT_TRANSPORT_MODE),
+METASTORE_CLIENT_THRIFT_HTTP_PATH("metastore.client.thrift.http.path",

Review comment:
   The reason I kept it here is because there are other client configs like 
METASTORE_CLIENT_AUTH_MODE, METASTORE_CLIENT_PLAIN_USERNAME that are defined in 
this conf. If you think we should move all client side confs to HiveConf, we 
can do it in a follow up patch. Thoughts? 

##
File path: itests/hive-unit/src/test/java/org/apache/hive/jdbc/TestSSL.java
##
@@ -437,15 +439,36 @@ public void testConnectionWrongCertCN() throws Exception {
* Test HMS server with SSL
* @throws Exception
*/
+  @Ignore
   @Test
   public void testMetastoreWithSSL() throws Exception {
 testSSLHMS(true);
   }
 
+  /**
+   * Test HMS server with Http + SSL
+   * @throws Exception
+   */
+  @Test
+  public void testMetastoreWithHttps() throws Exception {
+// MetastoreConf.setBoolVar(conf, 
MetastoreConf.ConfVars.EVENT_DB_NOTIFICATION_API_AUTH, false);
+//MetastoreConf.setVar(conf, 
MetastoreConf.ConfVars.METASTORE_CLIENT_TRANSPORT_MODE, "http");

Review comment:
   Done.

##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HmsThriftHttpServlet.java
##
@@ -0,0 +1,116 @@
+/* * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at

[jira] [Work logged] (HIVE-26095) Add queryid in QueryLifeTimeHookContext

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26095?focusedWorklogId=751302=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751302
 ]

ASF GitHub Bot logged work on HIVE-26095:
-

Author: ASF GitHub Bot
Created on: 01/Apr/22 00:01
Start Date: 01/Apr/22 00:01
Worklog Time Spent: 10m 
  Work Description: amansinha100 commented on a change in pull request 
#3156:
URL: https://github.com/apache/hive/pull/3156#discussion_r840116428



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContextImpl.java
##
@@ -78,11 +88,12 @@ public Builder withHookContext(HookContext hc) {
   return this;
 }
 
-public QueryLifeTimeHookContextImpl build() {
+public QueryLifeTimeHookContextImpl build(String queryId) {
   QueryLifeTimeHookContextImpl queryLifeTimeHookContext = new 
QueryLifeTimeHookContextImpl();
   queryLifeTimeHookContext.setHiveConf(this.conf);
   queryLifeTimeHookContext.setCommand(this.command);
   queryLifeTimeHookContext.setHookContext(this.hc);
+  queryLifeTimeHookContext.queryId = Objects.requireNonNull(queryId);

Review comment:
   Yes, sanity check was the only ask.  Changes LGTM.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751302)
Time Spent: 1h 10m  (was: 1h)

> Add queryid in QueryLifeTimeHookContext
> ---
>
> Key: HIVE-26095
> URL: https://issues.apache.org/jira/browse/HIVE-26095
> Project: Hive
>  Issue Type: New Feature
>  Components: Hooks
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> A 
> [QueryLifeTimeHook|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHook.java]
>  is executed various times in the life-cycle of a query but it is not always 
> possible to obtain the id of the query. The query id is inside the 
> {{HookContext}} but the latter is not always available notably during 
> compilation.
> The query id is useful for many purposes as it is the only way to uniquely 
> identify the query/command that is currently running. It is also the only way 
> to match together events appearing in before and after methods.
> The goal of this jira is to add the query id in 
> [QueryLifeTimeHookContext|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContext.java]
>  and make it available during all life-cycle events.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26095) Add queryid in QueryLifeTimeHookContext

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26095?focusedWorklogId=751201=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751201
 ]

ASF GitHub Bot logged work on HIVE-26095:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 20:08
Start Date: 31/Mar/22 20:08
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #3156:
URL: https://github.com/apache/hive/pull/3156#discussion_r839977437



##
File path: 
service/src/test/org/apache/hive/service/cli/operation/TestQueryLifeTimeHooksWithSQLOperation.java
##
@@ -110,6 +118,45 @@ public void afterExecution(QueryLifeTimeHookContext ctx, 
boolean hasError) {
   assertNull(ctx.getHookContext().getException());
   assertNotNull(ctx.getHookContext().getQueryInfo());
   assertNotNull(ctx.getHookContext().getQueryInfo().getQueryDisplay());
+  assertQueryId(ctx.getQueryId());
 }
   }
+
+  /**
+   * Asserts that the specified query id exists and has the expected prefix 
and size.
+   *
+   * 
+   * A query id looks like below:
+   * 
+   *   username_20220330093338_dab90f30-5e79-463d-8359-0d2fff57effa
+   * 
+   * and we can accurately predict how the prefix should look like. T
+   * 
+   *
+   * @param actualQueryId the query id to verify
+   */
+  private static void assertQueryId(String actualQueryId) {
+assertNotNull(actualQueryId);
+String expectedIdPrefix = makeQueryIdStablePrefix();
+String actualIdPrefix = actualQueryId.substring(0, 
expectedIdPrefix.length());
+assertEquals(expectedIdPrefix, actualIdPrefix);
+assertEquals(expectedIdPrefix.length() + 41, actualQueryId.length());
+  }
+
+  /**
+   * Makes a query id prefix that is stable for an hour. The prefix changes 
every hour but this is enough to guarantee

Review comment:
   To avoid the risk we can remove the prefix check and keep only the 
length check. Alternatively, we can simply `assertNotNull` for the queryId and 
nothing more. What do you prefer?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751201)
Time Spent: 1h  (was: 50m)

> Add queryid in QueryLifeTimeHookContext
> ---
>
> Key: HIVE-26095
> URL: https://issues.apache.org/jira/browse/HIVE-26095
> Project: Hive
>  Issue Type: New Feature
>  Components: Hooks
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> A 
> [QueryLifeTimeHook|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHook.java]
>  is executed various times in the life-cycle of a query but it is not always 
> possible to obtain the id of the query. The query id is inside the 
> {{HookContext}} but the latter is not always available notably during 
> compilation.
> The query id is useful for many purposes as it is the only way to uniquely 
> identify the query/command that is currently running. It is also the only way 
> to match together events appearing in before and after methods.
> The goal of this jira is to add the query id in 
> [QueryLifeTimeHookContext|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContext.java]
>  and make it available during all life-cycle events.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?focusedWorklogId=751176=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751176
 ]

ASF GitHub Bot logged work on HIVE-26100:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 18:53
Start Date: 31/Mar/22 18:53
Worklog Time Spent: 10m 
  Work Description: pvary commented on pull request #3162:
URL: https://github.com/apache/hive/pull/3162#issuecomment-1084984386


   @zabetak: Fixed them. Thanks for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751176)
Time Spent: 50m  (was: 40m)

> Preparing for 4.0.0-alpha-2 development
> ---
>
> Key: HIVE-26100
> URL: https://issues.apache.org/jira/browse/HIVE-26100
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-26099) Move patched-iceberg packages to org.apache.hive group

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary resolved HIVE-26099.
---
Fix Version/s: 4.0.0
   Resolution: Fixed

Pushed to master.
Thanks for the review [~Marton Bod]!

> Move patched-iceberg packages to org.apache.hive group
> --
>
> Key: HIVE-26099
> URL: https://issues.apache.org/jira/browse/HIVE-26099
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When releasing stuff we should release everything under {{org.apache.hive}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26099) Move patched-iceberg packages to org.apache.hive group

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26099?focusedWorklogId=751174=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751174
 ]

ASF GitHub Bot logged work on HIVE-26099:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 18:52
Start Date: 31/Mar/22 18:52
Worklog Time Spent: 10m 
  Work Description: pvary merged pull request #3161:
URL: https://github.com/apache/hive/pull/3161


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751174)
Time Spent: 20m  (was: 10m)

> Move patched-iceberg packages to org.apache.hive group
> --
>
> Key: HIVE-26099
> URL: https://issues.apache.org/jira/browse/HIVE-26099
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> When releasing stuff we should release everything under {{org.apache.hive}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26095) Add queryid in QueryLifeTimeHookContext

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26095?focusedWorklogId=751152=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751152
 ]

ASF GitHub Bot logged work on HIVE-26095:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 18:10
Start Date: 31/Mar/22 18:10
Worklog Time Spent: 10m 
  Work Description: kasakrisz commented on a change in pull request #3156:
URL: https://github.com/apache/hive/pull/3156#discussion_r839891772



##
File path: 
service/src/test/org/apache/hive/service/cli/operation/TestQueryLifeTimeHooksWithSQLOperation.java
##
@@ -110,6 +118,45 @@ public void afterExecution(QueryLifeTimeHookContext ctx, 
boolean hasError) {
   assertNull(ctx.getHookContext().getException());
   assertNotNull(ctx.getHookContext().getQueryInfo());
   assertNotNull(ctx.getHookContext().getQueryInfo().getQueryDisplay());
+  assertQueryId(ctx.getQueryId());
 }
   }
+
+  /**
+   * Asserts that the specified query id exists and has the expected prefix 
and size.
+   *
+   * 
+   * A query id looks like below:
+   * 
+   *   username_20220330093338_dab90f30-5e79-463d-8359-0d2fff57effa
+   * 
+   * and we can accurately predict how the prefix should look like. T
+   * 
+   *
+   * @param actualQueryId the query id to verify
+   */
+  private static void assertQueryId(String actualQueryId) {
+assertNotNull(actualQueryId);
+String expectedIdPrefix = makeQueryIdStablePrefix();
+String actualIdPrefix = actualQueryId.substring(0, 
expectedIdPrefix.length());
+assertEquals(expectedIdPrefix, actualIdPrefix);
+assertEquals(expectedIdPrefix.length() + 41, actualQueryId.length());
+  }
+
+  /**
+   * Makes a query id prefix that is stable for an hour. The prefix changes 
every hour but this is enough to guarantee

Review comment:
   What happens if the test starts at let's say 2022.03.31 01:59:59.999 and 
this method is called at 2022.03.31 02:00:00.000 ?
   
   Does the query id format should be checked here? How about adding an UT for 
the query generator itself.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751152)
Time Spent: 50m  (was: 40m)

> Add queryid in QueryLifeTimeHookContext
> ---
>
> Key: HIVE-26095
> URL: https://issues.apache.org/jira/browse/HIVE-26095
> Project: Hive
>  Issue Type: New Feature
>  Components: Hooks
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> A 
> [QueryLifeTimeHook|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHook.java]
>  is executed various times in the life-cycle of a query but it is not always 
> possible to obtain the id of the query. The query id is inside the 
> {{HookContext}} but the latter is not always available notably during 
> compilation.
> The query id is useful for many purposes as it is the only way to uniquely 
> identify the query/command that is currently running. It is also the only way 
> to match together events appearing in before and after methods.
> The goal of this jira is to add the query id in 
> [QueryLifeTimeHookContext|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContext.java]
>  and make it available during all life-cycle events.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?focusedWorklogId=751087=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751087
 ]

ASF GitHub Bot logged work on HIVE-26098:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 16:59
Start Date: 31/Mar/22 16:59
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on a change in pull request #3160:
URL: https://github.com/apache/hive/pull/3160#discussion_r839834104



##
File path: ql/src/java/org/apache/hadoop/hive/ql/exec/tez/DagUtils.java
##
@@ -1424,10 +1433,9 @@ public LocalResource localizeResource(
 return createLocalResource(destFS, dest, type, 
LocalResourceVisibility.PRIVATE);
   }
   try {
-if (src.toUri().getScheme()!=null) {
-  FileUtil.copy(src.getFileSystem(conf), src, destFS, dest, false, 
false, conf);
-}
-else {
+if (!isSrcLocal) {
+  FileUtil.copy(srcFs, src, destFS, dest, false, false, conf);
+} else {
   destFS.copyFromLocalFile(false, false, src, dest);

Review comment:
   Guess we can ditch the if-else and ``isSrcLocal`` part.
   ``destFS.copyFromLocalFile(false, false, src, dest);``  resolves to 
   ``FileUtil.copy(getLocal(conf), src, this, dst, delSrc, overwrite, conf);``
   which is technically same as the above `FileUtils.copy.` 
   The FileSystem anyway we have resolved to Local above :
   ```
   } else {
 srcFs = FileSystem.getLocal(conf);
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751087)
Time Spent: 40m  (was: 0.5h)

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> 

[jira] [Work logged] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?focusedWorklogId=751004=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751004
 ]

ASF GitHub Bot logged work on HIVE-26100:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 14:13
Start Date: 31/Mar/22 14:13
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #3162:
URL: https://github.com/apache/hive/pull/3162#discussion_r839658026



##
File path: 
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.mysql.sql
##
@@ -0,0 +1,240 @@
+SELECT 'Upgrading MetaStore schema from 4.0.0-alpha-1 to 4.0.0-alpha-2' AS 
MESSAGE;

Review comment:
   Thanks for catching this.
   Fixed

##
File path: 
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.postgres.sql
##
@@ -0,0 +1,374 @@
+SELECT 'Upgrading MetaStore schema from 4.0.0-alpha-1 to 4.0.0-alpha-2';

Review comment:
   Thanks for catching this.
   Fixed

##
File path: 
metastore/scripts/upgrade/hive/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.hive.sql
##
@@ -0,0 +1,887 @@
+SELECT 'Upgrading MetaStore schema from 3.1.0 to 4.0.0-alpha-1';

Review comment:
   Thanks for catching this.
   Fixed




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751004)
Time Spent: 0.5h  (was: 20m)

> Preparing for 4.0.0-alpha-2 development
> ---
>
> Key: HIVE-26100
> URL: https://issues.apache.org/jira/browse/HIVE-26100
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?focusedWorklogId=751005=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751005
 ]

ASF GitHub Bot logged work on HIVE-26100:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 14:13
Start Date: 31/Mar/22 14:13
Worklog Time Spent: 10m 
  Work Description: pvary commented on a change in pull request #3162:
URL: https://github.com/apache/hive/pull/3162#discussion_r839658497



##
File path: 
metastore/scripts/upgrade/hive/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.hive.sql
##
@@ -0,0 +1,887 @@
+SELECT 'Upgrading MetaStore schema from 3.1.0 to 4.0.0-alpha-1';
+
+USE SYS;
+
+-- HIVE-20793
+DROP TABLE IF EXISTS `WM_RESOURCEPLANS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_RESOURCEPLANS` (
+  `NAME` string,
+  `NS` string,
+  `STATUS` string,
+  `QUERY_PARALLELISM` int,
+  `DEFAULT_POOL_PATH` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\",
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"STATUS\",
+  \"WM_RESOURCEPLAN\".\"QUERY_PARALLELISM\",
+  \"WM_POOL\".\"PATH\"
+FROM
+  \"WM_RESOURCEPLAN\" LEFT OUTER JOIN \"WM_POOL\" ON 
\"WM_RESOURCEPLAN\".\"DEFAULT_POOL_ID\" = \"WM_POOL\".\"POOL_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_TRIGGERS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_TRIGGERS` (
+  `RP_NAME` string,
+  `NS` string,
+  `NAME` string,
+  `TRIGGER_EXPRESSION` string,
+  `ACTION_EXPRESSION` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  r.\"NAME\" AS RP_NAME,
+  case when r.\"NS\" is null then 'default' else r.\"NS\" end,
+  t.\"NAME\" AS NAME,
+  \"TRIGGER_EXPRESSION\",
+  \"ACTION_EXPRESSION\"
+FROM
+  \"WM_TRIGGER\" t
+JOIN
+  \"WM_RESOURCEPLAN\" r
+ON
+  t.\"RP_ID\" = r.\"RP_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_POOLS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_POOLS` (
+  `RP_NAME` string,
+  `NS` string,
+  `PATH` string,
+  `ALLOC_FRACTION` double,
+  `QUERY_PARALLELISM` int,
+  `SCHEDULING_POLICY` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\",
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"WM_POOL\".\"PATH\",
+  \"WM_POOL\".\"ALLOC_FRACTION\",
+  \"WM_POOL\".\"QUERY_PARALLELISM\",
+  \"WM_POOL\".\"SCHEDULING_POLICY\"
+FROM
+  \"WM_POOL\"
+JOIN
+  \"WM_RESOURCEPLAN\"
+ON
+  \"WM_POOL\".\"RP_ID\" = \"WM_RESOURCEPLAN\".\"RP_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_POOLS_TO_TRIGGERS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_POOLS_TO_TRIGGERS` (
+  `RP_NAME` string,
+  `NS` string,
+  `POOL_PATH` string,
+  `TRIGGER_NAME` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\" AS RP_NAME,
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"WM_POOL\".\"PATH\" AS POOL_PATH,
+  \"WM_TRIGGER\".\"NAME\" AS TRIGGER_NAME
+FROM \"WM_POOL_TO_TRIGGER\"
+  JOIN \"WM_POOL\" ON \"WM_POOL_TO_TRIGGER\".\"POOL_ID\" = 
\"WM_POOL\".\"POOL_ID\"
+  JOIN \"WM_TRIGGER\" ON \"WM_POOL_TO_TRIGGER\".\"TRIGGER_ID\" = 
\"WM_TRIGGER\".\"TRIGGER_ID\"
+  JOIN \"WM_RESOURCEPLAN\" ON \"WM_POOL\".\"RP_ID\" = 
\"WM_RESOURCEPLAN\".\"RP_ID\"
+UNION
+SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\" AS RP_NAME,
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  '' AS POOL_PATH,
+  \"WM_TRIGGER\".\"NAME\" AS TRIGGER_NAME
+FROM \"WM_TRIGGER\"
+  JOIN \"WM_RESOURCEPLAN\" ON \"WM_TRIGGER\".\"RP_ID\" = 
\"WM_RESOURCEPLAN\".\"RP_ID\"
+WHERE CAST(\"WM_TRIGGER\".\"IS_IN_UNMANAGED\" AS CHAR) IN ('1', 't')
+"
+);
+
+DROP TABLE IF EXISTS `WM_MAPPINGS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_MAPPINGS` (
+  `RP_NAME` string,
+  `NS` string,
+  `ENTITY_TYPE` string,
+  `ENTITY_NAME` string,
+  `POOL_PATH` string,
+  `ORDERING` int
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\",
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"ENTITY_TYPE\",
+  \"ENTITY_NAME\",
+  case when \"WM_POOL\".\"PATH\" is null then '' else 
\"WM_POOL\".\"PATH\" end,
+  \"ORDERING\"
+FROM \"WM_MAPPING\"
+JOIN \"WM_RESOURCEPLAN\" ON \"WM_MAPPING\".\"RP_ID\" = 
\"WM_RESOURCEPLAN\".\"RP_ID\"
+LEFT OUTER JOIN \"WM_POOL\" ON \"WM_POOL\".\"POOL_ID\" = 

[jira] [Updated] (HIVE-26103) Port Iceberg fixes to the iceberg module

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26103:
--
Labels: pull-request-available  (was: )

> Port Iceberg fixes to the iceberg module
> 
>
> Key: HIVE-26103
> URL: https://issues.apache.org/jira/browse/HIVE-26103
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should synchronise the Iceberg hive-metastore, mr modules with the Hive 
> codebase



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26103) Port Iceberg fixes to the iceberg module

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26103?focusedWorklogId=751002=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-751002
 ]

ASF GitHub Bot logged work on HIVE-26103:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 14:04
Start Date: 31/Mar/22 14:04
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3164:
URL: https://github.com/apache/hive/pull/3164


   ### What changes were proposed in this pull request?
   Crossporting Iceberg fixes changes
   
   ### Why are the changes needed?
   We need to be compatible with the Iceberg code
   
   ### Does this PR introduce _any_ user-facing change?
   Some Iceberg related changes
   
   ### How was this patch tested?
   Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 751002)
Remaining Estimate: 0h
Time Spent: 10m

> Port Iceberg fixes to the iceberg module
> 
>
> Key: HIVE-26103
> URL: https://issues.apache.org/jira/browse/HIVE-26103
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should synchronise the Iceberg hive-metastore, mr modules with the Hive 
> codebase



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?focusedWorklogId=750989=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750989
 ]

ASF GitHub Bot logged work on HIVE-26100:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 13:27
Start Date: 31/Mar/22 13:27
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #3162:
URL: https://github.com/apache/hive/pull/3162#discussion_r839594772



##
File path: 
standalone-metastore/metastore-server/src/main/sql/postgres/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.postgres.sql
##
@@ -0,0 +1,374 @@
+SELECT 'Upgrading MetaStore schema from 4.0.0-alpha-1 to 4.0.0-alpha-2';

Review comment:
   Shouldn't this Postgres upgrade script be mostly empty (except version 
update) at the moment?

##
File path: 
standalone-metastore/metastore-server/src/main/sql/mysql/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.mysql.sql
##
@@ -0,0 +1,240 @@
+SELECT 'Upgrading MetaStore schema from 4.0.0-alpha-1 to 4.0.0-alpha-2' AS 
MESSAGE;

Review comment:
   Shouldn't this MySQL upgrade script be mostly empty (except version 
update) at the moment?

##
File path: 
metastore/scripts/upgrade/hive/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.hive.sql
##
@@ -0,0 +1,887 @@
+SELECT 'Upgrading MetaStore schema from 3.1.0 to 4.0.0-alpha-1';

Review comment:
   Shouldn't this Hive upgrade script be mostly empty (except version 
update) at the moment?
   
   Also the query should be the following I think:
   `SELECT 'Upgrading MetaStore schema from 4.0.0-alpha-1 to 4.0.0-alpha-2';`
   

##
File path: 
metastore/scripts/upgrade/hive/upgrade-4.0.0-alpha-1-to-4.0.0-alpha-2.hive.sql
##
@@ -0,0 +1,887 @@
+SELECT 'Upgrading MetaStore schema from 3.1.0 to 4.0.0-alpha-1';
+
+USE SYS;
+
+-- HIVE-20793
+DROP TABLE IF EXISTS `WM_RESOURCEPLANS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_RESOURCEPLANS` (
+  `NAME` string,
+  `NS` string,
+  `STATUS` string,
+  `QUERY_PARALLELISM` int,
+  `DEFAULT_POOL_PATH` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\",
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"STATUS\",
+  \"WM_RESOURCEPLAN\".\"QUERY_PARALLELISM\",
+  \"WM_POOL\".\"PATH\"
+FROM
+  \"WM_RESOURCEPLAN\" LEFT OUTER JOIN \"WM_POOL\" ON 
\"WM_RESOURCEPLAN\".\"DEFAULT_POOL_ID\" = \"WM_POOL\".\"POOL_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_TRIGGERS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_TRIGGERS` (
+  `RP_NAME` string,
+  `NS` string,
+  `NAME` string,
+  `TRIGGER_EXPRESSION` string,
+  `ACTION_EXPRESSION` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  r.\"NAME\" AS RP_NAME,
+  case when r.\"NS\" is null then 'default' else r.\"NS\" end,
+  t.\"NAME\" AS NAME,
+  \"TRIGGER_EXPRESSION\",
+  \"ACTION_EXPRESSION\"
+FROM
+  \"WM_TRIGGER\" t
+JOIN
+  \"WM_RESOURCEPLAN\" r
+ON
+  t.\"RP_ID\" = r.\"RP_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_POOLS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_POOLS` (
+  `RP_NAME` string,
+  `NS` string,
+  `PATH` string,
+  `ALLOC_FRACTION` double,
+  `QUERY_PARALLELISM` int,
+  `SCHEDULING_POLICY` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\",
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"WM_POOL\".\"PATH\",
+  \"WM_POOL\".\"ALLOC_FRACTION\",
+  \"WM_POOL\".\"QUERY_PARALLELISM\",
+  \"WM_POOL\".\"SCHEDULING_POLICY\"
+FROM
+  \"WM_POOL\"
+JOIN
+  \"WM_RESOURCEPLAN\"
+ON
+  \"WM_POOL\".\"RP_ID\" = \"WM_RESOURCEPLAN\".\"RP_ID\""
+);
+
+DROP TABLE IF EXISTS `WM_POOLS_TO_TRIGGERS`;
+CREATE EXTERNAL TABLE IF NOT EXISTS `WM_POOLS_TO_TRIGGERS` (
+  `RP_NAME` string,
+  `NS` string,
+  `POOL_PATH` string,
+  `TRIGGER_NAME` string
+)
+STORED BY 'org.apache.hive.storage.jdbc.JdbcStorageHandler'
+TBLPROPERTIES (
+"hive.sql.database.type" = "METASTORE",
+"hive.sql.query" =
+"SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\" AS RP_NAME,
+  case when \"WM_RESOURCEPLAN\".\"NS\" is null then 'default' else 
\"WM_RESOURCEPLAN\".\"NS\" end AS NS,
+  \"WM_POOL\".\"PATH\" AS POOL_PATH,
+  \"WM_TRIGGER\".\"NAME\" AS TRIGGER_NAME
+FROM \"WM_POOL_TO_TRIGGER\"
+  JOIN \"WM_POOL\" ON \"WM_POOL_TO_TRIGGER\".\"POOL_ID\" = 
\"WM_POOL\".\"POOL_ID\"
+  JOIN \"WM_TRIGGER\" ON \"WM_POOL_TO_TRIGGER\".\"TRIGGER_ID\" = 
\"WM_TRIGGER\".\"TRIGGER_ID\"
+  JOIN \"WM_RESOURCEPLAN\" ON \"WM_POOL\".\"RP_ID\" = 
\"WM_RESOURCEPLAN\".\"RP_ID\"
+UNION
+SELECT
+  \"WM_RESOURCEPLAN\".\"NAME\" AS RP_NAME,
+  case when 

[jira] [Assigned] (HIVE-26103) Port Iceberg fixes to the iceberg module

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-26103:
-


> Port Iceberg fixes to the iceberg module
> 
>
> Key: HIVE-26103
> URL: https://issues.apache.org/jira/browse/HIVE-26103
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> We should synchronise the Iceberg hive-metastore, mr modules with the Hive 
> codebase



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-26102) Implement DELETE statements for Iceberg tables

2022-03-31 Thread Marton Bod (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marton Bod reassigned HIVE-26102:
-


> Implement DELETE statements for Iceberg tables
> --
>
> Key: HIVE-26102
> URL: https://issues.apache.org/jira/browse/HIVE-26102
> Project: Hive
>  Issue Type: New Feature
>Reporter: Marton Bod
>Assignee: Marton Bod
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26026) Use the new "REFUSED" compaction state where it makes sense

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26026?focusedWorklogId=750948=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750948
 ]

ASF GitHub Bot logged work on HIVE-26026:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:57
Start Date: 31/Mar/22 11:57
Worklog Time Spent: 10m 
  Work Description: deniskuzZ merged pull request #3126:
URL: https://github.com/apache/hive/pull/3126


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750948)
Time Spent: 6.5h  (was: 6h 20m)

> Use the new "REFUSED" compaction state where it makes sense
> ---
>
> Key: HIVE-26026
> URL: https://issues.apache.org/jira/browse/HIVE-26026
> Project: Hive
>  Issue Type: Improvement
>  Components: Hive
>Reporter: László Végh
>Assignee: László Végh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The 
> org.apache.hadoop.hive.ql.txn.compactor.Worker#findNextCompactionAndExecute 
> method does several checks (The table/partition exists, is not sorted, there 
> are enough files to compact, etc.) before it actually executes the compaction 
> request. If the compaction request fails on any of these checks, it is put to 
> "SUCCEEDED" state which is often misleading for users. SHOW COMPACTIONS will 
> show these requests as succeeded without an error, while the table is not 
> compacted at all.
> For these cases, the state should be "REFUSED" instead of "SUCCEEDED" among 
> with the appropriate error message.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26101) Port Iceberg Hive fix - Hive: Avoid recursive listing in HiveCatalog#renameTable

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26101:
--
Labels: pull-request-available  (was: )

> Port Iceberg Hive fix - Hive: Avoid recursive listing in 
> HiveCatalog#renameTable
> 
>
> Key: HIVE-26101
> URL: https://issues.apache.org/jira/browse/HIVE-26101
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should port this HiveTableOperations fix from Iceeberg code



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26101) Port Iceberg Hive fix - Hive: Avoid recursive listing in HiveCatalog#renameTable

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26101?focusedWorklogId=750942=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750942
 ]

ASF GitHub Bot logged work on HIVE-26101:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:49
Start Date: 31/Mar/22 11:49
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3163:
URL: https://github.com/apache/hive/pull/3163


   ### What changes were proposed in this pull request?
   Prevent recursive listing
   
   ### Why are the changes needed?
   Performance
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750942)
Remaining Estimate: 0h
Time Spent: 10m

> Port Iceberg Hive fix - Hive: Avoid recursive listing in 
> HiveCatalog#renameTable
> 
>
> Key: HIVE-26101
> URL: https://issues.apache.org/jira/browse/HIVE-26101
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We should port this HiveTableOperations fix from Iceeberg code



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26101) Port Iceberg Hive fix - Hive: Avoid recursive listing in HiveCatalog#renameTable

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary updated HIVE-26101:
--
Summary: Port Iceberg Hive fix - Hive: Avoid recursive listing in 
HiveCatalog#renameTable  (was: Port Iceberg Hive fix - Hive: Avoid recursive 
listing in HiveCatalog#renameTable (#4407))

> Port Iceberg Hive fix - Hive: Avoid recursive listing in 
> HiveCatalog#renameTable
> 
>
> Key: HIVE-26101
> URL: https://issues.apache.org/jira/browse/HIVE-26101
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> We should port this HiveTableOperations fix from Iceeberg code



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-26101) Port Iceberg Hive fix - Hive: Avoid recursive listing in HiveCatalog#renameTable (#4407)

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-26101:
-


> Port Iceberg Hive fix - Hive: Avoid recursive listing in 
> HiveCatalog#renameTable (#4407)
> 
>
> Key: HIVE-26101
> URL: https://issues.apache.org/jira/browse/HIVE-26101
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> We should port this HiveTableOperations fix from Iceeberg code



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26100:
--
Labels: pull-request-available  (was: )

> Preparing for 4.0.0-alpha-2 development
> ---
>
> Key: HIVE-26100
> URL: https://issues.apache.org/jira/browse/HIVE-26100
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?focusedWorklogId=750927=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750927
 ]

ASF GitHub Bot logged work on HIVE-26100:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:28
Start Date: 31/Mar/22 11:28
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3162:
URL: https://github.com/apache/hive/pull/3162


   ### What changes were proposed in this pull request?
   New pom.xml versions, and new db files
   
   ### Why are the changes needed?
   So the next release development could be started
   
   ### Does this PR introduce _any_ user-facing change?
   New db version
   
   ### How was this patch tested?
   Unit tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750927)
Remaining Estimate: 0h
Time Spent: 10m

> Preparing for 4.0.0-alpha-2 development
> ---
>
> Key: HIVE-26100
> URL: https://issues.apache.org/jira/browse/HIVE-26100
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-26068) Add README to the src tarball

2022-03-31 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-26068.

Fix Version/s: 4.0.0-alpha-2
   Resolution: Fixed

Fixed in 
https://github.com/apache/hive/commit/2ac9b5f09000cdf041043659ba5623a4bd653a85. 
Thanks for the review [~pvary]!

> Add README to the src tarball
> -
>
> Key: HIVE-26068
> URL: https://issues.apache.org/jira/browse/HIVE-26068
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We need to add the README to the src tarball.
> This should contain info about how to build the project from source



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26068) Add README with build instructions to the src tarball

2022-03-31 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis updated HIVE-26068:
---
Summary: Add README with build instructions to the src tarball   (was: Add 
README to the src tarball)

> Add README with build instructions to the src tarball 
> --
>
> Key: HIVE-26068
> URL: https://issues.apache.org/jira/browse/HIVE-26068
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We need to add the README to the src tarball.
> This should contain info about how to build the project from source



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26067) Remove core directory from src

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26067?focusedWorklogId=750914=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750914
 ]

ASF GitHub Bot logged work on HIVE-26067:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:03
Start Date: 31/Mar/22 11:03
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #3135:
URL: https://github.com/apache/hive/pull/3135


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750914)
Time Spent: 20m  (was: 10m)

> Remove core directory from src
> --
>
> Key: HIVE-26067
> URL: https://issues.apache.org/jira/browse/HIVE-26067
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is not used. For the only file there we have an exact copy in 
> {{org.apache.hive.hcatalog}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-26067) Remove core directory from src

2022-03-31 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved HIVE-26067.

Fix Version/s: 4.0.0-alpha-2
   Resolution: Fixed

Fixed in 
https://github.com/apache/hive/commit/d22864ff734699c65404c80d7b67b102dbe3e873. 
Thanks for the review [~pvary]!

> Remove core directory from src
> --
>
> Key: HIVE-26067
> URL: https://issues.apache.org/jira/browse/HIVE-26067
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> This is not used. For the only file there we have an exact copy in 
> {{org.apache.hive.hcatalog}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26068) Add README to the src tarball

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26068?focusedWorklogId=750913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750913
 ]

ASF GitHub Bot logged work on HIVE-26068:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 11:03
Start Date: 31/Mar/22 11:03
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #3136:
URL: https://github.com/apache/hive/pull/3136


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750913)
Time Spent: 20m  (was: 10m)

> Add README to the src tarball
> -
>
> Key: HIVE-26068
> URL: https://issues.apache.org/jira/browse/HIVE-26068
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We need to add the README to the src tarball.
> This should contain info about how to build the project from source



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-24649) Optimise Hive::addWriteNotificationLog for large data inserts

2022-03-31 Thread mahesh kumar behera (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515236#comment-17515236
 ] 

mahesh kumar behera commented on HIVE-24649:


[~rajesh.balamohan] 

I think this a taken care of by HIVE-25205. Can you please confirm.

> Optimise Hive::addWriteNotificationLog for large data inserts
> -
>
> Key: HIVE-24649
> URL: https://issues.apache.org/jira/browse/HIVE-24649
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: Rajesh Balamohan
>Priority: Major
>  Labels: performance
>
> When loading dynamic partition with large dataset, it spends lot of time in 
> "Hive::loadDynamicPartitions --> addWriteNotificationLog".
> Though it is for same for same table, it ends up loading table and partition 
> details for every partition and writes to notification log.
> Also, "Partition" details may be already present in {{PartitionDetails}} 
> object in {{Hive::loadDynamicPartitions}}. This is unnecessarily recomputed 
> again in {{HiveMetaStore::add_write_notification_log}}
>  
> Lines of interest:
> https://github.com/apache/hive/blob/master/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java#L3028
> https://github.com/apache/hive/blob/89073a94354f0cc14ec4ae0a43e05aae29276b4d/standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java#L8500
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-26100) Preparing for 4.0.0-alpha-2 development

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-26100:
-


> Preparing for 4.0.0-alpha-2 development
> ---
>
> Key: HIVE-26100
> URL: https://issues.apache.org/jira/browse/HIVE-26100
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26095) Add queryid in QueryLifeTimeHookContext

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26095?focusedWorklogId=750893=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750893
 ]

ASF GitHub Bot logged work on HIVE-26095:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 10:33
Start Date: 31/Mar/22 10:33
Worklog Time Spent: 10m 
  Work Description: zabetak commented on a change in pull request #3156:
URL: https://github.com/apache/hive/pull/3156#discussion_r839446442



##
File path: 
ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContextImpl.java
##
@@ -78,11 +88,12 @@ public Builder withHookContext(HookContext hc) {
   return this;
 }
 
-public QueryLifeTimeHookContextImpl build() {
+public QueryLifeTimeHookContextImpl build(String queryId) {
   QueryLifeTimeHookContextImpl queryLifeTimeHookContext = new 
QueryLifeTimeHookContextImpl();
   queryLifeTimeHookContext.setHiveConf(this.conf);
   queryLifeTimeHookContext.setCommand(this.command);
   queryLifeTimeHookContext.setHookContext(this.hc);
+  queryLifeTimeHookContext.queryId = Objects.requireNonNull(queryId);

Review comment:
   We can enforce a min length check but its not easy to have a max since 
the query id includes the user name. The username max length is OS dependent.
   
   I added a few more checks in 
https://github.com/apache/hive/pull/3156/commits/aae20a1fd3e9ae289daa3e6c95c20ae8a209211f.
 These are mostly sanity checks. I don't think we should put very involved 
checks here. Asserting things for the shape of the query id should be done via 
unit tests.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750893)
Time Spent: 40m  (was: 0.5h)

> Add queryid in QueryLifeTimeHookContext
> ---
>
> Key: HIVE-26095
> URL: https://issues.apache.org/jira/browse/HIVE-26095
> Project: Hive
>  Issue Type: New Feature
>  Components: Hooks
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0-alpha-2
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> A 
> [QueryLifeTimeHook|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHook.java]
>  is executed various times in the life-cycle of a query but it is not always 
> possible to obtain the id of the query. The query id is inside the 
> {{HookContext}} but the latter is not always available notably during 
> compilation.
> The query id is useful for many purposes as it is the only way to uniquely 
> identify the query/command that is currently running. It is also the only way 
> to match together events appearing in before and after methods.
> The goal of this jira is to add the query id in 
> [QueryLifeTimeHookContext|https://github.com/apache/hive/blob/6c0b86ef0cfc67c5acb3468408e1d46fa6ef8024/ql/src/java/org/apache/hadoop/hive/ql/hooks/QueryLifeTimeHookContext.java]
>  and make it available during all life-cycle events.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26099) Move patched-iceberg packages to org.apache.hive group

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26099?focusedWorklogId=750865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750865
 ]

ASF GitHub Bot logged work on HIVE-26099:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 09:21
Start Date: 31/Mar/22 09:21
Worklog Time Spent: 10m 
  Work Description: pvary opened a new pull request #3161:
URL: https://github.com/apache/hive/pull/3161


   ### What changes were proposed in this pull request?
   Move the patched iceberg modules under org.apache.hive package from 
org.apache.iceberg
   
   ### Why are the changes needed?
   We do not want to accidentally put trash under other projects package
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Run the compilation


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750865)
Remaining Estimate: 0h
Time Spent: 10m

> Move patched-iceberg packages to org.apache.hive group
> --
>
> Key: HIVE-26099
> URL: https://issues.apache.org/jira/browse/HIVE-26099
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When releasing stuff we should release everything under {{org.apache.hive}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-26099) Move patched-iceberg packages to org.apache.hive group

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26099:
--
Labels: pull-request-available  (was: )

> Move patched-iceberg packages to org.apache.hive group
> --
>
> Key: HIVE-26099
> URL: https://issues.apache.org/jira/browse/HIVE-26099
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When releasing stuff we should release everything under {{org.apache.hive}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-26099) Move patched-iceberg packages to org.apache.hive group

2022-03-31 Thread Peter Vary (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Vary reassigned HIVE-26099:
-


> Move patched-iceberg packages to org.apache.hive group
> --
>
> Key: HIVE-26099
> URL: https://issues.apache.org/jira/browse/HIVE-26099
> Project: Hive
>  Issue Type: Task
>Reporter: Peter Vary
>Assignee: Peter Vary
>Priority: Major
>
> When releasing stuff we should release everything under {{org.apache.hive}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25492) Major query-based compaction is skipped if partition is empty

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25492?focusedWorklogId=750848=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750848
 ]

ASF GitHub Bot logged work on HIVE-25492:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:51
Start Date: 31/Mar/22 08:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #3157:
URL: https://github.com/apache/hive/pull/3157#discussion_r839344126



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1480,6 +1471,40 @@ private static ValidTxnList 
getValidTxnList(Configuration conf) {
 return validTxnList;
   }
 
+
+  /**
+   * In case of the cleaner, we don't need to go into file level, it is enough 
to collect base/delta/deletedelta directories.
+   *
+   * @param fs the filesystem used for the directory lookup
+   * @param path the path of the table or partition needs to be cleaned
+   * @return The listed directory snapshot needs to be checked for cleaning
+   * @throws IOException on filesystem errors
+   */
+  public static Map getHdfsDirSnapshotsForCleaner(final 
FileSystem fs, final Path path)
+  throws IOException {
+Map dirToSnapshots = new HashMap<>();
+// depth first search
+Deque> stack = new ArrayDeque<>();
+stack.push(fs.listStatusIterator(path));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (acidHiddenFileFilter.accept(fPath)) {
+  if (deltaFileFilter.accept(fPath) ||
+  baseFileFilter.accept(fPath) ||
+  deleteEventDeltaDirFilter.accept(fPath)) {
+dirToSnapshots.put(fPath, new HdfsDirSnapshotImpl(fPath));
+  } else {

Review comment:
   we could check if fStatus.isDirectory() before listing again




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750848)
Time Spent: 50m  (was: 40m)

> Major query-based compaction is skipped if partition is empty
> -
>
> Key: HIVE-25492
> URL: https://issues.apache.org/jira/browse/HIVE-25492
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently if the result of query-based compaction is an empty base, delta, or 
> delete delta, the empty directory is deleted.
> This is because of minor compaction – if there are only deltas to compact, 
> then no compacted delete delta should be created (only a compacted delta). In 
> the same way, if there are only delete deltas to compact, then no compacted 
> delta should be created (only a compacted delete delta).
> There is an issue with major compaction. If all the data in the partition has 
> been deleted, then we should get an empty base directory after compaction. 
> Instead, the empty base directory is deleted because it's empty and 
> compaction claims to succeed but we end up with the same deltas/delete deltas 
> we started with – basically compaction does not run.
> Where to start? MajorQueryCompactor#commitCompaction



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25492) Major query-based compaction is skipped if partition is empty

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25492?focusedWorklogId=750849=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750849
 ]

ASF GitHub Bot logged work on HIVE-25492:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:51
Start Date: 31/Mar/22 08:51
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #3157:
URL: https://github.com/apache/hive/pull/3157#discussion_r839344900



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1480,6 +1471,40 @@ private static ValidTxnList 
getValidTxnList(Configuration conf) {
 return validTxnList;
   }
 
+
+  /**
+   * In case of the cleaner, we don't need to go into file level, it is enough 
to collect base/delta/deletedelta directories.
+   *
+   * @param fs the filesystem used for the directory lookup
+   * @param path the path of the table or partition needs to be cleaned
+   * @return The listed directory snapshot needs to be checked for cleaning
+   * @throws IOException on filesystem errors
+   */
+  public static Map getHdfsDirSnapshotsForCleaner(final 
FileSystem fs, final Path path)
+  throws IOException {
+Map dirToSnapshots = new HashMap<>();
+// depth first search
+Deque> stack = new ArrayDeque<>();
+stack.push(fs.listStatusIterator(path));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (acidHiddenFileFilter.accept(fPath)) {
+  if (deltaFileFilter.accept(fPath) ||

Review comment:
   please put some order: base/delta/deleteDelta




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750849)
Time Spent: 1h  (was: 50m)

> Major query-based compaction is skipped if partition is empty
> -
>
> Key: HIVE-25492
> URL: https://issues.apache.org/jira/browse/HIVE-25492
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Currently if the result of query-based compaction is an empty base, delta, or 
> delete delta, the empty directory is deleted.
> This is because of minor compaction – if there are only deltas to compact, 
> then no compacted delete delta should be created (only a compacted delta). In 
> the same way, if there are only delete deltas to compact, then no compacted 
> delta should be created (only a compacted delete delta).
> There is an issue with major compaction. If all the data in the partition has 
> been deleted, then we should get an empty base directory after compaction. 
> Instead, the empty base directory is deleted because it's empty and 
> compaction claims to succeed but we end up with the same deltas/delete deltas 
> we started with – basically compaction does not run.
> Where to start? MajorQueryCompactor#commitCompaction



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25492) Major query-based compaction is skipped if partition is empty

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25492?focusedWorklogId=750846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750846
 ]

ASF GitHub Bot logged work on HIVE-25492:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:49
Start Date: 31/Mar/22 08:49
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #3157:
URL: https://github.com/apache/hive/pull/3157#discussion_r839342668



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -1480,6 +1471,40 @@ private static ValidTxnList 
getValidTxnList(Configuration conf) {
 return validTxnList;
   }
 
+
+  /**
+   * In case of the cleaner, we don't need to go into file level, it is enough 
to collect base/delta/deletedelta directories.
+   *
+   * @param fs the filesystem used for the directory lookup
+   * @param path the path of the table or partition needs to be cleaned
+   * @return The listed directory snapshot needs to be checked for cleaning
+   * @throws IOException on filesystem errors
+   */
+  public static Map getHdfsDirSnapshotsForCleaner(final 
FileSystem fs, final Path path)
+  throws IOException {
+Map dirToSnapshots = new HashMap<>();
+// depth first search
+Deque> stack = new ArrayDeque<>();
+stack.push(fs.listStatusIterator(path));
+while (!stack.isEmpty()) {
+  RemoteIterator itr = stack.pop();
+  while (itr.hasNext()) {
+FileStatus fStatus = itr.next();
+Path fPath = fStatus.getPath();
+if (acidHiddenFileFilter.accept(fPath)) {

Review comment:
   Should we skip temp dirs as well - acidTempDirFilter?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750846)
Time Spent: 40m  (was: 0.5h)

> Major query-based compaction is skipped if partition is empty
> -
>
> Key: HIVE-25492
> URL: https://issues.apache.org/jira/browse/HIVE-25492
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently if the result of query-based compaction is an empty base, delta, or 
> delete delta, the empty directory is deleted.
> This is because of minor compaction – if there are only deltas to compact, 
> then no compacted delete delta should be created (only a compacted delta). In 
> the same way, if there are only delete deltas to compact, then no compacted 
> delta should be created (only a compacted delete delta).
> There is an issue with major compaction. If all the data in the partition has 
> been deleted, then we should get an empty base directory after compaction. 
> Instead, the empty base directory is deleted because it's empty and 
> compaction claims to succeed but we end up with the same deltas/delete deltas 
> we started with – basically compaction does not run.
> Where to start? MajorQueryCompactor#commitCompaction



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25492) Major query-based compaction is skipped if partition is empty

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25492?focusedWorklogId=750829=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750829
 ]

ASF GitHub Bot logged work on HIVE-25492:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:06
Start Date: 31/Mar/22 08:06
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #3157:
URL: https://github.com/apache/hive/pull/3157#discussion_r839301416



##
File path: ql/src/java/org/apache/hadoop/hive/ql/io/AcidUtils.java
##
@@ -28,16 +28,7 @@
 import java.io.Serializable;
 import java.net.URI;
 import java.net.URISyntaxException;
-import java.util.ArrayList;
-import java.util.Arrays;
-import java.util.Collections;
-import java.util.Comparator;
-import java.util.HashMap;
-import java.util.HashSet;
-import java.util.List;
-import java.util.Map;
-import java.util.Properties;
-import java.util.Set;
+import java.util.*;

Review comment:
   wildcard imports




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750829)
Time Spent: 0.5h  (was: 20m)

> Major query-based compaction is skipped if partition is empty
> -
>
> Key: HIVE-25492
> URL: https://issues.apache.org/jira/browse/HIVE-25492
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently if the result of query-based compaction is an empty base, delta, or 
> delete delta, the empty directory is deleted.
> This is because of minor compaction – if there are only deltas to compact, 
> then no compacted delete delta should be created (only a compacted delta). In 
> the same way, if there are only delete deltas to compact, then no compacted 
> delta should be created (only a compacted delete delta).
> There is an issue with major compaction. If all the data in the partition has 
> been deleted, then we should get an empty base directory after compaction. 
> Instead, the empty base directory is deleted because it's empty and 
> compaction claims to succeed but we end up with the same deltas/delete deltas 
> we started with – basically compaction does not run.
> Where to start? MajorQueryCompactor#commitCompaction



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25492) Major query-based compaction is skipped if partition is empty

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25492?focusedWorklogId=750828=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750828
 ]

ASF GitHub Bot logged work on HIVE-25492:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:05
Start Date: 31/Mar/22 08:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #3157:
URL: https://github.com/apache/hive/pull/3157#discussion_r839300569



##
File path: 
itests/hive-unit/src/test/java/org/apache/hadoop/hive/ql/txn/compactor/TestCrudCompactorOnTez.java
##
@@ -28,9 +28,7 @@
 
 import com.google.common.collect.Lists;
 import org.apache.commons.lang3.StringUtils;
-import org.apache.hadoop.fs.FileSystem;
-import org.apache.hadoop.fs.Path;
-import org.apache.hadoop.fs.PathFilter;
+import org.apache.hadoop.fs.*;

Review comment:
   wildcard imports




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750828)
Time Spent: 20m  (was: 10m)

> Major query-based compaction is skipped if partition is empty
> -
>
> Key: HIVE-25492
> URL: https://issues.apache.org/jira/browse/HIVE-25492
> Project: Hive
>  Issue Type: Bug
>Reporter: Karen Coppage
>Assignee: Antal Sinkovits
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently if the result of query-based compaction is an empty base, delta, or 
> delete delta, the empty directory is deleted.
> This is because of minor compaction – if there are only deltas to compact, 
> then no compacted delete delta should be created (only a compacted delta). In 
> the same way, if there are only delete deltas to compact, then no compacted 
> delta should be created (only a compacted delete delta).
> There is an issue with major compaction. If all the data in the partition has 
> been deleted, then we should get an empty base directory after compaction. 
> Instead, the empty base directory is deleted because it's empty and 
> compaction claims to succeed but we end up with the same deltas/delete deltas 
> we started with – basically compaction does not run.
> Where to start? MajorQueryCompactor#commitCompaction



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?focusedWorklogId=750827=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750827
 ]

ASF GitHub Bot logged work on HIVE-26098:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:02
Start Date: 31/Mar/22 08:02
Worklog Time Spent: 10m 
  Work Description: maheshk114 closed pull request #3159:
URL: https://github.com/apache/hive/pull/3159


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750827)
Time Spent: 0.5h  (was: 20m)

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:215) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:361) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:334) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 

[jira] [Work logged] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?focusedWorklogId=750825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750825
 ]

ASF GitHub Bot logged work on HIVE-26098:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 08:01
Start Date: 31/Mar/22 08:01
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request #3160:
URL: https://github.com/apache/hive/pull/3160


   …
   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750825)
Time Spent: 20m  (was: 10m)

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:215) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 

[jira] [Commented] (HIVE-24907) Wrong results with LEFT JOIN and subqueries with UNION and GROUP BY

2022-03-31 Thread Linleicheng (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24907?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515151#comment-17515151
 ] 

Linleicheng commented on HIVE-24907:


We can reproduce the issue if we generate a operator tree as below:

 
{noformat}
   
Reducer 2 
Reduce Operator Tree:
  Group By Operator
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
Column stats: COMPLETE
Execution mode: llap
Reduce Operator Tree:
  Group By Operator
keys: KEY._col0 (type: int)
mode: mergepartial
outputColumnNames: _col0
Statistics: Num rows: 1 Data size: 4 Basic stats: COMPLETE 
Column stats: COMPLETE
Merge Join Operator
  condition map:
   Left Outer Join 0 to 1
  keys:
0 _col0 (type: int)
1 _col0 (type: int)
...
...

{noformat}
We should avoid generating map-join operator, so explicitly setting 
hive.auto.convert.join to false would help.

 

> Wrong results with LEFT JOIN and subqueries with UNION and GROUP BY
> ---
>
> Key: HIVE-24907
> URL: https://issues.apache.org/jira/browse/HIVE-24907
> Project: Hive
>  Issue Type: Bug
>  Components: Query Processor
>Affects Versions: 2.4.0, 3.2.0, 4.0.0
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>
> The following SQL query returns wrong results when run in TEZ/LLAP:
> {code:sql}
> SET hive.auto.convert.sortmerge.join=true;
> CREATE TABLE tbl (key int,value int);
> INSERT INTO tbl VALUES (1, 2000);
> INSERT INTO tbl VALUES (2, 2001);
> INSERT INTO tbl VALUES (3, 2005);
> SELECT sub1.key, sub2.key
> FROM
>   (SELECT a.key FROM tbl a GROUP BY a.key) sub1
> LEFT OUTER JOIN (
>   SELECT b.key FROM tbl b WHERE b.value = 2001 GROUP BY b.key
>   UNION
>   SELECT c.key FROM tbl c WHERE c.value = 2005 GROUP BY c.key) sub2 
> ON sub1.key = sub2.key;
> {code}
> Actual results:
> ||SUB1.KEY||SUB2.KEY||
> |1|NULL|
> |2|NULL|
> |3|NULL|
> Expected results:
> ||SUB1.KEY||SUB2.KEY||
> |1|NULL|
> |2|2|
> |3|3|
> Tested can be reproduced with {{TestMiniLlapLocalCliDriver}} or 
> {{TestMiniTezCliDriver}} in older versions of Hive.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?focusedWorklogId=750804=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-750804
 ]

ASF GitHub Bot logged work on HIVE-26098:
-

Author: ASF GitHub Bot
Created on: 31/Mar/22 07:28
Start Date: 31/Mar/22 07:28
Worklog Time Spent: 10m 
  Work Description: maheshk114 opened a new pull request #3159:
URL: https://github.com/apache/hive/pull/3159


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 750804)
Remaining Estimate: 0h
Time Spent: 10m

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:215) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 

[jira] [Updated] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-26098:
--
Labels: pull-request-available  (was: )

> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:215) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:361) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:334) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:245) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:106) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:348) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:204) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:153) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:148) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> 

[jira] [Assigned] (HIVE-26098) Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path causing IllegalArgumentException

2022-03-31 Thread mahesh kumar behera (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-26098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mahesh kumar behera reassigned HIVE-26098:
--


> Duplicate path/Jar in hive.aux.jars.path or hive.reloadable.aux.jars.path 
> causing IllegalArgumentException
> --
>
> Key: HIVE-26098
> URL: https://issues.apache.org/jira/browse/HIVE-26098
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, HiveServer2
>Reporter: mahesh kumar behera
>Assignee: mahesh kumar behera
>Priority: Major
>
>  hive.aux.jars.path and hive.reloadable.aux.jars.path  are used for providing 
> auxiliary jars which are used doing query processing. These jars are copied 
> to Tez temp path so that the Tez jobs have access to these jars while 
> processing the job. There are a duplicate check to avoid copying the same jar 
> multiple times. This check assumes the jar to be in local file system. But in 
> real, the jars path can be anywhere. So this duplicate check fails, when the 
> source path is not in local path.
> {code:java}
> ERROR : Failed to execute tez graph.
> java.lang.IllegalArgumentException: Wrong FS: 
> hdfs://localhost:53877/tmp/test_jar/identity_udf.jar, expected: file:///
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781) 
> ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.pathToFile(RawLocalFileSystem.java:86)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:636)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:930)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:631)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:454)
>  ~[hadoop-common-3.1.0.jar:?]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.checkPreExisting(DagUtils.java:1392)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeResource(DagUtils.java:1411)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.addTempResources(DagUtils.java:1295)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.DagUtils.localizeTempFilesFromConf(DagUtils.java:1177)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.ensureLocalResources(TezSessionState.java:636)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.openInternal(TezSessionState.java:283)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionPoolSession.openInternal(TezSessionPoolSession.java:124)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezSessionState.open(TezSessionState.java:241)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.tez.TezTask.ensureSessionHasResources(TezTask.java:448)
>  ~[hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:215) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:212) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:105) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTask(Executor.java:361) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.launchTasks(Executor.java:334) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.runTasks(Executor.java:245) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Executor.execute(Executor.java:106) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:348) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:204) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:153) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:148) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:185) 
> [hive-exec-4.0.0-alpha-1.jar:4.0.0-alpha-1]
>     at 
> 

[jira] [Commented] (HIVE-25096) beeline can't get the correct hiveserver2 using the zoopkeeper with serviceDiscoveryMode=zooKeeper.

2022-03-31 Thread hezhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515107#comment-17515107
 ] 

hezhang commented on HIVE-25096:


when starting hive with ranger, the err will occur.

> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
> ---
>
> Key: HIVE-25096
> URL: https://issues.apache.org/jira/browse/HIVE-25096
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.1.2
> Environment: centos7.4
> x86_64
>Reporter: xiaozhongcheng
>Assignee: hezhang
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-25096.patch
>
>
> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
>  
> {code:java}
> // code placeholder
> [root@vhost-120-28 hive]# beeline -u 
> "jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  --verbose=true
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> !connect 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  '' [passwd stripped] 
> Connecting to 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper (state=,code=0)
> java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: 
> Unable to read HiveServer2 configs from ZooKeeper
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:170)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> at java.sql.DriverManager.getConnection(DriverManager.java:664)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1641)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1536)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56)
> at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1384)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1423)
> at org.apache.hive.beeline.BeeLine.connectUsingArgs(BeeLine.java:900)
> at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:795)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1048)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper
> at 
> org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:147)
> at 
> org.apache.hive.jdbc.Utils.configureConnParamsFromZooKeeper(Utils.java:511)
> at org.apache.hive.jdbc.Utils.parseURL(Utils.java:334)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:168)
> ... 25 more
> {code}
>   
>  You know, HiveServer2#startPrivilegeSynchronizer will create the namespace 
> of 

[jira] [Comment Edited] (HIVE-25096) beeline can't get the correct hiveserver2 using the zoopkeeper with serviceDiscoveryMode=zooKeeper.

2022-03-31 Thread hezhang (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-25096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17515107#comment-17515107
 ] 

hezhang edited comment on HIVE-25096 at 3/31/22, 6:50 AM:
--

when starting hive with ranger, the err will occur. add HIVE-25096.path to fix


was (Author: heiheizhang):
when starting hive with ranger, the err will occur.

> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
> ---
>
> Key: HIVE-25096
> URL: https://issues.apache.org/jira/browse/HIVE-25096
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.1.2
> Environment: centos7.4
> x86_64
>Reporter: xiaozhongcheng
>Assignee: hezhang
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-25096.patch
>
>
> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
>  
> {code:java}
> // code placeholder
> [root@vhost-120-28 hive]# beeline -u 
> "jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  --verbose=true
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> !connect 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  '' [passwd stripped] 
> Connecting to 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper (state=,code=0)
> java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: 
> Unable to read HiveServer2 configs from ZooKeeper
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:170)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> at java.sql.DriverManager.getConnection(DriverManager.java:664)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1641)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1536)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56)
> at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1384)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1423)
> at org.apache.hive.beeline.BeeLine.connectUsingArgs(BeeLine.java:900)
> at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:795)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1048)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper
> at 
> org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:147)
> at 
> org.apache.hive.jdbc.Utils.configureConnParamsFromZooKeeper(Utils.java:511)
> at org.apache.hive.jdbc.Utils.parseURL(Utils.java:334)
> at 

[jira] [Updated] (HIVE-25096) beeline can't get the correct hiveserver2 using the zoopkeeper with serviceDiscoveryMode=zooKeeper.

2022-03-31 Thread hezhang (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

hezhang updated HIVE-25096:
---
Attachment: HIVE-25096.patch

> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
> ---
>
> Key: HIVE-25096
> URL: https://issues.apache.org/jira/browse/HIVE-25096
> Project: Hive
>  Issue Type: Bug
>  Components: Beeline
>Affects Versions: 3.1.2
> Environment: centos7.4
> x86_64
>Reporter: xiaozhongcheng
>Assignee: hezhang
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: HIVE-25096.patch
>
>
> beeline can't get the correct hiveserver2 using the zoopkeeper with 
> serviceDiscoveryMode=zooKeeper.
>  
> {code:java}
> // code placeholder
> [root@vhost-120-28 hive]# beeline -u 
> "jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2"
>  --verbose=true
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in 
> [jar:file:/usr/wdp/1.0/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an 
> explanation.
> SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory]
> !connect 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
>  '' [passwd stripped] 
> Connecting to 
> jdbc:hive2://vhost-120-26:2181,vhost-120-27:2181,vhost-120-28:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2
> Error: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper (state=,code=0)
> java.sql.SQLException: org.apache.hive.jdbc.ZooKeeperHiveClientException: 
> Unable to read HiveServer2 configs from ZooKeeper
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:170)
> at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107)
> at java.sql.DriverManager.getConnection(DriverManager.java:664)
> at java.sql.DriverManager.getConnection(DriverManager.java:208)
> at 
> org.apache.hive.beeline.DatabaseConnection.connect(DatabaseConnection.java:145)
> at 
> org.apache.hive.beeline.DatabaseConnection.getConnection(DatabaseConnection.java:209)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1641)
> at org.apache.hive.beeline.Commands.connect(Commands.java:1536)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at 
> org.apache.hive.beeline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:56)
> at 
> org.apache.hive.beeline.BeeLine.execCommandWithPrefix(BeeLine.java:1384)
> at org.apache.hive.beeline.BeeLine.dispatch(BeeLine.java:1423)
> at org.apache.hive.beeline.BeeLine.connectUsingArgs(BeeLine.java:900)
> at org.apache.hive.beeline.BeeLine.initArgs(BeeLine.java:795)
> at org.apache.hive.beeline.BeeLine.begin(BeeLine.java:1048)
> at 
> org.apache.hive.beeline.BeeLine.mainWithInputRedirection(BeeLine.java:538)
> at org.apache.hive.beeline.BeeLine.main(BeeLine.java:520)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
> Caused by: org.apache.hive.jdbc.ZooKeeperHiveClientException: Unable to read 
> HiveServer2 configs from ZooKeeper
> at 
> org.apache.hive.jdbc.ZooKeeperHiveClientHelper.configureConnParams(ZooKeeperHiveClientHelper.java:147)
> at 
> org.apache.hive.jdbc.Utils.configureConnParamsFromZooKeeper(Utils.java:511)
> at org.apache.hive.jdbc.Utils.parseURL(Utils.java:334)
> at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:168)
> ... 25 more
> {code}
>   
>  You know, HiveServer2#startPrivilegeSynchronizer will create the namespace 
> of /hiveserver2/leader in the zookeeper,
>  however, if you want to