[jira] [Work logged] (HIVE-25397) Snapshot support for controlled failover

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25397?focusedWorklogId=693763=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693763
 ]

ASF GitHub Bot logged work on HIVE-25397:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 07:41
Start Date: 10/Dec/21 07:41
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on a change in pull request #2539:
URL: https://github.com/apache/hive/pull/2539#discussion_r766370403



##
File path: common/src/java/org/apache/hadoop/hive/conf/HiveConf.java
##
@@ -695,6 +695,10 @@ private static void populateLlapDaemonVarsSet(Set 
llapDaemonVarsSetLocal
 + "data copy, the target data is overwritten and the modifications are 
removed and the copy is again "
 + "attempted using the snapshot based approach. If disabled, the 
replication will fail in case the target is "
 + "modified."),
+REPL_REUSE_SNAPSHOTS("hive.repl.reuse.snapshots", false,
+"If enabled,reusing snapshots is attempted in case of controlled 
failover(B->A) when same paths are"
++ "used for external table replication on src and target. Also in 
cases of failed incremental where re-bootstrap is required."
++ "If set to true and snapshots exist in some paths, it creates/reuses 
new snapshots in those paths using the same name as exisiting snapshots."),

Review comment:
   Typo:
   exisiting

##
File path: 
shims/0.23/src/main/java/org/apache/hadoop/hive/shims/Hadoop23Shims.java
##
@@ -1237,25 +1238,42 @@ public boolean runDistCpWithSnapshots(String 
oldSnapshot, String newSnapshot, Li
 LOG.warn("Copy failed with INVALID_ARGUMENT for source: {} to target: 
{} snapshot1: {} snapshot2: {} "
 + "params: {}", srcPaths, dst, oldSnapshot, newSnapshot, params);
 return true;
-  } else if (returnCode == DistCpConstants.UNKNOWN_ERROR && 
overwriteTarget) {
+  } else if (returnCode == DistCpConstants.UNKNOWN_ERROR) {
 // Check if this error is due to target modified.
-if (shouldRdiff(dst, conf, oldSnapshot, overwriteTarget)) {
-  LOG.warn("Copy failed due to target modified. Attempting to restore 
back the target. source: {} target: {} "
-  + "snapshot: {}", srcPaths, dst, oldSnapshot);
-  List rParams = constructDistCpWithSnapshotParams(srcPaths, 
dst, ".", oldSnapshot, conf, "-rdiff");
-  DistCp rDistcp = new DistCp(conf, null);
-  returnCode = rDistcp.run(rParams.toArray(new String[0]));
-  if (returnCode == 0) {
-LOG.info("Target restored to previous state.  source: {} target: 
{} snapshot: {}. Reattempting to copy.",
-srcPaths, dst, oldSnapshot);
-dst.getFileSystem(conf).deleteSnapshot(dst, oldSnapshot);
-dst.getFileSystem(conf).createSnapshot(dst, oldSnapshot);
-returnCode = distcp.run(params.toArray(new String[0]));
+if (targetModified(dst, conf, oldSnapshot)) {
+  if (overwriteTarget) {
+LOG.warn("Copy failed due to target modified. Attempting to 
restore back the target. source: {} target: {} "
++ "snapshot: {}", srcPaths, dst, oldSnapshot);
+List rParams = constructDistCpWithSnapshotParams(srcPaths, 
dst, ".", oldSnapshot, conf, "-rdiff");
+DistCp rDistcp = new DistCp(conf, null);
+returnCode = rDistcp.run(rParams.toArray(new String[0]));
 if (returnCode == 0) {
+  LOG.info("Target restored to previous state.  source: {} target: 
{} snapshot: {}. Reattempting to copy.",
+  srcPaths, dst, oldSnapshot);
+  dst.getFileSystem(conf).deleteSnapshot(dst, oldSnapshot);
+  dst.getFileSystem(conf).createSnapshot(dst, oldSnapshot);
+  returnCode = distcp.run(params.toArray(new String[0]));
+  if (returnCode == 0) {
+return true;
+  } else {
+LOG.error("Copy failed with after target restore for source: 
{} to target: {} snapshot1: {} snapshot2: "
++ "{} params: {}. Return code: {}", srcPaths, dst, 
oldSnapshot, newSnapshot, params, returnCode);
+return false;
+  }
+}
+  } else {
+//in case overwriteTarget is false, and we encounter an exception 
due to targetFs getting modified
+// since last snapshot, then fall back to full distcp
+LOG.warn("Copy failed due to target modified and overwrite is 
false. Attempting full distcp." +
+"Source:{}, target: {}",srcPaths, dst);
+// Get the path relative to the initial snapshot for copy.
+Path snapRelPath = new Path(srcPaths.get(0), 
HdfsConstants.DOT_SNAPSHOT_DIR + "/" + newSnapshot);
+// Copy from the initial snapshot path.
+   

[jira] [Work logged] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?focusedWorklogId=693679=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693679
 ]

ASF GitHub Bot logged work on HIVE-25793:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 02:23
Start Date: 10/Dec/21 02:23
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 edited a comment on pull request #2860:
URL: https://github.com/apache/hive/pull/2860#issuecomment-990517308


   The [#2441](https://github.com/apache/hive/pull/2441/files) also wants to 
solve the problem, but brings some backwards compatible problems. I am not sure 
if it appropriate to introduce more metrics about the apis.
   Thanks,
   Zhihua Deng


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693679)
Time Spent: 40m  (was: 0.5h)

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> We can see that a metastore api count increments twice for one RPC, because a 
> metastore server makes two timers with same name for one RPC. one for a 
> retrying proxy handler and another for a hms handler. It will make api 
> metrics more inaccurate, especially during retrying.
> I think new metrics for retrying proxy will be helpful to make each metric 
> accurate.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?focusedWorklogId=693665=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693665
 ]

ASF GitHub Bot logged work on HIVE-25793:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 01:48
Start Date: 10/Dec/21 01:48
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 edited a comment on pull request #2860:
URL: https://github.com/apache/hive/pull/2860#issuecomment-990517308


   The [#2441](https://github.com/apache/hive/pull/2441/files) also wants to 
solve the problem, but brings some backwards compatible problem. I am not sure 
it appropriate to introduce some more metrics about the apis.
   Thanks,
   Zhihua Deng


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693665)
Time Spent: 0.5h  (was: 20m)

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> We can see that a metastore api count increments twice for one RPC, because a 
> metastore server makes two timers with same name for one RPC. one for a 
> retrying proxy handler and another for a hms handler. It will make api 
> metrics more inaccurate, especially during retrying.
> I think new metrics for retrying proxy will be helpful to make each metric 
> accurate.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?focusedWorklogId=693664=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693664
 ]

ASF GitHub Bot logged work on HIVE-25793:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 01:46
Start Date: 10/Dec/21 01:46
Worklog Time Spent: 10m 
  Work Description: dengzhhu653 commented on pull request #2860:
URL: https://github.com/apache/hive/pull/2860#issuecomment-990517308


   The [#2441](https://github.com/apache/hive/pull/2441/files) also wants to 
solve the problem, but brings some backwards compatible problem. I am not sure 
it appropriate to introduces some more metrics about the apis.
   Thanks,
   Zhihua Deng


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693664)
Time Spent: 20m  (was: 10m)

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can see that a metastore api count increments twice for one RPC, because a 
> metastore server makes two timers with same name for one RPC. one for a 
> retrying proxy handler and another for a hms handler. It will make api 
> metrics more inaccurate, especially during retrying.
> I think new metrics for retrying proxy will be helpful to make each metric 
> accurate.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread Jeongdae Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HIVE-25793:

Description: 
We can see that a metastore api count increments twice for one RPC, because a 
metastore server makes two timers with same name for one RPC. one for a 
retrying proxy handler and another for a hms handler. It will make api metrics 
more inaccurate, especially during retrying.

I think new metrics for retrying proxy will be helpful to make each metric 
accurate.

  was:
We can see that metastore api count increments twice for one RPC, because a 
metastore server makes two timers with same name for one RPC. one for retrying 
proxy handler and another for hms handler. It will make api metrics more 
inaccurate, especially during retrying.

I think new metrics for retrying proxy will be helpful to make each metric 
accurate.


> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We can see that a metastore api count increments twice for one RPC, because a 
> metastore server makes two timers with same name for one RPC. one for a 
> retrying proxy handler and another for a hms handler. It will make api 
> metrics more inaccurate, especially during retrying.
> I think new metrics for retrying proxy will be helpful to make each metric 
> accurate.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread Jeongdae Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HIVE-25793:

Description: 
We can see that metastore api count increments twice for one RPC, because a 
metastore server makes two timers with same name for one RPC. one for retrying 
proxy handler and another for hms handler. It will make api metrics more 
inaccurate, especially during retrying.

I think new metrics for retrying proxy will be helpful to make each metric 
accurate.

  was:A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
(HMSHandler) uses the same name for metrics. it makes metastore metrics 
inaccurate. ex) rpc counts increment twice for a rpc. 


> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> We can see that metastore api count increments twice for one RPC, because a 
> metastore server makes two timers with same name for one RPC. one for 
> retrying proxy handler and another for hms handler. It will make api metrics 
> more inaccurate, especially during retrying.
> I think new metrics for retrying proxy will be helpful to make each metric 
> accurate.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread Jeongdae Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim updated HIVE-25793:

Affects Version/s: 3.1.2
   2.3.8
   4.0.0

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 2.3.8, 3.1.2, 4.0.0
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
> (HMSHandler) uses the same name for metrics. it makes metastore metrics 
> inaccurate. ex) rpc counts increment twice for a rpc. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work started] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread Jeongdae Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HIVE-25793 started by Jeongdae Kim.
---
> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
> (HMSHandler) uses the same name for metrics. it makes metastore metrics 
> inaccurate. ex) rpc counts increment twice for a rpc. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25793:
--
Labels: pull-request-available  (was: )

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
> (HMSHandler) uses the same name for metrics. it makes metastore metrics 
> inaccurate. ex) rpc counts increment twice for a rpc. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?focusedWorklogId=693658=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693658
 ]

ASF GitHub Bot logged work on HIVE-25793:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 01:09
Start Date: 10/Dec/21 01:09
Worklog Time Spent: 10m 
  Work Description: JeongDaeKim opened a new pull request #2860:
URL: https://github.com/apache/hive/pull/2860


   
   
   ### What changes were proposed in this pull request?
   
   This change will make new metrics for a metastore retrying proxy handler.
   
   ### Why are the changes needed?
   
   We can see that metastore api count increments twice for one RPC. A 
metastore server makes two timers with same name for one RPC. one for retrying 
proxy handler and another for hms handler.  It will make api metrics more 
inaccurate, Especially during retrying.
   
   ### Does this PR introduce _any_ user-facing change?
   
   This change will make new metrics for a metastore retrying proxy handler.
   
   ### How was this patch tested?
   
   A test was added
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693658)
Remaining Estimate: 0h
Time Spent: 10m

> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
> (HMSHandler) uses the same name for metrics. it makes metastore metrics 
> inaccurate. ex) rpc counts increment twice for a rpc. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24289?focusedWorklogId=693638=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693638
 ]

ASF GitHub Bot logged work on HIVE-24289:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 00:11
Start Date: 10/Dec/21 00:11
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2675:
URL: https://github.com/apache/hive/pull/2675#issuecomment-990428791


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693638)
Remaining Estimate: 0h
Time Spent: 10m

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24776) Reduce HMS DB calls during stats updates

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24776?focusedWorklogId=693637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693637
 ]

ASF GitHub Bot logged work on HIVE-24776:
-

Author: ASF GitHub Bot
Created on: 10/Dec/21 00:11
Start Date: 10/Dec/21 00:11
Worklog Time Spent: 10m 
  Work Description: github-actions[bot] commented on pull request #2636:
URL: https://github.com/apache/hive/pull/2636#issuecomment-990428814


   This pull request has been automatically marked as stale because it has not 
had recent activity. It will be closed if no further activity occurs.
   Feel free to reach out on the d...@hive.apache.org list if the patch is in 
need of reviews.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693637)
Time Spent: 0.5h  (was: 20m)

> Reduce HMS DB calls during stats updates
> 
>
> Key: HIVE-24776
> URL: https://issues.apache.org/jira/browse/HIVE-24776
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
>  When adding large number of partitions (100s/1000s) in a table, it ends up 
> making lots of getTable calls which are not needed.
> Lines mentioned below may vary slightly in apache-master. 
> {noformat}
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManager.jdoRetrieve(JDOPersistenceManager.java:620)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManager.retrieve(JDOPersistenceManager.java:637)
>   at 
> org.datanucleus.api.jdo.JDOPersistenceManager.retrieve(JDOPersistenceManager.java:646)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2112)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.getMTable(ObjectStore.java:2150)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.ensureGetMTable(ObjectStore.java:4578)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.ensureGetTable(ObjectStore.java:4588)
>   at 
> org.apache.hadoop.hive.metastore.ObjectStore.updatePartitionColumnStatistics(ObjectStore.java:9264)
>   at sun.reflect.GeneratedMethodAccessor92.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:97)
>   at com.sun.proxy.$Proxy27.updatePartitionColumnStatistics(Unknown 
> Source)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartitonColStatsInternal(HiveMetaStore.java:6679)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.updatePartColumnStatsWithMerge(HiveMetaStore.java:8655)
>   at 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.set_aggr_stats_for(HiveMetaStore.java:8592)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147)
>   at 
> org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108)
>   at com.sun.proxy.$Proxy28.set_aggr_stats_for(Unknown Source)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:19060)
>   at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$set_aggr_stats_for.getResult(ThriftHiveMetastore.java:19044)
>   at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
>   at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
>  {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-24289) RetryingMetaStoreClient should not retry connecting to HMS on genuine errors

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24289?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-24289:
--
Labels: pull-request-available  (was: )

> RetryingMetaStoreClient should not retry connecting to HMS on genuine errors
> 
>
> Key: HIVE-24289
> URL: https://issues.apache.org/jira/browse/HIVE-24289
> Project: Hive
>  Issue Type: Improvement
>Reporter: Rajesh Balamohan
>Assignee: Harshit Gupta
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> When there is genuine error from HMS, it should not be retried in 
> RetryingMetaStoreClient. 
> For e.g, following query would be retried multiple times (~20+ times) in HMS 
> causing huge delay in processing, even though this constraint is available in 
> HMS. 
> It should just throw exception to client and stop retrying in such cases.
> {noformat}
> alter table web_sales add constraint tpcds_bin_partitioned_orc_1_ws_s_hd 
> foreign key  (ws_ship_hdemo_sk) references household_demographics 
> (hd_demo_sk) disable novalidate rely;
> org.apache.hadoop.hive.ql.metadata.HiveException: 
> org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>   at org.apache.hadoop.hive.ql.metadata.Hive.addForeignKey(Hive.java:5914)
> ..
> ...
> Caused by: org.apache.thrift.TApplicationException: Internal error processing 
> add_foreign_key
>at 
> org.apache.thrift.TApplicationException.read(TApplicationException.java:111)
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
>at 
> org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_add_foreign_key(ThriftHiveMetastore.java:1872)
> {noformat}
> https://github.com/apache/hive/blob/master/standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/RetryingMetaStoreClient.java#L256
> For e.g, if exception contains "Internal error processing ", it could stop 
> retrying all over again.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25773) Column descriptors might not deleted via direct sql

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25773?focusedWorklogId=693602=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693602
 ]

ASF GitHub Bot logged work on HIVE-25773:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 22:55
Start Date: 09/Dec/21 22:55
Worklog Time Spent: 10m 
  Work Description: hsnusonic commented on pull request #2843:
URL: https://github.com/apache/hive/pull/2843#issuecomment-990376713


   @pvary The tests are passed. Could you help to merge this? Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693602)
Time Spent: 50m  (was: 40m)

> Column descriptors might not deleted via direct sql
> ---
>
> Key: HIVE-25773
> URL: https://issues.apache.org/jira/browse/HIVE-25773
> Project: Hive
>  Issue Type: Bug
>  Components: Metastore
>Reporter: Yu-Wen Lai
>Assignee: Yu-Wen Lai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
> 1. create a partitioned table
> 2. add a partition _p_
> 3. add column to the partition _p_ (a column descriptor will be created)
> 4. drop partition _p_
> The new column descriptor still existed even though there's no relation left. 
> We are currently using below SQL and extract the results that count = 0 as 
> dangling column descriptors. However, it is impossible to get count = 0 from 
> groupby query so they will never be deleted if it is not a table's default 
> column descriptor.
>  
> {code:java}
> SELECT SDS.CD_ID, count(1)
>   FROM SDS WHERE SDS.CD_ID in (cdIds)
>   GROUP BY SDS.CD_ID;{code}
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25768) Extend query-level HMS cache lifetime beyond analysis stage

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25768?focusedWorklogId=693483=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693483
 ]

ASF GitHub Bot logged work on HIVE-25768:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 18:54
Start Date: 09/Dec/21 18:54
Worklog Time Spent: 10m 
  Work Description: jfsii commented on a change in pull request #2841:
URL: https://github.com/apache/hive/pull/2841#discussion_r766036837



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -278,11 +278,6 @@
*/
   private Map hdfsEncryptionShims = 
Maps.newHashMap();
 
-  /**
-   * Cache for Erasure Coding shims.
-   */
-  private Map erasureCodingShims;

Review comment:
   I did not purposely remove this, I think it was due to me moving around 
things. But it does seem valid to remove it as I can not find any usage of this 
anywhere.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693483)
Time Spent: 40m  (was: 0.5h)

> Extend query-level HMS cache lifetime beyond analysis stage
> ---
>
> Key: HIVE-25768
> URL: https://issues.apache.org/jira/browse/HIVE-25768
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> HIVE-24176 added a HMS response cache to improve compilation times when 
> metadata is requested multiple times. The cache gets destroyed after 
> analysis. If we extend the lifetime of the cache beyond analysis it could be 
> used to reduce HMS communication for other areas such as various exec hooks 
> that inspect the metadata for a query.
> The impetus that motivated this change is specifically the Atlas hook which 
> generates lineage information and makes HMS calls. These calls to HMS can add 
> latency to query results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25768) Extend query-level HMS cache lifetime beyond analysis stage

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25768?focusedWorklogId=693446=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693446
 ]

ASF GitHub Bot logged work on HIVE-25768:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 18:09
Start Date: 09/Dec/21 18:09
Worklog Time Spent: 10m 
  Work Description: jfsii commented on a change in pull request #2841:
URL: https://github.com/apache/hive/pull/2841#discussion_r766036837



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -278,11 +278,6 @@
*/
   private Map hdfsEncryptionShims = 
Maps.newHashMap();
 
-  /**
-   * Cache for Erasure Coding shims.
-   */
-  private Map erasureCodingShims;

Review comment:
   I did not purposely remove this, I think it was due to me moving around 
things. But it does seem valid to remove it ass I can not find any usage of 
this anywhere.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693446)
Time Spent: 0.5h  (was: 20m)

> Extend query-level HMS cache lifetime beyond analysis stage
> ---
>
> Key: HIVE-25768
> URL: https://issues.apache.org/jira/browse/HIVE-25768
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> HIVE-24176 added a HMS response cache to improve compilation times when 
> metadata is requested multiple times. The cache gets destroyed after 
> analysis. If we extend the lifetime of the cache beyond analysis it could be 
> used to reduce HMS communication for other areas such as various exec hooks 
> that inspect the metadata for a query.
> The impetus that motivated this change is specifically the Atlas hook which 
> generates lineage information and makes HMS calls. These calls to HMS can add 
> latency to query results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25768) Extend query-level HMS cache lifetime beyond analysis stage

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25768?focusedWorklogId=693443=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693443
 ]

ASF GitHub Bot logged work on HIVE-25768:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 17:53
Start Date: 09/Dec/21 17:53
Worklog Time Spent: 10m 
  Work Description: scarlin-cloudera commented on a change in pull request 
#2841:
URL: https://github.com/apache/hive/pull/2841#discussion_r766025515



##
File path: ql/src/java/org/apache/hadoop/hive/ql/session/SessionState.java
##
@@ -2206,31 +2201,21 @@ public String getNewSparkSessionId() {
 return getSessionId() + "_" + 
Long.toString(this.sparkSessionId.getAndIncrement());
   }
 
-  /**
-   * Can be called when we start compilation of a query.
-   * @param queryId the unique identifier of the query
-   */
-  public void startScope(String queryId) {
-Map existingVal = cache.put(queryId, new HashMap<>());
-Preconditions.checkState(existingVal == null);
-  }
-
-  /**
-   * Can be called when we end compilation of a query.
-   * @param queryId the unique identifier of the query
-   */
-  public void endScope(String queryId) {
-Map existingVal = cache.remove(queryId);
-Preconditions.checkState(existingVal != null);
-  }
-
   /**
* Retrieves the query cache for the given query.
* @param queryId the unique identifier of the query
* @return the cache for the query
*/
   public Map getQueryCache(String queryId) {
-return cache.get(queryId);
+QueryState qs = getQueryState(queryId);
+if (qs == null) {
+  return null;
+}
+
+if (qs == null) {

Review comment:
   Duplicated code




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693443)
Time Spent: 20m  (was: 10m)

> Extend query-level HMS cache lifetime beyond analysis stage
> ---
>
> Key: HIVE-25768
> URL: https://issues.apache.org/jira/browse/HIVE-25768
> Project: Hive
>  Issue Type: Improvement
>  Components: HiveServer2
>Reporter: John Sherman
>Assignee: John Sherman
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> HIVE-24176 added a HMS response cache to improve compilation times when 
> metadata is requested multiple times. The cache gets destroyed after 
> analysis. If we extend the lifetime of the cache beyond analysis it could be 
> used to reduce HMS communication for other areas such as various exec hooks 
> that inspect the metadata for a query.
> The impetus that motivated this change is specifically the Atlas hook which 
> generates lineage information and makes HMS calls. These calls to HMS can add 
> latency to query results.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25793) Isolate metastore metrics related to a retrying handler

2021-12-09 Thread Jeongdae Kim (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeongdae Kim reassigned HIVE-25793:
---


> Isolate metastore metrics related to a retrying handler
> ---
>
> Key: HIVE-25793
> URL: https://issues.apache.org/jira/browse/HIVE-25793
> Project: Hive
>  Issue Type: Bug
>Reporter: Jeongdae Kim
>Assignee: Jeongdae Kim
>Priority: Minor
>
> A metastore proxy for retrying (RetryingHMSHandler) and a base handler 
> (HMSHandler) uses the same name for metrics. it makes metastore metrics 
> inaccurate. ex) rpc counts increment twice for a rpc. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Reopened] (HIVE-25514) Alter table with partitions should honor {OWNER} policies from Apache Ranger in the HMS

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reopened HIVE-25514:
-

[~hemanth619] this doesn't looks like it was committed; if it was - could you 
point me to the commit hash for it?

> Alter table with partitions should honor {OWNER} policies from Apache Ranger 
> in the HMS
> ---
>
> Key: HIVE-25514
> URL: https://issues.apache.org/jira/browse/HIVE-25514
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Sai Hemanth Gantasala
>Assignee: Sai Hemanth Gantasala
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> The following commands should honor \{OWNER} policies from Apache Ranger in 
> the HMS.
> {code:java}
> Show partitions table_name;
> alter table foo.table_name partition (country='us') rename to partition 
> (country='canada);
> alter table foo.table_name drop partition (id='canada');{code}
> The examples above are tables with partitions. So the partition APIs in HMS 
> should be modifed to honor \{owner} policies from Apache ranger. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25599) Addendum HIVE-25570 Hive should send full URL path for authorization for the command insert overwrite location

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25599?focusedWorklogId=693267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693267
 ]

ASF GitHub Bot logged work on HIVE-25599:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 14:09
Start Date: 09/Dec/21 14:09
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk closed pull request #2703:
URL: https://github.com/apache/hive/pull/2703


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693267)
Time Spent: 40m  (was: 0.5h)

> Addendum HIVE-25570 Hive should send full URL path for authorization for the 
> command insert overwrite location
> --
>
> Key: HIVE-25599
> URL: https://issues.apache.org/jira/browse/HIVE-25599
> Project: Hive
>  Issue Type: Bug
>Reporter: Panagiotis Garefalakis
>Assignee: Panagiotis Garefalakis
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?focusedWorklogId=693261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693261
 ]

ASF GitHub Bot logged work on HIVE-25791:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 14:03
Start Date: 09/Dec/21 14:03
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk opened a new pull request #2859:
URL: https://github.com/apache/hive/pull/2859


   
   
   ### What changes were proposed in this pull request?
   
   
   
   ### Why are the changes needed?
   
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   
   
   ### How was this patch tested?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693261)
Remaining Estimate: 0h
Time Spent: 10m

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HIVE-25791:
--
Labels: pull-request-available  (was: )

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-24975) Bug in ValidWriteIdList comparison in TxnIdUtils

2021-12-09 Thread Sourabh Goyal (Jira)


[ 
https://issues.apache.org/jira/browse/HIVE-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17456382#comment-17456382
 ] 

Sourabh Goyal commented on HIVE-24975:
--

Thank you [~kgyrtkirk]  for reviewing and merging the PR. 

> Bug in ValidWriteIdList comparison in TxnIdUtils
> 
>
> Key: HIVE-24975
> URL: https://issues.apache.org/jira/browse/HIVE-24975
> Project: Hive
>  Issue Type: Bug
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> TxnIdUtils's 
> [compare|https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/TxnIdUtils.java#L38]
>  method returns incorrect result for the following validWriteIdLists
> ValidWriteIdList a = new ValidReaderWriteIdList("default.test:1:1:1:");
>  ValidWriteIdList b = new 
> ValidReaderWriteIdList("default.test:1:9223372036854775807::");
> TxnIdUtils.compare(a, b) returns +1 whereas the expected response is -1 since 
> b is more recent.
> cc - [~kishendas] [~vihangk1]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-25791:

Issue Type: Improvement  (was: Bug)

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Improvement
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-25791:

Description: Especially for cases when the path is already known to be 
invalid; like: {{sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#}}  (was: 
Especially for cases when the path is already known to be invalid; like: 
`sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#`)

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Especially for cases when the path is already known to be invalid; like: 
> {{sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#}}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated HIVE-25791:

Description: Especially for cases when the path is already known to be 
invalid; like: {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code} 
 (was: Especially for cases when the path is already known to be invalid; like: 
{{sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#}})

> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Especially for cases when the path is already known to be invalid; like: 
> {code}sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#{code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Assigned] (HIVE-25791) Improve SFS exception messages

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich reassigned HIVE-25791:
---


> Improve SFS exception messages
> --
>
> Key: HIVE-25791
> URL: https://issues.apache.org/jira/browse/HIVE-25791
> Project: Hive
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> Especially for cases when the path is already known to be invalid; like: 
> `sfs+file:///nonexistent/nonexistent.txt/#SINGLEFILE#`



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25495) Upgrade to JLine3

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25495?focusedWorklogId=693127=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693127
 ]

ASF GitHub Bot logged work on HIVE-25495:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 10:46
Start Date: 09/Dec/21 10:46
Worklog Time Spent: 10m 
  Work Description: ayushtkn commented on pull request #2617:
URL: https://github.com/apache/hive/pull/2617#issuecomment-989733853


   @kgyrtkirk I think that was intentional only, Atleast the JLine one is:
   
https://issues.apache.org/jira/browse/YARN-8778?focusedCommentId=16650922=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16650922


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693127)
Time Spent: 1h 40m  (was: 1.5h)

> Upgrade to JLine3
> -
>
> Key: HIVE-25495
> URL: https://issues.apache.org/jira/browse/HIVE-25495
> Project: Hive
>  Issue Type: Improvement
>Reporter: David Mollitor
>Assignee: David Mollitor
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Jline 2 has been discontinued a long while ago.  Hadoop uses JLine3 so Hive 
> should match.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25716) Fix flaky test TestCompactionMetrics#testOldestReadyForCleaningAge

2021-12-09 Thread Antal Sinkovits (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Antal Sinkovits resolved HIVE-25716.

Fix Version/s: 4.0.0
   Resolution: Fixed

Committed to master branch. Thanks for your contribution [~vcsomor]

> Fix flaky test TestCompactionMetrics#testOldestReadyForCleaningAge
> --
>
> Key: HIVE-25716
> URL: https://issues.apache.org/jira/browse/HIVE-25716
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Viktor Csomor
>Priority: Major
>  Labels: flaky-test, pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Flaky check failed on run #59:
> [http://ci.hive.apache.org/job/hive-flaky-check/467/|http://ci.hive.apache.org/job/hive-flaky-check/467/]
> {code:java}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hive.ql.txn.compactor.TestCompactionMetrics.testOldestReadyForCleaningAge(TestCompactionMetrics.java:214)
> {code}
> (!) After turning off the test the problematic line is actually 215 in the 
> codebase
> {code}
> Assert.assertTrue(Metrics.getOrCreateGauge(MetricsConstants.OLDEST_READY_FOR_CLEANING_AGE).intValue()
>  >= youngDiff);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-25716) Fix flaky test TestCompactionMetrics#testOldestReadyForCleaningAge

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25716?focusedWorklogId=693104=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693104
 ]

ASF GitHub Bot logged work on HIVE-25716:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 10:15
Start Date: 09/Dec/21 10:15
Worklog Time Spent: 10m 
  Work Description: asinkovits merged pull request #2837:
URL: https://github.com/apache/hive/pull/2837


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693104)
Time Spent: 1h 10m  (was: 1h)

> Fix flaky test TestCompactionMetrics#testOldestReadyForCleaningAge
> --
>
> Key: HIVE-25716
> URL: https://issues.apache.org/jira/browse/HIVE-25716
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 4.0.0
>Reporter: Karen Coppage
>Assignee: Viktor Csomor
>Priority: Major
>  Labels: flaky-test, pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Flaky check failed on run #59:
> [http://ci.hive.apache.org/job/hive-flaky-check/467/|http://ci.hive.apache.org/job/hive-flaky-check/467/]
> {code:java}
> java.lang.AssertionError
>   at org.junit.Assert.fail(Assert.java:87)
>   at org.junit.Assert.assertTrue(Assert.java:42)
>   at org.junit.Assert.assertTrue(Assert.java:53)
>   at 
> org.apache.hadoop.hive.ql.txn.compactor.TestCompactionMetrics.testOldestReadyForCleaningAge(TestCompactionMetrics.java:214)
> {code}
> (!) After turning off the test the problematic line is actually 215 in the 
> codebase
> {code}
> Assert.assertTrue(Metrics.getOrCreateGauge(MetricsConstants.OLDEST_READY_FOR_CLEANING_AGE).intValue()
>  >= youngDiff);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24390) Spelling fixes

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24390?focusedWorklogId=693074=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693074
 ]

ASF GitHub Bot logged work on HIVE-24390:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 09:29
Start Date: 09/Dec/21 09:29
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk commented on pull request #2810:
URL: https://github.com/apache/hive/pull/2810#issuecomment-989668675


   I've kicked off the tests once more - but the test failures seem to be 
releated to the changes; because the same set of tests failed.
   
   
http://ci.hive.apache.org/job/hive-precommit/job/PR-2810/2/testReport/org.apache.hadoop.hive.cli/TestNegativeLlapLocalCliDriver/Testing___split_11___PostProcess___testCliDriver_stats_publisher_error_1_/


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693074)
Time Spent: 5h 10m  (was: 5h)

> Spelling fixes
> --
>
> Key: HIVE-24390
> URL: https://issues.apache.org/jira/browse/HIVE-24390
> Project: Hive
>  Issue Type: Bug
>Reporter: Josh Soref
>Assignee: Josh Soref
>Priority: Minor
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-24975) Bug in ValidWriteIdList comparison in TxnIdUtils

2021-12-09 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich resolved HIVE-24975.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged into master. Thank you [~sourabh912] for fixing this!

> Bug in ValidWriteIdList comparison in TxnIdUtils
> 
>
> Key: HIVE-24975
> URL: https://issues.apache.org/jira/browse/HIVE-24975
> Project: Hive
>  Issue Type: Bug
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> TxnIdUtils's 
> [compare|https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/TxnIdUtils.java#L38]
>  method returns incorrect result for the following validWriteIdLists
> ValidWriteIdList a = new ValidReaderWriteIdList("default.test:1:1:1:");
>  ValidWriteIdList b = new 
> ValidReaderWriteIdList("default.test:1:9223372036854775807::");
> TxnIdUtils.compare(a, b) returns +1 whereas the expected response is -1 since 
> b is more recent.
> cc - [~kishendas] [~vihangk1]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Work logged] (HIVE-24975) Bug in ValidWriteIdList comparison in TxnIdUtils

2021-12-09 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24975?focusedWorklogId=693066=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-693066
 ]

ASF GitHub Bot logged work on HIVE-24975:
-

Author: ASF GitHub Bot
Created on: 09/Dec/21 09:21
Start Date: 09/Dec/21 09:21
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #2641:
URL: https://github.com/apache/hive/pull/2641


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: gitbox-unsubscr...@hive.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 693066)
Time Spent: 1h 10m  (was: 1h)

> Bug in ValidWriteIdList comparison in TxnIdUtils
> 
>
> Key: HIVE-24975
> URL: https://issues.apache.org/jira/browse/HIVE-24975
> Project: Hive
>  Issue Type: Bug
>Reporter: Sourabh Goyal
>Assignee: Sourabh Goyal
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> TxnIdUtils's 
> [compare|https://github.com/apache/hive/blob/master/storage-api/src/java/org/apache/hive/common/util/TxnIdUtils.java#L38]
>  method returns incorrect result for the following validWriteIdLists
> ValidWriteIdList a = new ValidReaderWriteIdList("default.test:1:1:1:");
>  ValidWriteIdList b = new 
> ValidReaderWriteIdList("default.test:1:9223372036854775807::");
> TxnIdUtils.compare(a, b) returns +1 whereas the expected response is -1 since 
> b is more recent.
> cc - [~kishendas] [~vihangk1]



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Resolved] (HIVE-25737) Compaction Observability: Initiator/Worker/Cleaner cycle measurement improvements

2021-12-09 Thread Karen Coppage (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karen Coppage resolved HIVE-25737.
--
Resolution: Fixed

Committed to master branch. Thanks for your contribution [~vcsomor]!

> Compaction Observability: Initiator/Worker/Cleaner cycle measurement 
> improvements
> -
>
> Key: HIVE-25737
> URL: https://issues.apache.org/jira/browse/HIVE-25737
> Project: Hive
>  Issue Type: Improvement
>  Components: Standalone Metastore
>Affects Versions: 4.0.0
>Reporter: Viktor Csomor
>Assignee: Viktor Csomor
>Priority: Major
>
> In the Compaction Observability the Initiator/Worker/Cleaner cycle is 
> measured with a [Dropwizard 
> Timer|https://metrics.dropwizard.io/4.2.0/getting-started.html] metrics.
> {noformat}
> Timers
> A timer measures both the rate that a particular piece of code is called and 
> the distribution of its duration.
> {noformat}
> However this is not good to measure simply a duration. Furthermore, one HMS 
> can run multiple Worker threads and the duration of the last finished worker 
> is not really informative if a Worker thread got stuck.
> Timers do not carry enough information because they only bump the counter if 
> a Worker has finished a loop.
> If Initiator/Worker/Cleaner gets stuck, then the metrics is not provided 
> hence it didn't bump the counter.
> It'd better to implement the followings:
> - Time passed since Initiator start (single threaded) -> Gauge metric
> - Oldest Working compaction -> Gauge Metric
> - Oldest Working Cleaner -> Gauge metric



--
This message was sent by Atlassian Jira
(v8.20.1#820001)