[jira] [Work logged] (HIVE-25022) Metric about incomplete compactions

2021-04-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25022?focusedWorklogId=586525=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-586525
 ]

ASF GitHub Bot logged work on HIVE-25022:
-

Author: ASF GitHub Bot
Created on: 21/Apr/21 12:28
Start Date: 21/Apr/21 12:28
Worklog Time Spent: 10m 
  Work Description: klcopp closed pull request #2184:
URL: https://github.com/apache/hive/pull/2184


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 586525)
Time Spent: 50m  (was: 40m)

> Metric about incomplete compactions
> ---
>
> Key: HIVE-25022
> URL: https://issues.apache.org/jira/browse/HIVE-25022
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> "Compactions in a state" metrics (for example compaction_num_working) count 
> the sum of tables/partitions where the last compaction is in that state.
> I propose introducing a new metric about incomplete compactions: i.e. the 
> number of tables/partitions where the last finished compaction* is 
> unsuccessful (failed or "did not initiate"), or where major compaction was 
> unsuccessful then minor compaction succeeded (compaction is not "complete" 
> since major compaction has not succeeded in the time since it should have 
> run).
> Example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major working
> major failed
> major initiated
> major working
> major failed
> major initiated
> major working
> The "compactions in a state" metrics will consider the state of this table: 
> working.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there have been failed compactions since the last succeeded compaction.
> {code}
> Another example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major failed
> minor failed
> minor succeeded
> The "compactions in a state" metrics will consider the state of this table: 
> succeeded.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there hasn't been a major succeeded since major failed.{code}
> Last example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> minor did not initiate
> The "compactions in a state" metrics will consider the state of this table: 
> did not initiate.
> The "incomplete compactions" metric will consider this: incomplete, since the 
> last compaction was "did not initiate"{code}
> *finished compaction: state in (succeeded, failed, attempted/did not initiate)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25022) Metric about incomplete compactions

2021-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25022?focusedWorklogId=585708=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585708
 ]

ASF GitHub Bot logged work on HIVE-25022:
-

Author: ASF GitHub Bot
Created on: 20/Apr/21 11:05
Start Date: 20/Apr/21 11:05
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on pull request #2184:
URL: https://github.com/apache/hive/pull/2184#issuecomment-823185836


   > > LGTM, however, I don't see real benefit from this metric.
   > > ```
   > > major succeeded
   > > major failed
   > > minor failed
   > > minor succeeded
   > > ```
   > > 
   > > 
   > > This would be reported as incomplete compaction. What action do you 
expect from the end-users in this case?
   > 
   > In this case end users should re-run major compaction. Major compaction 
should have run (at 2. major failed) but hasn't since that failure.
   
   How would they know tables/partitions to re-run major compaction? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 585708)
Time Spent: 40m  (was: 0.5h)

> Metric about incomplete compactions
> ---
>
> Key: HIVE-25022
> URL: https://issues.apache.org/jira/browse/HIVE-25022
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> "Compactions in a state" metrics (for example compaction_num_working) count 
> the sum of tables/partitions where the last compaction is in that state.
> I propose introducing a new metric about incomplete compactions: i.e. the 
> number of tables/partitions where the last finished compaction* is 
> unsuccessful (failed or "did not initiate"), or where major compaction was 
> unsuccessful then minor compaction succeeded (compaction is not "complete" 
> since major compaction has not succeeded in the time since it should have 
> run).
> Example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major working
> major failed
> major initiated
> major working
> major failed
> major initiated
> major working
> The "compactions in a state" metrics will consider the state of this table: 
> working.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there have been failed compactions since the last succeeded compaction.
> {code}
> Another example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major failed
> minor failed
> minor succeeded
> The "compactions in a state" metrics will consider the state of this table: 
> succeeded.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there hasn't been a major succeeded since major failed.{code}
> Last example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> minor did not initiate
> The "compactions in a state" metrics will consider the state of this table: 
> did not initiate.
> The "incomplete compactions" metric will consider this: incomplete, since the 
> last compaction was "did not initiate"{code}
> *finished compaction: state in (succeeded, failed, attempted/did not initiate)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25022) Metric about incomplete compactions

2021-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25022?focusedWorklogId=585692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585692
 ]

ASF GitHub Bot logged work on HIVE-25022:
-

Author: ASF GitHub Bot
Created on: 20/Apr/21 10:30
Start Date: 20/Apr/21 10:30
Worklog Time Spent: 10m 
  Work Description: klcopp commented on pull request #2184:
URL: https://github.com/apache/hive/pull/2184#issuecomment-823166632


   > LGTM, however, I don't see real benefit from this metric.
   > 
   > ```
   > major succeeded
   > major failed
   > minor failed
   > minor succeeded
   > ```
   > 
   > This would be reported as incomplete compaction. What action do you expect 
from the end-users in this case?
   
   In this case end users should re-run major compaction. Major compaction 
should have run (at 2. major failed) but hasn't since that failure.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 585692)
Time Spent: 0.5h  (was: 20m)

> Metric about incomplete compactions
> ---
>
> Key: HIVE-25022
> URL: https://issues.apache.org/jira/browse/HIVE-25022
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> "Compactions in a state" metrics (for example compaction_num_working) count 
> the sum of tables/partitions where the last compaction is in that state.
> I propose introducing a new metric about incomplete compactions: i.e. the 
> number of tables/partitions where the last finished compaction* is 
> unsuccessful (failed or "did not initiate"), or where major compaction was 
> unsuccessful then minor compaction succeeded (compaction is not "complete" 
> since major compaction has not succeeded in the time since it should have 
> run).
> Example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major working
> major failed
> major initiated
> major working
> major failed
> major initiated
> major working
> The "compactions in a state" metrics will consider the state of this table: 
> working.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there have been failed compactions since the last succeeded compaction.
> {code}
> Another example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major failed
> minor failed
> minor succeeded
> The "compactions in a state" metrics will consider the state of this table: 
> succeeded.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there hasn't been a major succeeded since major failed.{code}
> Last example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> minor did not initiate
> The "compactions in a state" metrics will consider the state of this table: 
> did not initiate.
> The "incomplete compactions" metric will consider this: incomplete, since the 
> last compaction was "did not initiate"{code}
> *finished compaction: state in (succeeded, failed, attempted/did not initiate)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-25022) Metric about incomplete compactions

2021-04-20 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25022?focusedWorklogId=585689=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-585689
 ]

ASF GitHub Bot logged work on HIVE-25022:
-

Author: ASF GitHub Bot
Created on: 20/Apr/21 10:26
Start Date: 20/Apr/21 10:26
Worklog Time Spent: 10m 
  Work Description: deniskuzZ commented on a change in pull request #2184:
URL: https://github.com/apache/hive/pull/2184#discussion_r616544766



##
File path: 
standalone-metastore/metastore-server/src/main/java/org/apache/hadoop/hive/metastore/metrics/AcidMetricService.java
##
@@ -108,19 +112,56 @@ private void updateDBMetrics() throws MetaException {
   @VisibleForTesting
   public static void updateMetricsFromShowCompact(ShowCompactResponse 
showCompactResponse) {
 Map lastElements = new HashMap<>();
+Map lastUnsuccessfulMajor = new HashMap<>();
+Map lastUnsuccessfulMinor = new HashMap<>();
 long oldestEnqueueTime = Long.MAX_VALUE;
 
-// Get the last compaction for each db/table/partition
-for(ShowCompactResponseElement element : 
showCompactResponse.getCompacts()) {
+// sort compactions by ID. This is not done in TxnHandler.
+List compactions = 
showCompactResponse.getCompacts().stream()
+.sorted((o1, o2) -> (int) (o1.getId() - 
o2.getId())).collect(Collectors.toList());
+for (ShowCompactResponseElement element : compactions) {
   String key = element.getDbname() + "/" + element.getTablename() +
   (element.getPartitionname() != null ? "/" + 
element.getPartitionname() : "");
+
+  // Get the last compaction for each db/table/partition
   // If new key, add the element, if there is an existing one, change to 
the element if the element.id is greater than old.id
   lastElements.compute(key, (k, old) -> (old == null) ? element : 
(element.getId() > old.getId() ? element : old));
   if (TxnStore.INITIATED_RESPONSE.equals(element.getState()) && 
oldestEnqueueTime > element.getEnqueueTime()) {
 oldestEnqueueTime = element.getEnqueueTime();
   }
+
+  // Count incomplete compactions
+  CompactionType type = element.getType();
+  lastUnsuccessfulMajor.compute(key, (k, old) -> {
+// Add newest unsuccessful compaction to the map
+if (wasUnsuccessful(element) && type == MAJOR) {

Review comment:
   Could it be simplified ?
   ```
   if (type == MAJOR){
   if (wasUnsuccessful(element)) return element.getId();
   if (wasSuccessful(element)) return null;
   }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 585689)
Time Spent: 20m  (was: 10m)

> Metric about incomplete compactions
> ---
>
> Key: HIVE-25022
> URL: https://issues.apache.org/jira/browse/HIVE-25022
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> "Compactions in a state" metrics (for example compaction_num_working) count 
> the sum of tables/partitions where the last compaction is in that state.
> I propose introducing a new metric about incomplete compactions: i.e. the 
> number of tables/partitions where the last finished compaction* is 
> unsuccessful (failed or "did not initiate"), or where major compaction was 
> unsuccessful then minor compaction succeeded (compaction is not "complete" 
> since major compaction has not succeeded in the time since it should have 
> run).
> Example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major working
> major failed
> major initiated
> major working
> major failed
> major initiated
> major working
> The "compactions in a state" metrics will consider the state of this table: 
> working.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there have been failed compactions since the last succeeded compaction.
> {code}
> Another example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major failed
> minor failed
> minor succeeded
> The "compactions in a state" metrics will consider the state of this table: 
> succeeded.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there hasn't been a major succeeded since major failed.{code}
> Last example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> minor did not initiate
> The "compactions in a state" metrics will consider the state of 

[jira] [Work logged] (HIVE-25022) Metric about incomplete compactions

2021-04-15 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25022?focusedWorklogId=583551=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-583551
 ]

ASF GitHub Bot logged work on HIVE-25022:
-

Author: ASF GitHub Bot
Created on: 15/Apr/21 16:04
Start Date: 15/Apr/21 16:04
Worklog Time Spent: 10m 
  Work Description: klcopp opened a new pull request #2184:
URL: https://github.com/apache/hive/pull/2184


   ### What changes were proposed in this pull request?
   See https://issues.apache.org/jira/browse/HIVE-25022
   
   ### How was this patch tested?
   Unit tests
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 583551)
Remaining Estimate: 0h
Time Spent: 10m

> Metric about incomplete compactions
> ---
>
> Key: HIVE-25022
> URL: https://issues.apache.org/jira/browse/HIVE-25022
> Project: Hive
>  Issue Type: Sub-task
>Reporter: Karen Coppage
>Assignee: Karen Coppage
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> "Compactions in a state" metrics (for example compaction_num_working) count 
> the sum of tables/partitions where the last compaction is in that state.
> I propose introducing a new metric about incomplete compactions: i.e. the 
> number of tables/partitions where the last finished compaction* is 
> unsuccessful (failed or "did not initiate"), or where major compaction was 
> unsuccessful then minor compaction succeeded (compaction is not "complete" 
> since major compaction has not succeeded in the time since it should have 
> run).
> Example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major working
> major failed
> major initiated
> major working
> major failed
> major initiated
> major working
> The "compactions in a state" metrics will consider the state of this table: 
> working.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there have been failed compactions since the last succeeded compaction.
> {code}
> Another example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> major failed
> minor failed
> minor succeeded
> The "compactions in a state" metrics will consider the state of this table: 
> succeeded.
> The "incomplete compactions" metric will consider this: incomplete, since 
> there hasn't been a major succeeded since major failed.{code}
> Last example:
> {code:java}
> These compactions ran on a partition:
> major succeeded
> minor did not initiate
> The "compactions in a state" metrics will consider the state of this table: 
> did not initiate.
> The "incomplete compactions" metric will consider this: incomplete, since the 
> last compaction was "did not initiate"{code}
> *finished compaction: state in (succeeded, failed, attempted/did not initiate)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)