Re: Review Request 51498: Multiple Aggregate Alerts For JournalNode Exist With Some Being Incorrect

2016-08-29 Thread Robert Nettleton

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51498/#review147205
---


Ship it!




Ship It!

- Robert Nettleton


On Aug. 29, 2016, 7:43 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51498/
> ---
> 
> (Updated Aug. 29, 2016, 7:43 p.m.)
> 
> 
> Review request for Ambari, Nate Cole, Robert Levas, and Robert Nettleton.
> 
> 
> Bugs: AMBARI-18271
> https://issues.apache.org/jira/browse/AMBARI-18271
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When a cluster is being created initially, there are many new alert instances 
> being generated concurrently. This can lead to a race condition where certain 
> alert types, such as aggregate alerts, fire several events and end up 
> creating duplicate alert instances.
> 
> Although aggregate alerts are the most common, the flaw is in the alert 
> listener since it never guarantees uniqueness even though it supports 
> concurrent events.
> 
> This problem is mainly encountered when the cluster is initially created, and 
> therefore a solution shouldn't impact the rest of the cluster's lifetime. In 
> other words, we shouldn't cause a degradation the other 99.9% of the time in 
> order to fix this problem.
> 
> 
> Diffs
> -
> 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
>  2dcf1d6 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertCurrentEntity.java
>  b30b100 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntity.java
>  638646c 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertHistoryEntity.java
>  8bc4b99 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertCurrentEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertHistoryEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertReceivedListenerTest.java
>  7bf11e3 
> 
> Diff: https://reviews.apache.org/r/51498/diff/
> 
> 
> Testing
> ---
> 
> A new unit test was written which reproduced the problem, passing once the 
> patch was complete.
> 
> Tests run: 4619, Failures: 0, Errors: 0, Skipped: 34
> 
> [INFO] 
> 
> [INFO] BUILD SUCCESS
> [INFO] 
> 
> [INFO] Total time: 34:07 min
> [INFO] Finished at: 2016-08-29T14:23:54-04:00
> [INFO] Final Memory: 38M/684M
> [INFO] 
> 
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 51498: Multiple Aggregate Alerts For JournalNode Exist With Some Being Incorrect

2016-08-29 Thread Jonathan Hurley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51498/
---

(Updated Aug. 29, 2016, 3:43 p.m.)


Review request for Ambari, Nate Cole, Robert Levas, and Robert Nettleton.


Bugs: AMBARI-18271
https://issues.apache.org/jira/browse/AMBARI-18271


Repository: ambari


Description
---

When a cluster is being created initially, there are many new alert instances 
being generated concurrently. This can lead to a race condition where certain 
alert types, such as aggregate alerts, fire several events and end up creating 
duplicate alert instances.

Although aggregate alerts are the most common, the flaw is in the alert 
listener since it never guarantees uniqueness even though it supports 
concurrent events.

This problem is mainly encountered when the cluster is initially created, and 
therefore a solution shouldn't impact the rest of the cluster's lifetime. In 
other words, we shouldn't cause a degradation the other 99.9% of the time in 
order to fix this problem.


Diffs
-

  
ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
 2dcf1d6 
  
ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertCurrentEntity.java
 b30b100 
  
ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntity.java
 638646c 
  
ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertHistoryEntity.java
 8bc4b99 
  
ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertCurrentEntityTest.java
 PRE-CREATION 
  
ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntityTest.java
 PRE-CREATION 
  
ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertHistoryEntityTest.java
 PRE-CREATION 
  
ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertReceivedListenerTest.java
 7bf11e3 

Diff: https://reviews.apache.org/r/51498/diff/


Testing (updated)
---

A new unit test was written which reproduced the problem, passing once the 
patch was complete.

Tests run: 4619, Failures: 0, Errors: 0, Skipped: 34

[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 34:07 min
[INFO] Finished at: 2016-08-29T14:23:54-04:00
[INFO] Final Memory: 38M/684M
[INFO] 


Thanks,

Jonathan Hurley



Re: Review Request 51498: Multiple Aggregate Alerts For JournalNode Exist With Some Being Incorrect

2016-08-29 Thread Nate Cole

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51498/#review147169
---


Ship it!




Ship It!

- Nate Cole


On Aug. 29, 2016, 12:54 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51498/
> ---
> 
> (Updated Aug. 29, 2016, 12:54 p.m.)
> 
> 
> Review request for Ambari, Nate Cole, Robert Levas, and Robert Nettleton.
> 
> 
> Bugs: AMBARI-18271
> https://issues.apache.org/jira/browse/AMBARI-18271
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When a cluster is being created initially, there are many new alert instances 
> being generated concurrently. This can lead to a race condition where certain 
> alert types, such as aggregate alerts, fire several events and end up 
> creating duplicate alert instances.
> 
> Although aggregate alerts are the most common, the flaw is in the alert 
> listener since it never guarantees uniqueness even though it supports 
> concurrent events.
> 
> This problem is mainly encountered when the cluster is initially created, and 
> therefore a solution shouldn't impact the rest of the cluster's lifetime. In 
> other words, we shouldn't cause a degradation the other 99.9% of the time in 
> order to fix this problem.
> 
> 
> Diffs
> -
> 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
>  2dcf1d6 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertCurrentEntity.java
>  b30b100 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntity.java
>  638646c 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertHistoryEntity.java
>  8bc4b99 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertCurrentEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertHistoryEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertReceivedListenerTest.java
>  7bf11e3 
> 
> Diff: https://reviews.apache.org/r/51498/diff/
> 
> 
> Testing
> ---
> 
> PENDING...
> 
> A new unit test was written which reproduced the problem, passing once the 
> patch was complete.
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>



Re: Review Request 51498: Multiple Aggregate Alerts For JournalNode Exist With Some Being Incorrect

2016-08-29 Thread Jonathan Hurley

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/51498/#review147147
---




ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
 (lines 107 - 110)


The size of the striped locks isn't terribly important since too few would 
only cause contention on the initial creation of alerts



ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
 (lines 186 - 210)


Hitting the DB once, and then a second time is not the best, but it allows 
this patch to focus primarily on the case of new alerts without impacting 
performance of alerts running in an existing cluster. 

The tables queried use indexes as well, so it shouldn't be too bad of a hit.



ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertHistoryEntity.java
 (lines 388 - 425)


equals/hashCode rewrite was from an initial attempt at fixing this problem, 
but it was better than what was there before, so I kept it.

It would have been nice to have a non-entity related ID which could be used 
for equality comparison. We don't have one, so ID is OK to use when it exists.

Alerts get compared to frequently though, so I wanted to ensure that if we 
didn't need to compare other stuff, we don't.


- Jonathan Hurley


On Aug. 29, 2016, 12:54 p.m., Jonathan Hurley wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/51498/
> ---
> 
> (Updated Aug. 29, 2016, 12:54 p.m.)
> 
> 
> Review request for Ambari, Nate Cole, Robert Levas, and Robert Nettleton.
> 
> 
> Bugs: AMBARI-18271
> https://issues.apache.org/jira/browse/AMBARI-18271
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> When a cluster is being created initially, there are many new alert instances 
> being generated concurrently. This can lead to a race condition where certain 
> alert types, such as aggregate alerts, fire several events and end up 
> creating duplicate alert instances.
> 
> Although aggregate alerts are the most common, the flaw is in the alert 
> listener since it never guarantees uniqueness even though it supports 
> concurrent events.
> 
> This problem is mainly encountered when the cluster is initially created, and 
> therefore a solution shouldn't impact the rest of the cluster's lifetime. In 
> other words, we shouldn't cause a degradation the other 99.9% of the time in 
> order to fix this problem.
> 
> 
> Diffs
> -
> 
>   
> ambari-server/src/main/java/org/apache/ambari/server/events/listeners/alerts/AlertReceivedListener.java
>  2dcf1d6 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertCurrentEntity.java
>  b30b100 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntity.java
>  638646c 
>   
> ambari-server/src/main/java/org/apache/ambari/server/orm/entities/AlertHistoryEntity.java
>  8bc4b99 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertCurrentEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertDefinitionEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/orm/entities/AlertHistoryEntityTest.java
>  PRE-CREATION 
>   
> ambari-server/src/test/java/org/apache/ambari/server/state/alerts/AlertReceivedListenerTest.java
>  7bf11e3 
> 
> Diff: https://reviews.apache.org/r/51498/diff/
> 
> 
> Testing
> ---
> 
> PENDING...
> 
> A new unit test was written which reproduced the problem, passing once the 
> patch was complete.
> 
> 
> Thanks,
> 
> Jonathan Hurley
> 
>