Gordon,
Thanks for your comments.
Pls. check my answer and flow chart below.

On Wed, 21 Sep 2016, gordon chung wrote:


=========== event-alarm timeout implementation =============
As it's for event-alarm, we need keep it as event-driven. Furthermore,
for quick response, we need use event for timeout handling. Periodic
worker can't meet real time requirement.

Separated queue for 'alarm.timeout.end'(indicates timeout expire) leads
tricky race condition.  e.g.  'XYZ.done' comes in queue1, and
'alarm.timeout.end' comes in queue2, so that they are handled in
parallel way:

1. In queue1, 'XYZ.done' is checking against alarm(current UNKNOWN), and
will be set ALARM in next step.
2. In queue2, 'alarm.timeout.end' is checking against same alarm(current
UNKNOWN), and will be set to OK(UNALARM) in next step.
3. In qeueu1, alarm transition happen: UNKNOWN => ALARM
4. In queue2, another alarm transition happen: ALARM =>OK(UNALARM)



can you clarify how this work? after user creates event timeout alarm
definition through API (i assume the alarm definition specify we should
see event x within y seconds).
- how does the evaluator get this alarm definition? is there an
alarm.timeout.start message?

Yes.

- what is this UNALARM state? to be honest, that isn't a real word so i
don't know what it's suppose to represent here.

It's OK - mean we have enough data to say: not trigger this alarm. Somebody mistaken it by ALARM, so I mark it as UN-ALARM.


biggest problem for me is the only thing i know is there's a
alarm.timeout.end event that needs to be handled by evaluator. i don't
know where it's coming from or what it's needed for.

I attached flow chart at bottom . pls. check it. 'timeout.end' and event 'X' comes in different ways, it's good if evaluator do not touch next one until previous one was handled.



So this alarm has bogus transition: UNKNOWN=>ALARM=>UNALARM, and tells
the user: required event came, then no required event came;

If put all events in one queue, evaluator handles them one by one(low
level oslo mesg should be multi-threaded) so that second event would see
alarm state as not UNKNOWN, and give up its transition.  As Gordc said,
it's slow. But only very small part of the event-alarm need timeout
handling, as it's only for telco usage model.

so the multithreaded part is what i was talking about. it's not handling
them one by one. it's handling 64 (or whatever the default is) at any
given time. whether its' one queue or two, you have a race to handle.

See https://github.com/openstack/aodh/blob/master/aodh/evaluator/event.py#L158

evaluate_events is the handler of the endpoint for 'alarm.all', it iterates the event list and evaluate them one by one with project alarms. If both 'timeout.end' and 'X' are in the event list, I assume they are handled in sequence at different iterations of for loop. Am I right?

If we have evaluate_timeout_events as handler of another endpoint for 'alarm.timeout', then 2 handlers can run concurrently to lead race condition. I'm not familiar with underline oslo notifications, and think separated queue is different story. Pls. correct me if I'm wrong.

for e in events:
    try:
        event = Event(e)
    ......

    for id, alarm in six.iteritems(
            self._get_project_alarms(event.project)):
        try:
            self._evaluate_alarm(alarm, event)
        ...

================================================================================

+----------+   +------------+     +------------------+     +------------+      
+-----------+          +------------+
|  User    |   | API server |     | Notification bus |     | Evaluator  |      
| Threads   |          | Alarm state|
+--+-------+   +-----+------+     +--------+---------+     +-----+------+      
+--------+--+          +------+-----+
   |                 |                     |                     |              
        |                    |
   +---------------> |                     |                     |              
        |                    |
   | +-------------+ |                     |                     |              
        |                    |
   | |Alarm create | |                     |                     |              
        |                    +-----------+
   | |event: X     | |                     |                     |              
        |                    | UNKNOWN   |
   | |timeout: 5s  | |                     |                     |              
        |                    +-----------+
   | +-------------+ |                     |                     |              
        |                    |
   |                 +-----------------------------------------> |              
        |                    |
   |                 | +-----------------+ |                     |              
        |                    |
   |                 | |Event sent:      | |                     |              
        |                    |
   |                 | |tiemout.start    | |                     |              
        |                    |
   |                 | +-----------------+ |                     |              
        |                    |
   |                 |                     |                     
+--------------------> |                    |
   |                 |                     |                     |    
+----------+      |                    |
   |                 |                     |                     |    | create  
 |      |                    |
   |                 |                     |                     |    
+----------+      |                    |
   |                 |                     |                     |              
        +-----------+        |
   |                 |                     |                     |              
        |Sleep 10s  |        |
   |                 |                     |                     |              
        +-----------+        |
   |                 |                     |                     |              
        |                    |
   |                 |                     |                     | 
<--------------------+                    |
   |                 |                     |                     | 
+-----------------+  |                    |
   |                 |                     |                     | |1 - Event 
sent:  |  |                    |
   |                 |                     |                     | |timeout.end 
     |  |                    |
   |                 |                     |                     | 
+-----------------+  |                    |
   |                 |                     |                     |              
        |                    |
   |                 |                     |                     
+-----------------------------------------> |
   |                 |                     |                     | 
+------------------+ |                    +--------+
   |                 |                     |                     | |Transition: 
      | |                    | OK     |
   |                 |                     |                     | |==>> OK     
      | |                    +--------+
   |                 |                     |                     | 
+------------------+ |                    |
   |                 |                     +-------------------> |              
        |                    |
   |                 |                     | +---------------+   |              
        |                    |
   |                 |                     | |2 - Event come:|   |              
        |                    |
   |                 |                     | |X              |   |              
        |                    |
   |                 |                     | +---------------+   | 
+------------------+ |                    |
   |                 |                     |                     | |No 
transition:    | |                    |
   +                 +                     +                     + |Already OK  
      | +                    +
                                                                   
+------------------+



__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

Reply via email to