Re: [OPSAWG] OPS-DIR review of draft-ietf-opsawg-vmm-mib-02

Keiichi SHIMA Mon, 25 May 2015 18:46:20 -0700

Hi all,

>> 17.  I am concerned that the current way of defining the
>> notifications switches is too course. Hypervisors may have many VMs
>> in charge, and if each generates one notification per each state
>> changes the numbers can become big even in normal operation. Maybe
>> some throttling mechanism would be useful. Or maybe a couple of more
>> switches that allow to enable only 'critical' notifications
>> (e.g. vmCrashed).
> 
> The only situation I can think of where the number of notifications
> will be significant is during restart of a whole rack with many
> hypervisors inside. But even then, things usually take time. (I think
> our Xen hypervisors actually create virtual machines sequentially and
> hence the notifications get actually spaced over time.) It is not
> unlikely that the operating systems inside the hypervisors during
> startup generate significantly more network traffic compared to a few
> hypervisor notifications. That said, during the development of the MIB
> module we moved from generic state change notifications to a set of
> specific notifications and this allows to use SNMPv3 notification
> filtering to filter out notifications people find not useful.



I and Asai talked locally and we agree Juergen that the number of notification 
events is manageable.

However we noticed that the 'vmBlocked' notification may be a problem.  The 
'blocked' state is defined as 'The operational state of the virtual machine 
indicating the execution of the virtual machine is currently blocked, e.g., 
waiting for some action of the hypervisor to finish.  This is a transient state 
from/to other states.'  This state transition event may appear more frequently 
than other events, since the state is caused by the I/O scheduler of the hyper 
visor implementation.

We think we can simply remove the 'vmBlocked' notification.  The 'blocked' 
state is a transient state and typically return to the previous state 
immediately (once the pending I/O requests completed).

---
Keiichi SHIMA (島 慶一)
WIDE project <[email protected]>
Research Laboratory, IIJ Innovation Institute, Inc <[email protected]>



> On 2015/05/23, at 15:37, Juergen Schoenwaelder 
> <[email protected]> wrote:
> 
> On Thu, May 14, 2015 at 04:45:19PM +0000, Romascanu, Dan (Dan) wrote:
> 
> [...]
> 
> Dan,
> 
> many thanks for the review. I am skipping over the smaller nits.
> 
>> 16.  Page 34-35: I have a big question mark about how the
>> vmStorageReadLatency and vmStorageWriteLatency objects need to be
>> used. First the usage of the Counter64 syntax seems odd as these
>> objects do not count anything and their increase is not
>> unitary. Second I do not know how to interpret them. What does say a
>> reading of 'a million' or of 'a billion' mean? It really makes no
>> sense if not interpreted in conjunction with the number of
>> respective I/O operations, media type, etc. I guess that the
>> implementation of these objects is not trivial and probably requires
>> HW support taking into consideration the microseconds resolution. At
>> a minimum I guess that some text recommending to the operators how
>> they are supposed to be used is needed.
> 
> The only SMIv2 requirement for a Counter64 is that it increases
> monotonically which is the case here. There is no requirement that a
> counter increases unitary. By making this a counter, it is clear that
> one has to to look at the first derivative, that is the change of the
> counter over time. Obviously, on a lightly loaded hypervisor, the
> latency should be small but in situations where it increases
> significantly, you are likely hitting a slowdown of your VMs due to
> I/O contention instead of CPU limits. These objects are there to
> identify such situations (and the virtual storage devices involved).
> 
>> 17.  I am concerned that the current way of defining the
>> notifications switches is too course. Hypervisors may have many VMs
>> in charge, and if each generates one notification per each state
>> changes the numbers can become big even in normal operation. Maybe
>> some throttling mechanism would be useful. Or maybe a couple of more
>> switches that allow to enable only 'critical' notifications
>> (e.g. vmCrashed).
> 
> The only situation I can think of where the number of notifications
> will be significant is during restart of a whole rack with many
> hypervisors inside. But even then, things usually take time. (I think
> our Xen hypervisors actually create virtual machines sequentially and
> hence the notifications get actually spaced over time.) It is not
> unlikely that the operating systems inside the hypervisors during
> startup generate significantly more network traffic compared to a few
> hypervisor notifications. That said, during the development of the MIB
> module we moved from generic state change notifications to a set of
> specific notifications and this allows to use SNMPv3 notification
> filtering to filter out notifications people find not useful.
> 
>> 18.   The Security Considerations section does not include a description of 
>> the security hazards of mis-configuration of the writeable objects (danger 
>> of flooding the network with unwanted notifications)
>> 
> 
> Perhaps text can be added but I am not sure I see that a data center
> network will experience any significant load due to these notifications.
> 
> /js
> 
> -- 
> Juergen Schoenwaelder           Jacobs University Bremen gGmbH
> Phone: +49 421 200 3587         Campus Ring 1 | 28759 Bremen | Germany
> Fax:   +49 421 200 3103         <http://www.jacobs-university.de/>
> 
> _______________________________________________
> OPSAWG mailing list
> [email protected]
> https://www.ietf.org/mailman/listinfo/opsawg

_______________________________________________
OPSAWG mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/opsawg

Re: [OPSAWG] OPS-DIR review of draft-ietf-opsawg-vmm-mib-02

Reply via email to