Hi Amin,
These are all good points.
I'm a little hesitant to advocate using events as a scratch pad for
distribution. While that is one approach there are many others. For
example, the approach we've toyed with is to put all distributable data
in a replicated database or DHT. In this model, state distribution is
done via the databases and events are only local. We've found this to
be a simpler method of reasoning about the scalability limits (which
converge on the read/mod/write throughput of the db backends). That
said, it clearly has high value for debugging.
In response to your last question, there isn't any support for this in
the core code base. We know of some deployments which use Linux HA and
a replicated cdb, but all the soft state must be reconstructed on
failover. Certainly for production deployments this is a high-priority
item, but not on the immediate-term roadmap.
.martin
Hi Martin,
I meant requiring the originators to add themselves (I can think of
workarounds to have it work implicitly, but they are
compiler/architecture specific).
This feature is not only useful for debugging, it can also be used to
deploy multiple NOX controllers to control a single network: On each
controller *capture a set of ofp_msg_events* (e.g., only a small
portion of packet_in events change the controller state) and
*replay/dispatch* them on the others. We need to discard any outgoing
ofp packets caused by the replayed events and for that we need to keep
track of what events triggered other events/messages.
Also, to find out which events are *important* (i.e., alter the
controller state), the controller and the running applications need to
mark events explicitly. In other words, it is the
controller/application developer's job to specify which events should
be propagated to other controllers. This part also requires the
feature mentioned above. That is because if a non-ofp_msg_event is
marked we should be able to trace back to the original ofp_msg_event
and mark it. Am I right about the ofp_msg_events being the driving
force of NOX operation?
And my last question: Is there currently anyway for two NOX instances
to synchronize their states for fail over? If not, are there any plans
to provide such a feature? Is there any way for a
controller/application to store its transient state on disk? What
happens in a production network with hundreds of switches when the
controller crashes and comes back up in a few seconds? Should it
rediscover the topology, host-ip-mac bindings, etc. from scratch?
Cheers,
Amin
Regarding tracing the event call stack. This would certainly be a useful
debugging tool. However, the nature of events is that the infrastructure is
decoupled from the senders and receivers so it isn't clear to me how we'd
mark the originator an a general way without requiring the originators to
add themselves. I'm certainly open to ideas ...
Thanks,
Amin
_______________________________________________
nox-dev mailing list
[email protected]
http://noxrepo.org/mailman/listinfo/nox-dev_noxrepo.org