scalable-agent-comms)

Sumit Naiksatam (snaiksat) Sat, 12 May 2012 14:36:46 -0700

Hi Gary, All,


Some comments inline.

 

Thanks,

~Sumit.

 

From: netstack-bounces+snaiksat=cisco....@lists.launchpad.net
[mailto:netstack-bounces+snaiksat=cisco....@lists.launchpad.net] On
Behalf Of Gary Kotton
Sent: Friday, May 11, 2012 1:29 AM
To: Maru Newby
Cc: Christopher Wright; netstack@lists.launchpad.net
Subject: Re: [Netstack]
ScalableAgents(https://blueprints.launchpad.net/quantum/+spec/scalable-a
gent-comms)

 

Hi,
Thanks for the input and comments. This is really great.

I would like to propose the following staged development (enable to
stabilize, test, and then optimize):
1. Stage 1 - have the agent detect a change, initially by polling. When
the agent detects and update then it will contact the plugin for a
detailed update about the network.

 

<Sumit> Currently the agents work by first polling the Quantum DB to
detect changes in the network/port state, and then accordingly react
locally (to check for the presence of a tap device, etc.). As I
understand, you are proposing to switch the sequence of this logic,
i.e., the agent first detects a change in the local state (e.g., a new
tap device has been created) and then communicates with the Quantum
plugin to obtain more context for this change. If this is the thought, I
believe it is reasonable and will eliminate the overhead from polling
the DB. The premise here of course is that the agent is able to locally
detect all the changes that it needs to react to, and in the basic case
of the Linux Bridge plugin, I don't think there is anything beyond
creation of tap devices, so this should work. If there are other state
changes introduced by the Quantum plugin that the agent needs to react
to, then the agent would not know about these in the absence of a
notification mechanism.</Sumit>


2. Stage 2 - Event driven support. One option is to have the operating
system notify the agent (as suggested by Darragh) another is to have the
VIF driver notify the agent. I am in favour of the latter. The VIF
driver is essentially creating the new tap device or deleting the
existing tap device. It seems logical that this would drive the update
on the agent.

 

<Sumit> I would actually prefer the former approach. It's better to
decouple the components to the extent possible so as to be able to
update/reuse them independently. I doubt that we will get any
significant performance advantage out of making the VIF driver aware of
the agents and having them communicate with those explicitly. However,
it does introduce a stronger coupling and we should probably avoid
it.</Sumit>



What I would like to do is a quick POC of the above and then write a
detailed design of the flow so that we can all review. If it "compiles
and runs" on paper then it will speed up the development, testing and
deployment. It will also enable us to document for future reference.
This will also save time with review and the ping/pong with the -1's.

 

<Sumit> Great, thanks for doing this! </Sumit>



Have a good weekend and thanks for the inputs and comments. Hopefully
next week I'll have an update on the progress.
Thanks
Gary


On 05/11/2012 12:30 AM, Maru Newby wrote: 

Thanks Darragh!  That should cover kvm.  And apparently it's possible to
be notified of vif changes from xen/xcp too, a more xen-savvy co-worker
is tracking down details.  

 

Gary, it sounds like it will be possible to have the agent notified
directly of device changes.  What are you thoughts as to modifying your
proposal to take this into account?

 

Cheers,

 

 

Maru

 

 

On 2012-05-10, at 2:06 PM, Darragh OReilly wrote:





 

maybe udev events rules/actions could be installed for add/remove tap
device events

http://www.reactivated.net/writing_udev_rules.html#external-run





________________________________

From: Maru Newby <mne...@internap.com>
To: gkot...@redhat.com 
Cc: Christopher Wright <chr...@redhat.com>; netstack@lists.launchpad.net

Sent: Thursday, 10 May 2012, 18:38
Subject: Re: [Netstack] Scalable
Agents(https://blueprints.launchpad.net/quantum/+spec/scalable-agent-com
ms)

 

Hi Gary,

 

I appreciate the effort you've put into condensing the options.

 

I agree with your suggestion that option 1 is a good starting point.
How will the agent discover changes to tap devices?  Can an agent
register for events from linux/kvm or xen, or would the agent just poll?
For all I know agents may do this already, so I apologize if this is a
silly question.

 

Regarding option 2, I still see no reason to have the vif driver talk to
the agent directly.  Ensuring a single point of contact between quantum
clients (of which the vif driver is one) and quantum, namely the rest
interface, limits complexity and will be easier to maintain and test.
If and when performance or other concerns require direct vif driver to
agent communication, we can go down that road, but as of now it's
answering a question that hasn't been asked.  YAGNI.

 

I would also argue that even RPC communication between the plugin and
agent is gold-plating.  The problem at hand is that database polling
doesn't scale well.  The simple answer is for the plugin and agent to
communicate directly rather than through a database intermediary.
Adding RPC to the mix is an implementation detail, pure and simple, and
is not cost-free.  RPC introduces queue dependency that can be
problematic to debug and as we've seen in nova can cause performance
issues all its own.

 

I'm all for leaving us open going forward to introduce an RPC
dependency, but I think the most important thing is to create a clean
communication interface between plugin and agent.  The initial
implementation can be something simple (and relatively dependency free)
like secured http.  The semantics for implementing and debugging http
communication are well-known to all of us.  If and when RPC becomes
necessary, it will be straightforward to plug in a new transport driver.

 

Let's keep it simple - distributed computing is complicated enough!

 

Cheers,

 

 

Maru

 

On 2012-05-10, at 8:22 AM, Gary Kotton wrote:





Hi,
Below is a table that lists a number of options, a short description,
their advantages and disadvantages. Hopefully this can give an idea of
the scope and complexity.

Option

Description

Advantages

Disadvantages

1 .Agent driving data retrieval from plugin 

The agent maintains a list of tap devices. If there is a new tap device
then the agent will request the network information for this tap device
from the quantum plugin. In the case of the open source ovs and lb
(linuxbridge) plugins this is tap + 11 letters of the attachment id.
The agent will send an RCP update about the delta to the plugin. The
plugin will answer accordingly. For example if one or more tap devices
are detected then these are sent to the plugin. For each new tap device
the plugin will sent the network information (tags etc) and set the
database attachment as up. For deletion they will be removed (or set as
down).

Simple
Self contained in Quantum

If there is more than 1 attachment ID with the same prefix of 11
characters then this will not work (this currently is a bug)
The agent will still have to poll the network interfaces. 

2 .VIF driver driving retrieval from plugin

The VIF driver updates the plugin about a change, which inturn updates
the relevant agent (this was described in the link
https://docs.google.com/document/d/1MbcBA2Os4b98ybdgAw2qe_68R1NG6KMh8zd
)

Event driven.
No polling

VIF driver and agents will need to share communication channels

3. Plugin broadcasting

When the plugin receives a change it broadcasts the change to all of the
registered agents

Relatively simple

Lots of unnecessary messages to agents that do not need to deal with the
traffic


I think that option #1 is a good start. This can later be optimized to
option #2.

Thanks
Gary

On 05/10/2012 10:05 AM, Gary Kotton wrote: 

On 05/10/2012 12:55 AM, Sumit Naiksatam (snaiksat) wrote: 



Hi Gary, 

Thanks for initiating this. A couple of comments/questions - 

1. Do we really need the VIF driver to communicate the agent's identity;

I am referring to the agent ID being sent by the VIF driver in the 
message? In general, I am not sure if there is a need to have the VIF 
driver send messages/notifications in the first place, but I perhaps 
it's being included as a capability in the framework? 

At the moment the open source plugins are not aware of the agents. The
agents poll the data base for updates. The agent ID enables a agent to
regsiter with the plugin, this in trrun enables the plugin to send a
update to the specific agent. The update is initiated by the VIF driver.
In my opinion this does the following: 
1. updates the agents as soon as possible regarding a network change 
2. limits traffic on the network 
3. removes the database interface from the agents 



2. One model I was thinking of (which is kind of inline with the 
existing agent implementations), is where the agents are smart, and they

know what to do in response to changes in the state of the logical 
Quantum resources. In such cases, the Quantum plugin need not have to 
keep track of sending a message to a particular agent. Instead, can we 
have broadcast messages from the plugin to all the agents? If the plugin

has to unicast messages to specific agents, then it needs to maintain a 
lot more state/topology information which should not be mandated for 
this sole reason. 

I too thought about this option. In a sense the above proposal is an
optimization of what you mention. This comes at the cost of complexity.
The broadcast option is nice when the number of agents is small. When
this is large, then for each network update there will be
NUMBER_OF_AGENT messages sent for each update. The advantage of what you
mention is that the code is self contained in Quantum. 

It may be better to start with the broadcast and then deal with the
optimizations afterwards. 

Thanks 
Gary 




Thanks, 
~Sumit. 




-----Original Message----- 
From: netstack-bounces+snaiksat=cisco....@lists.launchpad.net 
[mailto:netstack-bounces+snaiksat=cisco....@lists.launchpad.net] On 
Behalf Of Gary Kotton 
Sent: Wednesday, May 09, 2012 4:27 AM 
To:<netstack@lists.launchpad.net> <mailto:netstack@lists.launchpad.net>

Subject: [Netstack] Scalable 
Agents(https://blueprints.launchpad.net/quantum/+spec/scalable-agent- 
comms) 

Hi, 
I have added a very high level description on how to address the 

issue. 



This can be seen at: 

https://docs.google.com/document/d/1MbcBA2Os4b98ybdgAw2qe_68R1NG6KMh8zd 



ZKgOlpvg/edit 
Comments will be greatly appreciated. 
Questions: 
1. Do we want agents to be backward compatible (that is, still 

maintain 



the polling code) 
2. The generation of the Agent ID 
3. Any other ideas or thoughts about the matter? 
I'd like to go ahead with a POC and implement this. 
Thanks 
Gary 

-- 
Mailing list: https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack>  
Post to     : netstack@lists.launchpad.net 
Unsubscribe : https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack>  
More help   : https://help.launchpad.net/ListHelp 

 

 

-- 
Mailing list: https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack> 
Post to     : netstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack> 
More help   : https://help.launchpad.net/ListHelp

 


-- 
Mailing list: https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack> 
Post to    : netstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~netstack
<https://launchpad.net/%7Enetstack> 
More help  : https://help.launchpad.net/ListHelp

-- 
Mailing list: https://launchpad.net/~netstack
Post to     : netstack@lists.launchpad.net
Unsubscribe : https://launchpad.net/~netstack
More help   : https://help.launchpad.net/ListHelp

Re: [Netstack] ScalableAgents(https://blueprints.launchpad.net/quantum/+spec/scalable-agent-comms)

Reply via email to