Re: [openstack-dev] [nova] Manage multiple clusters using a single nova service

2014-08-06 Thread Gary Kotton
Hi,
Sorry for taking such  long time to chime in but these mails were sadly
missed. Please see my inline comments below. My original concerns for the
revert of the service were as follows:

1. What do we do about existing installation. This support was added at
the end of Havana and it is in production.
2. I had concerns regarding the way in which the image cache would be
maintained - that is each compute node has its own cache directory. So
this may have had datastore issues.

Over the last few weeks I have encountered some serious problems with the
multi VC support. This is causing production setups to break
(https://review.openstack.org/108225 is an example - this is due to
https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L3368
). This is due to the fact that the node may be updated at random places
in the nova manager code (these may be bugs - but it does not work well
with the multi cluster support). There are too many edge cases here and
the code is not robust enough.

If we do decide to go ahead with dropping the support, then we need to do
the following:
1. Upgrade path: we need to have a well defined upgrade path that will
enable an existing setup to upgrade from I to J (I do not think that we
should leave this till K as there are too many pinpoints with the node
management).
2. We need to make a few tweaks to the image cache path. My original
concern was that each compute node has its own cache directory. After
giving it some though this will be ok as long as we have each compute host
using the same cache directory. The reason for this is that the locking
for image handling is done external on the file system
(https://github.com/openstack/nova/blob/master/nova/virt/vmwareapi/vmops.py
#L319). So if we have multiple compute processes running on the same host
then we are good. In addition to this we can make use of a shared files
system and then we can have all compute nodes use the shared file system
for the locking - win win :). If anyone gets to this stage in the thread
then please see a fix for object support and aging
(https://review.openstack.org/111996 - the object updates made earlier int
he cycle caused a few problems - but I guess that the gate does not wait
24 hours to purge instances).

In short I am in favor of removing the multi cluster support but we need
to do the following:
1. Upgrade path
2. Investigate memory issues with nova compute
3. Tweak image cache path


Thanks
Gary

On 7/15/14, 11:36 AM, "Matthew Booth"  wrote:

>On 14/07/14 09:34, Vaddi, Kiran Kumar wrote:
>> Hi,
>> 
>>  
>> 
>> In the Juno summit, it was discussed that the existing approach of
>> managing multiple VMware Clusters using a single nova compute service is
>> not preferred and the approach of one nova compute service representing
>> one cluster should be looked into.
>> 
>>  
>> 
>> We would like to retain the existing approach (till we have resolved the
>> issues) for the following reasons:
>> 
>>  
>> 
>> 1.   Even though a single service is managing all the clusters,
>> logically it is still one compute per cluster. To the scheduler each
>> cluster is represented as individual computes. Even in the driver each
>> cluster is represented separately.

This is something that would not change with dropping the multi cluster
support.
The only change here is that additional processes will be running (please
see below).

>> 
>>  
>> 
>> 2.   Since ESXi does not allow to run nova-compute service on the
>> hypervisor unlike KVM, the service has to be run externally on a
>> different server. Its easier from administration perspective to manage a
>> single service than multiple.

Yes, you have a good point here, but I think that at the end of the day we
need a robust service and that service will be managed by external tools,
for example chef, puppet etc. Unless it is a very small cloud.

>>  
>> 
>> 3.   Every connection to vCenter uses up ~140MB in the driver. If we
>> were to manage each cluster by an individual service the memory consumed
>> for 32 clusters will be high (~4GB). The newer versions support 64
>>clusters!

I think that this is a bug and it needs to be fixed. I understand that
this may affect a decision from today to tomorrow but it is not an
architectural issue and can be resolved (and really should be resolved
ASAP). I think that we need to open a bug for this and we should start to
investigate - fixing this will enable whoever is running a service uses
those resources elsewhere :)

>> 
>>  
>> 
>> 4.   There are existing customer installations that use the existing
>> approach and therefore not enforce the new approach until it is simple
>> to manage and not resource intensive.
>> 
>>  
>> 
>> If the admin wants to use one service per cluster, it can be done with
>> the existing driver. In the conf the admin has to specify a single
>> cluster instead of a list of clusters. Therefore its better to give the
>> admins the choice rather than enforcing one type of de

Re: [openstack-dev] [nova] Manage multiple clusters using a single nova service

2014-07-23 Thread Dan Smith
>> I'm just do not suppor the idea that Nova needs to change its 
>> fundamental design in order to support the *design* of other host 
>> management platforms.
> 
> The current implementation doesn't make nova change its design, the 
> scheduling decisions are still done by nova.

Nova's design is not just "making the scheduling decisions" but also
includes the deployment model, which is intended to be a single compute
service tied to a single hypervisor. I think that's important for scale
and failure isolation at least.

> Its only the deployment that has been changed. Agree that there are 
> no separate topic-exchange queues for each cluster.

I'm definitely with Jay here: I want to get away from hiding larger
systems behind a single compute host/service.

--Dan



signature.asc
Description: OpenPGP digital signature
___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Manage multiple clusters using a single nova service

2014-07-23 Thread Vaddi, Kiran Kumar
Answers to some of your concerns

> Why can't ESXi hosts not run the nova-compute service? Is it like the
> XenServer driver that has a pitifully old version of Python (2.4) that
> constrains the code that is possible to run on it? If so, then I don't
> really think the poor constraints of the hypervisor dom0 should mean
> that Nova should change its design principles to accomodate. The
> XenServer driver uses custom agents to get around this issue, IIRC. Why
> can't the VCenter driver?

ESXi hosts are generally operated in a lock-down mode where installation of 
agents is not allowed.
All communication and tasks on the ESXi hosts must be done using vCenter.

> The fact that each connection to vCenter uses 140MB of memory is
> completely ridiculous. You can thank crappy SOAP for that, I believe.

Yes, and the problem becomes bigger if we create multiple services

> I'm just do not suppor the idea that Nova needs to
> change its fundamental design in order to support the *design* of other
> host management platforms.

The current implementation doesn't make nova change its design, the scheduling 
decisions are still done by nova.
Its only the deployment that has been changed. Agree that there are no separate 
topic-exchange queues for each cluster.

Thanks
Kiran

> -Original Message-
> From: Jay Pipes [mailto:jaypi...@gmail.com]
> Sent: Tuesday, July 22, 2014 9:30 AM
> To: openstack-dev@lists.openstack.org
> Subject: Re: [openstack-dev] [nova] Manage multiple clusters using a single
> nova service
> 
> On 07/14/2014 04:34 AM, Vaddi, Kiran Kumar wrote:
> > Hi,
> >
> > In the Juno summit, it was discussed that the existing approach of
> > managing multiple VMware Clusters using a single nova compute service
> > is not preferred and the approach of one nova compute service
> > representing one cluster should be looked into.
> 
> Even this is outside what I consider to be best practice for Nova,
> frankly. The model of scale-out inside Nova is to have a nova-compute
> worker responsible for only the distinct set of compute resources that
> are provided by a single bare metal node.
> 
> Unfortunately, with the introduction of the bare-metal driver in Nova,
> as well as the "clustered hypervisors" like VCenter and Hyper-V, this
> architectural design point was shot in the head, and now it is only
> possible to scale the nova-compute <-> hypervisor communication layer
> using a scale-up model instead of a scale-out model. This is a big deal,
> and unfortunately, not enough discussion has been had around this, IMO.
> 
> The proposed blueprint(s) around this and the code patches I've seen are
> moving Nova in the opposite direction it needs to go, IMHO.
> 
> > We would like to retain the existing approach (till we have resolved
> >  the issues) for the following reasons:
> >
> > 1.Even though a single service is managing all the clusters,
> > logically it is still one compute per cluster. To the scheduler each
> >  cluster is represented as individual computes. Even in the driver
> > each cluster is represented separately.
> 
> How is this so? In Kanagaraj Manickam's proposed blueprint about this
> [1], the proposed implementation would fork one process for each
> hypervisor or cluster. However, the problem with this is that the
> scheduler uses the single service record for the nova-compute worker to
> determine whether or not the node is available to place resources on.
> The servicegroup API would need to be refactored (rewritten, really) to
> change its definition of a service to instead of being a single daemon,
> now being a single process running within that daemon. Since the daemon
> only responds to a single RPC target endpoint and rpc.call direct and
> topic exchanges, all of that code would then need to be rewritten, or
> code would need to be added to nova.manager to dispatch events sent to
> the nova-compute's single RPC topic-exchange to one of the specific
> processes that is responsible for a particular cluster.
> 
> In short, a huge chunk of code would need to be refactored in order to
> make Nova's worldview amenable to the design choices of certain
> clustered hypervisors. That, IMHO, is not something to be taken lightly,
> and not something we should even consider without a REALLY good reason.
> And the use case of "Openstack is an platform and its good to provide
> flexibility in it to accommodate different needs." is not a really good
> reason, IMO.
> 
> > 2.Since ESXi does not allow to run nova-compute service on the
> > hypervisor unlike KVM, the service has to be run externally on a
> > different server. Its easier from administration persp

Re: [openstack-dev] [nova] Manage multiple clusters using a single nova service

2014-07-21 Thread Jay Pipes

On 07/14/2014 04:34 AM, Vaddi, Kiran Kumar wrote:

Hi,

In the Juno summit, it was discussed that the existing approach of
managing multiple VMware Clusters using a single nova compute service
is not preferred and the approach of one nova compute service
representing one cluster should be looked into.


Even this is outside what I consider to be best practice for Nova,
frankly. The model of scale-out inside Nova is to have a nova-compute
worker responsible for only the distinct set of compute resources that
are provided by a single bare metal node.

Unfortunately, with the introduction of the bare-metal driver in Nova,
as well as the "clustered hypervisors" like VCenter and Hyper-V, this
architectural design point was shot in the head, and now it is only
possible to scale the nova-compute <-> hypervisor communication layer
using a scale-up model instead of a scale-out model. This is a big deal,
and unfortunately, not enough discussion has been had around this, IMO.

The proposed blueprint(s) around this and the code patches I've seen are
moving Nova in the opposite direction it needs to go, IMHO.


We would like to retain the existing approach (till we have resolved
 the issues) for the following reasons:

1.Even though a single service is managing all the clusters,
logically it is still one compute per cluster. To the scheduler each
 cluster is represented as individual computes. Even in the driver
each cluster is represented separately.


How is this so? In Kanagaraj Manickam's proposed blueprint about this
[1], the proposed implementation would fork one process for each
hypervisor or cluster. However, the problem with this is that the
scheduler uses the single service record for the nova-compute worker to
determine whether or not the node is available to place resources on.
The servicegroup API would need to be refactored (rewritten, really) to
change its definition of a service to instead of being a single daemon,
now being a single process running within that daemon. Since the daemon
only responds to a single RPC target endpoint and rpc.call direct and
topic exchanges, all of that code would then need to be rewritten, or
code would need to be added to nova.manager to dispatch events sent to
the nova-compute's single RPC topic-exchange to one of the specific
processes that is responsible for a particular cluster.

In short, a huge chunk of code would need to be refactored in order to
make Nova's worldview amenable to the design choices of certain
clustered hypervisors. That, IMHO, is not something to be taken lightly,
and not something we should even consider without a REALLY good reason.
And the use case of "Openstack is an platform and its good to provide
flexibility in it to accommodate different needs." is not a really good
reason, IMO.


2.Since ESXi does not allow to run nova-compute service on the
hypervisor unlike KVM, the service has to be run externally on a
different server. Its easier from administration perspective to
manage a single service than multiple.


Why can't ESXi hosts not run the nova-compute service? Is it like the
XenServer driver that has a pitifully old version of Python (2.4) that
constrains the code that is possible to run on it? If so, then I don't
really think the poor constraints of the hypervisor dom0 should mean
that Nova should change its design principles to accomodate. The
XenServer driver uses custom agents to get around this issue, IIRC. Why
can't the VCenter driver?


3.Every connection to vCenter uses up ~140MB in the driver. If we
were to manage each cluster by an individual service the memory
consumed for 32 clusters will be high (~4GB). The newer versions
support 64 clusters!


The fact that each connection to vCenter uses 140MB of memory is
completely ridiculous. You can thank crappy SOAP for that, I believe.

That said, Nova should not be changing its design principles to
accommodate poor software of a driver.

It raises questions on why exactly folks are even using OpenStack at all
if they want to continue to use VCenter for host management, DRS, DPM,
and the like.

What advantage are they getting from OpenStack?

If the idea is to move off of expensive VCenter-licensed clusters and on
to a pure OpenStack infrastructure then, I don't see a point in
supporting *more* clustered hypervisor features in the driver code at
all. If the idea is to just "use what we know, don't rock the enterprise
IT boat", then why use OpenStack at all?

Look, I'm all for compatibility and transferability of different image
formats, different underlying hypervisors, and the dream of
interoperable clouds. I'm happy to see Nova support a wide variety of
disk image formats and hypervisor features (note: VCenter isn't a
hypervisor). I'm just do not suppor the idea that Nova needs to
change its fundamental design in order to support the *design* of other
host management platforms.

Best,
-jay

[1] https://review.openstack.org/#/c/103054/

_

Re: [openstack-dev] [nova] Manage multiple clusters using a single nova service

2014-07-15 Thread Matthew Booth
On 14/07/14 09:34, Vaddi, Kiran Kumar wrote:
> Hi,
> 
>  
> 
> In the Juno summit, it was discussed that the existing approach of
> managing multiple VMware Clusters using a single nova compute service is
> not preferred and the approach of one nova compute service representing
> one cluster should be looked into.
> 
>  
> 
> We would like to retain the existing approach (till we have resolved the
> issues) for the following reasons:
> 
>  
> 
> 1.   Even though a single service is managing all the clusters,
> logically it is still one compute per cluster. To the scheduler each
> cluster is represented as individual computes. Even in the driver each
> cluster is represented separately.
> 
>  
> 
> 2.   Since ESXi does not allow to run nova-compute service on the
> hypervisor unlike KVM, the service has to be run externally on a
> different server. Its easier from administration perspective to manage a
> single service than multiple.
> 
>  
> 
> 3.   Every connection to vCenter uses up ~140MB in the driver. If we
> were to manage each cluster by an individual service the memory consumed
> for 32 clusters will be high (~4GB). The newer versions support 64 clusters!
> 
>  
> 
> 4.   There are existing customer installations that use the existing
> approach and therefore not enforce the new approach until it is simple
> to manage and not resource intensive.
> 
>  
> 
> If the admin wants to use one service per cluster, it can be done with
> the existing driver. In the conf the admin has to specify a single
> cluster instead of a list of clusters. Therefore its better to give the
> admins the choice rather than enforcing one type of deployment.

Does anybody recall the detail of why we wanted to remove this? There
was unease over use of instance's node field in the db, but I don't
recall why.

Matt

-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490

___
OpenStack-dev mailing list
OpenStack-dev@lists.openstack.org
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev