Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-03-17 Thread Kekane, Abhishek
Hi John,

Thanks for your opinion.

Fundamentally we cannot assume infinite storage space.

To enhance the shelve/unshelve performance, I have proposed nova-specs [1], in 
which there are two challenges.

A. This design is libvirt specific, currently I am using KVM hypervisor but I 
am open to make changes to other hypervisors.
  I don't have the know-how about other hypervisors (how to configuration 
etc.)  any help about same from community is appreciated.

B. HostAggregateGroupFilter [2] (Rescheduling instance)- Filter to schedule 
instance on different node if shared storage is full or resources are not 
available.
 Please let me know your opinion about this HostAggregateGroupFilter.

I request community members to go through the nova-specs [1] and patches 
submitted [3] for the same and let us give your feedback on the same.

[1] https://review.openstack.org/135387
[2] https://review.openstack.org/150330
[3] https://review.openstack.org/150315, https://review.openstack.org/150337, 
https://review.openstack.org/150344

Thank You,

Abhishek Kekane

-Original Message-
From: John Garbutt [mailto:j...@johngarbutt.com] 
Sent: 12 March 2015 17:41
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

Hi,

On 11 March 2015 at 06:35, Kekane, Abhishek abhishek.kek...@nttdata.com wrote:
 In case of start/stop API’s cpu/memory are not released/reassigned. We 
 can modify these API’s to release the cpu and memory while stopping 
 the instance and reassign the same while starting the instance. In 
 this case also rescheduling logic need  to be modified to reschedule 
 the instance on different host, if required resources are not 
 available while starting the instance. This is similar to what I have 
 implemented in [2] Improving the performance of unshelve API.

I am against start releasing the resources, as you can't guarantee start will 
work quickly. Similar to suspend I suppose.

The idea of shelve/unshelve is to avoid that problem, by ensuring you can 
resume the VM anywhere, should someone else use the resources you have freed 
up. But the idea was to optimize for a quick unshelve, where possible. The 
feature is not really complete, we need a scheduling weighter to deal with 
avoiding that capacity till you need it, etc. When you have shared storage, it 
maybes sense to add the option of skipping the snapshot (boot from volume 
clearly doesn't need a snapshot), if you are happy to assume there will always 
be space on some host that can see that shared storage.

 Please let me know your opinion, whether we can modify start/stop 
 API’s as an alternative to shelve/unshelve API’s.

I would rather we enhance shelve/unshelve, rather than fundamentally change the 
semantics of start/stop.

Thanks,
John


 From: Kekane, Abhishek [mailto:abhishek.kek...@nttdata.com]
 Sent: 24 February 2015 12:47

 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
 Optimization Questions



 Hi Duncan,



 Thank you for the inputs.



 @Community-Members

 I want to know if there are any other alternatives to improve the 
 performance of unshelve api ((booted from image only).

 Please give me your opinion on the same.



 Thank You,



 Abhishek Kekane







 From: Duncan Thomas [mailto:duncan.tho...@gmail.com]
 Sent: 16 February 2015 16:46
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
 Optimization Questions



 There has been some talk in cinder meetings about making 
 cinder-glance interactions more efficient. They are already 
 optimised in some deployments, e.g. ceph glance and ceph cinder, and 
 some backends cache glance images so that many volumes created from the same 
 image becomes very efficient.
 (search the meeting logs or channel logs for 'public snapshot' to get 
 some entry points into the discussions)

 I'd like to see more work done on this, and perhaps re-examine a 
 cinder backend to glance. This would give some of what you're 
 suggesting (particularly fast, low traffic un-shelve), and there is 
 more that can be done to improve that performance, particularly if we 
 can find a better performing generic CoW technology than QCOW2.

 As suggested in the review, in the short term you might be better 
 experimenting with moving to boot-from-volume instances if you have a 
 suitable cinder deployed, since that gives you some of the performance 
 improvements already.



 On 16 February 2015 at 12:10, Kekane, Abhishek 
 abhishek.kek...@nttdata.com
 wrote:

 Hi Devs,



 Problem Statement: Performance and storage efficiency of 
 shelving/unshelving instance booted from image is far worse than instance 
 booted from volume.



 When you unshelve hundreds of instances at the same time, instance 
 spawning time varies and it mainly

Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-03-17 Thread Kekane, Abhishek
Hi John,

Thanks for your opinion.

Fundamentally we cannot assume infinite storage space.

To enhance the shelve/unshelve performance, I have proposed nova-specs [1], in 
which there are two challenges.

A. This design is libvirt specific, currently I am using KVM hypervisor but I 
am open to make changes to other hypervisors.
  I don't have the know-how about other hypervisors (how to configuration 
etc.)  any help about same from community is appreciated.

B. HostAggregateGroupFilter [2] - Filter to schedule host on different node if 
shared storage is full or resources are not available.
 Please let me know your opinion about this HostAggregateGroupFilter.

I request community members to go through the nova-specs [1] and patches 
submitted [3] for the same and let us give your feedback on the same.

[1] https://review.openstack.org/135387
[2] https://review.openstack.org/150330
[3] https://review.openstack.org/150315, https://review.openstack.org/150337, 
https://review.openstack.org/150344

Thank You,

Abhishek Kekane

-Original Message-
From: John Garbutt [mailto:j...@johngarbutt.com] 
Sent: 12 March 2015 17:41
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

Hi,

On 11 March 2015 at 06:35, Kekane, Abhishek abhishek.kek...@nttdata.com wrote:
 In case of start/stop API’s cpu/memory are not released/reassigned. We 
 can modify these API’s to release the cpu and memory while stopping 
 the instance and reassign the same while starting the instance. In 
 this case also rescheduling logic need  to be modified to reschedule 
 the instance on different host, if required resources are not 
 available while starting the instance. This is similar to what I have 
 implemented in [2] Improving the performance of unshelve API.

I am against start releasing the resources, as you can't guarantee start will 
work quickly. Similar to suspend I suppose.

The idea of shelve/unshelve is to avoid that problem, by ensuring you can 
resume the VM anywhere, should someone else use the resources you have freed 
up. But the idea was to optimize for a quick unshelve, where possible. The 
feature is not really complete, we need a scheduling weighter to deal with 
avoiding that capacity till you need it, etc. When you have shared storage, it 
maybes sense to add the option of skipping the snapshot (boot from volume 
clearly doesn't need a snapshot), if you are happy to assume there will always 
be space on some host that can see that shared storage.

 Please let me know your opinion, whether we can modify start/stop 
 API’s as an alternative to shelve/unshelve API’s.

I would rather we enhance shelve/unshelve, rather than fundamentally change the 
semantics of start/stop.

Thanks,
John


 From: Kekane, Abhishek [mailto:abhishek.kek...@nttdata.com]
 Sent: 24 February 2015 12:47

 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
 Optimization Questions



 Hi Duncan,



 Thank you for the inputs.



 @Community-Members

 I want to know if there are any other alternatives to improve the 
 performance of unshelve api ((booted from image only).

 Please give me your opinion on the same.



 Thank You,



 Abhishek Kekane







 From: Duncan Thomas [mailto:duncan.tho...@gmail.com]
 Sent: 16 February 2015 16:46
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance 
 Optimization Questions



 There has been some talk in cinder meetings about making 
 cinder-glance interactions more efficient. They are already 
 optimised in some deployments, e.g. ceph glance and ceph cinder, and 
 some backends cache glance images so that many volumes created from the same 
 image becomes very efficient.
 (search the meeting logs or channel logs for 'public snapshot' to get 
 some entry points into the discussions)

 I'd like to see more work done on this, and perhaps re-examine a 
 cinder backend to glance. This would give some of what you're 
 suggesting (particularly fast, low traffic un-shelve), and there is 
 more that can be done to improve that performance, particularly if we 
 can find a better performing generic CoW technology than QCOW2.

 As suggested in the review, in the short term you might be better 
 experimenting with moving to boot-from-volume instances if you have a 
 suitable cinder deployed, since that gives you some of the performance 
 improvements already.



 On 16 February 2015 at 12:10, Kekane, Abhishek 
 abhishek.kek...@nttdata.com
 wrote:

 Hi Devs,



 Problem Statement: Performance and storage efficiency of 
 shelving/unshelving instance booted from image is far worse than instance 
 booted from volume.



 When you unshelve hundreds of instances at the same time, instance 
 spawning time varies and it mainly depends on the size

Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-03-12 Thread John Garbutt
Hi,

On 11 March 2015 at 06:35, Kekane, Abhishek abhishek.kek...@nttdata.com wrote:
 In case of start/stop API’s cpu/memory are not released/reassigned. We can
 modify these API’s to release
 the cpu and memory while stopping the instance and reassign the same while
 starting the instance. In this case
 also rescheduling logic need  to be modified to reschedule the instance on
 different host, if required resources
 are not available while starting the instance. This is similar to what I
 have implemented in [2] Improving the
 performance of unshelve API.

I am against start releasing the resources, as you can't guarantee
start will work quickly. Similar to suspend I suppose.

The idea of shelve/unshelve is to avoid that problem, by ensuring you
can resume the VM anywhere, should someone else use the resources you
have freed up. But the idea was to optimize for a quick unshelve,
where possible. The feature is not really complete, we need a
scheduling weighter to deal with avoiding that capacity till you need
it, etc. When you have shared storage, it maybes sense to add the
option of skipping the snapshot (boot from volume clearly doesn't need
a snapshot), if you are happy to assume there will always be space on
some host that can see that shared storage.

 Please let me know your opinion, whether we can modify start/stop API’s as
 an alternative to shelve/unshelve API’s.

I would rather we enhance shelve/unshelve, rather than fundamentally
change the semantics of start/stop.

Thanks,
John


 From: Kekane, Abhishek [mailto:abhishek.kek...@nttdata.com]
 Sent: 24 February 2015 12:47

 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance
 Optimization Questions



 Hi Duncan,



 Thank you for the inputs.



 @Community-Members

 I want to know if there are any other alternatives to improve the
 performance of unshelve api ((booted from image only).

 Please give me your opinion on the same.



 Thank You,



 Abhishek Kekane







 From: Duncan Thomas [mailto:duncan.tho...@gmail.com]
 Sent: 16 February 2015 16:46
 To: OpenStack Development Mailing List (not for usage questions)
 Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance
 Optimization Questions



 There has been some talk in cinder meetings about making cinder-glance
 interactions more efficient. They are already optimised in some deployments,
 e.g. ceph glance and ceph cinder, and some backends cache glance images so
 that many volumes created from the same image becomes very efficient.
 (search the meeting logs or channel logs for 'public snapshot' to get some
 entry points into the discussions)

 I'd like to see more work done on this, and perhaps re-examine a cinder
 backend to glance. This would give some of what you're suggesting
 (particularly fast, low traffic un-shelve), and there is more that can be
 done to improve that performance, particularly if we can find a better
 performing generic CoW technology than QCOW2.

 As suggested in the review, in the short term you might be better
 experimenting with moving to boot-from-volume instances if you have a
 suitable cinder deployed, since that gives you some of the performance
 improvements already.



 On 16 February 2015 at 12:10, Kekane, Abhishek abhishek.kek...@nttdata.com
 wrote:

 Hi Devs,



 Problem Statement: Performance and storage efficiency of shelving/unshelving
 instance booted from image is far worse than instance booted from volume.



 When you unshelve hundreds of instances at the same time, instance spawning
 time varies and it mainly depends on the size of the instance snapshot and

 the network speed between glance and nova servers.



 If you have configured file store (shared storage) as a backend in Glance
 for storing images/snapshots, then it's possible to improve the performance
 of

 unshelve instance dramatically by configuring
 nova.image.download.FileTransfer in nova. In this case, it simply copies the
 instance snapshot as if it is

 stored on the local filesystem of the compute node. But then again in this
 case, it is observed the network traffic between shared storage servers and

 nova increases enormously resulting in slow spawning of the instances.



 I would like to gather some thoughts about how we can improve the
 performance of unshelve api (booted from image only) in terms of downloading
 large

 size instance snapshots from glance.



 I have proposed a nova-specs [1] to address this performance issue. Please
 take a look at it.



 During the last nova mid-cycle summit, Michael Still has suggested
 alternative solutions to tackle this issue.



 Storage solutions like ceph (Software based) and NetApp (Hardare based)
 support exposing images from glance to nova-compute and cinder-volume with

 copy in write feature. This way there will be no need to download the
 instance snapshot and unshelve api will be pretty faster than getting it

 from glance.



 Do

Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-03-11 Thread Kekane, Abhishek
Hi Devs,

As another alternative we can use start/stop API’s instead of shelve/unshelve 
the instance.
Following is the details of start/stop and shelve/unshelve on the basis of 
cpu/memory/disk released and fast respawning.

API’S: start/stop
cpu/memory released: No
Disk released: No
Fast respawning: Yes

API’S: shelve/unshelve
cpu/memory released: Yes
Disk released: Yes (Not released if shelved_offload_time = -1)
Fast respawning: No (if instance is booted from image)


In order to make unshelve fast enough, we need to preserve instance root disk 
in compute node,
which I have proposed in spec [1] of shelve-partial-offload.

In case of start/stop API’s cpu/memory are not released/reassigned. We can 
modify these API’s to release
the cpu and memory while stopping the instance and reassign the same while 
starting the instance. In this case
also rescheduling logic need  to be modified to reschedule the instance on 
different host, if required resources
are not available while starting the instance. This is similar to what I have 
implemented in [2] Improving the
performance of unshelve API.

[1] https://review.openstack.org/#/c/135387/
[2] https://review.openstack.org/#/c/150344/

Please let me know your opinion, whether we can modify start/stop API’s as an 
alternative to shelve/unshelve API’s.

Thank You,

Abhishek Kekane


From: Kekane, Abhishek [mailto:abhishek.kek...@nttdata.com]
Sent: 24 February 2015 12:47
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

Hi Duncan,

Thank you for the inputs.

@Community-Members
I want to know if there are any other alternatives to improve the performance 
of unshelve api ((booted from image only).
Please give me your opinion on the same.

Thank You,

Abhishek Kekane



From: Duncan Thomas [mailto:duncan.tho...@gmail.com]
Sent: 16 February 2015 16:46
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

There has been some talk in cinder meetings about making cinder-glance 
interactions more efficient. They are already optimised in some deployments, 
e.g. ceph glance and ceph cinder, and some backends cache glance images so that 
many volumes created from the same image becomes very efficient. (search the 
meeting logs or channel logs for 'public snapshot' to get some entry points 
into the discussions)

I'd like to see more work done on this, and perhaps re-examine a cinder backend 
to glance. This would give some of what you're suggesting (particularly fast, 
low traffic un-shelve), and there is more that can be done to improve that 
performance, particularly if we can find a better performing generic CoW 
technology than QCOW2.
As suggested in the review, in the short term you might be better experimenting 
with moving to boot-from-volume instances if you have a suitable cinder 
deployed, since that gives you some of the performance improvements already.

On 16 February 2015 at 12:10, Kekane, Abhishek 
abhishek.kek...@nttdata.commailto:abhishek.kek...@nttdata.com wrote:
Hi Devs,

Problem Statement: Performance and storage efficiency of shelving/unshelving 
instance booted from image is far worse than instance booted from volume.

When you unshelve hundreds of instances at the same time, instance spawning 
time varies and it mainly depends on the size of the instance snapshot and
the network speed between glance and nova servers.

If you have configured file store (shared storage) as a backend in Glance for 
storing images/snapshots, then it's possible to improve the performance of
unshelve instance dramatically by configuring nova.image.download.FileTransfer 
in nova. In this case, it simply copies the instance snapshot as if it is
stored on the local filesystem of the compute node. But then again in this 
case, it is observed the network traffic between shared storage servers and
nova increases enormously resulting in slow spawning of the instances.

I would like to gather some thoughts about how we can improve the performance 
of unshelve api (booted from image only) in terms of downloading large
size instance snapshots from glance.

I have proposed a nova-specs [1] to address this performance issue. Please take 
a look at it.

During the last nova mid-cycle summit, Michael 
Stillhttps://review.openstack.org/#/q/owner:mikal%2540stillhq.com+status:open,n,z
 has suggested alternative solutions to tackle this issue.

Storage solutions like ceph (Software based) and NetApp (Hardare based) support 
exposing images from glance to nova-compute and cinder-volume with
copy in write feature. This way there will be no need to download the instance 
snapshot and unshelve api will be pretty faster than getting it
from glance.

Do you think the above performance issue should be handled in the OpenStack 
software as described in nova-specs [1] or storage solutions like

Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-03-10 Thread Kekane, Abhishek
Hi Devs,

As another alternative we can use start/stop API’s instead of shelve/unshelve 
the instance.

API’s

cpu/memory released

Disk released

Fast respawning

Notes

start/stop

No

No

Yes



shelve/unshelve

Yes

Yes (Not released if shelved_offload_time = -1)

No

Instance does not respawn faster in case of instance is booted from image


In order to make unshelve fast enough, we need to preserve instance root disk 
in compute node,
which I have proposed in spec [1] of shelve-partial-offload.

In case of start/stop API’s cpu/memory are not released/reassigned. We can 
modify these API’s to release
the cpu and memory while stopping the instance and reassign the same while 
starting the instance. In this case
also rescheduling logic need  to be modified to reschedule the instance on 
different host, if required resources
are not available while starting the instance. This is similar to what I have 
implemented in [2] Improving the
performance of unshelve API.

[1] https://review.openstack.org/#/c/135387/
[2] https://review.openstack.org/#/c/150344/

Please let me know your opinion, whether we can modify start/stop API’s as an 
alternative to shelve/unshelve API’s.

Thank You,

Abhishek Kekane

From: Kekane, Abhishek [mailto:abhishek.kek...@nttdata.com]
Sent: 24 February 2015 12:47
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

Hi Duncan,

Thank you for the inputs.

@Community-Members
I want to know if there are any other alternatives to improve the performance 
of unshelve api ((booted from image only).
Please give me your opinion on the same.

Thank You,

Abhishek Kekane



From: Duncan Thomas [mailto:duncan.tho...@gmail.com]
Sent: 16 February 2015 16:46
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization 
Questions

There has been some talk in cinder meetings about making cinder-glance 
interactions more efficient. They are already optimised in some deployments, 
e.g. ceph glance and ceph cinder, and some backends cache glance images so that 
many volumes created from the same image becomes very efficient. (search the 
meeting logs or channel logs for 'public snapshot' to get some entry points 
into the discussions)

I'd like to see more work done on this, and perhaps re-examine a cinder backend 
to glance. This would give some of what you're suggesting (particularly fast, 
low traffic un-shelve), and there is more that can be done to improve that 
performance, particularly if we can find a better performing generic CoW 
technology than QCOW2.
As suggested in the review, in the short term you might be better experimenting 
with moving to boot-from-volume instances if you have a suitable cinder 
deployed, since that gives you some of the performance improvements already.

On 16 February 2015 at 12:10, Kekane, Abhishek 
abhishek.kek...@nttdata.commailto:abhishek.kek...@nttdata.com wrote:
Hi Devs,

Problem Statement: Performance and storage efficiency of shelving/unshelving 
instance booted from image is far worse than instance booted from volume.

When you unshelve hundreds of instances at the same time, instance spawning 
time varies and it mainly depends on the size of the instance snapshot and
the network speed between glance and nova servers.

If you have configured file store (shared storage) as a backend in Glance for 
storing images/snapshots, then it's possible to improve the performance of
unshelve instance dramatically by configuring nova.image.download.FileTransfer 
in nova. In this case, it simply copies the instance snapshot as if it is
stored on the local filesystem of the compute node. But then again in this 
case, it is observed the network traffic between shared storage servers and
nova increases enormously resulting in slow spawning of the instances.

I would like to gather some thoughts about how we can improve the performance 
of unshelve api (booted from image only) in terms of downloading large
size instance snapshots from glance.

I have proposed a nova-specs [1] to address this performance issue. Please take 
a look at it.

During the last nova mid-cycle summit, Michael 
Stillhttps://review.openstack.org/#/q/owner:mikal%2540stillhq.com+status:open,n,z
 has suggested alternative solutions to tackle this issue.

Storage solutions like ceph (Software based) and NetApp (Hardare based) support 
exposing images from glance to nova-compute and cinder-volume with
copy in write feature. This way there will be no need to download the instance 
snapshot and unshelve api will be pretty faster than getting it
from glance.

Do you think the above performance issue should be handled in the OpenStack 
software as described in nova-specs [1] or storage solutions like
ceph/NetApp should be used in production environment? Apart from ceph/NetApp 
solutions, are there any other options available

[openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-02-16 Thread Kekane, Abhishek
Hi Devs,

Problem Statement: Performance and storage efficiency of shelving/unshelving 
instance booted from image is far worse than instance booted from volume.

When you unshelve hundreds of instances at the same time, instance spawning 
time varies and it mainly depends on the size of the instance snapshot and
the network speed between glance and nova servers.

If you have configured file store (shared storage) as a backend in Glance for 
storing images/snapshots, then it's possible to improve the performance of
unshelve instance dramatically by configuring nova.image.download.FileTransfer 
in nova. In this case, it simply copies the instance snapshot as if it is
stored on the local filesystem of the compute node. But then again in this 
case, it is observed the network traffic between shared storage servers and
nova increases enormously resulting in slow spawning of the instances.

I would like to gather some thoughts about how we can improve the performance 
of unshelve api (booted from image only) in terms of downloading large
size instance snapshots from glance.

I have proposed a nova-specs [1] to address this performance issue. Please take 
a look at it.

During the last nova mid-cycle summit, Michael 
Stillhttps://review.openstack.org/#/q/owner:mikal%2540stillhq.com+status:open,n,z
 has suggested alternative solutions to tackle this issue.

Storage solutions like ceph (Software based) and NetApp (Hardare based) support 
exposing images from glance to nova-compute and cinder-volume with
copy in write feature. This way there will be no need to download the instance 
snapshot and unshelve api will be pretty faster than getting it
from glance.

Do you think the above performance issue should be handled in the OpenStack 
software as described in nova-specs [1] or storage solutions like
ceph/NetApp should be used in production environment? Apart from ceph/NetApp 
solutions, are there any other options available in the market.

[1] https://review.openstack.org/#/c/135387/

Thank You,

Abhishek Kekane

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


Re: [openstack-dev] [nova] Unshelve Instance Performance Optimization Questions

2015-02-16 Thread Duncan Thomas
There has been some talk in cinder meetings about making cinder-glance
interactions more efficient. They are already optimised in some
deployments, e.g. ceph glance and ceph cinder, and some backends cache
glance images so that many volumes created from the same image becomes very
efficient. (search the meeting logs or channel logs for 'public snapshot'
to get some entry points into the discussions)

I'd like to see more work done on this, and perhaps re-examine a cinder
backend to glance. This would give some of what you're suggesting
(particularly fast, low traffic un-shelve), and there is more that can be
done to improve that performance, particularly if we can find a better
performing generic CoW technology than QCOW2.

As suggested in the review, in the short term you might be better
experimenting with moving to boot-from-volume instances if you have a
suitable cinder deployed, since that gives you some of the performance
improvements already.

On 16 February 2015 at 12:10, Kekane, Abhishek abhishek.kek...@nttdata.com
wrote:

  Hi Devs,



 Problem Statement: Performance and storage efficiency of
 shelving/unshelving instance booted from image is far worse than instance
 booted from volume.



 When you unshelve hundreds of instances at the same time, instance
 spawning time varies and it mainly depends on the size of the instance
 snapshot and

 the network speed between glance and nova servers.



 If you have configured file store (shared storage) as a backend in Glance
 for storing images/snapshots, then it's possible to improve the performance
 of

 unshelve instance dramatically by configuring
 nova.image.download.FileTransfer in nova. In this case, it simply copies
 the instance snapshot as if it is

 stored on the local filesystem of the compute node. But then again in this
 case, it is observed the network traffic between shared storage servers and

 nova increases enormously resulting in slow spawning of the instances.



 I would like to gather some thoughts about how we can improve the
 performance of unshelve api (booted from image only) in terms of
 downloading large

 size instance snapshots from glance.



 I have proposed a nova-specs [1] to address this performance issue. Please
 take a look at it.



 During the last nova mid-cycle summit, Michael Still
 https://review.openstack.org/#/q/owner:mikal%2540stillhq.com+status:open,n,z
 has suggested alternative solutions to tackle this issue.



 Storage solutions like ceph (Software based) and NetApp (Hardare based)
 support exposing images from glance to nova-compute and cinder-volume with

 copy in write feature. This way there will be no need to download the
 instance snapshot and unshelve api will be pretty faster than getting it

 from glance.



 Do you think the above performance issue should be handled in the
 OpenStack software as described in nova-specs [1] or storage solutions like

 ceph/NetApp should be used in production environment? Apart from
 ceph/NetApp solutions, are there any other options available in the market.



 [1] https://review.openstack.org/#/c/135387/



 Thank You,



 Abhishek Kekane

 __
 Disclaimer: This email and any attachments are sent in strictest confidence
 for the sole use of the addressee and may contain legally privileged,
 confidential, and proprietary data. If you are not the intended recipient,
 please advise the sender by replying promptly to this email and then delete
 and destroy this email and any attachments without any further use, copying
 or forwarding.

 __
 OpenStack Development Mailing List (not for usage questions)
 Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
 http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




-- 
Duncan Thomas
__
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev