Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-24 Thread Wido den Hollander

On 03/23/2014 08:01 PM, Andrei Mikhailovsky wrote:

Wido,

Could you please let me know when you've done this so I could try it out. Would 
it be a part of the 4.3 branch or 4.4?



I'll do that. It will go into master which is 4.4 and I'm not sure if 
this will be backported to 4.3.1


Wido


Thanks
- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Sunday, 23 March, 2014 3:56:44 PM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates



On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote:


Wido,


i would be happy to try the custom ACS build unless 4.3 comes out soon. It has 
been overdue for sometime now )). Has this feature been addressed in the 4.3 
release?



No, it hasn't been fixed yet. I have to admit, I forgot about this until
you sent this e-mail to the list.

I'll fix this in master later this week.



I can leave with this feature for the time being, but i do see a longer term 
issue when my volumes become large as i've only got about 100gb free space on 
my host servers.




I fully agree. While writing this code I was aware of this. See my
comments in the code:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087


 From what i can tell by looking at the rbd ls -l info all of my volumes are 
done in Format 2



Correct, because I by-pass libvirt and Qemu at some places right now.



Cheers,


Andrei




- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Thursday, 20 March, 2014 9:40:29 AM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates

On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote:

Hi guys,

I was wondering if this is a bug?



No, it's a feature.


I've noticed that during volume migration from NFS to RBD primary storage the 
volume image is first copied to /tmp and only then to the RBD storage. This 
seems silly to me as one would expect a typical volume to be larger than the 
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
performance reasons. Thus, a typical host server will have far smaller /tmp 
folder than the size of an average volume. As a result, volume migration would 
break after filling the /tmp and could probably cause a bunch of issue for the 
KVM host itself as well as any vms running on the server.



Correct. The problem was that RBD images know two formats. Format 1
(old/legacy) and format 2.

In order to perform cloning images should be in RBD format 2.

When running qemu-img convert with a RBD image as a destination qemu-img
will create a RBD image in format 1.

That's due to this piece of code in block/rbd.c in Qemu:

ret = rbd_create(io_ctx, name, bytes, obj_order);

rbd_create() creates images in format 1. To use format 2 you should use
rbd_create2() or rbd_create3().

With RBD format 1 we can't do snapshotting or cloning, which we require
in ACS.

So I had to do a intermediate step where I first wrote the RAW image
somewhere and afterwards write it to RBD.

After some discussion a config option has been added to Ceph:

OPTION(rbd_default_format, OPT_INT, 1)

This allows me to do this:

qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2

This causes librbd/RBD to create a format 2 image and we can skip the
convert step to /tmp.

This option is available since Ceph Dumpling 0.67.5 and was not
available when ACS 4.2 was written.

I'm going to make changes in master which skip the step with /tmp.

Technically this can be backported to 4.2, but then you would have to
run your own homebrew version of 4.2


It also seems that the /tmp is temporarily used during a template creation .



Same story as above.


My setup:

ACS 4.2.1
Ubuntu 12.04 with KVM
RBD + NFS for Primary storage
NFS for Staging and Secondary storage


Thanks

Andrei












Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-24 Thread Andrei Mikhailovsky

Do you think I can apply the patch manually to the 4.3 branch? I would love to 
try it with 4.3, but not too adventitious to upgrade my setup to 4.4 yet )) 


Andrei 
- Original Message -

From: Wido den Hollander w...@widodh.nl 
To: dev@cloudstack.apache.org 
Sent: Monday, 24 March, 2014 12:29:36 PM 
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 

On 03/23/2014 08:01 PM, Andrei Mikhailovsky wrote: 
 Wido, 
 
 Could you please let me know when you've done this so I could try it out. 
 Would it be a part of the 4.3 branch or 4.4? 
 

I'll do that. It will go into master which is 4.4 and I'm not sure if 
this will be backported to 4.3.1 

Wido 

 Thanks 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Sunday, 23 March, 2014 3:56:44 PM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 
 
 On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote: 
 
 Wido, 
 
 
 i would be happy to try the custom ACS build unless 4.3 comes out soon. It 
 has been overdue for sometime now )). Has this feature been addressed in the 
 4.3 release? 
 
 
 No, it hasn't been fixed yet. I have to admit, I forgot about this until 
 you sent this e-mail to the list. 
 
 I'll fix this in master later this week. 
 
 
 I can leave with this feature for the time being, but i do see a longer term 
 issue when my volumes become large as i've only got about 100gb free space 
 on my host servers. 
 
 
 
 I fully agree. While writing this code I was aware of this. See my 
 comments in the code: 
 https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087
  
 
 From what i can tell by looking at the rbd ls -l info all of my volumes are 
 done in Format 2 
 
 
 Correct, because I by-pass libvirt and Qemu at some places right now. 
 
 
 Cheers, 
 
 
 Andrei 
 
 
 
 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Thursday, 20 March, 2014 9:40:29 AM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote: 
 Hi guys, 
 
 I was wondering if this is a bug? 
 
 
 No, it's a feature. 
 
 I've noticed that during volume migration from NFS to RBD primary storage 
 the volume image is first copied to /tmp and only then to the RBD storage. 
 This seems silly to me as one would expect a typical volume to be larger 
 than the host's hard disk. Also, it is a common practice to use tmpfs as 
 /tmp for performance reasons. Thus, a typical host server will have far 
 smaller /tmp folder than the size of an average volume. As a result, volume 
 migration would break after filling the /tmp and could probably cause a 
 bunch of issue for the KVM host itself as well as any vms running on the 
 server. 
 
 
 Correct. The problem was that RBD images know two formats. Format 1 
 (old/legacy) and format 2. 
 
 In order to perform cloning images should be in RBD format 2. 
 
 When running qemu-img convert with a RBD image as a destination qemu-img 
 will create a RBD image in format 1. 
 
 That's due to this piece of code in block/rbd.c in Qemu: 
 
 ret = rbd_create(io_ctx, name, bytes, obj_order); 
 
 rbd_create() creates images in format 1. To use format 2 you should use 
 rbd_create2() or rbd_create3(). 
 
 With RBD format 1 we can't do snapshotting or cloning, which we require 
 in ACS. 
 
 So I had to do a intermediate step where I first wrote the RAW image 
 somewhere and afterwards write it to RBD. 
 
 After some discussion a config option has been added to Ceph: 
 
 OPTION(rbd_default_format, OPT_INT, 1) 
 
 This allows me to do this: 
 
 qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2 
 
 This causes librbd/RBD to create a format 2 image and we can skip the 
 convert step to /tmp. 
 
 This option is available since Ceph Dumpling 0.67.5 and was not 
 available when ACS 4.2 was written. 
 
 I'm going to make changes in master which skip the step with /tmp. 
 
 Technically this can be backported to 4.2, but then you would have to 
 run your own homebrew version of 4.2 
 
 It also seems that the /tmp is temporarily used during a template creation 
 . 
 
 
 Same story as above. 
 
 My setup: 
 
 ACS 4.2.1 
 Ubuntu 12.04 with KVM 
 RBD + NFS for Primary storage 
 NFS for Staging and Secondary storage 
 
 
 Thanks 
 
 Andrei 
 
 
 
 
 
 




Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-24 Thread Wido den Hollander

On 03/24/2014 03:22 PM, Andrei Mikhailovsky wrote:


Do you think I can apply the patch manually to the 4.3 branch? I would love to 
try it with 4.3, but not too adventitious to upgrade my setup to 4.4 yet ))



Yes! I just pushed a commit to the master branch: 
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=commit;h=9763faf85e3f54ac84d5ca1d5ad6e89c7fcc87ee


To build 4.3

$ git checkout 4.3
$ git cherry-pick 9763faf85e3f54ac84d5ca1d5ad6e89c7fcc87ee
$ dpkg-buildpackage

Now you only have to update the cloudstack-agent package on the hypervisors.

Wido



Andrei
- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Monday, 24 March, 2014 12:29:36 PM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates

On 03/23/2014 08:01 PM, Andrei Mikhailovsky wrote:

Wido,

Could you please let me know when you've done this so I could try it out. Would 
it be a part of the 4.3 branch or 4.4?



I'll do that. It will go into master which is 4.4 and I'm not sure if
this will be backported to 4.3.1

Wido


Thanks
- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Sunday, 23 March, 2014 3:56:44 PM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates



On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote:


Wido,


i would be happy to try the custom ACS build unless 4.3 comes out soon. It has 
been overdue for sometime now )). Has this feature been addressed in the 4.3 
release?



No, it hasn't been fixed yet. I have to admit, I forgot about this until
you sent this e-mail to the list.

I'll fix this in master later this week.



I can leave with this feature for the time being, but i do see a longer term 
issue when my volumes become large as i've only got about 100gb free space on 
my host servers.




I fully agree. While writing this code I was aware of this. See my
comments in the code:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087


 From what i can tell by looking at the rbd ls -l info all of my volumes are 
done in Format 2



Correct, because I by-pass libvirt and Qemu at some places right now.



Cheers,


Andrei




- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Thursday, 20 March, 2014 9:40:29 AM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates

On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote:

Hi guys,

I was wondering if this is a bug?



No, it's a feature.


I've noticed that during volume migration from NFS to RBD primary storage the 
volume image is first copied to /tmp and only then to the RBD storage. This 
seems silly to me as one would expect a typical volume to be larger than the 
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
performance reasons. Thus, a typical host server will have far smaller /tmp 
folder than the size of an average volume. As a result, volume migration would 
break after filling the /tmp and could probably cause a bunch of issue for the 
KVM host itself as well as any vms running on the server.



Correct. The problem was that RBD images know two formats. Format 1
(old/legacy) and format 2.

In order to perform cloning images should be in RBD format 2.

When running qemu-img convert with a RBD image as a destination qemu-img
will create a RBD image in format 1.

That's due to this piece of code in block/rbd.c in Qemu:

ret = rbd_create(io_ctx, name, bytes, obj_order);

rbd_create() creates images in format 1. To use format 2 you should use
rbd_create2() or rbd_create3().

With RBD format 1 we can't do snapshotting or cloning, which we require
in ACS.

So I had to do a intermediate step where I first wrote the RAW image
somewhere and afterwards write it to RBD.

After some discussion a config option has been added to Ceph:

OPTION(rbd_default_format, OPT_INT, 1)

This allows me to do this:

qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2

This causes librbd/RBD to create a format 2 image and we can skip the
convert step to /tmp.

This option is available since Ceph Dumpling 0.67.5 and was not
available when ACS 4.2 was written.

I'm going to make changes in master which skip the step with /tmp.

Technically this can be backported to 4.2, but then you would have to
run your own homebrew version of 4.2


It also seems that the /tmp is temporarily used during a template creation .



Same story as above.


My setup:

ACS 4.2.1
Ubuntu 12.04 with KVM
RBD + NFS for Primary storage
NFS for Staging and Secondary storage


Thanks

Andrei
















Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-24 Thread Andrei Mikhailovsky
Wido, 


Thanks, i will give it a try when I have a moment. 


Andrei 
- Original Message -

From: Wido den Hollander w...@widodh.nl 
To: dev@cloudstack.apache.org 
Sent: Monday, 24 March, 2014 3:38:57 PM 
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 

On 03/24/2014 03:22 PM, Andrei Mikhailovsky wrote: 
 
 Do you think I can apply the patch manually to the 4.3 branch? I would love 
 to try it with 4.3, but not too adventitious to upgrade my setup to 4.4 yet 
 )) 
 

Yes! I just pushed a commit to the master branch: 
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=commit;h=9763faf85e3f54ac84d5ca1d5ad6e89c7fcc87ee
 

To build 4.3 

$ git checkout 4.3 
$ git cherry-pick 9763faf85e3f54ac84d5ca1d5ad6e89c7fcc87ee 
$ dpkg-buildpackage 

Now you only have to update the cloudstack-agent package on the hypervisors. 

Wido 

 
 Andrei 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Monday, 24 March, 2014 12:29:36 PM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 On 03/23/2014 08:01 PM, Andrei Mikhailovsky wrote: 
 Wido, 
 
 Could you please let me know when you've done this so I could try it out. 
 Would it be a part of the 4.3 branch or 4.4? 
 
 
 I'll do that. It will go into master which is 4.4 and I'm not sure if 
 this will be backported to 4.3.1 
 
 Wido 
 
 Thanks 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Sunday, 23 March, 2014 3:56:44 PM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 
 
 On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote: 
 
 Wido, 
 
 
 i would be happy to try the custom ACS build unless 4.3 comes out soon. It 
 has been overdue for sometime now )). Has this feature been addressed in 
 the 4.3 release? 
 
 
 No, it hasn't been fixed yet. I have to admit, I forgot about this until 
 you sent this e-mail to the list. 
 
 I'll fix this in master later this week. 
 
 
 I can leave with this feature for the time being, but i do see a longer 
 term issue when my volumes become large as i've only got about 100gb free 
 space on my host servers. 
 
 
 
 I fully agree. While writing this code I was aware of this. See my 
 comments in the code: 
 https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087
  
 
 From what i can tell by looking at the rbd ls -l info all of my volumes are 
 done in Format 2 
 
 
 Correct, because I by-pass libvirt and Qemu at some places right now. 
 
 
 Cheers, 
 
 
 Andrei 
 
 
 
 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Thursday, 20 March, 2014 9:40:29 AM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote: 
 Hi guys, 
 
 I was wondering if this is a bug? 
 
 
 No, it's a feature. 
 
 I've noticed that during volume migration from NFS to RBD primary storage 
 the volume image is first copied to /tmp and only then to the RBD storage. 
 This seems silly to me as one would expect a typical volume to be larger 
 than the host's hard disk. Also, it is a common practice to use tmpfs as 
 /tmp for performance reasons. Thus, a typical host server will have far 
 smaller /tmp folder than the size of an average volume. As a result, 
 volume migration would break after filling the /tmp and could probably 
 cause a bunch of issue for the KVM host itself as well as any vms running 
 on the server. 
 
 
 Correct. The problem was that RBD images know two formats. Format 1 
 (old/legacy) and format 2. 
 
 In order to perform cloning images should be in RBD format 2. 
 
 When running qemu-img convert with a RBD image as a destination qemu-img 
 will create a RBD image in format 1. 
 
 That's due to this piece of code in block/rbd.c in Qemu: 
 
 ret = rbd_create(io_ctx, name, bytes, obj_order); 
 
 rbd_create() creates images in format 1. To use format 2 you should use 
 rbd_create2() or rbd_create3(). 
 
 With RBD format 1 we can't do snapshotting or cloning, which we require 
 in ACS. 
 
 So I had to do a intermediate step where I first wrote the RAW image 
 somewhere and afterwards write it to RBD. 
 
 After some discussion a config option has been added to Ceph: 
 
 OPTION(rbd_default_format, OPT_INT, 1) 
 
 This allows me to do this: 
 
 qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2 
 
 This causes librbd/RBD to create a format 2 image and we can skip the 
 convert step to /tmp. 
 
 This option is available since Ceph Dumpling 0.67.5 and was not 
 available when ACS 4.2 was written. 
 
 I'm going to make changes in master which skip the step with /tmp. 
 
 Technically this can be backported to 4.2, but then you would have 

Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-23 Thread Wido den Hollander



On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote:


Wido,


i would be happy to try the custom ACS build unless 4.3 comes out soon. It has 
been overdue for sometime now )). Has this feature been addressed in the 4.3 
release?



No, it hasn't been fixed yet. I have to admit, I forgot about this until 
you sent this e-mail to the list.


I'll fix this in master later this week.



I can leave with this feature for the time being, but i do see a longer term 
issue when my volumes become large as i've only got about 100gb free space on 
my host servers.




I fully agree. While writing this code I was aware of this. See my 
comments in the code: 
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087



 From what i can tell by looking at the rbd ls -l info all of my volumes are 
done in Format 2



Correct, because I by-pass libvirt and Qemu at some places right now.



Cheers,


Andrei




- Original Message -

From: Wido den Hollander w...@widodh.nl
To: dev@cloudstack.apache.org
Sent: Thursday, 20 March, 2014 9:40:29 AM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates

On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote:

Hi guys,

I was wondering if this is a bug?



No, it's a feature.


I've noticed that during volume migration from NFS to RBD primary storage the 
volume image is first copied to /tmp and only then to the RBD storage. This 
seems silly to me as one would expect a typical volume to be larger than the 
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
performance reasons. Thus, a typical host server will have far smaller /tmp 
folder than the size of an average volume. As a result, volume migration would 
break after filling the /tmp and could probably cause a bunch of issue for the 
KVM host itself as well as any vms running on the server.



Correct. The problem was that RBD images know two formats. Format 1
(old/legacy) and format 2.

In order to perform cloning images should be in RBD format 2.

When running qemu-img convert with a RBD image as a destination qemu-img
will create a RBD image in format 1.

That's due to this piece of code in block/rbd.c in Qemu:

ret = rbd_create(io_ctx, name, bytes, obj_order);

rbd_create() creates images in format 1. To use format 2 you should use
rbd_create2() or rbd_create3().

With RBD format 1 we can't do snapshotting or cloning, which we require
in ACS.

So I had to do a intermediate step where I first wrote the RAW image
somewhere and afterwards write it to RBD.

After some discussion a config option has been added to Ceph:

OPTION(rbd_default_format, OPT_INT, 1)

This allows me to do this:

qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2

This causes librbd/RBD to create a format 2 image and we can skip the
convert step to /tmp.

This option is available since Ceph Dumpling 0.67.5 and was not
available when ACS 4.2 was written.

I'm going to make changes in master which skip the step with /tmp.

Technically this can be backported to 4.2, but then you would have to
run your own homebrew version of 4.2


It also seems that the /tmp is temporarily used during a template creation .



Same story as above.


My setup:

ACS 4.2.1
Ubuntu 12.04 with KVM
RBD + NFS for Primary storage
NFS for Staging and Secondary storage


Thanks

Andrei







Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-23 Thread Andrei Mikhailovsky
Wido, 

Could you please let me know when you've done this so I could try it out. Would 
it be a part of the 4.3 branch or 4.4? 

Thanks 
- Original Message -

From: Wido den Hollander w...@widodh.nl 
To: dev@cloudstack.apache.org 
Sent: Sunday, 23 March, 2014 3:56:44 PM 
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 



On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote: 
 
 Wido, 
 
 
 i would be happy to try the custom ACS build unless 4.3 comes out soon. It 
 has been overdue for sometime now )). Has this feature been addressed in the 
 4.3 release? 
 

No, it hasn't been fixed yet. I have to admit, I forgot about this until 
you sent this e-mail to the list. 

I'll fix this in master later this week. 

 
 I can leave with this feature for the time being, but i do see a longer term 
 issue when my volumes become large as i've only got about 100gb free space on 
 my host servers. 
 
 

I fully agree. While writing this code I was aware of this. See my 
comments in the code: 
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087
 

 From what i can tell by looking at the rbd ls -l info all of my volumes are 
 done in Format 2 
 

Correct, because I by-pass libvirt and Qemu at some places right now. 

 
 Cheers, 
 
 
 Andrei 
 
 
 
 
 - Original Message - 
 
 From: Wido den Hollander w...@widodh.nl 
 To: dev@cloudstack.apache.org 
 Sent: Thursday, 20 March, 2014 9:40:29 AM 
 Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 
 
 On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote: 
 Hi guys, 
 
 I was wondering if this is a bug? 
 
 
 No, it's a feature. 
 
 I've noticed that during volume migration from NFS to RBD primary storage 
 the volume image is first copied to /tmp and only then to the RBD storage. 
 This seems silly to me as one would expect a typical volume to be larger 
 than the host's hard disk. Also, it is a common practice to use tmpfs as 
 /tmp for performance reasons. Thus, a typical host server will have far 
 smaller /tmp folder than the size of an average volume. As a result, volume 
 migration would break after filling the /tmp and could probably cause a 
 bunch of issue for the KVM host itself as well as any vms running on the 
 server. 
 
 
 Correct. The problem was that RBD images know two formats. Format 1 
 (old/legacy) and format 2. 
 
 In order to perform cloning images should be in RBD format 2. 
 
 When running qemu-img convert with a RBD image as a destination qemu-img 
 will create a RBD image in format 1. 
 
 That's due to this piece of code in block/rbd.c in Qemu: 
 
 ret = rbd_create(io_ctx, name, bytes, obj_order); 
 
 rbd_create() creates images in format 1. To use format 2 you should use 
 rbd_create2() or rbd_create3(). 
 
 With RBD format 1 we can't do snapshotting or cloning, which we require 
 in ACS. 
 
 So I had to do a intermediate step where I first wrote the RAW image 
 somewhere and afterwards write it to RBD. 
 
 After some discussion a config option has been added to Ceph: 
 
 OPTION(rbd_default_format, OPT_INT, 1) 
 
 This allows me to do this: 
 
 qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2 
 
 This causes librbd/RBD to create a format 2 image and we can skip the 
 convert step to /tmp. 
 
 This option is available since Ceph Dumpling 0.67.5 and was not 
 available when ACS 4.2 was written. 
 
 I'm going to make changes in master which skip the step with /tmp. 
 
 Technically this can be backported to 4.2, but then you would have to 
 run your own homebrew version of 4.2 
 
 It also seems that the /tmp is temporarily used during a template creation . 
 
 
 Same story as above. 
 
 My setup: 
 
 ACS 4.2.1 
 Ubuntu 12.04 with KVM 
 RBD + NFS for Primary storage 
 NFS for Staging and Secondary storage 
 
 
 Thanks 
 
 Andrei 
 
 
 
 



Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-21 Thread Andrei Mikhailovsky

Wido, 


i would be happy to try the custom ACS build unless 4.3 comes out soon. It has 
been overdue for sometime now )). Has this feature been addressed in the 4.3 
release? 


I can leave with this feature for the time being, but i do see a longer term 
issue when my volumes become large as i've only got about 100gb free space on 
my host servers. 


From what i can tell by looking at the rbd ls -l info all of my volumes are 
done in Format 2 


Cheers, 


Andrei 




- Original Message -

From: Wido den Hollander w...@widodh.nl 
To: dev@cloudstack.apache.org 
Sent: Thursday, 20 March, 2014 9:40:29 AM 
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates 

On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote: 
 Hi guys, 
 
 I was wondering if this is a bug? 
 

No, it's a feature. 

 I've noticed that during volume migration from NFS to RBD primary storage the 
 volume image is first copied to /tmp and only then to the RBD storage. This 
 seems silly to me as one would expect a typical volume to be larger than the 
 host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
 performance reasons. Thus, a typical host server will have far smaller /tmp 
 folder than the size of an average volume. As a result, volume migration 
 would break after filling the /tmp and could probably cause a bunch of issue 
 for the KVM host itself as well as any vms running on the server. 
 

Correct. The problem was that RBD images know two formats. Format 1 
(old/legacy) and format 2. 

In order to perform cloning images should be in RBD format 2. 

When running qemu-img convert with a RBD image as a destination qemu-img 
will create a RBD image in format 1. 

That's due to this piece of code in block/rbd.c in Qemu: 

ret = rbd_create(io_ctx, name, bytes, obj_order); 

rbd_create() creates images in format 1. To use format 2 you should use 
rbd_create2() or rbd_create3(). 

With RBD format 1 we can't do snapshotting or cloning, which we require 
in ACS. 

So I had to do a intermediate step where I first wrote the RAW image 
somewhere and afterwards write it to RBD. 

After some discussion a config option has been added to Ceph: 

OPTION(rbd_default_format, OPT_INT, 1) 

This allows me to do this: 

qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2 

This causes librbd/RBD to create a format 2 image and we can skip the 
convert step to /tmp. 

This option is available since Ceph Dumpling 0.67.5 and was not 
available when ACS 4.2 was written. 

I'm going to make changes in master which skip the step with /tmp. 

Technically this can be backported to 4.2, but then you would have to 
run your own homebrew version of 4.2 

 It also seems that the /tmp is temporarily used during a template creation . 
 

Same story as above. 

 My setup: 
 
 ACS 4.2.1 
 Ubuntu 12.04 with KVM 
 RBD + NFS for Primary storage 
 NFS for Staging and Secondary storage 
 
 
 Thanks 
 
 Andrei 
 




Re: ACS and KVM uses /tmp for volumes migration and templates

2014-03-20 Thread Wido den Hollander

On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote:

Hi guys,

I was wondering if this is a bug?



No, it's a feature.


I've noticed that during volume migration from NFS to RBD primary storage the 
volume image is first copied to /tmp and only then to the RBD storage. This 
seems silly to me as one would expect a typical volume to be larger than the 
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
performance reasons. Thus, a typical host server will have far smaller /tmp 
folder than the size of an average volume. As a result, volume migration would 
break after filling the /tmp and could probably cause a bunch of issue for the 
KVM host itself as well as any vms running on the server.



Correct. The problem was that RBD images know two formats. Format 1 
(old/legacy) and format 2.


In order to perform cloning images should be in RBD format 2.

When running qemu-img convert with a RBD image as a destination qemu-img 
will create a RBD image in format 1.


That's due to this piece of code in block/rbd.c in Qemu:

ret = rbd_create(io_ctx, name, bytes, obj_order);

rbd_create() creates images in format 1. To use format 2 you should use 
rbd_create2() or rbd_create3().


With RBD format 1 we can't do snapshotting or cloning, which we require 
in ACS.


So I had to do a intermediate step where I first wrote the RAW image 
somewhere and afterwards write it to RBD.


After some discussion a config option has been added to Ceph:

OPTION(rbd_default_format, OPT_INT, 1)

This allows me to do this:

qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2

This causes librbd/RBD to create a format 2 image and we can skip the 
convert step to /tmp.


This option is available since Ceph Dumpling 0.67.5 and was not 
available when ACS 4.2 was written.


I'm going to make changes in master which skip the step with /tmp.

Technically this can be backported to 4.2, but then you would have to 
run your own homebrew version of 4.2



It also seems that the /tmp is temporarily used during a template creation .



Same story as above.


My setup:

ACS 4.2.1
Ubuntu 12.04 with KVM
RBD + NFS for Primary storage
NFS for Staging and Secondary storage


Thanks

Andrei





ACS and KVM uses /tmp for volumes migration and templates

2014-03-19 Thread Andrei Mikhailovsky
Hi guys, 

I was wondering if this is a bug? 

I've noticed that during volume migration from NFS to RBD primary storage the 
volume image is first copied to /tmp and only then to the RBD storage. This 
seems silly to me as one would expect a typical volume to be larger than the 
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for 
performance reasons. Thus, a typical host server will have far smaller /tmp 
folder than the size of an average volume. As a result, volume migration would 
break after filling the /tmp and could probably cause a bunch of issue for the 
KVM host itself as well as any vms running on the server. 

It also seems that the /tmp is temporarily used during a template creation . 

My setup: 

ACS 4.2.1 
Ubuntu 12.04 with KVM 
RBD + NFS for Primary storage 
NFS for Staging and Secondary storage 


Thanks 

Andrei 


RE: ACS and KVM uses /tmp for volumes migration and templates

2014-03-19 Thread Edison Su
Which version of CloudStack are you using? Seems in 4.2, Wido enhanced RBD a 
lot, qemu-img itself can copy volume from NFS to RBD without temporary copying 
to /tmp folder. 

 -Original Message-
 From: Andrei Mikhailovsky [mailto:and...@arhont.com]
 Sent: Wednesday, March 19, 2014 5:00 PM
 To: dev@cloudstack.apache.org
 Subject: ACS and KVM uses /tmp for volumes migration and templates
 
 Hi guys,
 
 I was wondering if this is a bug?
 
 I've noticed that during volume migration from NFS to RBD primary storage
 the volume image is first copied to /tmp and only then to the RBD storage.
 This seems silly to me as one would expect a typical volume to be larger than
 the host's hard disk. Also, it is a common practice to use tmpfs as /tmp for
 performance reasons. Thus, a typical host server will have far smaller /tmp
 folder than the size of an average volume. As a result, volume migration
 would break after filling the /tmp and could probably cause a bunch of issue
 for the KVM host itself as well as any vms running on the server.
 
 It also seems that the /tmp is temporarily used during a template creation .
 
 My setup:
 
 ACS 4.2.1
 Ubuntu 12.04 with KVM
 RBD + NFS for Primary storage
 NFS for Staging and Secondary storage
 
 
 Thanks
 
 Andrei