Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-26 Thread Nir Soffer
On Mon, Jan 25, 2016 at 9:20 PM, Pavel Gashev  wrote:
> Nir,
>
> On Fri, 2016-01-22 at 20:47 +, Nir Soffer wrote:
>> On Fri, Jan 22, 2016 at 5:15 PM, Pavel Gashev 
>> wrote:
>> > I've tried to reproduce the mirroring of active layer:
>> >
>> > 1. Create two thin template provisioned VMs from the same template
>> > on different storages.
>> > 2. Start VM1
>> > 3. virsh blockcopy VM1 vda /rhev/data
>> > -center/...path.to.disk.of.VM2.. --wait --verbose --reuse-external
>> > --shallow
>> > 4. virsh blockjob VM1 vda --abort --pivot
>> > 5. Shutdown VM1
>> > 6. Start VM2. Boot in recovery mode and check filesystem.
>> >
>> > I did try this a dozen times. Everything works fine. No data
>> > corruption.
>>
>> If you take same vm, and do a live storage migration in ovirt, the
>> file system is
>> corrupted after the migration?
>
> Yes. And I've reproduced the issue:
>
> 1. Create a VM on MS NFS
> 2. Start VM
> 3. Create a disk-only snapshot
> 4. virsh blockcopy VM1 /some/file --wait --verbose --reuse-external
> --shallow
> 5. virsh blockjob VM1 vda --abort --pivot

At this point, /some/file should be the top layer of the vm, instead of
the latest snapshot of the vm.

Can you add the output of qemu-img info /some/file?

> 6. Shutdown VM
> 7. Copy the /some/file back to
>/rhev/data-center/..the.latest.snapshot.of.VM..

Can you add the output of qemu-img info on the lastest snapshot of the vm?

> 8. Start VM and check filesystem

What are the results of this check?

Can you answer this on the bug, so we have more information for libvirt and qemu
developers?

>
> In other words, creating a snapshot is important step for reproducing
> the issue.

So this happens only when mirroring a volume that is a result of a live
snapshot, right?

>
>> What is the guest os? did you try with more then one?
>
> Guest OS is W2K12. I was unable to reproduce the issue with Linux.

W2K - Windows 2000?

>
>> The next step is to open a bug with the logs I requested in my last
>> message. Please mark the bug
>> as urgent.
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1301713
>
>> I'm adding Kevin (from qemu) and Eric (from libvirt), hopefully they
>> can tell if the virsh flow is
>> indeed identical to what ovirt does, and what should be the next step
>> for debugging this.
>>
>> Ovirt is using blockCopy if available (should be available everywhere
>> for some time), or fallback
>> to blockRebase. Do you see this warning?
>>
>> blockCopy not supported, using blockRebase
>
> No such warning. There is 'Replicating drive vda to  Please find vdsm.log attached to the bug report.
>
> Thanks
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-25 Thread Pavel Gashev
Nir,

On Fri, 2016-01-22 at 20:47 +, Nir Soffer wrote:
> On Fri, Jan 22, 2016 at 5:15 PM, Pavel Gashev 
> wrote:
> > I've tried to reproduce the mirroring of active layer:
> > 
> > 1. Create two thin template provisioned VMs from the same template
> > on different storages.
> > 2. Start VM1
> > 3. virsh blockcopy VM1 vda /rhev/data
> > -center/...path.to.disk.of.VM2.. --wait --verbose --reuse-external 
> > --shallow
> > 4. virsh blockjob VM1 vda --abort --pivot
> > 5. Shutdown VM1
> > 6. Start VM2. Boot in recovery mode and check filesystem.
> > 
> > I did try this a dozen times. Everything works fine. No data
> > corruption.
> 
> If you take same vm, and do a live storage migration in ovirt, the
> file system is
> corrupted after the migration?

Yes. And I've reproduced the issue:

1. Create a VM on MS NFS
2. Start VM
3. Create a disk-only snapshot
4. virsh blockcopy VM1 /some/file --wait --verbose --reuse-external
--shallow
5. virsh blockjob VM1 vda --abort --pivot
6. Shutdown VM
7. Copy the /some/file back to
   /rhev/data-center/..the.latest.snapshot.of.VM..
8. Start VM and check filesystem

In other words, creating a snapshot is important step for reproducing
the issue. 

> What is the guest os? did you try with more then one?

Guest OS is W2K12. I was unable to reproduce the issue with Linux.

> The next step is to open a bug with the logs I requested in my last
> message. Please mark the bug
> as urgent.

https://bugzilla.redhat.com/show_bug.cgi?id=1301713

> I'm adding Kevin (from qemu) and Eric (from libvirt), hopefully they
> can tell if the virsh flow is
> indeed identical to what ovirt does, and what should be the next step
> for debugging this.
> 
> Ovirt is using blockCopy if available (should be available everywhere
> for some time), or fallback
> to blockRebase. Do you see this warning?
> 
> blockCopy not supported, using blockRebase

No such warning. There is 'Replicating drive vda to 

Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-22 Thread Pavel Gashev
Nir,


On 21/01/16 23:55, "Nir Soffer"  wrote:
>live migration starts by creating a snapshot, then copying the disks to the new
>storage, and then mirroring the active layer so both the old and the
>new disks are
>the same. Finally we switch to the new disk, and delete the old disk.
>
>So probably the issue is in the mirroring step. This is most likely a
>qemu issue.

Thank you for clarification. This brought me an idea to check consistency of 
the old disk.

I performed the following testing:
1. Create a VM on MS NFS
2. Initiate live disk migration to another storage
3. Catch the source files before oVirt has removed them by creating hard links 
to another directory
4. Shutdown VM
5. Create another VM and move the catched files to the place where new disk 
files is located
6. Check consistency of filesystem in both VMs


The source disk is consistent. The destination disk is corrupted.

>
>I'll try to get instructions for this from libvirt developers. If this
>happen with
>libvirt alone, this is a libvirt or qemu bug, and there is little we (ovirt) 
>can
>do about it.


I've tried to reproduce the mirroring of active layer:

1. Create two thin template provisioned VMs from the same template on different 
storages.
2. Start VM1
3. virsh blockcopy VM1 vda /rhev/data-center/...path.to.disk.of.VM2.. --wait 
--verbose --reuse-external --shallow
4. virsh blockjob VM1 vda --abort --pivot
5. Shutdown VM1
6. Start VM2. Boot in recovery mode and check filesystem.

I did try this a dozen times. Everything works fine. No data corruption.


Ideas?

___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-22 Thread Nir Soffer
On Fri, Jan 22, 2016 at 5:15 PM, Pavel Gashev  wrote:
> Nir,
>
>
> On 21/01/16 23:55, "Nir Soffer"  wrote:
>>live migration starts by creating a snapshot, then copying the disks to the 
>>new
>>storage, and then mirroring the active layer so both the old and the
>>new disks are
>>the same. Finally we switch to the new disk, and delete the old disk.
>>
>>So probably the issue is in the mirroring step. This is most likely a
>>qemu issue.
>
> Thank you for clarification. This brought me an idea to check consistency of 
> the old disk.
>
> I performed the following testing:
> 1. Create a VM on MS NFS
> 2. Initiate live disk migration to another storage
> 3. Catch the source files before oVirt has removed them by creating hard 
> links to another directory
> 4. Shutdown VM
> 5. Create another VM and move the catched files to the place where new disk 
> files is located
> 6. Check consistency of filesystem in both VMs
>
>
> The source disk is consistent. The destination disk is corrupted.
>
>>
>>I'll try to get instructions for this from libvirt developers. If this
>>happen with
>>libvirt alone, this is a libvirt or qemu bug, and there is little we (ovirt) 
>>can
>>do about it.
>
>
> I've tried to reproduce the mirroring of active layer:
>
> 1. Create two thin template provisioned VMs from the same template on 
> different storages.
> 2. Start VM1
> 3. virsh blockcopy VM1 vda /rhev/data-center/...path.to.disk.of.VM2.. --wait 
> --verbose --reuse-external --shallow
> 4. virsh blockjob VM1 vda --abort --pivot
> 5. Shutdown VM1
> 6. Start VM2. Boot in recovery mode and check filesystem.
>
> I did try this a dozen times. Everything works fine. No data corruption.

If you take same vm, and do a live storage migration in ovirt, the
file system is
corrupted after the migration?

What is the guest os? did you try with more then one?

>
>
> Ideas?

Thanks for this research!

The next step is to open a bug with the logs I requested in my last
message. Please mark the bug
as urgent.

I'm adding Kevin (from qemu) and Eric (from libvirt), hopefully they
can tell if the virsh flow is
indeed identical to what ovirt does, and what should be the next step
for debugging this.

Ovirt is using blockCopy if available (should be available everywhere
for some time), or fallback
to blockRebase. Do you see this warning?

blockCopy not supported, using blockRebase

For reference, this is the relevant code in ovirt for the mirroring
part. The mirroring starts with
diskReplicateStart(), and ends with diskReplicateFinish(). I remove
the parts about managing
vdsm state and left the calls to libvirt.

3378 def diskReplicateFinish(self, srcDisk, dstDisk):
...
3394 blkJobInfo = self._dom.blockJobInfo(drive.name, 0)
...
3418 if srcDisk != dstDisk:
3419 self.log.debug("Stopping the disk replication
switching to the "
3420"destination drive: %s", dstDisk)
3421 blockJobFlags = libvirt.VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT
...
3429 else:
3430 self.log.debug("Stopping the disk replication
remaining on the "
3431"source drive: %s", dstDisk)
3432 blockJobFlags = 0
...
3435 try:
3436 # Stopping the replication
3437 self._dom.blockJobAbort(drive.name, blockJobFlags)
3438 except Exception:
3439 self.log.exception("Unable to stop the replication for"
3440" the drive: %s", drive.name)
...

3462 def _startDriveReplication(self, drive):
3463 destxml = drive.getReplicaXML().toprettyxml()
3464 self.log.debug("Replicating drive %s to %s", drive.name, destxml)
3465
3466 flags = (libvirt.VIR_DOMAIN_BLOCK_COPY_SHALLOW |
3467  libvirt.VIR_DOMAIN_BLOCK_COPY_REUSE_EXT)
3468
3469 # TODO: Remove fallback when using libvirt >= 1.2.9.
3470 try:
3471 self._dom.blockCopy(drive.name, destxml, flags=flags)
3472 except libvirt.libvirtError as e:
3473 if e.get_error_code() != libvirt.VIR_ERR_NO_SUPPORT:
3474 raise
3475
3476 self.log.warning("blockCopy not supported, using blockRebase")
3477
3478 base = drive.diskReplicate["path"]
3479 self.log.debug("Replicating drive %s to %s", drive.name, base)
3480
3481 flags = (libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY |
3482  libvirt.VIR_DOMAIN_BLOCK_REBASE_REUSE_EXT |
3483  libvirt.VIR_DOMAIN_BLOCK_REBASE_SHALLOW)
3484
3485 if drive.diskReplicate["diskType"] == DISK_TYPE.BLOCK:
3486 flags |= libvirt.VIR_DOMAIN_BLOCK_REBASE_COPY_DEV
3487
3488 self._dom.blockRebase(drive.name, base, flags=flags)
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Dan Yasny
inline

On Thu, Jan 21, 2016 at 7:54 AM, Pavel Gashev  wrote:

> Hello,
>
> First of all I would like to ask if anybody has an experience with using
> Microsoft NFS server as a storage domain.
>
>
I have used one as an ISO domain for years. It wasn't great, but it was
good enough. Never a data domain though


> The main issue with MS NFS is NTFS :) NTFS doesn't support sparse files.
> Technically it's possible by enabling NTFS compression but  it has bad
> performance on huge files which is our case. Also there is no option in
> oVirt web interface to use COW format on NFS storage domains.
>
> Since it looks like oVirt doesn't support MS NFS, I decided to migrate all
> my VMs out of MS NFS to another storage. And I hit a bug. Live storage
> migration *silently* *corrupts* *data* if you migrate a disk from MS NFS
> storage domain. So if you shutdown just migrated VM and check filesystem
> you find that it has a lot of unrecoverable errors.
>
> There are the following symptoms:
> 1. It corrupts data if you migrate a disk from MS NFS to Linux NFS
> 2. It corrupts data if you migrate a disk from MS NFS to iSCSI
> 3. There is no corruption if you migrate from Linux NFS to iSCSI and vice
> versa.
> 4. There is no corruption if you migrate from anywhere to MS NFS.
> 5. Data corruption happens after 'Auto-generated for Live Storage
> Migration' snapshot. So if you rollback the snapshot, you could see
> absolutely clean filesystem.
> 6. It doesn't depend on SPM. So it corrupts data if SPM is on the same
> host, or another.
> 7. There are no error messages in vdsm/qemu/system logs.
>
> Yes, of course I could migrate from MS NFS with downtime – it's not an
> issue. The issue is that oVirt does silently corrupt data under some
> circumstances.
>
> Could you please help me to understand the reason of data corruption?
>
> vdsm-4.17.13-1.el7.noarch
> qemu-img-ev-2.3.0-31.el7_2.4.1.x86_64
> libvirt-daemon-1.2.17-13.el7_2.2.x86_64
> ovirt-engine-backend-3.6.1.3-1.el7.centos.noarch
>
> Thank you
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Pavel Gashev
Hello,

First of all I would like to ask if anybody has an experience with using 
Microsoft NFS server as a storage domain.

The main issue with MS NFS is NTFS :) NTFS doesn't support sparse files. 
Technically it's possible by enabling NTFS compression but  it has bad 
performance on huge files which is our case. Also there is no option in oVirt 
web interface to use COW format on NFS storage domains.

Since it looks like oVirt doesn't support MS NFS, I decided to migrate all my 
VMs out of MS NFS to another storage. And I hit a bug. Live storage migration 
silently corrupts data if you migrate a disk from MS NFS storage domain. So if 
you shutdown just migrated VM and check filesystem you find that it has a lot 
of unrecoverable errors.

There are the following symptoms:
1. It corrupts data if you migrate a disk from MS NFS to Linux NFS
2. It corrupts data if you migrate a disk from MS NFS to iSCSI
3. There is no corruption if you migrate from Linux NFS to iSCSI and vice versa.
4. There is no corruption if you migrate from anywhere to MS NFS.
5. Data corruption happens after 'Auto-generated for Live Storage Migration' 
snapshot. So if you rollback the snapshot, you could see absolutely clean 
filesystem.
6. It doesn't depend on SPM. So it corrupts data if SPM is on the same host, or 
another.
7. There are no error messages in vdsm/qemu/system logs.

Yes, of course I could migrate from MS NFS with downtime – it's not an issue. 
The issue is that oVirt does silently corrupt data under some circumstances.

Could you please help me to understand the reason of data corruption?

vdsm-4.17.13-1.el7.noarch
qemu-img-ev-2.3.0-31.el7_2.4.1.x86_64
libvirt-daemon-1.2.17-13.el7_2.2.x86_64
ovirt-engine-backend-3.6.1.3-1.el7.centos.noarch

Thank you


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Nir Soffer
On Thu, Jan 21, 2016 at 2:54 PM, Pavel Gashev  wrote:
> Hello,
>
> First of all I would like to ask if anybody has an experience with using
> Microsoft NFS server as a storage domain.
>
> The main issue with MS NFS is NTFS :) NTFS doesn't support sparse files.
> Technically it's possible by enabling NTFS compression but  it has bad
> performance on huge files which is our case. Also there is no option in
> oVirt web interface to use COW format on NFS storage domains.

You can
1. create a small disk (1G)
2. create a snapshot
3. extend the disk go the final size

And you have nfs with cow format. The performance difference with one snapshot
should be small.

> Since it looks like oVirt doesn't support MS NFS, I decided to migrate all
> my VMs out of MS NFS to another storage. And I hit a bug. Live storage
> migration silently corrupts data if you migrate a disk from MS NFS storage
> domain. So if you shutdown just migrated VM and check filesystem you find
> that it has a lot of unrecoverable errors.
>
> There are the following symptoms:
> 1. It corrupts data if you migrate a disk from MS NFS to Linux NFS
> 2. It corrupts data if you migrate a disk from MS NFS to iSCSI
> 3. There is no corruption if you migrate from Linux NFS to iSCSI and vice
> versa.
> 4. There is no corruption if you migrate from anywhere to MS NFS.
> 5. Data corruption happens after 'Auto-generated for Live Storage Migration'
> snapshot. So if you rollback the snapshot, you could see absolutely clean
> filesystem.

Can you try to create a live-snapshot on MS NFS? It seems that this is the
issue, not live storage migration.

Do you have qemu-guest-agent on the vm? Without qemu-guest-agent, file
systems on the guest will no be freezed during the snapshot, which may cause
inconsistent snapshot.

Can you reproduce this with virt-manager, or by creating a vm and taking
a snapshot using virsh?

> 6. It doesn't depend on SPM. So it corrupts data if SPM is on the same host,
> or another.
> 7. There are no error messages in vdsm/qemu/system logs.
>
> Yes, of course I could migrate from MS NFS with downtime – it's not an
> issue. The issue is that oVirt does silently corrupt data under some
> circumstances.
>
> Could you please help me to understand the reason of data corruption?

Please file a bug and attach:

- /var/log/vdsm/vdsm.log
- /var/log/messages
- /var/log/sanlock.log
- output of  nfsstat during the test, maybe run it every minute?

> vdsm-4.17.13-1.el7.noarch
> qemu-img-ev-2.3.0-31.el7_2.4.1.x86_64
> libvirt-daemon-1.2.17-13.el7_2.2.x86_64
> ovirt-engine-backend-3.6.1.3-1.el7.centos.noarch
>
> Thank you
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Pavel Gashev
On Thu, 2016-01-21 at 18:42 +, Nir Soffer wrote:

On Thu, Jan 21, 2016 at 2:54 PM, Pavel Gashev 
> wrote:

Also there is no option in
oVirt web interface to use COW format on NFS storage domains.



You can
1. create a small disk (1G)
2. create a snapshot
3. extend the disk go the final size

And you have nfs with cow format. The performance difference with one snapshot
should be small.


Yes. And there are other workarounds:
1. Use some block (i.e. iSCSI) storage for creating a thin provisioned disk 
(which is COW) and then move it to required storage.
2. Keep an empty 1G COW disk and copy+resize it when required.
3. Use ovirt-shell for creating disks.

Unfortunately, these are not native ways. These are ways for a hacker. Plain 
user clicks "New" in "Disks" tab and selects "Thin Provision" allocation 
policy. It's hard to explain to users that the simplest and obvious way is 
wrong. I hope it's wrong only for MS NFS.


5. Data corruption happens after 'Auto-generated for Live Storage Migration'
snapshot. So if you rollback the snapshot, you could see absolutely clean
filesystem.



Can you try to create a live-snapshot on MS NFS? It seems that this is the
issue, not live storage migration.


Live snapshots work very well on MS NFS. Creating and deleting works live 
without any issues. I did it many times. Please note that everything before the 
snapshot remains consistent. Data corruption occurs after the snapshot. So only 
non-snapshotted data is corrupted.



Do you have qemu-guest-agent on the vm? Without qemu-guest-agent, file
systems on the guest will no be freezed during the snapshot, which may cause
inconsistent snapshot.


I tried it with and without qemu-guest-agent. It doesn't depend.



Can you reproduce this with virt-manager, or by creating a vm and taking
a snapshot using virsh?


Sorry, I'm not sure how I can reproduce the issue using virsh.




Please file a bug and attach:

- /var/log/vdsm/vdsm.log
- /var/log/messages
- /var/log/sanlock.log
- output of  nfsstat during the test, maybe run it every minute?

Ok, I will collect the logs and fill a bug.

Thanks


___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Nir Soffer
On Thu, Jan 21, 2016 at 10:13 PM, Pavel Gashev  wrote:
> On Thu, 2016-01-21 at 18:42 +, Nir Soffer wrote:
>
> On Thu, Jan 21, 2016 at 2:54 PM, Pavel Gashev  wrote:
>
> Also there is no option in
> oVirt web interface to use COW format on NFS storage domains.
>
>
> You can
> 1. create a small disk (1G)
> 2. create a snapshot
> 3. extend the disk go the final size
>
> And you have nfs with cow format. The performance difference with one
> snapshot
> should be small.
>
>
> Yes. And there are other workarounds:
> 1. Use some block (i.e. iSCSI) storage for creating a thin provisioned disk
> (which is COW) and then move it to required storage.
> 2. Keep an empty 1G COW disk and copy+resize it when required.
> 3. Use ovirt-shell for creating disks.
>
> Unfortunately, these are not native ways. These are ways for a hacker. Plain
> user clicks "New" in "Disks" tab and selects "Thin Provision" allocation
> policy. It's hard to explain to users that the simplest and obvious way is
> wrong. I hope it's wrong only for MS NFS.

Sure I agree.

I think we do not use qcow format on file storage since there is no
need for this,
the file system is always sparse. I guess we did not plan to use MS NFS.

I would open bug for supporting qcow format on file storage. If this works for
some users, I think this is an option that should be possible in the
ui. Hopefully
there are no too many assumptions in the code about this.

Allon, do you see any reason not to support this for user that need this option?

>
> 5. Data corruption happens after 'Auto-generated for Live Storage Migration'
> snapshot. So if you rollback the snapshot, you could see absolutely clean
> filesystem.
>
>
> Can you try to create a live-snapshot on MS NFS? It seems that this is the
> issue, not live storage migration.
>
>
> Live snapshots work very well on MS NFS. Creating and deleting works live
> without any issues. I did it many times. Please note that everything before
> the snapshot remains consistent. Data corruption occurs after the snapshot.
> So only non-snapshotted data is corrupted.

live migration starts by creating a snapshot, then copying the disks to the new
storage, and then mirroring the active layer so both the old and the
new disks are
the same. Finally we switch to the new disk, and delete the old disk.

So probably the issue is in the mirroring step. This is most likely a
qemu issue.

>
> Do you have qemu-guest-agent on the vm? Without qemu-guest-agent, file
> systems on the guest will no be freezed during the snapshot, which may cause
> inconsistent snapshot.
>
>
> I tried it with and without qemu-guest-agent. It doesn't depend.
>
> Can you reproduce this with virt-manager, or by creating a vm and taking
> a snapshot using virsh?
>
>
> Sorry, I'm not sure how I can reproduce the issue using virsh.

I'll try to get instructions for this from libvirt developers. If this
happen with
libvirt alone, this is a libvirt or qemu bug, and there is little we (ovirt) can
do about it.

>
>
> Please file a bug and attach:
>
> - /var/log/vdsm/vdsm.log
> - /var/log/messages
> - /var/log/sanlock.log
> - output of  nfsstat during the test, maybe run it every minute?
>
>
> Ok, I will collect the logs and fill a bug.
>
> Thanks
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Using Microsoft NFS server as storage domain

2016-01-21 Thread Nir Soffer
Adding Allon

On Thu, Jan 21, 2016 at 10:55 PM, Nir Soffer  wrote:
> On Thu, Jan 21, 2016 at 10:13 PM, Pavel Gashev  wrote:
>> On Thu, 2016-01-21 at 18:42 +, Nir Soffer wrote:
>>
>> On Thu, Jan 21, 2016 at 2:54 PM, Pavel Gashev  wrote:
>>
>> Also there is no option in
>> oVirt web interface to use COW format on NFS storage domains.
>>
>>
>> You can
>> 1. create a small disk (1G)
>> 2. create a snapshot
>> 3. extend the disk go the final size
>>
>> And you have nfs with cow format. The performance difference with one
>> snapshot
>> should be small.
>>
>>
>> Yes. And there are other workarounds:
>> 1. Use some block (i.e. iSCSI) storage for creating a thin provisioned disk
>> (which is COW) and then move it to required storage.
>> 2. Keep an empty 1G COW disk and copy+resize it when required.
>> 3. Use ovirt-shell for creating disks.
>>
>> Unfortunately, these are not native ways. These are ways for a hacker. Plain
>> user clicks "New" in "Disks" tab and selects "Thin Provision" allocation
>> policy. It's hard to explain to users that the simplest and obvious way is
>> wrong. I hope it's wrong only for MS NFS.
>
> Sure I agree.
>
> I think we do not use qcow format on file storage since there is no
> need for this,
> the file system is always sparse. I guess we did not plan to use MS NFS.
>
> I would open bug for supporting qcow format on file storage. If this works for
> some users, I think this is an option that should be possible in the
> ui. Hopefully
> there are no too many assumptions in the code about this.
>
> Allon, do you see any reason not to support this for user that need this 
> option?
>
>>
>> 5. Data corruption happens after 'Auto-generated for Live Storage Migration'
>> snapshot. So if you rollback the snapshot, you could see absolutely clean
>> filesystem.
>>
>>
>> Can you try to create a live-snapshot on MS NFS? It seems that this is the
>> issue, not live storage migration.
>>
>>
>> Live snapshots work very well on MS NFS. Creating and deleting works live
>> without any issues. I did it many times. Please note that everything before
>> the snapshot remains consistent. Data corruption occurs after the snapshot.
>> So only non-snapshotted data is corrupted.
>
> live migration starts by creating a snapshot, then copying the disks to the 
> new
> storage, and then mirroring the active layer so both the old and the
> new disks are
> the same. Finally we switch to the new disk, and delete the old disk.
>
> So probably the issue is in the mirroring step. This is most likely a
> qemu issue.
>
>>
>> Do you have qemu-guest-agent on the vm? Without qemu-guest-agent, file
>> systems on the guest will no be freezed during the snapshot, which may cause
>> inconsistent snapshot.
>>
>>
>> I tried it with and without qemu-guest-agent. It doesn't depend.
>>
>> Can you reproduce this with virt-manager, or by creating a vm and taking
>> a snapshot using virsh?
>>
>>
>> Sorry, I'm not sure how I can reproduce the issue using virsh.
>
> I'll try to get instructions for this from libvirt developers. If this
> happen with
> libvirt alone, this is a libvirt or qemu bug, and there is little we (ovirt) 
> can
> do about it.
>
>>
>>
>> Please file a bug and attach:
>>
>> - /var/log/vdsm/vdsm.log
>> - /var/log/messages
>> - /var/log/sanlock.log
>> - output of  nfsstat during the test, maybe run it every minute?
>>
>>
>> Ok, I will collect the logs and fill a bug.
>>
>> Thanks
>>
>>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users