from:"Steve Dainard"

[Users] Extremely poor disk access speeds in Windows guest

2014-01-23 Thread Steve Dainard

Backing Storage: Gluster Replica
Storage Domain: NFS
Ovirt Hosts: CentOS 6.5
Ovirt version: 3.3.2
Network: GigE
# of VM's: 3 - two Linux guests are idle, one Windows guest is installing
updates.

I've installed a Windows 2008 R2 guest with virtio disk, and all the
drivers from the latest virtio iso. I've also installed the spice agent
drivers.

Guest disk access is horribly slow, Resource monitor during Windows updates
shows Disk peaking at 1MB/sec (scale never increases) and Disk Queue Length
Peaking at 5 and looks to be sitting at that level 99% of the time. 113
updates in Windows has been running solidly for about 2.5 hours and is at
89/113 updates complete.

I can't say my Linux guests are blisteringly fast, but updating a guest
from RHEL 6.3 fresh install to 6.5 took about 25 minutes.

If anyone has any ideas, please let me know - I haven't found any tuning
docs for Windows guests that could explain this issue.

Thanks,


*Steve Dainard *
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-23 Thread Steve Dainard

I have two options, virtio and virtio-scsi.

I was using virtio, and have also attempted virtio-scsi on another Windows
guest with the same results.

Using the newest drivers, virtio-win-0.1-74.iso.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Thu, Jan 23, 2014 at 4:24 PM, Itamar Heim  wrote:

> On 01/23/2014 07:46 PM, Steve Dainard wrote:
>
>> Backing Storage: Gluster Replica
>> Storage Domain: NFS
>> Ovirt Hosts: CentOS 6.5
>> Ovirt version: 3.3.2
>> Network: GigE
>> # of VM's: 3 - two Linux guests are idle, one Windows guest is
>> installing updates.
>>
>> I've installed a Windows 2008 R2 guest with virtio disk, and all the
>> drivers from the latest virtio iso. I've also installed the spice agent
>> drivers.
>>
>> Guest disk access is horribly slow, Resource monitor during Windows
>> updates shows Disk peaking at 1MB/sec (scale never increases) and Disk
>> Queue Length Peaking at 5 and looks to be sitting at that level 99% of
>> the time. 113 updates in Windows has been running solidly for about 2.5
>> hours and is at 89/113 updates complete.
>>
>
> virtio-block or virtio-scsi?
> which windows guest driver version for that?
>
>
>> I can't say my Linux guests are blisteringly fast, but updating a guest
>> from RHEL 6.3 fresh install to 6.5 took about 25 minutes.
>>
>> If anyone has any ideas, please let me know - I haven't found any tuning
>> docs for Windows guests that could explain this issue.
>>
>> Thanks,
>>
>>
>> *Steve Dainard *
>>
>>
>>
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-24 Thread Steve Dainard

Not sure what a good method to bench this would be, but:

An NFS mount point on virt host:
[root@ovirt001 iso-store]# dd if=/dev/zero of=test1 bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (410 MB) copied, 3.95399 s, 104 MB/s

Raw brick performance on gluster server (yes, I know I shouldn't write
directly to the brick):
[root@gluster1 iso-store]# dd if=/dev/zero of=test bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (410 MB) copied, 3.06743 s, 134 MB/s

Gluster mount point on gluster server:
[root@gluster1 iso-store]# dd if=/dev/zero of=test bs=4k count=10
10+0 records in
10+0 records out
40960 bytes (410 MB) copied, 19.5766 s, 20.9 MB/s

The storage servers are a bit older, but are both dual socket quad core
opterons with 4x 7200rpm drives.

I'm in the process of setting up a share from my desktop and I'll see if I
can bench between the two systems. Not sure if my ssd will impact the
tests, I've heard there isn't an advantage using ssd storage for glusterfs.

Does anyone have a hardware reference design for glusterfs as a backend for
virt? Or is there a benchmark utility?

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Thu, Jan 23, 2014 at 7:18 PM, Andrew Cathrow  wrote:

> Are we sure that the issue is the guest I/O - what's the raw performance
> on the host accessing the gluster storage?
>
> --
>
> *From: *"Steve Dainard" 
> *To: *"Itamar Heim" 
> *Cc: *"Ronen Hod" , "users" , "Sanjay
> Rao" 
> *Sent: *Thursday, January 23, 2014 4:56:58 PM
> *Subject: *Re: [Users] Extremely poor disk access speeds in Windows guest
>
>
> I have two options, virtio and virtio-scsi.
>
> I was using virtio, and have also attempted virtio-scsi on another Windows
> guest with the same results.
>
> Using the newest drivers, virtio-win-0.1-74.iso.
>
> *Steve Dainard *
> IT Infrastructure Manager
> Miovision <http://miovision.com/> | *Rethink Traffic*
> 519-513-2407 ex.250
> 877-646-8476 (toll-free)
>
> *Blog <http://miovision.com/blog>  |  **LinkedIn
> <https://www.linkedin.com/company/miovision-technologies>  |  Twitter
> <https://twitter.com/miovision>  |  Facebook
> <https://www.facebook.com/miovision>*
> --
>  Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
> ON, Canada | N2C 1L3
> This e-mail may contain information that is privileged or confidential. If
> you are not the intended recipient, please delete the e-mail and any
> attachments and notify us immediately.
>
>
> On Thu, Jan 23, 2014 at 4:24 PM, Itamar Heim  wrote:
>
>> On 01/23/2014 07:46 PM, Steve Dainard wrote:
>>
>>> Backing Storage: Gluster Replica
>>> Storage Domain: NFS
>>> Ovirt Hosts: CentOS 6.5
>>> Ovirt version: 3.3.2
>>> Network: GigE
>>> # of VM's: 3 - two Linux guests are idle, one Windows guest is
>>> installing updates.
>>>
>>> I've installed a Windows 2008 R2 guest with virtio disk, and all the
>>> drivers from the latest virtio iso. I've also installed the spice agent
>>> drivers.
>>>
>>> Guest disk access is horribly slow, Resource monitor during Windows
>>> updates shows Disk peaking at 1MB/sec (scale never increases) and Disk
>>> Queue Length Peaking at 5 and looks to be sitting at that level 99% of
>>> the time. 113 updates in Windows has been running solidly for about 2.5
>>> hours and is at 89/113 updates complete.
>>>
>>
>> virtio-block or virtio-scsi?
>> which windows guest driver version for that?
>>
>>
>>> I can't say my Linux guests are blisteringly fast, but updating a guest
>>> from RHEL 6.3 fresh install to 6.5 took about 25 minutes.
>>>
>>> If anyone has any ideas, please let me know - I haven't found any tuning
>>> docs for Windows guests that could explain this issue.
>>>
>>> Thanks,
>>>
>>>
>>> *Steve Dainard *
>>>
>>>
>>>
>>> ___
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/mailman/listinfo/users
>>>
>>>
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-25 Thread Steve Dainard

pport is assumed and would break
otherwise. I'm assuming this is still valid as I cannot get a storage lock
when I attempt a gluster storage domain.




I've setup a NFS storage domain on my desktops SSD. I've re-installed win
2008 r2 and initially it was running smoother.

Disk performance peaks at 100MB/s.

If I copy a 250MB file from a share into the Windows VM, it writes out
quickly, less than 5 seconds.

If I copy 20 files, ranging in file sizes from 4k to 200MB, totaling in
650MB from the share - windows becomes unresponsive, in top the desktop's
nfs daemon is barely being touched at all, and then eventually is not hit.
I can still interact with the VM's windows through the spice console.
Eventually the file transfer will start and rocket through the transfer.

I've opened a 271MB zip file with 4454 files and started the extract
process but the progress windows will sit on 'calculating...' after a
significant period of time the decompression starts and runs at
<200KB/second. Windows is guesstimating 1HR completion time. Eventually
even this freezes up, and my spice console mouse won't grab. I can still
see the resource monitor in the Windows VM doing its thing but have to
poweroff the VM as its no longer usable.

The windows update process is the same. It seems like when the guest needs
quick large writes its fine, but lots of io causes serious hanging,
unresponsiveness, spice mouse cursor freeze, and eventually poweroff/reboot
is the only way to get it back.

Also, during window 2008 r2 install the 'expanding windows files' task is
quite slow, roughly 1% progress every 20 seconds (~30 mins to complete).
The GLUSTER host shows these stats pretty consistently:

PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

 8139 root  20   0 1380m  28m 2476 R 83.1  0.4   8:35.78 glusterfsd

 8295 root  20   0  550m 186m 2980 S  4.3  2.4   1:52.56 glusterfs

bwm-ng v0.6 (probing every 2.000s), press 'h' for help
input: /proc/net/dev type: rate
  \ iface   Rx   Tx
 Total

==
   lo:3719.31 KB/s 3719.31 KB/s 7438.62
KB/s
 eth0:3405.12 KB/s 3903.28 KB/s 7308.40
KB/s


I've copied the same zip file to an nfs mount point on the OVIRT host
(gluster backend) and get about 25 - 600 KB/s during unzip. The same test
on NFS mount point (desktop SSD ext4 backend) averaged a network transfer
speed of 5MB/s and completed in about 40 seconds.

I have a RHEL 6.5 guest running on the NFS/gluster backend storage domain,
and just did the same test. Extracting the file took 22.3 seconds (faster
than the fuse mount point on the host !?!?).

GLUSTER host top reported this while the RHEL guest was decompressing the
zip file:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

 2141 root  20   0  555m 187m 2844 S  4.0  2.4  18:17.00 glusterfs

 2122 root  20   0 1380m  31m 2396 S  2.3  0.4  83:19.40 glusterfsd





*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.

>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-28 Thread Steve Dainard

I've had a bit of luck here.

Overall IO performance is very poor during Windows updates, but a
contributing factor seems to be the "SCSI Controller" device in the guest.
This last install I didn't install a driver for that device, and my
performance is much better. Updates still chug along quite slowly, but I
seem to have more than the < 100KB/s write speeds I was seeing previously.

Does anyone know what this device is for? I have the "Red Hat VirtIO SCSI
Controller" listed under storage controllers.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Sun, Jan 26, 2014 at 2:33 AM, Itamar Heim  wrote:

> On 01/26/2014 02:37 AM, Steve Dainard wrote:
>
>> Thanks for the responses everyone, really appreciate it.
>>
>> I've condensed the other questions into this reply.
>>
>>
>> Steve,
>> What is the CPU load of the GlusterFS host when comparing the raw
>> brick test to the gluster mount point test? Give it 30 seconds and
>> see what top reports. You'll probably have to significantly increase
>> the count on the test so that it runs that long.
>>
>> - Nick
>>
>>
>>
>> Gluster mount point:
>>
>> *4K* on GLUSTER host
>> [root@gluster1 rep2]# dd if=/dev/zero of=/mnt/rep2/test1 bs=4k
>> count=50
>> 50+0 records in
>> 50+0 records out
>> 204800  bytes (2.0 GB) copied, 100.076 s, 20.5 MB/s
>>
>>
>> Top reported this right away:
>> PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>   1826 root  20   0  294m  33m 2540 S 27.2  0.4   0:04.31 glusterfs
>>   2126 root  20   0 1391m  31m 2336 S 22.6  0.4  11:25.48 glusterfsd
>>
>> Then at about 20+ seconds top reports this:
>>PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>   1826 root  20   0  294m  35m 2660 R 141.7  0.5   1:14.94 glusterfs
>>   2126 root  20   0 1392m  31m 2344 S 33.7  0.4  11:46.56 glusterfsd
>>
>> *4K* Directly on the brick:
>> dd if=/dev/zero of=test1 bs=4k count=50
>> 50+0 records in
>> 50+0 records out
>> 204800  bytes (2.0 GB) copied, 4.99367 s, 410 MB/s
>>
>>
>>   7750 root  20   0  102m  648  544 R 50.3  0.0   0:01.52 dd
>>   7719 root  20   0 000 D  1.0  0.0   0:01.50 flush-253:2
>>
>> Same test, gluster mount point on OVIRT host:
>> dd if=/dev/zero of=/mnt/rep2/test1 bs=4k count=50
>> 50+0 records in
>> 50+0 records out
>> 204800  bytes (2.0 GB) copied, 42.4518 s, 48.2 MB/s
>>
>>
>>PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>   2126 root  20   0 1396m  31m 2360 S 40.5  0.4  13:28.89 glusterfsd
>>
>>
>> Same test, on OVIRT host but against NFS mount point:
>> dd if=/dev/zero of=/mnt/rep2-nfs/test1 bs=4k count=50
>> 50+0 records in
>> 50+0 records out
>> 204800  bytes (2.0 GB) copied, 18.8911 s, 108 MB/s
>>
>>
>> PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
>>   2141 root  20   0  550m 184m 2840 R 84.6  2.3  16:43.10 glusterfs
>>   2126 root  20   0 1407m  30m 2368 S 49.8  0.4  13:49.07 glusterfsd
>>
>> Interesting - It looks like if I use a NFS mount point, I incur a cpu
>> hit on two processes instead of just the daemon. I also get much better
>> performance if I'm not running dd (fuse) on the GLUSTER host.
>>
>>
>> The storage servers are a bit older, but are both dual socket
>> quad core
>>
>> opterons with 4x 7200rpm drives.
>>
>>
>> A block size of 4k is quite small so that the context switch
>> overhead involved with fuse would be more perceivable.
>>
>> Would it be possible to increase the block size for dd and test?
>>
>>
>>
>> I'm in the process of setting up a share from my desktop and
>> I'll see if
>>
>> I can bench between the two

Re: [Users] Ovirt Gluster problems

2014-01-28 Thread Steve Dainard

Not sure if this is exactly your issue, but this post here:
http://comments.gmane.org/gmane.comp.emulators.ovirt.user/12200 might lead
you in the right direction.

"one note - if you back it up while its attached to an engine, you will
need to edit its meta data file to remove the association to allow the
other engine to connect it to the new pool for restore."


*Steve Dainard *



On Tue, Jan 28, 2014 at 12:41 PM, Juan Pablo Lorier wrote:

> Hi,
>
> I had some issues with a gluster cluster and after some time trying to
> get the storage domain up or delete it (I opened a BZ about a deadlock
> in the process of removing the domain) I gave up and destroyed the DC.
> The thing is that I want to add the hosts that where part of the DC and
> now I get that I can't as they have the volume. I try to stop the volume
> but I can't as no host is running in the deleted cluster and for some
> reason, ovirt needs that.
> I can't delete the hosts either as they have the volume... so  I'm back
> in another chicken and egg problem.
> Any hints??
>
> PD: I can't nuke the hole ovirt plataform as I have another DC in
> production otherwise I would :-)
>
> Regards,
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-29 Thread Steve Dainard

On Wed, Jan 29, 2014 at 5:11 AM, Vadim Rozenfeld wrote:

> On Wed, 2014-01-29 at 11:30 +0200, Ronen Hod wrote:
> > Adding the virtio-scsi developers.
> > Anyhow, virtio-scsi is newer and less established than viostor (the
> > block device), so you might want to try it out.
>
> [VR]
> Was it "SCSI Controller" or "SCSI pass-through controller"?
> If it's "SCSI Controller" then it will be viostor (virtio-blk) device
> driver.
>
>
"SCSI Controller" is listed in device manager.

Hardware ID's:
PCI\VEN_1AF4&DEV_1004&SUBSYS_00081AF4&REV_00
PCI\VEN_1AF4&DEV_1004&SUBSYS_00081AF4
PCI\VEN_1AF4&DEV_1004&CC_01
PCI\VEN_1AF4&DEV_1004&CC_0100

>
> > A disclaimer: There are time and patches gaps between RHEL and other
> > versions.
> >
> > Ronen.
> >
> > On 01/28/2014 10:39 PM, Steve Dainard wrote:
> >
> > > I've had a bit of luck here.
> > >
> > >
> > > Overall IO performance is very poor during Windows updates, but a
> > > contributing factor seems to be the "SCSI Controller" device in the
> > > guest. This last install I didn't install a driver for that device,
>
> [VR]
> Does it mean that your system disk is IDE and the data disk (virtio-blk)
> is not accessible?
>

In Ovirt 3.3.2-1.el6 I do not have an option to add a virtio-blk device:
Screenshot here:
https://dl.dropboxusercontent.com/u/21916057/Screenshot%20from%202014-01-29%2010%3A04%3A57.png

VM disk drive is "Red Hat VirtIO SCSI Disk Device", storage controller is
listed as "Red Hat VirtIO SCSI Controller" as shown in device manager.
Screenshot here:
https://dl.dropboxusercontent.com/u/21916057/Screenshot%20from%202014-01-29%2009%3A57%3A24.png

In Ovirt manager the disk interface is listed as "VirtIO".
Screenshot here:
https://dl.dropboxusercontent.com/u/21916057/Screenshot%20from%202014-01-29%2009%3A58%3A35.png

>
> > >  and my performance is much better. Updates still chug along quite
> > > slowly, but I seem to have more than the < 100KB/s write speeds I
> > > was seeing previously.
> > >
> > >
> > > Does anyone know what this device is for? I have the "Red Hat VirtIO
> > > SCSI Controller" listed under storage controllers.
>
> [VR]
> It's a virtio-blk device. OS cannot see this volume unless you have
> viostor.sys driver installed on it.
>

Interesting that my VM's can see the controller, but I can't add a disk for
that controller in Ovirt. Is there a package I have missed on install?

rpm -qa | grep ovirt
ovirt-host-deploy-java-1.1.3-1.el6.noarch
ovirt-engine-backend-3.3.2-1.el6.noarch
ovirt-engine-lib-3.3.2-1.el6.noarch
ovirt-engine-restapi-3.3.2-1.el6.noarch
ovirt-engine-sdk-python-3.3.0.8-1.el6.noarch
ovirt-log-collector-3.3.2-2.el6.noarch
ovirt-engine-dbscripts-3.3.2-1.el6.noarch
ovirt-engine-webadmin-portal-3.3.2-1.el6.noarch
ovirt-host-deploy-1.1.3-1.el6.noarch
ovirt-image-uploader-3.3.2-1.el6.noarch
ovirt-engine-websocket-proxy-3.3.2-1.el6.noarch
ovirt-engine-userportal-3.3.2-1.el6.noarch
ovirt-engine-setup-3.3.2-1.el6.noarch
ovirt-iso-uploader-3.3.2-1.el6.noarch
ovirt-engine-cli-3.3.0.6-1.el6.noarch
ovirt-engine-3.3.2-1.el6.noarch
ovirt-engine-tools-3.3.2-1.el6.noarch

> > >
> > > I've setup a NFS storage domain on my desktops SSD.
> > > I've re-installed
> > > win 2008 r2 and initially it was running smoother.
> > >
> > > Disk performance peaks at 100MB/s.
> > >
> > > If I copy a 250MB file from a share into the Windows
> > > VM, it writes out
> [VR]
> Do you copy it with Explorer or any other copy program?
>

Windows Explorer only.

> Do you have HPET enabled?
>

I can't find it in the guest 'system devices'. On the hosts the current
clock source is 'tsc', although 'hpet' is an available option.

> How does it work with if you copy from/to local (non-NFS) storage?
>

Not sure, this is a royal pain to setup. Can I use my ISO domain in two
different data centers at the same time? I don't have an option to create
an ISO / NFS domain in the local storage DC.

When I use the import option with the default DC's ISO domain, I get an
error "There is no storage domain under the specified path. Check event log
for more details." VDMS logs show "Resource namespace
0e90e574-b003-4a62-867d-cf274b17e6b1_imageNS already registered" so I'm
guessing the answer is no.

I tried to deploy with WDS, but the 64bit drivers apparently aren't signed,
and on x86 I get an error about the NIC not being supported even with the
drivers added to WDS.

> What is your virtio-win drivers package origin and version?
>

virtio-win-0.1-74.iso ->
http://alt.fedoraproject.org/pub/alt/virtio-win/latest/images/

>
> Thanks,
> Vadim.
>
>
>
Appreciate it,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[Users] engine-backup restore how to

2014-01-29 Thread Steve Dainard

There doesn't seem to be any solid documentation on how to use the
engine-backup restore function, and I'm not able to restore a backup.

The best I've come up with is:
1. Install engine on new host
2. Stop engine
3. run engine-backup --mode=restore --file=filename --log=logfile

Fail.

Log shows:
psql: FATAL:  password authentication failed for user "engine"
2014-01-29 18:20:30 10285: FATAL: Can't connect to the database

4. engine-backup --mode=restore --file=engine.bak --log=logfile
--change-db-credentials --db-host=localhost --db-user=engine
--db-name=engine --db-password='newpassword'

Fails with same error.

5. change user to postgres, drop the old db, create a new db named engine,
set password for engine user same as 'newpassword'

6. engine-backup --mode=restore --file=engine.bak --log=logfile
--change-db-credentials --db-host=localhost --db-user=engine
--db-name=engine --db-password='newpassword'

Restoring...
Rewriting /etc/ovirt-engine/engine.conf.d/10-setup-database.conf
Note: you might need to manually fix:
- iptables/firewalld configuration
- autostart of ovirt-engine service
You can now start the engine service and then restart httpd
Done.

7. start ovirt-engine, restart httpd, browse to web ui

Blank page, no content.

8. stop firewall, browse to web ui

Blank page, no content

9. Engine log contains:

2014-01-29 18:35:56,973 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service
thread 1-40) Value of property "SENSITIVE_KEYS" is
",ENGINE_DB_PASSWORD,ENGINE_PKI_TR
UST_STORE_PASSWORD,ENGINE_PKI_ENGINE_STORE_PASSWORD".
2014-01-29 18:35:57,330 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread
 1-25) Error in getting DB connection. The database is inaccessible.
Original exception is: BadSqlGrammarException: CallableStatementCallback;
bad SQL grammar [{call checkdbconnection()}]; nested exception is
org.postgresql.util.PSQLException: ERROR: function checkdbconnection() does
not exist
  Hint: No function matches the given name and argument types. You might
need to add explicit type casts.
  Position: 15
2014-01-29 18:35:58,336 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-25) Error in getting DB connection. The database is
inaccessible. Original exception is: UncategorizedSQLException:
CallableStatementCallback; uncategorized SQLException for SQL [{call
checkdbconnection()}]; SQL state [25P02]; error code [0]; ERROR: current
transaction is aborted, commands ignored until end of transaction block;
nested exception is org.postgresql.util.PSQLException: ERROR: current
transaction is aborted, commands ignored until end of transaction block


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] engine-backup restore how to

2014-01-29 Thread Steve Dainard

I also see this error in engine.log which repeats every second if I am
trying to access the web ui.

2014-01-29 18:59:47,531 ERROR [org.ovirt.engine.core.bll.Backend]
(ajp--127.0.0.1-8702-4) Error in getting DB connection. The database is
inaccessible. Original exception is: UncategorizedSQLException:
CallableStatementCallback; uncategorized SQLException for SQL [{call
checkdbconnection()}]; SQL state [25P02]; error code [0]; ERROR: current
transaction is aborted, commands ignored until end of transaction block;
nested exception is org.postgresql.util.PSQLException: ERROR: current
transaction is aborted, commands ignored until end of transaction block

It looks like the db inserted correctly, I took a quick look through some
tables and can see the valid admin user, and snapshots. But I can't say for
certain.

The IP address of the new server does not match the IP of the old (backup
file) server, would this have any impact? I would think not as its a local
db.

When I changed the password for the psql engine user, is there any config
file this is referenced in that may not have been updated?

Thanks,

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Wed, Jan 29, 2014 at 7:06 PM, Alon Bar-Lev  wrote:

>
>
> ----- Original Message -
> > From: "Steve Dainard" 
> > To: "users" 
> > Sent: Thursday, January 30, 2014 1:59:08 AM
> > Subject: [Users] engine-backup restore how to
> >
> > There doesn't seem to be any solid documentation on how to use the
> > engine-backup restore function, and I'm not able to restore a backup.
> >
> > The best I've come up with is:
> > 1. Install engine on new host
> > 2. Stop engine
> > 3. run engine-backup --mode=restore --file=filename --log=logfile
> >
> > Fail.
> >
> > Log shows:
> > psql: FATAL: password authentication failed for user "engine"
> > 2014-01-29 18:20:30 10285: FATAL: Can't connect to the database
> >
> > 4. engine-backup --mode=restore --file=engine.bak --log=logfile
> > --change-db-credentials --db-host=localhost --db-user=engine
> > --db-name=engine --db-password='newpassword'
> >
> > Fails with same error.
>
> the --db-password must match the user's actual password within database,
> --change-db-credentials does not change the password in database but the
> host/port/user/password that are used by enigne.
>
> >
> > 5. change user to postgres, drop the old db, create a new db named
> engine,
> > set password for engine user same as 'newpassword'
> >
> > 6. engine-backup --mode=restore --file=engine.bak --log=logfile
> > --change-db-credentials --db-host=localhost --db-user=engine
> > --db-name=engine --db-password='newpassword'
>
> Ok, this is correct now.
>
> > Restoring...
> > Rewriting /etc/ovirt-engine/engine.conf.d/10-setup-database.conf
> > Note: you might need to manually fix:
> > - iptables/firewalld configuration
> > - autostart of ovirt-engine service
> > You can now start the engine service and then restart httpd
> > Done.
> >
> > 7. start ovirt-engine, restart httpd, browse to web ui
> >
> > Blank page, no content.
> >
> > 8. stop firewall, browse to web ui
> >
> > Blank page, no content
> >
> > 9. Engine log contains:
> >
> > 2014-01-29 18:35:56,973 INFO [org.ovirt.engine.core.utils.LocalConfig]
> (MSC
> > service
> > thread 1-40) Value of property "SENSITIVE_KEYS" is
> > ",ENGINE_DB_PASSWORD,ENGINE_PKI_TR
> > UST_STORE_PASSWORD,ENGINE_PKI_ENGINE_STORE_PASSWORD".
> > 2014-01-29 18:35:57,330 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
> > service thread
> > 1-25) Error in getting DB connection. The database is inaccessible.
> Original
> > exception is: BadSqlGrammarException: CallableStatementCallback; bad SQL
> > grammar [{call checkdbconnection()}]; nested exception is
> > org.postgresql.util.PSQLException: ERROR: function checkdbconnection()
> does
> > not exist
> > Hint: No function match

Re: [Users] engine-backup restore how to

2014-01-30 Thread Steve Dainard

 is "
jsse.enableSNIExtension=false".
2014-01-30 10:24:19,022 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_PROXY_ENABLED" is
"true".
2014-01-30 10:24:19,022 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_PROXY_HTTPS_PORT" is
"443".
2014-01-30 10:24:19,023 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_PROXY_HTTP_PORT" is
"80".
2014-01-30 10:24:19,023 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_STOP_INTERVAL" is "1".
2014-01-30 10:24:19,024 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_STOP_TIME" is "10".
2014-01-30 10:24:19,024 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_TMP" is
"/var/tmp/ovirt-engine".
2014-01-30 10:24:19,024 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_UP_MARK" is
"/var/lib/ovirt-engine/engine.up".
2014-01-30 10:24:19,025 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_USER" is "ovirt".
2014-01-30 10:24:19,025 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_USR" is
"/usr/share/ovirt-engine".
2014-01-30 10:24:19,026 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_VAR" is
"/var/lib/ovirt-engine".
2014-01-30 10:24:19,026 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "ENGINE_VERBOSE_GC" is "false".
2014-01-30 10:24:19,027 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "JBOSS_HOME" is
"/usr/share/jboss-as".
2014-01-30 10:24:19,027 INFO  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) Value of property "SENSITIVE_KEYS" is
",ENGINE_DB_PASSWORD,ENGINE_PKI_TRUST_STORE_PASSWORD,ENGINE_PKI_ENGINE_STORE_PASSWORD".
2014-01-30 10:24:19,391 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-2) Error in getting DB connection. The database is
inaccessible. Original exception is: BadSqlGrammarException:
CallableStatementCallback; bad SQL grammar [{call checkdbconnection()}];
nested exception is org.postgresql.util.PSQLException: ERROR: function
checkdbconnection() does not exist
  Hint: No function matches the given name and argument types. You might
need to add explicit type casts.
  Position: 15
2014-01-30 10:24:20,398 ERROR [org.ovirt.engine.core.bll.Backend] (MSC
service thread 1-2) Error in getting DB connection. The database is
inaccessible. Original exception is: UncategorizedSQLException:
CallableStatementCallback; uncategorized SQLException for SQL [{call
checkdbconnection()}]; SQL state [25P02]; error code [0]; ERROR: current
transaction is aborted, commands ignored until end of transaction block;
nested exception is org.postgresql.util.PSQLException: ERROR: current
transaction is aborted, commands ignored until end of transaction block
[last error repeats]

Thanks,

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.

On Thu, Jan 30, 2014 at 2:18 AM, Yedidyah Bar David  wrote:

> *From: *"Steve Dainard" 
> *To: *"Alon Bar-Lev" 
> *Cc: *"users" , "Yedidyah Bar David" ,
> "Eli Mesika" 
> *Sent: *Thursday, January 30, 2014 7:44:01 AM
> *Subject: *Re: [Users] engine-backup restore how to
>
>
> I also see this error in engine.log which repeats every second if I am
> trying to access the web ui.
>
> 2014-01-29 18:59:47,531 ERROR [org.ovirt.engine.core.bll.Backend]
> (ajp--127.0.0.1-8702-4) Error in getting DB connection. The database is
> inaccessible. Original exception is: UncategorizedSQLException:
> CallableStatementCallback; uncategorized SQLException for SQL [{call
> checkdbconnection()}]; SQL state [25P02]; error code [0]; ERROR: current
> transaction is aborted, commands ignored until end of transaction block;
> nested exception is org.postgresql.util.PSQLException: ERROR: current
> transaction is aborted, commands ignored until end of transaction block
>
> It looks like the db inserted correctly, I took a quick look through some
> tables and can see the valid admin user, and snapshots. But I can't say for
> certain.
>
> The IP address of the new server does not match the IP of the old (backup
> file) server, would this have any impact? I would think not as its a local
> db.
>
> When I changed the password for the psql engine user, is there any config
> file this is referenced in that may not have been updated?
>
>
> In principle, the only needed file is
> /etc/ovirt-engine/engine.conf.d/10-setup-database.conf
> which was updated by restore. Can you please verify that you can connect
> to the database
> using the credentials in this file? What are its permissions/owner?
>
> Thanks,
> --
> Didi
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] engine-backup restore how to

2014-01-30 Thread Steve Dainard

Is this file supposed to exist:
2014-01-30 10:24:18,990 WARN  [org.ovirt.engine.core.utils.LocalConfig]
(MSC service thread 1-23) The file "/etc/ovirt-engine/engine.conf" doesn't
exist or isn't readable. Will return an empty set of properties.

I can't find it anywhere on the system.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Thu, Jan 30, 2014 at 10:32 AM, Steve Dainard wrote:

> I can connect to the db with the 'engine' user.
>
> Initially 'engine' wasn't a member of any roles, I added it to 'postgres'.
> engine=> \du
> List of roles
>  Role name | Attributes  | Member of
> ---+-+
>  engine| | {postgres}
>  postgres  | Superuser   | {}
>: Create role
>: Create DB
>
> If you meant file permissions they are:
> # ll
> total 24
> -rw---. 1 ovirt ovirt 380 Jan 29 18:35 10-setup-database.conf
> -rw---. 1 ovirt ovirt 378 Jan 15 15:58
> 10-setup-database.conf.20140129183539
> -rw-r--r--. 1 root  root   33 Jan 15 15:58 10-setup-jboss.conf
> -rw---. 1 ovirt ovirt 384 Jan 15 15:59 10-setup-pki.conf
> -rw-r--r--. 1 root  root  259 Jan 15 15:58 10-setup-protocols.conf
> -rw-r--r--. 1 root  root  204 Dec 13 03:22 README
>
>
> On ovirt-engine restart (engine.log):
> i
> 2014-01-30 10:24:18,988 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Loaded file
> "/usr/share/ovirt-engine/services/ovirt-engine/ovirt-engine.conf".
> 2014-01-30 10:24:18,990 WARN  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) The file "/etc/ovirt-engine/engine.conf" doesn't
> exist or isn't readable. Will return an empty set of properties.
> 2014-01-30 10:24:18,991 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Loaded file
> "/etc/ovirt-engine/engine.conf.d/10-setup-database.conf".
> 2014-01-30 10:24:18,992 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Loaded file
> "/etc/ovirt-engine/engine.conf.d/10-setup-jboss.conf".
> 2014-01-30 10:24:18,994 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Loaded file
> "/etc/ovirt-engine/engine.conf.d/10-setup-pki.conf".
> 2014-01-30 10:24:18,994 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Loaded file
> "/etc/ovirt-engine/engine.conf.d/10-setup-protocols.conf".
> 2014-01-30 10:24:18,995 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_AJP_ENABLED" is "true".
> 2014-01-30 10:24:18,996 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_AJP_PORT" is "8702".
> 2014-01-30 10:24:18,997 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_APPS" is "engine.ear".
> 2014-01-30 10:24:18,997 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_CACHE" is
> "/var/cache/ovirt-engine".
> 2014-01-30 10:24:18,998 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_DB_CHECK_INTERVAL" is
> "1000".
> 2014-01-30 10:24:18,998 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_DB_CONNECTION_TIMEOUT"
> is "30".
> 2014-01-30 10:24:18,999 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_DB_DATABASE" is
> "engine".
> 2014-01-30 10:24:19,000 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_DB_DRIVER" is
> "org.postgresql.Driver".
> 2014-01-30 10:24:19,000 INFO  [org.ovirt.engine.core.utils.LocalConfig]
> (MSC service thread 1-23) Value of property "ENGINE_DB_HOST" is "localhost".
> 2014-01-30 10:24:19,001 INFO  [org.ovir

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-31 Thread Steve Dainard

I've reconfigured my setup (good succes below, but need clarity on gluster
option):

Two nodes total, both running virt and glusterfs storage (2 node replica,
quorum).

I've created an NFS storage domain, pointed at the first nodes IP address.
I've launched a 2008 R2 SP1 install with a virtio-scsi disk, and the SCSI
pass-through driver on the same node as the NFS domain is pointing at.

Windows guest install has been running for roughly 1.5 hours, still
"Expanding Windows files (55%) ..."

top is showing:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

 3609 root  20   0 1380m  33m 2604 S 35.4  0.1 231:39.75 glusterfsd

21444 qemu  20   0 6362m 4.1g 6592 S 10.3  8.7  10:11.53 qemu-kvm


This is a 2 socket, 6 core xeon machine with 48GB of RAM, and 6x 7200rpm
enterprise sata disks in RAID5 so I don't think we're hitting hardware
limitations.

dd on xfs (no gluster)

time dd if=/dev/zero of=test bs=1M count=2048
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB) copied, 4.15787 s, 516 MB/s

real 0m4.351s
user 0m0.000s
sys 0m1.661s


time dd if=/dev/zero of=test bs=1k count=200
200+0 records in
200+0 records out
204800 bytes (2.0 GB) copied, 4.06949 s, 503 MB/s

real 0m4.260s
user 0m0.176s
sys 0m3.991s


I've enabled nfs.trusted-sync (
http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#nfs.trusted-sync
) on the gluster volume, and the speed difference is immeasurable . Can
anyone explain what this option does, and what the risks are with a 2 node
gluster replica volume with quorum enabled?

Thanks,
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-31 Thread Steve Dainard

>
>
> I've enabled nfs.trusted-sync (
> http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#nfs.trusted-sync
> ) on the gluster volume, and the speed difference is immeasurable . Can
> anyone explain what this option does, and what the risks are with a 2
> node gluster replica volume with quorum enabled?
>
>
Sorry, I understand async, I meant option nfs.trusted-write, and if it
would help this situation.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-01-31 Thread Steve Dainard

>
>
> How would you developers, speaking for the oVirt-community, propose to
> solve this for CentOS _now_ ?
>
> I would imagine that the easiest way is that you build and host this one
> package(qemu-kvm-rhev), since you´ve basically already have the source
> and recipe (since you´re already providing it for RHEV anyway). Then,
> once that´s in place, it´s more a question of where to host the
> packages, in what repository. Be it your own, or some other repo set up
> for the SIG.
>
> This is my view, how I as a user view this issue.
>
>
>
I think this is a pretty valid view.

What would it take to get the correct qemu package hosted in the ovirt repo?



> --
>
> Med Vänliga Hälsningar
>
>
> ---
> Karli Sjöberg
> Swedish University of Agricultural Sciences Box 7079 (Visiting Address
> Kronåsvägen 8)
> S-750 07 Uppsala, Sweden
> Phone:  +46-(0)18-67 15 66
> karli.sjob...@slu.se
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Extremely poor disk access speeds in Windows guest

2014-01-31 Thread Steve Dainard

IDE is just as slow. Just over 2 hours for 2008R2 install.

Is this what you mean by kvm?
lsmod | grep kvm
kvm_intel  54285  3
kvm   332980  1 kvm_intel


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Fri, Jan 31, 2014 at 8:20 PM, Vadim Rozenfeld wrote:

> On Fri, 2014-01-31 at 11:37 -0500, Steve Dainard wrote:
> > I've reconfigured my setup (good succes below, but need clarity on
> > gluster option):
> >
> >
> > Two nodes total, both running virt and glusterfs storage (2 node
> > replica, quorum).
> >
> >
> > I've created an NFS storage domain, pointed at the first nodes IP
> > address. I've launched a 2008 R2 SP1 install with a virtio-scsi disk,
> > and the SCSI pass-through driver on the same node as the NFS domain is
> > pointing at.
> >
> >
> > Windows guest install has been running for roughly 1.5 hours, still
> > "Expanding Windows files (55%) ..."
>
> [VR]
> Does it work faster with IDE?
> Do you have kvm enabled?
> Thanks,
> Vadim.
>
> >
> >
> > top is showing:
> >   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
> >
> >  3609 root  20   0 1380m  33m 2604 S 35.4  0.1 231:39.75
> > glusterfsd
> > 21444 qemu  20   0 6362m 4.1g 6592 S 10.3  8.7  10:11.53 qemu-kvm
> >
> >
> >
> > This is a 2 socket, 6 core xeon machine with 48GB of RAM, and 6x
> > 7200rpm enterprise sata disks in RAID5 so I don't think we're hitting
> > hardware limitations.
> >
> >
> > dd on xfs (no gluster)
> >
> >
> > time dd if=/dev/zero of=test bs=1M count=2048
> > 2048+0 records in
> > 2048+0 records out
> > 2147483648 bytes (2.1 GB) copied, 4.15787 s, 516 MB/s
> >
> >
> > real 0m4.351s
> > user 0m0.000s
> > sys 0m1.661s
> >
> >
> >
> >
> > time dd if=/dev/zero of=test bs=1k count=200
> > 200+0 records in
> > 200+0 records out
> > 204800 bytes (2.0 GB) copied, 4.06949 s, 503 MB/s
> >
> >
> > real 0m4.260s
> > user 0m0.176s
> > sys 0m3.991s
> >
> >
> >
> >
> > I've enabled nfs.trusted-sync
> > (
> http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#nfs.trusted-sync)
> on the gluster volume, and the speed difference is immeasurable . Can
> anyone explain what this option does, and what the risks are with a 2 node
> gluster replica volume with quorum enabled?
> >
> >
> > Thanks,
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-02-01 Thread Steve Dainard

I have two CentOS 6.5 Ovirt hosts (ovirt001, ovirt002)

I've installed the applicable qemu-kvm-rhev packages from this site:
http://www.dreyou.org/ovirt/vdsm32/Packages/ on ovirt002.

On ovirt001 if I take a live snapshot:

Snapshot 'test qemu-kvm' creation for VM 'snapshot-test' was initiated by
admin@internal.
The VM is paused
Failed to create live snapshot 'test qemu-kvm' for VM 'snapshot-test'. VM
restart is recommended.
Failed to complete snapshot 'test qemu-kvm' creation for VM 'snapshot-test'.
The VM is then started, and the status for the snapshot changes to OK.

On ovirt002 (with the packages from dreyou) I don't get any messages about
a snapshot failing, but my VM is still paused to complete the snapshot. Is
there something else other than the qemu-kvm-rhev packages that would
enable this functionality?

I've looked for some information on when the packages would be built as
required in the CentOS repos, but I don't see anything definitive.

http://lists.ovirt.org/pipermail/users/2013-December/019126.html Looks like
one of the maintainers is waiting for someone to tell him what flags need
to be set.

Also, another thread here:
http://comments.gmane.org/gmane.comp.emulators.ovirt.arch/1618 same
maintainer, mentioning that he hasn't seen anything in the bug tracker.

There is a bug here: https://bugzilla.redhat.com/show_bug.cgi?id=1009100 that
seems to have ended in finding a way for qemu to expose whether it supports
live snapshots, rather than figuring out how to get the CentOS team the
info they need to build the packages with the proper flags set.

I have bcc'd both dreyou (packaged the qemu-kvm-rhev packages listed above)
and Russ (CentOS maintainer mentioned in the other threads) if they wish to
chime in and perhaps collaborate on which flags, if any, should be set for
the qemu-kvm builds so we can get a CentOS bug report going and hammer this
out.

Thanks everyone.

**crosses fingers and hopes for live snapshots soon**

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.

On Fri, Jan 31, 2014 at 1:26 PM, Steve Dainard wrote:

>
>> How would you developers, speaking for the oVirt-community, propose to
>> solve this for CentOS _now_ ?
>>
>> I would imagine that the easiest way is that you build and host this one
>> package(qemu-kvm-rhev), since you´ve basically already have the source
>> and recipe (since you´re already providing it for RHEV anyway). Then,
>> once that´s in place, it´s more a question of where to host the
>> packages, in what repository. Be it your own, or some other repo set up
>> for the SIG.
>>
>> This is my view, how I as a user view this issue.
>>
>>
>>
> I think this is a pretty valid view.
>
> What would it take to get the correct qemu package hosted in the ovirt
> repo?
>
>
>
>>  --
>>
>> Med Vänliga Hälsningar
>>
>>
>> ---
>> Karli Sjöberg
>> Swedish University of Agricultural Sciences Box 7079 (Visiting Address
>> Kronåsvägen 8)
>> S-750 07 Uppsala, Sweden
>> Phone:  +46-(0)18-67 15 66
>> karli.sjob...@slu.se
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-02-03 Thread Steve Dainard

[root@ovirt002 ~]# vdsClient -s 0 getStorageDomainInfo
a52938f7-2cf4-4771-acb2-0c78d14999e5
uuid = a52938f7-2cf4-4771-acb2-0c78d14999e5
pool = ['fcb89071-6cdb-4972-94d1-c9324cebf814']
lver = 5
version = 3
role = Master
remotePath = gluster-store-vip:/rep1
spm_id = 2
type = NFS
class = Data
master_ver = 1
name = gluster-store-rep1


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Sun, Feb 2, 2014 at 2:55 PM, Dafna Ron  wrote:

> please run vdsClient -s 0 getStorageDomainInfo a52938f7-2cf4-4771-acb2-
> 0c78d14999e5
>
> Thanks,
>
> Dafna
>
>
>
> On 02/02/2014 03:02 PM, Steve Dainard wrote:
>
>> Logs attached with VM running on qemu-kvm-rhev packages installed.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>> 519-513-2407 ex.250
>>
>> 877-646-8476 (toll-free)
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/
>> company/miovision-technologies>  | Twitter <https://twitter.com/miovision>
>>  | Facebook <https://www.facebook.com/miovision>*
>> 
>> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
>> ON, Canada | N2C 1L3
>> This e-mail may contain information that is privileged or confidential.
>> If you are not the intended recipient, please delete the e-mail and any
>> attachments and notify us immediately.
>>
>>
>> On Sun, Feb 2, 2014 at 5:05 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> can you please upload full engine, vdsm, libvirt and vm's qemu logs?
>>
>>
>> On 02/02/2014 02:08 AM, Steve Dainard wrote:
>>
>> I have two CentOS 6.5 Ovirt hosts (ovirt001, ovirt002)
>>
>> I've installed the applicable qemu-kvm-rhev packages from this
>> site: http://www.dreyou.org/ovirt/vdsm32/Packages/ on ovirt002.
>>
>> On ovirt001 if I take a live snapshot:
>>
>> Snapshot 'test qemu-kvm' creation for VM 'snapshot-test' was
>> initiated by admin@internal.
>> The VM is paused
>> Failed to create live snapshot 'test qemu-kvm' for VM
>> 'snapshot-test'. VM restart is recommended.
>> Failed to complete snapshot 'test qemu-kvm' creation for VM
>> 'snapshot-test'.
>> The VM is then started, and the status for the snapshot
>> changes to OK.
>>
>> On ovirt002 (with the packages from dreyou) I don't get any
>> messages about a snapshot failing, but my VM is still paused
>> to complete the snapshot. Is there something else other than
>> the qemu-kvm-rhev packages that would enable this functionality?
>>
>> I've looked for some information on when the packages would be
>> built as required in the CentOS repos, but I don't see
>> anything definitive.
>>
>> http://lists.ovirt.org/pipermail/users/2013-December/019126.html
>> Looks like one of the maintainers is waiting for someone to
>> tell him what flags need to be set.
>>
>> Also, another thread here:
>> http://comments.gmane.org/gmane.comp.emulators.ovirt.arch/1618
>> same maintainer, mentioning that he hasn't seen anything in
>> the bug tracker.
>>
>> There is a bug here:
>> https://bugzilla.redhat.com/show_bug.cgi?id=1009100 that seems
>> to have ended in finding a way for qemu to expose whether it
>> supports live snapshots, rather than figuring out how to get
>>     the CentOS team the info they need to build the packages with
>> the proper flags set.
>>
>> I have bcc'd both dreyou (packaged the qemu-kvm-rhev packages
>> listed above) and Russ (CentOS maintainer mentioned in the
>>

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-02-03 Thread Steve Dainard

FYI I'm running version 3.3.2, not the 3.3.3 beta.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Mon, Feb 3, 2014 at 11:24 AM, Dafna Ron  wrote:

> Thanks Steve.
>
> from the logs I can see that the create snapshot succeeds and that the vm
> is resumed.
> the vm moves to pause as part of libvirt flows:
>
> 2014-02-02 14:41:20.872+: 5843: debug : qemuProcessHandleStop:728 :
> Transitioned guest snapshot-test to paused state
> 2014-02-02 14:41:30.031+: 5843: debug : qemuProcessHandleResume:776 :
> Transitioned guest snapshot-test out of paused into resumed state
>
> There are bugs here but I am not sure yet if this is libvirt regression or
> engine.
>
> I'm adding Elad and Maor since in engine logs I can't see anything calling
> for live snapshot (only for snapshot) - Maor, shouldn't live snapshot
> command be logged somewhere in the logs?
> Is it possible that engine is calling to create snapshot and not create
> live snapshot which is why the vm pauses?
>
> Elad, if engine is not logging live snapshot anywhere I would open a bug
> for engine (to print that in the logs).
> Also, there is a bug in vdsm log for sdc where the below is logged as
> ERROR and not INFO:
>
> Thread-23::ERROR::2014-02-02 09:51:19,497::sdc::137::
> Storage.StorageDomainCache::(_findDomain) looking for unfetched domain
> a52938f7-2cf4-4771-acb2-0c78d14999e5
> Thread-23::ERROR::2014-02-02 09:51:19,497::sdc::154::
> Storage.StorageDomainCache::(_findUnfetchedDomain) looking for domain
> a52938f7-2cf4-4771-acb2-0c78d14999e5
>
> If the engine was sending live snapshot or if there is no difference in
> the two commands in engine side than I would open a bug for libvirt for
> pausing the vm during live snapshot.
>
> Dafna
>
>
> On 02/03/2014 02:41 PM, Steve Dainard wrote:
>
>> [root@ovirt002 ~]# vdsClient -s 0 getStorageDomainInfo
>> a52938f7-2cf4-4771-acb2-0c78d14999e5
>> uuid = a52938f7-2cf4-4771-acb2-0c78d14999e5
>> pool = ['fcb89071-6cdb-4972-94d1-c9324cebf814']
>> lver = 5
>> version = 3
>> role = Master
>> remotePath = gluster-store-vip:/rep1
>> spm_id = 2
>> type = NFS
>> class = Data
>> master_ver = 1
>> name = gluster-store-rep1
>>
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>> 519-513-2407 ex.250
>> 877-646-8476 (toll-free)
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/
>> company/miovision-technologies>  | Twitter <https://twitter.com/miovision>
>>  | Facebook <https://www.facebook.com/miovision>*
>> 
>> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
>> ON, Canada | N2C 1L3
>> This e-mail may contain information that is privileged or confidential.
>> If you are not the intended recipient, please delete the e-mail and any
>> attachments and notify us immediately.
>>
>>
>> On Sun, Feb 2, 2014 at 2:55 PM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> please run vdsClient -s 0 getStorageDomainInfo
>> a52938f7-2cf4-4771-acb2-0c78d14999e5
>>
>> Thanks,
>>
>> Dafna
>>
>>
>>
>> On 02/02/2014 03:02 PM, Steve Dainard wrote:
>>
>> Logs attached with VM running on qemu-kvm-rhev packages installed.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>> 519-513-2407  ex.250
>>
>> 877-646-8476  (toll-free)
>>
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn
>> <https://www.linkedin.com/company/miovision-technologies>  |
>> Twitter <https://twitter.com/miovision>  | Facebook
>> <https://www.facebook.com/miovision>*
>> -

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-02-03 Thread Steve Dainard

[root@ovirt002 ~]# rpm -qa | egrep 'qemu|vdsm|libvirt' | sort
gpxe-roms-qemu-0.9.7-6.10.el6.noarch
libvirt-0.10.2-29.el6_5.3.x86_64
libvirt-client-0.10.2-29.el6_5.3.x86_64
libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64
libvirt-python-0.10.2-29.el6_5.3.x86_64
qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64
qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64
qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64
vdsm-4.13.3-2.el6.x86_64
vdsm-cli-4.13.3-2.el6.noarch
vdsm-python-4.13.3-2.el6.x86_64
vdsm-xmlrpc-4.13.3-2.el6.noarch


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Mon, Feb 3, 2014 at 11:54 AM, Dafna Ron  wrote:

> Can you also put the vdsm, libvirt and qemu packages?
>
> Thanks,
> Dafna
>
>
>
> On 02/03/2014 04:49 PM, Steve Dainard wrote:
>
>> FYI I'm running version 3.3.2, not the 3.3.3 beta.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>> 519-513-2407 ex.250
>> 877-646-8476 (toll-free)
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/
>> company/miovision-technologies>  | Twitter <https://twitter.com/miovision>
>>  | Facebook <https://www.facebook.com/miovision>*
>> 
>> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
>> ON, Canada | N2C 1L3
>> This e-mail may contain information that is privileged or confidential.
>> If you are not the intended recipient, please delete the e-mail and any
>> attachments and notify us immediately.
>>
>>
>> On Mon, Feb 3, 2014 at 11:24 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> Thanks Steve.
>>
>> from the logs I can see that the create snapshot succeeds and that
>> the vm is resumed.
>> the vm moves to pause as part of libvirt flows:
>>
>> 2014-02-02 14:41:20.872+: 5843: debug :
>> qemuProcessHandleStop:728 : Transitioned guest snapshot-test to
>> paused state
>> 2014-02-02 14:41:30.031+: 5843: debug :
>> qemuProcessHandleResume:776 : Transitioned guest snapshot-test out
>> of paused into resumed state
>>
>> There are bugs here but I am not sure yet if this is libvirt
>> regression or engine.
>>
>> I'm adding Elad and Maor since in engine logs I can't see anything
>> calling for live snapshot (only for snapshot) - Maor, shouldn't
>> live snapshot command be logged somewhere in the logs?
>> Is it possible that engine is calling to create snapshot and not
>> create live snapshot which is why the vm pauses?
>>
>> Elad, if engine is not logging live snapshot anywhere I would open
>> a bug for engine (to print that in the logs).
>> Also, there is a bug in vdsm log for sdc where the below is logged
>> as ERROR and not INFO:
>>
>> Thread-23::ERROR::2014-02-02
>> 09:51:19,497::sdc::137::Storage.StorageDomainCache::(_findDomain)
>> looking for unfetched domain a52938f7-2cf4-4771-acb2-0c78d14999e5
>>     Thread-23::ERROR::2014-02-02
>> 09:51:19,497::sdc::154::Storage.StorageDomainCache::(_
>> findUnfetchedDomain)
>> looking for domain a52938f7-2cf4-4771-acb2-0c78d14999e5
>>
>> If the engine was sending live snapshot or if there is no
>> difference in the two commands in engine side than I would open a
>> bug for libvirt for pausing the vm during live snapshot.
>>
>> Dafna
>>
>>
>> On 02/03/2014 02:41 PM, Steve Dainard wrote:
>>
>> [root@ovirt002 ~]# vdsClient -s 0 getStorageDomainInfo
>> a52938f7-2cf4-4771-acb2-0c78d14999e5
>> uuid = a52938f7-2cf4-4771-acb2-0c78d14999e5
>> pool = ['fcb89071-6cdb-4972-94d1-c9324cebf814']
>> lver = 5
>> version = 3
>> role = Master
>> remotePath = gluster-store-vip:/rep1
>

Re: [Users] oVirt 3.3.3 RC EL6 Live Snapshot

2014-02-03 Thread Steve Dainard

When I trigger the live snapshot while pinging the guest I get the
following high latency but no packet loss:

...
64 bytes from 10.0.6.228: icmp_seq=32 ttl=63 time=0.267 ms
64 bytes from 10.0.6.228: icmp_seq=33 ttl=63 time=0.319 ms
64 bytes from 10.0.6.228: icmp_seq=34 ttl=63 time=0.231 ms
64 bytes from 10.0.6.228: icmp_seq=35 ttl=63 time=0.294 ms
64 bytes from 10.0.6.228: icmp_seq=36 ttl=63 time=0.357 ms
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=37 ttl=63
time=10375 ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=38 ttl=63 time=9375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=39 ttl=63 time=8375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=40 ttl=63 time=7375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=41 ttl=63 time=6375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=42 ttl=63 time=5375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=43 ttl=63 time=4375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=44 ttl=63 time=3375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=45 ttl=63 time=2375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=46 ttl=63 time=1375
ms*
*64 bytes from 10.0.6.228 <http://10.0.6.228>: icmp_seq=47 ttl=63 time=375
ms*
64 bytes from 10.0.6.228: icmp_seq=48 ttl=63 time=0.324 ms
64 bytes from 10.0.6.228: icmp_seq=49 ttl=63 time=0.232 ms
64 bytes from 10.0.6.228: icmp_seq=50 ttl=63 time=0.318 ms
64 bytes from 10.0.6.228: icmp_seq=51 ttl=63 time=0.297 ms
64 bytes from 10.0.6.228: icmp_seq=52 ttl=63 time=0.343 ms
64 bytes from 10.0.6.228: icmp_seq=53 ttl=63 time=0.293 ms
64 bytes from 10.0.6.228: icmp_seq=54 ttl=63 time=0.286 ms
64 bytes from 10.0.6.228: icmp_seq=55 ttl=63 time=0.302 ms
64 bytes from 10.0.6.228: icmp_seq=56 ttl=63 time=0.304 ms
64 bytes from 10.0.6.228: icmp_seq=57 ttl=63 time=0.305 ms
^C
--- 10.0.6.228 ping statistics ---
57 packets transmitted, 57 received, 0% packet loss, time 56000ms
rtt min/avg/max/mdev = 0.228/1037.547/10375.035/2535.522 ms, pipe 11

So the guest is in some sort of paused mode, but its interesting that the
pings seems to be queued rather than dropped.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Mon, Feb 3, 2014 at 12:08 PM, Maor Lipchuk  wrote:

> From the engine logs it seems that indeed live snapshot is called (The
> command is snapshotVDSCommand see [1]).
> This is done right after the snapshot has been created in the VM and it
> signals the qemu process to start using the new volume created.
>
> When live snapshot does not succeed we should see in the log something
> like "Wasn't able to live snapshot due to error:...", but it does not
> appear so it seems that this worked out fine.
>
> At some point I can see in the logs that VDSM reports to the engine that
> the VM is paused.
>
>
> [1]
> 2014-02-02 09:41:20,564 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (pool-6-thread-49) START, SnapshotVDSCommand(HostName = ovirt002, HostId
> = 3080fb61-2d03-4008-b47f-9b66276a4257,
> vmId=e261e707-a21f-4ae8-9cff-f535f4430446), log id: 7e0d7872
> 2014-02-02 09:41:21,119 INFO
> [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
> (DefaultQuartzScheduler_Worker-93) VM snapshot-test
> e261e707-a21f-4ae8-9cff-f535f4430446 moved from Up --> Paused
> 2014-02-02 09:41:30,234 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.SnapshotVDSCommand]
> (pool-6-thread-49) FINISH, SnapshotVDSCommand, log id: 7e0d7872
> 2014-02-02 09:41:30,238 INFO
> [org.ovirt.engine.core.bll.CreateSnapshotCommand] (pool-6-thread-49)
> [67ea047a] Ending command successfully:
> org.ovirt.engine.core.bll.CreateSnapshotCommand
> ...
>
> Regards,
> Maor
>
> On 02/03/2014 06:24 PM, Dafna Ron wrote:
> > Thanks Steve.
> >
> > from the logs I can see that the create snapshot succeeds and that the
> > vm is resumed.
> > the vm moves to pause as part of libvirt flows:
> >
> > 2014-02-02 14:41:20.872+: 5843: debug : qemuProcessHandleStop:728 :
> > Transitioned guest snapshot-test to paused state
> > 2014-02-02 14:41:30.031+: 5843: debug : qemuProcessHandleResume

[Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-04 Thread Steve Dainard

ostId, async)
  File "/usr/share/vdsm/storage/clusterlock.py", line 189, in acquireHostId
raise se.AcquireHostIdFailure(self._sdUUID, e)
AcquireHostIdFailure: Cannot acquire host id:
('471487ed-2946-4dfc-8ec3-96546006be12', SanlockException(22, 'Sanlock
lockspace add failure', 'Invalid argument'))
Thread-31::DEBUG::2014-02-04
09:54:08,826::task::869::TaskManager.Task::(_run)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::Task._run:
66924dbf-5a1c-473e-a158-d038aae38dc3 (None,
'8c4e8898-c91a-4d49-98e8-b6467791a9cc', 'IT',
'471487ed-2946-4dfc-8ec3-96546006be12',
['471487ed-2946-4dfc-8ec3-96546006be12'], 3, None, 5, 60, 10, 3) {} failed
- stopping task
Thread-31::DEBUG::2014-02-04
09:54:08,826::task::1194::TaskManager.Task::(stop)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::stopping in state preparing
(force False)
Thread-31::DEBUG::2014-02-04
09:54:08,826::task::974::TaskManager.Task::(_decref)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::ref 1 aborting True
Thread-31::INFO::2014-02-04
09:54:08,826::task::1151::TaskManager.Task::(prepare)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::aborting: Task is aborted:
'Cannot acquire host id' - code 661
Thread-31::DEBUG::2014-02-04
09:54:08,826::task::1156::TaskManager.Task::(prepare)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::Prepare: aborted: Cannot
acquire host id
Thread-31::DEBUG::2014-02-04
09:54:08,827::task::974::TaskManager.Task::(_decref)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::ref 0 aborting True
Thread-31::DEBUG::2014-02-04
09:54:08,827::task::909::TaskManager.Task::(_doAbort)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::Task._doAbort: force False
Thread-31::DEBUG::2014-02-04
09:54:08,827::resourceManager::976::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-31::DEBUG::2014-02-04
09:54:08,827::task::579::TaskManager.Task::(_updateState)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::moving from state preparing ->
state aborting
Thread-31::DEBUG::2014-02-04
09:54:08,827::task::534::TaskManager.Task::(__state_aborting)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::_aborting: recover policy none
Thread-31::DEBUG::2014-02-04
09:54:08,827::task::579::TaskManager.Task::(_updateState)
Task=`66924dbf-5a1c-473e-a158-d038aae38dc3`::moving from state aborting ->
state failed
Thread-31::DEBUG::2014-02-04
09:54:08,827::resourceManager::939::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources
{'Storage.471487ed-2946-4dfc-8ec3-96546006be12': < ResourceRef
'Storage.471487ed-2946-4dfc-8ec3-96546006be12', isValid: 'True' obj:
'None'>, 'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc': < ResourceRef
'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc', isValid: 'True' obj:
'None'>}
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::976::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::615::ResourceManager::(releaseResource)
Trying to release resource 'Storage.471487ed-2946-4dfc-8ec3-96546006be12'
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::634::ResourceManager::(releaseResource)
Released resource 'Storage.471487ed-2946-4dfc-8ec3-96546006be12' (0 active
users)
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::640::ResourceManager::(releaseResource)
Resource 'Storage.471487ed-2946-4dfc-8ec3-96546006be12' is free, finding
out if anyone is waiting for it.
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::648::ResourceManager::(releaseResource) No
one is waiting for resource 'Storage.471487ed-2946-4dfc-8ec3-96546006be12',
Clearing records.
Thread-31::DEBUG::2014-02-04
09:54:08,828::resourceManager::615::ResourceManager::(releaseResource)
Trying to release resource 'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc'
Thread-31::DEBUG::2014-02-04
09:54:08,829::resourceManager::634::ResourceManager::(releaseResource)
Released resource 'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc' (0 active
users)
Thread-31::DEBUG::2014-02-04
09:54:08,829::resourceManager::640::ResourceManager::(releaseResource)
Resource 'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc' is free, finding
out if anyone is waiting for it.
Thread-31::DEBUG::2014-02-04
09:54:08,829::resourceManager::648::ResourceManager::(releaseResource) No
one is waiting for resource 'Storage.8c4e8898-c91a-4d49-98e8-b6467791a9cc',
Clearing records.
Thread-31::ERROR::2014-02-04
09:54:08,829::dispatcher::67::Storage.Dispatcher.Protect::(run) {'status':
{'message': "Cannot acquire host id:
('471487ed-2946-4dfc-8ec3-96546006be12', SanlockException(22, 'Sanlock
lockspace add failure', 'Invalid argument'))", 'code': 661}}


*Storage domain metadata file:*

CLASS=Data
DESCRIPTION=gluster-store-rep2
IOOPTIMEOUTSEC=10
LEASERETRIES=3
LEASETIMESEC=60
LOCKPOLICY=
LOCKRENEWALINTERVALSEC=5
POOL_UUID=
REMOTE_PATH=10.0.10.3:/rep2
ROLE=Regular
SDUUID=471487ed-2946-4dfc-8ec3-96546006be12
TYPE=POSIXFS
VERSION=3
_SHA_CKSUM=469191aac3fb8ef504b6a4d301b6d8be6fffece1



*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-04 Thread Steve Dainard

I should be able to provide any logs required, I've reverted to my NFS
storage domain but can move a host over to POSIX whenever necessary.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Tue, Feb 4, 2014 at 10:20 AM, Elad Ben Aharon wrote:

> Nir,
> Can you take a look? the user gets the same Sanlock exception as reported
> here:  https://bugzilla.redhat.com/show_bug.cgi?id=1046430
>
> - Original Message -
> From: "Steve Dainard" 
> To: "users" 
> Sent: Tuesday, February 4, 2014 5:09:43 PM
> Subject: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain
>
> I can successfully create a POSIX storage domain backed by gluster, but at
> the end of creation I get an error message "failed to acquire host id".
>
> Note that I have successfully created/activated NFS DC/SD on the same
> ovirt/hosts.
>
> I have some logs when I tried to attach to the DC after failure:
>
> engine.log
>
> 2014-02-04 09:54:04,324 INFO
> [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand]
> (ajp--127.0.0.1-8702-3) [1dd40406] Lock Acquired to object EngineLock [ex
> clusiveLocks= key: 8c4e8898-c91a-4d49-98e8-b6467791a9cc value: POOL
> , sharedLocks= ]
> 2014-02-04 09:54:04,473 INFO
> [org.ovirt.engine.core.bll.storage.AddStoragePoolWithStoragesCommand]
> (pool-6-thread-42) [1dd40406] Running command: AddStoragePoolWithStorages
> Command internal: false. Entities affected : ID:
> 8c4e8898-c91a-4d49-98e8-b6467791a9cc Type: StoragePool
> 2014-02-04 09:54:04,673 INFO
> [org.ovirt.engine.core.bll.storage.ConnectStorageToVdsCommand]
> (pool-6-thread-42) [3f86c31b] Running command: ConnectStorageToVdsCommand
> intern
> al: true. Entities affected : ID: aaa0----123456789aaa
> Type: System
> 2014-02-04 09:54:04,682 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand]
> (pool-6-thread-42) [3f86c31b] START, ConnectStorageServerVDSCommand(
> HostName = ovirt001, HostId = 48f13d47-8346-4ff6-81ca-4f4324069db3,
> storagePoolId = ----, storageType =
> POSIXFS, connectionList = [{ id: 87f9
> ff74-93c4-4fe5-9a56-ed5338290af9, connection: 10.0.10.3:/rep2, iqn: null,
> vfsType: glusterfs, mountOptions: null, nfsVersion: null, nfsRetrans: null,
> nfsTimeo: null };]), lo
> g id: 332ff091
> 2014-02-04 09:54:05,089 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.ConnectStorageServerVDSCommand]
> (pool-6-thread-42) [3f86c31b] FINISH, ConnectStorageServerVDSCommand
> , return: {87f9ff74-93c4-4fe5-9a56-ed5338290af9=0}, log id: 332ff091
> 2014-02-04 09:54:05,093 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
> (pool-6-thread-42) [3f86c31b] START, CreateStoragePoolVDSCommand(HostNa
> me = ovirt001, HostId = 48f13d47-8346-4ff6-81ca-4f4324069db3,
> storagePoolId=8c4e8898-c91a-4d49-98e8-b6467791a9cc, storageType=POSIXFS,
> storagePoolName=IT, masterDomainId=471
> 487ed-2946-4dfc-8ec3-96546006be12,
> domainsIdList=[471487ed-2946-4dfc-8ec3-96546006be12], masterVersion=3), log
> id: 1be84579
> 2014-02-04 09:54:08,833 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStorag
> ePoolVDSCommand] (pool-6-thread-42) [3f86c31b] Failed in
> CreateStoragePoolVDS method
> 2014-02-04 09:54:08,834 ERROR
> [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
> (pool-6-thread-42) [3f86c31b] Error code AcquireHostIdFailure and error
> message VDSGenericException: VDSErrorException: Failed to
> CreateStoragePoolVDS, error = Cannot acquire host id:
> ('471487ed-2946-4dfc-8ec3-96546006be12', SanlockException(22, 'Sanlock
> lockspace add failure', 'Invalid argument'))
> 2014-02-04 09:54:08,835 INFO
> [org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand]
> (pool-6-thread-42) [3f86c31b] Command
> org.ovirt.engine.core.vdsbroker.vdsbroker.CreateStoragePoolVDSCommand
> return value
> StatusOnlyReturnForXmlRpc [mStatus=StatusForXmlRpc [mCode=661,
> mMessage=Cannot acquire host id: ('471487ed-2946-4dfc-8ec3-96546006be12',
> SanlockException(22, 'Sanlock locks

Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-07 Thread Steve Dainard

Hi Nir,

Do you need any more info from me? I missed the 'sanlock client host_status
-D' request, info below:

[root@ovirt002 ~]# sanlock client host_status -D
lockspace a52938f7-2cf4-4771-acb2-0c78d14999e5
1 timestamp 0
last_check=176740
last_live=205
last_req=0
owner_id=1
owner_generation=5
timestamp=0
io_timeout=10
2 timestamp 176719
last_check=176740
last_live=176740
last_req=0
owner_id=2
owner_generation=7
timestamp=176719
io_timeout=10
250 timestamp 0
last_check=176740
last_live=205
last_req=0
owner_id=250
owner_generation=1
timestamp=0
io_timeout=10

[root@ovirt001 ~]# sanlock client host_status -D
[root@ovirt001 ~]#


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Wed, Feb 5, 2014 at 1:39 PM, Steve Dainard wrote:

>
>
> *Steve Dainard *
> IT Infrastructure Manager
> Miovision <http://miovision.com/> | *Rethink Traffic*
> 519-513-2407 ex.250
> 877-646-8476 (toll-free)
>
> *Blog <http://miovision.com/blog>  |  **LinkedIn
> <https://www.linkedin.com/company/miovision-technologies>  |  Twitter
> <https://twitter.com/miovision>  |  Facebook
> <https://www.facebook.com/miovision>*
> --
>  Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
> ON, Canada | N2C 1L3
> This e-mail may contain information that is privileged or confidential. If
> you are not the intended recipient, please delete the e-mail and any
> attachments and notify us immediately.
>
>
> On Wed, Feb 5, 2014 at 10:50 AM, Steve Dainard wrote:
>
>> On Tue, Feb 4, 2014 at 6:23 PM, Nir Soffer  wrote:
>>
>>> - Original Message -
>>> > From: "Steve Dainard" 
>>> > To: "Nir Soffer" 
>>> > Cc: "Elad Ben Aharon" , "users" ,
>>> "Aharon Canan" 
>>> > Sent: Tuesday, February 4, 2014 10:50:02 PM
>>> > Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage
>>> domain
>>> >
>>> > Happens every time I try to add a POSIX SD type glusterfs.
>>> >
>>> > Logs attached.
>>>
>>> Hi Steve,
>>>
>>> I'm afraid we need more history in the logs. I want to see the logs from
>>> the time the machine started,
>>> until the time of the first error.
>>>
>>
>> No problem. I've rebooted both hosts (manager is installed on host
>> ovirt001). And I've attached all the logs. Note that I put the wrong DNS
>> name 'gluster-rr:/rep2' the first time I created the domain, hence the
>> errors. The POSIX domain created is against 'gluster-store-vip:/rep2'
>>
>> Note only host ovirt002 is in the POSIX SD cluster.
>>
>
> Sorry this is wrong, it should be ovirt001 is the only host in the POSIX
> SD cluster.
>
>
>>
>> I've also included the glusterfs log for rep2, with these errors:
>>
>> [2014-02-05 15:36:28.246203] W
>> [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-0: remote
>> operation failed: Invalid argument
>> [2014-02-05 15:36:28.246418] W
>> [client-rpc-fops.c:873:client3_3_writev_cbk] 0-rep2-client-1: remote
>> operation failed: Invalid argument
>> [2014-02-05 15:36:28.246450] W [fuse-bridge.c:2167:fuse_writev_cbk]
>> 0-glusterfs-fuse: 163: WRITE => -1 (Invalid argument)
>>
>>
>>>
>>> We suspect that acquireHostId fails because someone else has aquired the
>>> same id. We may see evidence
>>> in the logs.
>>>
>>> Also can you send the output of this command on the host that fails, and
>>> if you have other hosts using the
>>> same storage, also on some of these hosts.
>>>
>>> sanlock client host_status -D
>>>
>>> And finally, can you attach also /var/log/sanlock.log?
>>>
>>> Thanks,
>>> Nir
>>>
>>>
>> Thanks,
>> Steve
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-08 Thread Steve Dainard

Hi Nir,

[root@ovirt001 storage]# mount -t glusterfs 10.0.10.2:/rep2 rep2-mount/
[root@ovirt001 storage]# ls -lh rep2-mount/
total 0
-rwxr-xr-x. 1 vdsm kvm  0 Feb  5 10:36 __DIRECT_IO_TEST__
drwxr-xr-x. 4 vdsm kvm 32 Feb  5 10:36 ff0e0521-a8fa-4c10-8372-7b67ac3fca31
[root@ovirt001 storage]# ls -lh
total 0
drwxr-xr-x. 4 vdsm kvm  91 Jan 30 17:34 iso-mount
drwxr-xr-x. 3 root root 23 Jan 30 17:31 lv-iso-domain
drwxr-xr-x. 3 vdsm kvm  35 Jan 29 17:43 lv-storage-domain
drwxr-xr-x. 3 vdsm kvm  17 Feb  4 15:43 lv-vm-domain
drwxr-xr-x. 4 vdsm kvm  91 Feb  5 10:36 rep2-mount


*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Sat, Feb 8, 2014 at 5:28 PM, Nir Soffer  wrote:

> - Original Message -
> > From: "Steve Dainard" 
> > To: "Nir Soffer" , "users" 
> > Sent: Friday, February 7, 2014 6:27:04 PM
> > Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage
> domain
> >
> > Hi Nir,
> >
> > Do you need any more info from me? I missed the 'sanlock client
> host_status
> > -D' request, info below:
>
> Hi Steve,
>
> It looks like your glusterfs mount is not writable by vdsm or sanlock.
> This can
> happen when permissions or owner of the directory is not correct.
>
> Can you try to mount the glusterfs volume manually, and share the output
> of ls -lh?
>
> sudo mkdir /tmp/gluster
> sudo mount -t glusterfs 10.0.10.2:/rep2 /tmp/gluster
> ls -lh /tmp/gluster/
> sudo umount /tmp/gluster
>
> Thanks,
> Nir
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-09 Thread Steve Dainard

vdsm can write to it:

[root@ovirt001 rep2-mount]# sudo -u vdsm dd if=/dev/zero of=__test__ bs=1M
count=1 oflag=direct
1+0 records in
1+0 records out
1048576 bytes (1.0 MB) copied, 0.12691 s, 8.3 MB/s

[root@ovirt001 rep2-mount]# pwd
/mnt/storage/rep2-mount

[root@ovirt001 rep2-mount]# mount
/dev/md125p2 on / type ext4 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
tmpfs on /dev/shm type tmpfs (rw,rootcontext="system_u:object_r:tmpfs_t:s0")
/dev/md125p1 on /boot type ext4 (rw)
/dev/mapper/gluster-storage--domain on /mnt/storage/lv-storage-domain type
xfs (rw)
/dev/mapper/gluster-iso--domain on /mnt/storage/lv-iso-domain type xfs (rw)
/dev/mapper/gluster-vm--domain on /mnt/storage/lv-vm-domain type xfs (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
10.0.10.2:/iso-store on /mnt/storage/iso-mount type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)
10.0.10.2:/rep2 on /mnt/storage/rep2-mount type fuse.glusterfs
(rw,default_permissions,allow_other,max_read=131072)

[root@ovirt001 rep2-mount]# gluster volume info rep2
Volume Name: rep2
Type: Replicate
Volume ID: b89a21bb-5ad1-493f-b197-8f990ab3ba77
Status: Started
Number of Bricks: 1 x 2 = 2
Transport-type: tcp
Bricks:
Brick1: 10.0.10.2:/mnt/storage/lv-vm-domain/rep2
Brick2: 10.0.10.3:/mnt/storage/lv-vm-domain/rep2
Options Reconfigured:
storage.owner-gid: 36
storage.owner-uid: 36
server.allow-insecure: on
cluster.quorum-type: auto

Thanks,

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Sun, Feb 9, 2014 at 1:30 AM, Nir Soffer  wrote:

> ----- Original Message -
> > From: "Steve Dainard" 
> > To: "Nir Soffer" 
> > Cc: "users" 
> > Sent: Sunday, February 9, 2014 3:51:03 AM
> > Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage
> domain
> >
> > Hi Nir,
> >
> > [root@ovirt001 storage]# mount -t glusterfs 10.0.10.2:/rep2 rep2-mount/
> > [root@ovirt001 storage]# ls -lh rep2-mount/
> > total 0
> > -rwxr-xr-x. 1 vdsm kvm  0 Feb  5 10:36 __DIRECT_IO_TEST__
> > drwxr-xr-x. 4 vdsm kvm 32 Feb  5 10:36
> ff0e0521-a8fa-4c10-8372-7b67ac3fca31
> > [root@ovirt001 storage]# ls -lh
> > total 0
> > drwxr-xr-x. 4 vdsm kvm  91 Jan 30 17:34 iso-mount
> > drwxr-xr-x. 3 root root 23 Jan 30 17:31 lv-iso-domain
> > drwxr-xr-x. 3 vdsm kvm  35 Jan 29 17:43 lv-storage-domain
> > drwxr-xr-x. 3 vdsm kvm  17 Feb  4 15:43 lv-vm-domain
> > drwxr-xr-x. 4 vdsm kvm  91 Feb  5 10:36 rep2-mount
>
> Looks good.
>
> Can you write into rep2-mount?
>
> Please try:
>
> sudo -u vdsm dd if=/dev/zero of=rep2-mount/__test__ bs=1M count=1
> oflag=direct
> rm rep2-mount/__test__
>
> Thanks,
> Nir
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Live migration of VM's occasionally fails

2014-02-17 Thread Steve Dainard

Hi Dafna,

No snapshots of either of those VM's have been taken, and there are no
updates for any of those packages on EL 6.5.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Sun, Feb 16, 2014 at 7:05 AM, Dafna Ron  wrote:

> does the vm that fails migration have a live snapshot?
> if so how many snapshots does the vm have.
> I think that there are newer packages of vdsm, libvirt and qemu - can you
> try to update
>
>
>
> On 02/16/2014 12:33 AM, Steve Dainard wrote:
>
>> Versions are the same:
>>
>> [root@ovirt001 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' | sort
>> gpxe-roms-qemu-0.9.7-6.10.el6.noarch
>> libvirt-0.10.2-29.el6_5.3.x86_64
>> libvirt-client-0.10.2-29.el6_5.3.x86_64
>> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64
>> libvirt-python-0.10.2-29.el6_5.3.x86_64
>> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64
>> vdsm-4.13.3-3.el6.x86_64
>> vdsm-cli-4.13.3-3.el6.noarch
>> vdsm-gluster-4.13.3-3.el6.noarch
>> vdsm-python-4.13.3-3.el6.x86_64
>> vdsm-xmlrpc-4.13.3-3.el6.noarch
>>
>> [root@ovirt002 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' | sort
>> gpxe-roms-qemu-0.9.7-6.10.el6.noarch
>> libvirt-0.10.2-29.el6_5.3.x86_64
>> libvirt-client-0.10.2-29.el6_5.3.x86_64
>> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64
>> libvirt-python-0.10.2-29.el6_5.3.x86_64
>> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64
>> vdsm-4.13.3-3.el6.x86_64
>> vdsm-cli-4.13.3-3.el6.noarch
>> vdsm-gluster-4.13.3-3.el6.noarch
>> vdsm-python-4.13.3-3.el6.x86_64
>> vdsm-xmlrpc-4.13.3-3.el6.noarch
>>
>> Logs attached, thanks.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/
>> company/miovision-technologies>  | Twitter <https://twitter.com/miovision>
>>  | Facebook <https://www.facebook.com/miovision>*
>> 
>> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
>> ON, Canada | N2C 1L3
>> This e-mail may contain information that is privileged or confidential.
>> If you are not the intended recipient, please delete the e-mail and any
>> attachments and notify us immediately.
>>
>>
>> On Sat, Feb 15, 2014 at 6:24 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> the migration fails in libvirt:
>>
>>
>> Thread-153709::ERROR::2014-02-14
>> 11:17:40,420::vm::337::vm.Vm::(run)
>> vmId=`08434c90-ffa3-4b63-aa8e-5613f7b0e0cd`::Failed to migrate
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/vm.py", line 323, in run
>> self._startUnderlyingMigration()
>>   File "/usr/share/vdsm/vm.py", line 403, in _startUnderlyingMigration
>> None, maxBandwidth)
>>   File "/usr/share/vdsm/vm.py", line 841, in f
>> ret = attr(*args, **kwargs)
>>   File
>> "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py",
>> line 76, in wrapper
>> ret = f(*args, **kwargs)
>>   File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1178,
>> in migrateToURI2
>> if ret == -1: raise libvirtError ('virDomainMigrateToURI2()
>> failed', dom=self)
>> libvirtError: Unable to read from monitor: Connection reset by peer
>> Thread-54041::DEBUG::2014-02-14
>> 11:17:41,752::task::579::TaskManager.Task::(_updateState)
>> Task=`094c412a-43dc-4c29-a601-d759486469a8`::moving from state
>> init -> state preparing
>> Thread-54041::INFO::2014-02-14
>> 11:17:41,753::logUtils::44::dispatcher::(wrapper) Run and protect:
>> getVolume

Re: [Users] Live migration of VM's occasionally fails

2014-02-17 Thread Steve Dainard

Failed live migration is wider spread than these two VM's, but they are a
good example because they were both built from the same template and have
no modifications after they were created. They were also migrated one after
the other, with one successfully migrating and the other not.

Are there any increased logging levels that might help determine what the
issue is?

Thanks,

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Mon, Feb 17, 2014 at 11:47 AM, Dafna Ron  wrote:

> did you install these vm's from a cd? run it as run-once with a special
> monitor?
> try to think if there is anything different in the configuration of these
> vm's from the other vm's that succeed to migrate?
>
>
> On 02/17/2014 04:36 PM, Steve Dainard wrote:
>
>> Hi Dafna,
>>
>> No snapshots of either of those VM's have been taken, and there are no
>> updates for any of those packages on EL 6.5.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn <https://www.linkedin.com/
>> company/miovision-technologies>  | Twitter <https://twitter.com/miovision>
>>  | Facebook <https://www.facebook.com/miovision>*
>> 
>> Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener,
>> ON, Canada | N2C 1L3
>> This e-mail may contain information that is privileged or confidential.
>> If you are not the intended recipient, please delete the e-mail and any
>> attachments and notify us immediately.
>>
>>
>> On Sun, Feb 16, 2014 at 7:05 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>>     does the vm that fails migration have a live snapshot?
>> if so how many snapshots does the vm have.
>> I think that there are newer packages of vdsm, libvirt and qemu -
>> can you try to update
>>
>>
>>
>> On 02/16/2014 12:33 AM, Steve Dainard wrote:
>>
>> Versions are the same:
>>
>> [root@ovirt001 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' | sort
>> gpxe-roms-qemu-0.9.7-6.10.el6.noarch
>> libvirt-0.10.2-29.el6_5.3.x86_64
>> libvirt-client-0.10.2-29.el6_5.3.x86_64
>> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64
>> libvirt-python-0.10.2-29.el6_5.3.x86_64
>> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64
>> vdsm-4.13.3-3.el6.x86_64
>> vdsm-cli-4.13.3-3.el6.noarch
>> vdsm-gluster-4.13.3-3.el6.noarch
>> vdsm-python-4.13.3-3.el6.x86_64
>> vdsm-xmlrpc-4.13.3-3.el6.noarch
>>
>> [root@ovirt002 ~]# rpm -qa | egrep 'libvirt|vdsm|qemu' | sort
>> gpxe-roms-qemu-0.9.7-6.10.el6.noarch
>> libvirt-0.10.2-29.el6_5.3.x86_64
>> libvirt-client-0.10.2-29.el6_5.3.x86_64
>> libvirt-lock-sanlock-0.10.2-29.el6_5.3.x86_64
>> libvirt-python-0.10.2-29.el6_5.3.x86_64
>> qemu-img-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-0.12.1.2-2.355.el6.5.x86_64
>> qemu-kvm-rhev-tools-0.12.1.2-2.355.el6.5.x86_64
>> vdsm-4.13.3-3.el6.x86_64
>> vdsm-cli-4.13.3-3.el6.noarch
>> vdsm-gluster-4.13.3-3.el6.noarch
>> vdsm-python-4.13.3-3.el6.x86_64
>> vdsm-xmlrpc-4.13.3-3.el6.noarch
>>
>> Logs attached, thanks.
>>
>> *Steve Dainard *
>> IT Infrastructure Manager
>> Miovision <http://miovision.com/> | /Rethink Traffic/
>>
>> *Blog <http://miovision.com/blog> | **LinkedIn
>> <https://www.linkedin.com/company/miovision-technologies>  |
>> Twitter <https://twitter.com/miovision>  | Facebook
>> <https://www.facebook.com/miovision>*
>> --

Re: [Users] Live migration of VM's occasionally fails

2014-02-17 Thread Steve Dainard

VM's are identical, same template, same cpu/mem/nic. Server type, thin
provisioned on NFS (backend is glusterfs 3.4).

Does monitor = spice console? I don't believe either of them had a spice
connection.

I don't see anything in the ovirt001 sanlock.log:

2014-02-14 11:16:05-0500 255246 [5111]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:05-0500 255246 [5111]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:16:15-0500 255256 [5110]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:15-0500 255256 [5110]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:16:25-0500 255266 [5111]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:25-0500 255266 [5111]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:16:36-0500 255276 [5110]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:36-0500 255276 [5110]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:16:46-0500 255286 [5111]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:46-0500 255286 [5111]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:16:56-0500 255296 [5110]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:16:56-0500 255296 [5110]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:17:06-0500 255306 [5111]: cmd_inq_lockspace 4,14
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:06-0500 255306 [5111]: cmd_inq_lockspace 4,14 done 0
2014-02-14 11:17:06-0500 255307 [5105]: cmd_register ci 4 fd 14 pid 31132
2014-02-14 11:17:06-0500 255307 [5105]: cmd_restrict ci 4 fd 14 pid 31132
flags 1
2014-02-14 11:17:16-0500 255316 [5110]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:16-0500 255316 [5110]: cmd_inq_lockspace 5,15 done 0
2014-02-14 11:17:26-0500 255326 [5111]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:26-0500 255326 [5111]: cmd_inq_lockspace 5,15 done 0
2014-02-14 11:17:26-0500 255326 [5110]: cmd_acquire 4,14,31132 ci_in 5 fd
15 count 0
2014-02-14 11:17:26-0500 255326 [5110]: cmd_acquire 4,14,31132 result 0
pid_dead 0
2014-02-14 11:17:26-0500 255326 [5111]: cmd_acquire 4,14,31132 ci_in 6 fd
16 count 0
2014-02-14 11:17:26-0500 255326 [5111]: cmd_acquire 4,14,31132 result 0
pid_dead 0
2014-02-14 11:17:36-0500 255336 [5110]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:36-0500 255336 [5110]: cmd_inq_lockspace 5,15 done 0
2014-02-14 11:17:39-0500 255340 [5105]: cmd_register ci 5 fd 15 pid 31319
2014-02-14 11:17:39-0500 255340 [5105]: cmd_restrict ci 5 fd 15 pid 31319
flags 1
2014-02-14 11:17:39-0500 255340 [5105]: client_pid_dead 5,15,31319
cmd_active 0 suspend 0
2014-02-14 11:17:46-0500 255346 [5111]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:46-0500 255346 [5111]: cmd_inq_lockspace 5,15 done 0
2014-02-14 11:17:56-0500 255356 [5110]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0
2014-02-14 11:17:56-0500 255356 [5110]: cmd_inq_lockspace 5,15 done 0
2014-02-14 11:18:06-0500 255366 [5111]: cmd_inq_lockspace 5,15
a52938f7-2cf4-4771-acb2-0c78d14999e5:1:/rhev/data-center/mnt/gluster-store-vip:_rep1/a52938f7-2cf4-4771-acb2-0c78d14999e5/dom_md/ids:0
flags 0

ovirt002 sanlock.log has on entries during that time frame.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Driv

Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage domain

2014-02-19 Thread Steve Dainard

Hi Nir,

I have a thread open on the gluster side about heal-failed operations, so
I'll wait for a response on that side.

Agreed on two node quorum, I'm waiting for a 3rd node right now :) But in
the meantime or for anyone who reads this thread, if you only have 2
storage nodes you have to weigh the risks of 2 nodes in quorum ensuring
storage consistency, or 2 nodes no quorum with an extra shot at uptime.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Wed, Feb 19, 2014 at 4:13 AM, Nir Soffer  wrote:

> ----- Original Message -
> > From: "Steve Dainard" 
> > To: "Nir Soffer" 
> > Cc: "users" 
> > Sent: Tuesday, February 11, 2014 7:42:37 PM
> > Subject: Re: [Users] Ovirt 3.3.2 Cannot attach POSIX (gluster) storage
> domain
> >
> > Enabled logging, logs attached.
>
> According to sanlock and gluster log:
>
> 1. on the host, sanlock is failing write to the ids volume
> 2. on the gluster side we see failure to heal the ids file.
>
> This looks like glusterfs issue, and should be handled by glusterfs folks.
>
> You probably should configure sanlock log level back to the default by
> commenting
> out the configuration I suggested in the previous mail.
>
> According to gluster configuration in this log, this looks like 2 replicas
> with auto quorum.
> This setup is not recommended because both machines must be up all the
> time.
> When one machine is down, your entire storage is down.
>
> Check this post explaining this issue:
> http://lists.ovirt.org/pipermail/users/2014-February/021541.html
>
> Thanks,
> Nir
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Why choose glusterfs over iscsi for oVirt storage?

2014-02-20 Thread Steve Dainard

On Mon, Feb 17, 2014 at 3:50 AM, Justin Clacherty wrote:

> Hi,
>
>
>
> I'm just setting up some storage for use with oVirt and am wondering why I
> might choose glusterfs rather than just exporting a raid array as iscsi.
> The oVirt team seem to be pushing gluster (though that could just be
> because it's a Red Hat product).  Can anyone answer this one?
>

If you're contemplating software iscsi then you probably aren't working on
a production setup that requires high availability or data consistency. If
you are, you should take a deep dive on what your single point of failures
are. If you're actually talking about a SAN lun, then I would imagine you
aren't the storage admin, and can leave the particulars in their hands.

Its funny, in RH's current supported version of RHEV, native gluster
storage isn't even an option. This is a pretty new project, and if you
follow the bread crumbs on distributed file systems like glusterfs or ceph
you'll start to see the future advantages. You can also figure that seeing
as both projects are spearheaded by RH dev's that there is likely a lot of
cross-talk and feature requests making both projects better, and who
wouldn't promote a better solution?

>
>
> What I have come up with is as follows.
>
>
>
> For:
>

There is too much to cover in any of these advantages, you're going to need
to do a lot of research on each of these 'features' if you want to use them
successfully.

> -  easy to expand
>
> -  redundancy across storage devices/easy replication
>
> -  high availablility
>
> -  performance
>
> -  it's kind of cool J
>
> -  maintenance?
>
>
>
> Against (these are guesses):
>
> -  performance? (multiple layers of filesystems everywhere - fs
> in vm + image on gluster + gluster + block filesystem)
>
Its worse than just a ton of software layers. NAS protocols seem to be
focused on data consistency, which means they don't write async to storage.
iscsi is typically async and has much better throughput, but also a greater
chance for data loss or corruption. Technically you can achieve the same
level of performance as iscsi using NFS (backed by glusterfs if you like)
but you need to set options on the volume to allow async writes.

> -  complexity
>
If you're doing storage properly, the underpinnings are always complex
unless you're paying someone else to do it (read: SAN / managed service
provider). Research multipath and HA on software iscsi and you'll see what
I mean.

> -  maintenance?
>
>
>
> Any help here is appreciated.  Also, does the underlying block level
> filesystem matter here?  VMs running under ovirt would be typical business
> applications - file serving (samba4), email, databases, web servers, etc.
>

There is a lot to answer here and I don't have all the answers. Take a look
at the gluster docs for underlying file system requirements. Any block
device should work. Specifically I'll mention that the glusterfs team
doesn't suggest hosting db's on glusterfs - many small reads/writes are not
one of glusters strong points.

>
>
> Cheers,
>
> Justin.
>
>
>
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[Users] Opinions needed: 3 node gluster replica 3 | NFS async | snapshots for consistency

2014-02-20 Thread Steve Dainard

I'm looking for some opinions on this configuration in an effort to
increase write performance:

3 storage nodes using glusterfs in replica 3, quorum.
Ovirt storage domain via NFS
Volume set nfs.trusted-sync on
On Ovirt, taking snapshots often enough to recover from a storage crash
Using CTDB to manage NFS storage domain IP, moving it to another storage
node when necessary

Something along the lines of EC2's data consistency model, where only
snapshots can be considered reliable. The Ovirt added advantage would be
memory consistency at time of snapshot as well.

Feedback appreciated, including 'you are insane for thinking this is a good
idea' (and some supported reasoning would be great).

Thanks,



*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*

*Blog <http://miovision.com/blog>  |  **LinkedIn
<https://www.linkedin.com/company/miovision-technologies>  |  Twitter
<https://twitter.com/miovision>  |  Facebook
<https://www.facebook.com/miovision>*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Opinions needed: 3 node gluster replica 3 | NFS async | snapshots for consistency

2014-02-23 Thread Steve Dainard

On Sun, Feb 23, 2014 at 4:27 AM, Ayal Baron  wrote:

>
>
> - Original Message -
> > I'm looking for some opinions on this configuration in an effort to
> increase
> > write performance:
> >
> > 3 storage nodes using glusterfs in replica 3, quorum.
>
> gluster doesn't support replica 3 yet, so I'm not sure how heavily I'd
> rely on this.
>

Glusterfs or RHSS doesn't support rep 3? How could I create a quorum
without 3+ hosts?

>
> > Ovirt storage domain via NFS
>
> why NFS and not gluster?
>

Gluster via posix SD doesn't have any performance gains over NFS, maybe the
opposite.

Gluster 'native' SD's are broken on EL6.5 so I have been unable to test
performance. I have heard performance can be upwards of 3x NFS for raw
write.

Gluster doesn't have an async write option, so its doubtful it will ever be
close to NFS async speeds.

>
> > Volume set nfs.trusted-sync on
> > On Ovirt, taking snapshots often enough to recover from a storage crash
>
> Note that this would have negative write performance impact
>

The difference between NFS sync (<50MB/s) and async (>300MB/s on 10g) write
speeds should more than compensate for the performance hit of taking
snapshots more often. And that's just raw speed. If we take into
consideration IOPS (guest small writes) async is leaps and bounds ahead.

If we assume the site has backup UPS and generator power and we can build a
highly available storage system with 3 nodes in quorum, are there any
potential issues other than a write performance hit?

The issue I thought might be most prevalent is if an ovirt host goes down
and the VM's are automatically brought back up on another host, they could
incur disk corruption and need to be brought back down and restored to the
last snapshot state. This basically means the HA feature should be disabled.

Even worse, if the gluster node with CTDB NFS IP goes down, it may not have
written out and replicated to its peers.  <-- I think I may have just
answered my own question.

Thanks,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Opinions needed: 3 node gluster replica 3 | NFS async | snapshots for consistency

2014-02-23 Thread Steve Dainard

On Sun, Feb 23, 2014 at 3:20 PM, Ayal Baron  wrote:

>
>
> - Original Message -
> > On Sun, Feb 23, 2014 at 4:27 AM, Ayal Baron  wrote:
> >
> > >
> > >
> > > - Original Message -
> > > > I'm looking for some opinions on this configuration in an effort to
> > > increase
> > > > write performance:
> > > >
> > > > 3 storage nodes using glusterfs in replica 3, quorum.
> > >
> > > gluster doesn't support replica 3 yet, so I'm not sure how heavily I'd
> > > rely on this.
> > >
> >
> > Glusterfs or RHSS doesn't support rep 3? How could I create a quorum
> > without 3+ hosts?
>
> glusterfs has the capability but it hasn't been widely tested with oVirt
> yet and we've already found a couple of issues there.
> afaiu gluster has the ability to define a tie breaker (a third node which
> is part of the quorum but does not provide a third replica of the data).
>

Good to know, I'll dig into this.


>
> >
> >
> > >
> > > > Ovirt storage domain via NFS
> > >
> > > why NFS and not gluster?
> > >
> >
> > Gluster via posix SD doesn't have any performance gains over NFS, maybe
> the
> > opposite.
>
> gluster via posix is mounting it using the gluster fuse client which
> should provide better performance + availability than NFS.
>

Availability for sure, but performance is seriously questionable. I've run
in both scenarios and haven't seen a performance improvement, the general
consensus seems to be fuse is adding overhead and therefore decreasing
performance vs. NFS.


>
> >
> > Gluster 'native' SD's are broken on EL6.5 so I have been unable to test
> > performance. I have heard performance can be upwards of 3x NFS for raw
> > write.
>
> Broken how?
>

Ongoing issues, libgfapi support wasn't available, and then was disabled
because snapshot support wasn't built into the kvm packages which was a
dependency. There are a few threads in reference to this, and some effort
to get CentOS builds to enable snapshot support in kvm.

I have installed rebuilt qemu packages with the RHEV snapshot flag enabled,
and was just able to create a native gluster SD, maybe I missed something
during a previous attempt. I'll test performance and see if its close to
what I'm looking for.


>
> >
> > Gluster doesn't have an async write option, so its doubtful it will ever
> be
> > close to NFS async speeds.t
> >
> >
> > >
> > > > Volume set nfs.trusted-sync on
> > > > On Ovirt, taking snapshots often enough to recover from a storage
> crash
> > >
> > > Note that this would have negative write performance impact
> > >
> >
> > The difference between NFS sync (<50MB/s) and async (>300MB/s on 10g)
> write
> > speeds should more than compensate for the performance hit of taking
> > snapshots more often. And that's just raw speed. If we take into
> > consideration IOPS (guest small writes) async is leaps and bounds ahead.
>
> I would test this, since qemu is already doing async I/O (using threads
> when native AIO is not supported) and oVirt runs it with cache=none (direct
> I/O) so sync ops should not happen that often (depends on guest).  You may
> be still enjoying performance boost, but I've seen UPS systems fail before
> bringing down multiple nodes at once.
> In addition, if you do not guarantee your data is safe when you create a
> snapshot (and it doesn't seem like you are) then I see no reason to think
> your snapshots are any better off than latest state on disk.
>

My logic here was if a snapshot is run, then the disk and system state
should be consistent at time of snapshot once its been written to storage.
If the host failed during snapshot then the snapshot would be incomplete,
and the last complete snapshot would need to be used for recovery.


>
> >
> >
> > If we assume the site has backup UPS and generator power and we can
> build a
> > highly available storage system with 3 nodes in quorum, are there any
> > potential issues other than a write performance hit?
> >
> > The issue I thought might be most prevalent is if an ovirt host goes down
> > and the VM's are automatically brought back up on another host, they
> could
> > incur disk corruption and need to be brought back down and restored to
> the
> > last snapshot state. This basically means the HA feature should be
> disabled.
>
> I'm not sure I understand what your concern is here, what would cause the
> data corruption? if your node crashed then there is no I/O in flight.  So
> starting up the VM should be perfectly safe.
>

Good point, that makes sense.


>
> >
> > Even worse, if the gluster node with CTDB NFS IP goes down, it may not
> have
> > written out and replicated to its peers.  <-- I think I may have just
> > answered my own question.
>
> If 'trusted-sync' means that the CTDB NFS node acks the I/O before it
> reached quorum then I'd say that's a gluster bug.


http://gluster.org/community/documentation/index.php/Gluster_3.2:_Setting_Volume_Options#nfs.trusted-syncSpecifically
mentions data won't be guaranteed to be on disk, but doesn't
mention if data would

[ovirt-users] New VM from template, vm disk creation time very high

2014-04-17 Thread Steve Dainard

When I create a new vm from a template (centos 6, 20gig thin disk) it takes
close to 40 minutes to clone the disk image.

Are there any settings that control how much bandwidth the cloning process
can use, or how its prioritized? This is brutally slow, and although a
quick clone isn't needed often, occasionally when someone is asking for a
resource it would be nice to expedite the process.

Storage: Gluster replica 2 volume
Network: 10gig
Ovirt: 3.3.4, Centos 6.5

Thanks,


*Steve*
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] New VM from template, vm disk creation time very high

2014-04-21 Thread Steve Dainard

This may have been lost over the weekend...


*Steve*


On Thu, Apr 17, 2014 at 1:21 PM, Steve  wrote:

> When I create a new vm from a template (centos 6, 20gig thin disk) it
> takes close to 40 minutes to clone the disk image.
>
> Are there any settings that control how much bandwidth the cloning process
> can use, or how its prioritized? This is brutally slow, and although a
> quick clone isn't needed often, occasionally when someone is asking for a
> resource it would be nice to expedite the process.
>
> Storage: Gluster replica 2 volume
> Network: 10gig
> Ovirt: 3.3.4, Centos 6.5
>
> Thanks,
>
>
> *Steve *
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] New VM from template, vm disk creation time very high

2014-04-21 Thread Steve Dainard

Hi

Yes I'm using glusterfs as storage domain, and glusterd is running on the
ovirt nodes.



*Steve *


On Mon, Apr 21, 2014 at 11:46 AM, Ovirt User  wrote:

> hi steve,
>
> are you using bluster fs as storage domain ? if yes, inside compute node
> or you dedicated 2 or more node for GLUSTER ?
>
> Il giorno 21/apr/2014, alle ore 16:00, Steve Dainard <
> sdain...@miovision.com> ha scritto:
>
> This may have been lost over the weekend...
>
>
> *Steve*
>
>
> On Thu, Apr 17, 2014 at 1:21 PM, Steve  wrote:
>
>> When I create a new vm from a template (centos 6, 20gig thin disk) it
>> takes close to 40 minutes to clone the disk image.
>>
>> Are there any settings that control how much bandwidth the cloning
>> process can use, or how its prioritized? This is brutally slow, and
>> although a quick clone isn't needed often, occasionally when someone is
>> asking for a resource it would be nice to expedite the process.
>>
>> Storage: Gluster replica 2 volume
>> Network: 10gig
>> Ovirt: 3.3.4, Centos 6.5
>>
>> Thanks,
>>
>>
>> *Steve *
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Ovirt snapshot failing on one VM

2014-04-22 Thread Steve Dainard

No alert in web ui, I restarted the VM yesterday just in case, no change. I
also restored an earlier snapshot and tried to re-snapshot, same result.


*Steve *


On Tue, Apr 22, 2014 at 10:57 AM, Dafna Ron  wrote:

> This is the actual problem:
>
> bf025a73-eeeb-4ac5-b8a9-32afa4ae482e::DEBUG::2014-04-22
> 10:21:49,374::volume::1058::Storage.Misc.excCmd::(createVolume) FAILED:
>  = '/rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/
> 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/4
> 66d9ae9-e46a-46f8-9f4b-964d8af0675b/87efa937-b31f-4bb1-aee1-0ee14a0dc6fb:
> error while creating qcow2: No such file or directory\n';  = 1
>
> from that you see the actual failure:
>
> bf025a73-eeeb-4ac5-b8a9-32afa4ae482e::ERROR::2014-04-22
> 10:21:49,392::volume::286::Storage.Volume::(clone) Volume.clone: can't
> clone: /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/
> 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d
> 9ae9-e46a-46f8-9f4b-964d8af0675b/1a67de4b-aa1c-4436-baca-ca55726d54d7 to
> /rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/
> 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/466d9ae9-
> e46a-46f8-9f4b-964d8af0675b/87efa937-b31f-4bb1-aee1-0ee1
> 4a0dc6fb
> bf025a73-eeeb-4ac5-b8a9-32afa4ae482e::ERROR::2014-04-22
> 10:21:49,392::volume::508::Storage.Volume::(create) Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/volume.py", line 466, in create
> srcVolUUID, imgPath, volPath)
>   File "/usr/share/vdsm/storage/fileVolume.py", line 160, in _create
> volParent.clone(imgPath, volUUID, volFormat, preallocate)
>   File "/usr/share/vdsm/storage/volume.py", line 287, in clone
> raise se.CannotCloneVolume(self.volumePath, dst_path, str(e))
> CannotCloneVolume: Cannot clone volume: 'src=/rhev/data-center/
> 9497ef2c-8368-4c92-8d61-7f318a90748f/95b9d922-4df7-
> 4d3b-9bca-467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b-
> 964d8af0675b/1a67de4b-aa1c-4436-baca-ca55726d54d7, dst=/rhev/data-cen
> ter/9497ef2c-8368-4c92-8d61-7f318a90748f/95b9d922-4df7-
> 4d3b-9bca-467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b-
> 964d8af0675b/87efa937-b31f-4bb1-aee1-0ee14a0dc6fb: Error creating a new
> volume: (["Formatting \'/rhev/data-center/9497ef2c-8368-
> 4c92-8d61-7f318a90748f/95b9d922-4df7-4d3b-9bca-
> 467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b-964d8af0675b/
> 87efa937-b31f-4bb1-aee1-0ee14a0dc6fb\', fmt=qcow2 size=21474836480
> backing_file=\'../466d9ae9-e46a-46f8-9f4b-964d8af0675b/1a67de4b-aa
> 1c-4436-baca-ca55726d54d7\' backing_fmt=\'qcow2\' encryption=off
> cluster_size=65536 "],)'
>
>
> do you have any alert in the webadmin to restart the vm?
>
> Dafna
>
>
> On 04/22/2014 03:31 PM, Steve Dainard wrote:
>
>> Sorry for the confusion.
>>
>> I attempted to take a live snapshot of a running VM. After that failed, I
>> migrated the VM to another host, and attempted the live snapshot again
>> without success, eliminating a single host as the cause of failure.
>>
>> Ovirt is 3.3.4, storage domain is gluster 3.4.2.1, OS is CentOS 6.5.
>>
>> Package versions:
>> libvirt-0.10.2-29.el6_5.5.x86_64
>> libvirt-lock-sanlock-0.10.2-29.el6_5.5.x86_64
>> qemu-img-rhev-0.12.1.2-2.415.el6.nux.3.x86_64
>> qemu-kvm-rhev-0.12.1.2-2.415.el6.nux.3.x86_64
>> qemu-kvm-rhev-tools-0.12.1.2-2.415.el6.nux.3.x86_64
>> vdsm-4.13.3-4.el6.x86_64
>> vdsm-gluster-4.13.3-4.el6.noarch
>>
>>
>> I made another live snapshot attempt at 10:21 EST today, full vdsm.log
>> attached, and a truncated engine.log.
>>
>> Thanks,
>>
>> *Steve
>> *
>>
>>
>>
>> On Tue, Apr 22, 2014 at 9:48 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> please explain the flow of what you are trying to do, are you
>> trying to live migrate the disk (from one storage to another), are
>> you trying to migrate the vm and after vm migration is finished
>> you try to take a live snapshot of the vm? or are you trying to
>> take a live snapshot of the vm during a vm migration from host1 to
>> host2?
>>
>> Please attach full vdsm logs from any host you are using (if you
>> are trying to migrate the vm from host1 to host2) + please attach
>> engine log.
>>
>> Also, what is the vdsm, libvirt and qemu versions, what ovirt
>> version are you using and what is the storage you are using?
>>
>> Thanks,
>>
>> Dafna
>>
>>
>>
>>
>> On 04/22/2014 02:12 PM, Steve Dainard wrote:
>>
>> I've attempted migrating the vm

Re: [ovirt-users] Ovirt snapshot failing on one VM

2014-04-22 Thread Steve Dainard

se
-rw-r--r--. 1 vdsm kvm 272 Apr 10 14:34
3ece1489-9bff-4223-ab97-e45135106222.meta
-rw-rw. 1 vdsm kvm22413312 Apr 10 14:29
dcee2e8a-8803-44e2-80e8-82c882af83ef
-rw-rw. 1 vdsm kvm 1048576 Apr 10 14:28
3ece1489-9bff-4223-ab97-e45135106222.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 14:28
dcee2e8a-8803-44e2-80e8-82c882af83ef.meta
-rw-rw. 1 vdsm kvm54460416 Apr 10 14:26
57066786-613a-46ff-b2f9-06d84678975b
-rw-rw. 1 vdsm kvm 1048576 Apr 10 14:26
dcee2e8a-8803-44e2-80e8-82c882af83ef.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 14:26
57066786-613a-46ff-b2f9-06d84678975b.meta
-rw-rw. 1 vdsm kvm15728640 Apr 10 13:31
121ae509-d2b2-4df2-a56f-dfdba4b8d21c
-rw-rw. 1 vdsm kvm 1048576 Apr 10 13:30
57066786-613a-46ff-b2f9-06d84678975b.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 13:30
121ae509-d2b2-4df2-a56f-dfdba4b8d21c.meta
-rw-rw. 1 vdsm kvm 5767168 Apr 10 13:18
1d95a9d2-e4ba-4bcc-ba71-5d493a838dcc
-rw-rw. 1 vdsm kvm 1048576 Apr 10 13:17
121ae509-d2b2-4df2-a56f-dfdba4b8d21c.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 13:17
1d95a9d2-e4ba-4bcc-ba71-5d493a838dcc.meta
-rw-rw. 1 vdsm kvm 5373952 Apr 10 13:13
3ce8936a-38f5-43a9-a4e0-820094fbeb04
-rw-rw. 1 vdsm kvm 1048576 Apr 10 13:13
1d95a9d2-e4ba-4bcc-ba71-5d493a838dcc.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 13:12
3ce8936a-38f5-43a9-a4e0-820094fbeb04.meta
-rw-rw. 1 vdsm kvm  3815243776 Apr 10 13:11
7211d323-c398-4c1c-8524-a1047f9d5ec9
-rw-rw. 1 vdsm kvm 1048576 Apr 10 13:11
3ce8936a-38f5-43a9-a4e0-820094fbeb04.lease
-rw-r--r--. 1 vdsm kvm 272 Apr 10 13:11
7211d323-c398-4c1c-8524-a1047f9d5ec9.meta
-rw-r--r--. 1 vdsm kvm 272 Mar 19 10:35
af94adc4-fad4-42f5-a004-689670311d66.meta
-rw-rw. 1 vdsm kvm 21474836480 Mar 19 10:22
af94adc4-fad4-42f5-a004-689670311d66
-rw-rw. 1 vdsm kvm 1048576 Mar 19 09:39
7211d323-c398-4c1c-8524-a1047f9d5ec9.lease
-rw-rw. 1 vdsm kvm 1048576 Mar 19 09:39
af94adc4-fad4-42f5-a004-689670311d66.lease

Its just very odd that I can snapshot any other VM except this one.

I just cloned a new VM from the last snapshot on this VM and it created
without issue. I was also able to snapshot the new VM without a problem.


*Steve *


On Tue, Apr 22, 2014 at 12:51 PM, Dafna Ron  wrote:

> it's the same error:
>
> c1d7c4e-392b-4a62-9836-3add1360a46d::DEBUG::2014-04-22
> 12:13:44,340::volume::1058::Storage.Misc.excCmd::(createVolume) FAILED:
>  = '/rhev/data-center/9497ef2c-8368-4c92-8d61-7f318a90748f/
> 95b9d922-4df7-4d3b-9bca-467e2fd9d573/images/4
> 66d9ae9-e46a-46f8-9f4b-964d8af0675b/0b2d15e5-bf4f-4eaf-90e2-f1bd51a3a936:
> error while creating qcow2: No such file or directory\n';  = 1
>
>
> were these 23 snapshots created any way each time we fail to create the
> snapshot or are these older snapshots which you actually created before the
> failure?
>
> at this point my main theory is that somewhere along the line you had some
> sort of failure in your storage and from that time each snapshot you create
> will fail.
> if the snapshots are created during the failure can you please delete the
> snapshots you do not need and try again?
>
> There should not be a limit on how many snapshots you can have since it's
> only a link changing the image the vm should boot from.
> Having said that, it's not ideal to have that many snapshots and can
> probably lead to unexpected results so I would not recommend having that
> many snapshots on a single vm :)
>
> for example, my second theory would be that because we have so many
> snapshots we have some sort of race where part of the createVolume command
> expects some result from a query run before the create itself and because
> there are so many snapshots there is "no such file" on the volume because
> it's too far up the list.
>
> can you also run: ls -l /rhev/data-center/9497ef2c-
> 8368-4c92-8d61-7f318a90748f/95b9d922-4df7-4d3b-9bca-
> 467e2fd9d573/images/466d9ae9-e46a-46f8-9f4b-964d8af0675b
>
> lets see what images are listed under that vm.
>
> btw, you know that your export domain is getting StorageDomainDoesNotExist
> in the vdsm log? is that domain in up state? can you try to deactivate the
> export domain?
>
> Thanks,
>
> Dafna
>
>
>
>
>
> On 04/22/2014 05:20 PM, Steve Dainard wrote:
>
>> Ominous..
>>
>> 23 snapshots. Is there an upper limit?
>>
>> Offline snapshot fails as well. Both logs attached again (snapshot
>> attempted at 12:13 EST).
>>
>> *Steve *
>>
>>
>> On Tue, Apr 22, 2014 at 11:20 AM, Dafna Ron > d...@redhat.com>> wrote:
>>
>> are you able to take an offline snapshot? (while the vm is down)
>> how many snapshots do you have on this vm?
>>
>&g

[ovirt-users] Anyone using gluster storage domain with WAN geo-rep?

2014-04-23 Thread Steve Dainard

I'm currently using a two node combined virt/storage setup with Ovirt 3.3.4
and Gluster 3.4.2 (replica 2, glusterfs storage domain). I'll call this
pair PROD.

I'm then geo-replicating to another gluster replica pair on the local net,
btrfs underlying storage, and volume snapshots so I can recover my storage
domain from different points in time if necessary. Its also local so
restore time is much better than off-site. I'll call this pair BACKUP.

I'm planning on setting up geo-replication from BACKUP to an EC2 gluster
target. I'll call this host EC2HOST.

PROD ---geo-rep-lan---> BACKUP ---geo-rep-wan---> EC2HOST

I'd like to avoid saturating my WAN link during office hours. I have some
ideas (or combination of):

1. limit bandwidth during certain hours to the offsite hosts. But
realistically the bandwidth I would allocate is so low I don't see the
purpose of this. Also with 8 guests running, I'm noticing quite a bit of
data transfer to the local backup nodes (avg 6-8MB/s), and I'm thinking
there is a lot of thrashing going on which isn't useful to backup offsite
anyways.

2. stop WAN geo-replication during office hours, and restart for
overnight/weekend hours.

3. Not use geo-rep between BACKUP ---> EC2HOST, use rsync on one of the
btrfs volume snapshots so we avoid the thrashing. In this case I could
limit WAN speed to 1MB/s which should be fine for most differences
throughout the day.

So my question is, how do you off-site your storage domains, what
constraints have you identified and how have you dealt with them? And of
course how would you deal with the scenario I've oulined above?

Thanks,




*Steve*
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Ovirt snapshot failing on one VM

2014-04-25 Thread Steve Dainard

Restarting vdsm and hosts didn't do anything helpful.

I was able to clone from latest snapshot, then live snapshot the new cloned
VM.

After upgrading engine to 3.4 and upgrading my hosts, I can now live
snapshot this VM.


*Steve *


On Thu, Apr 24, 2014 at 1:48 AM, Itamar Heim  wrote:

> On 04/23/2014 09:57 PM, R P Herrold wrote:
>
>> On Wed, 23 Apr 2014, Steve Dainard wrote:
>>
>>  I have other VM's with the same amount of snapshots without this problem.
>>> No conclusion jumping going on. More interested in what the best practice
>>> is for VM's that accumulate snapshots over time.
>>>
>>
>> For some real world context, we seem to accumulate snapshots
>> using our local approach, and are not that focused on, or
>> attentive about removing them.  The 'highwater mark' of 39, on
>> a machine that has been around since it was provisioned:
>> 2010-01-05
>>
>> [root@xxx backups]# ./count-snapshots.sh | sort -n | tail -3
>> 38 vm_64099
>> 38 vm_98036
>> 39 vm_06359
>>
>> Accumulating large numbers of snapshots seems more the
>> function of pets, than ephemeral 'cattle'
>>
>> I wrote the first paragraph without looking up the 'owners' of
>> the images. As I dereference the VM id's, all of the top ten
>> in that list turn out to be mailservers, radius servers, name
>> servers, and such, where the business unit owners chose not
>> (or neglect) to 'winnow' their herd.  There are no ephemeral
>> use units in the top ten
>>
>> -- Russ herrold
>> ___
>> Users mailing list
>> Users@ovirt.org
>> http://lists.ovirt.org/mailman/listinfo/users
>>
>>
> please note there is a recommended limit of having no more than 500
> snapshots per block storage domain due to some LVM performance issues with
> high number of LVs. each disk/snapshot is an LV.
> NFS doesn't have this limitation.
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Online backup options

2014-06-20 Thread Steve Dainard

Hello Ovirt team,

Reading this bulletin: https://access.redhat.com/site/solutions/117763
there is a reference to 'private Red Hat Bug # 523354' covering online
backups of VM's.

Can someone comment on this feature, and rough timeline? Is this a native
backup solution that will be included with Ovirt/RHEV?

Is this Ovirt feature where the work is being done?
http://www.ovirt.org/Features/Backup-Restore_API_Integration It seems like
this may be a different feature specifically for 3rd party backup options.

Thanks,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Master storage domain no longer exists, how to recover?

2014-07-14 Thread Steve Dainard

I'm running a trial of RHEV 3.4 and decided to run through our DR plan.

I have two storage domains.
vm-store
aws_storage_gateway

vm-store is the domain which contains all local VM's which would be
considered important. I must have put vm-store into maintenance mode before
aws_storage_gateway which caused aws_storage_gateway to become the master
domain. It also happens to be non-existant now after recovery, as the DR
process is to re-create this SD with new disks as all the data is on aws
anyways.

Is there a method of changing the master storage domain when all domains
are in maintenance mode?

If I attempt to destroy it I get a UI error:

Error while executing action: Cannot destroy the master Storage Domain from
the Data Center without another active Storage Domain to take its place.
-Either activate another Storage Domain in the Data Center, or remove the
Data Center.
-If you have problems with the master Data Domain, consider following the
recovery process described in the documentation, or contact your system
administrator.

I tried to fake the storage domain by creating another one, copy in the
file structure, and renaming the folder uuid and updating the meta file to
match the missing SD without luck.

Can I update the db storage pool reference with the vm-store SD uuid?

engine=# select * from storage_pool;
  id  |  name   |   description   |
storage_p
ool_type | storage_pool_format_type | status | master_domain_version |
spm_vds_id | c
ompatibility_version | _create_date  | _update_date

| quota_enforcement_type | free_text_comment | is_local
--+-+-+--
-+--++---++--
-+---+---
++---+--
 *5849b030-626e-47cb-ad90-3ce782d831b3* | Default | The default Data Center
|
   1 | 3|  2 |19 |
   | 3
.3   | 2014-05-29 19:46:50.199772-04 | 2014-07-10
09:24:22.790738-04
|  0 |   | f
(1 row)




Thanks,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Master storage domain no longer exists, how to recover?

2014-07-14 Thread Steve Dainard

And 5 minutes later I found...

https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Virtualization/3.4/html-single/Administration_Guide/index.html
3.5.6. Re-Initializing a Data Center: Recovery Procedure

Which worked quite well.

Steve


On Mon, Jul 14, 2014 at 10:27 AM, Steve Dainard 
wrote:

> I'm running a trial of RHEV 3.4 and decided to run through our DR plan.
>
> I have two storage domains.
> vm-store
> aws_storage_gateway
>
> vm-store is the domain which contains all local VM's which would be
> considered important. I must have put vm-store into maintenance mode before
> aws_storage_gateway which caused aws_storage_gateway to become the master
> domain. It also happens to be non-existant now after recovery, as the DR
> process is to re-create this SD with new disks as all the data is on aws
> anyways.
>
> Is there a method of changing the master storage domain when all domains
> are in maintenance mode?
>
> If I attempt to destroy it I get a UI error:
>
> Error while executing action: Cannot destroy the master Storage Domain
> from the Data Center without another active Storage Domain to take its
> place.
> -Either activate another Storage Domain in the Data Center, or remove the
> Data Center.
> -If you have problems with the master Data Domain, consider following the
> recovery process described in the documentation, or contact your system
> administrator.
>
> I tried to fake the storage domain by creating another one, copy in the
> file structure, and renaming the folder uuid and updating the meta file to
> match the missing SD without luck.
>
> Can I update the db storage pool reference with the vm-store SD uuid?
>
> engine=# select * from storage_pool;
>   id  |  name   |   description
> | storage_p
> ool_type | storage_pool_format_type | status | master_domain_version |
> spm_vds_id | c
> ompatibility_version | _create_date  |
> _update_date
> | quota_enforcement_type | free_text_comment | is_local
>
> --+-+-+--
>
> -+--++---++--
>
> -+---+---
> ++---+--
>  *5849b030-626e-47cb-ad90-3ce782d831b3* | Default | The default Data
> Center |
>1 | 3|  2 |19 |
>| 3
> .3   | 2014-05-29 19:46:50.199772-04 | 2014-07-10
> 09:24:22.790738-04
> |  0 |   | f
> (1 row)
>
>
>
>
> Thanks,
> Steve
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Relationship bw storage domain uuid/images/children and VM's

2014-07-17 Thread Steve Dainard

Hello,

I'd like to get an understanding of the relationship between VM's using a
storage domain, and the child directories listed under ...///images/.

Running through some backup scenarios I'm noticing a significant difference
between the number of provisioned VM's using a storage domain (21) +
templates (6) versus the number of child directories under images/ (107).

Running RHEV 3.4 trial.

Thanks,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Relationship bw storage domain uuid/images/children and VM's

2014-07-18 Thread Steve Dainard

another SD):



























































Thanks,
Steve


On Fri, Jul 18, 2014 at 8:12 AM, Yair Zaslavsky  wrote:

>
>
> - Original Message -----
> > From: "Steve Dainard" 
> > To: "users" 
> > Sent: Thursday, July 17, 2014 7:51:31 PM
> > Subject: [ovirt-users] Relationship bw storage domain
> uuid/images/childrenand VM's
> >
> > Hello,
> >
> > I'd like to get an understanding of the relationship between VM's using a
> > storage domain, and the child directories listed under .../ domain
> > name>//images/.
> >
> > Running through some backup scenarios I'm noticing a significant
> difference
> > between the number of provisioned VM's using a storage domain (21) +
> > templates (6) versus the number of child directories under images/ (107).
>
> Can you please elaborate (if possible) on the number of images per VM that
> you're having in your setup?
>
> >
> > Running RHEV 3.4 trial.
> >
> > Thanks,
> > Steve
> >
> > ___
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> >
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Online backup options

2014-07-18 Thread Steve Dainard

Anyone?


On Fri, Jun 20, 2014 at 9:35 AM, Steve Dainard 
wrote:

> Hello Ovirt team,
>
> Reading this bulletin: https://access.redhat.com/site/solutions/117763
> there is a reference to 'private Red Hat Bug # 523354' covering online
> backups of VM's.
>
> Can someone comment on this feature, and rough timeline? Is this a native
> backup solution that will be included with Ovirt/RHEV?
>
> Is this Ovirt feature where the work is being done?
> http://www.ovirt.org/Features/Backup-Restore_API_Integration It seems
> like this may be a different feature specifically for 3rd party backup
> options.
>
> Thanks,
> Steve
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] RHEV 3.4 trial hosted-engine either host wants to take ownership

2014-07-21 Thread Steve Dainard

I added a hook to rhevm, and then restarted the engine service which
triggered a hosted-engine VM shutdown (likely because of the failed
liveliness check).

Once the hosted-engine VM shutdown it did not restart on the other host.

On both hosts configured for hosted-engine I'm seeing logs from ha-agent
where each host thinks the other host has a better score. Is there supposed
to be a mechanism for a tie breaker here? I do notice that the log mentions
best REMOTE host, so perhaps I'm interpreting this message incorrectly.

ha-agent logs:

Host 001:

MainThread::INFO::2014-07-21
11:51:57,396::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957917.4 type=state_transition
detail=EngineDown-EngineDown hostname='rhev001.miovision.corp'
MainThread::INFO::2014-07-21
11:51:57,397::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown) sent?
ignored
MainThread::INFO::2014-07-21
11:51:57,924::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-21
11:51:57,924::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev002.miovision.corp (id: 2, score: 2400)
MainThread::INFO::2014-07-21
11:52:07,961::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down, local host does not have best score
MainThread::INFO::2014-07-21
11:52:07,975::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957927.98 type=state_transition
detail=EngineDown-EngineDown hostname='rhev001.miovision.corp'

Host 002:

MainThread::INFO::2014-07-21
11:51:47,405::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957907.41 type=state_transition
detail=EngineDown-EngineDown hostname='rhev002.miovision.corp'
MainThread::INFO::2014-07-21
11:51:47,406::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineDown-EngineDown) sent?
ignored
MainThread::INFO::2014-07-21
11:51:47,834::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineDown (score: 2400)
MainThread::INFO::2014-07-21
11:51:47,835::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev001.miovision.corp (id: 1, score: 2400)
MainThread::INFO::2014-07-21
11:51:57,870::states::454::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(consume)
Engine down, local host does not have best score
MainThread::INFO::2014-07-21
11:51:57,883::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405957917.88 type=state_transition
detail=EngineDown-EngineDown hostname='rhev002.miovision.corp'

This went on for 20 minutes about an hour ago, and I decided to --vm-start
on one of the hosts. The manager VM runs for a few minutes with the engine
ui accessible, before shutting itself down again.

I then put host 002 into local maintenance mode, and host 001 auto started
the hosted-engine VM. The logging still references host 002 as the 'best
remote host' even though the calculated score is now 0:

MainThread::INFO::2014-07-21
12:03:24,011::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405958604.01 type=state_transition
detail=EngineUp-EngineUp hostname='rhev001.miovision.corp'
MainThread::INFO::2014-07-21
12:03:24,013::brokerlink::117::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Success, was notification of state_transition (EngineUp-EngineUp) sent?
ignored
MainThread::INFO::2014-07-21
12:03:24,515::hosted_engine::323::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Current state EngineUp (score: 2400)
MainThread::INFO::2014-07-21
12:03:24,516::hosted_engine::328::ovirt_hosted_engine_ha.agent.hosted_engine.HostedEngine::(start_monitoring)
Best remote host rhev002.miovision.corp (id: 2, score: 0)
MainThread::INFO::2014-07-21
12:03:34,567::brokerlink::108::ovirt_hosted_engine_ha.lib.brokerlink.BrokerLink::(notify)
Trying: notify time=1405958614.57 type=state_transition
detail=EngineUp-EngineUp hostname='rhev001.miovision.corp'

Once the hosted-engine VM was up for about 5 minutes I took host 002 out of
local maintenance mode and the VM has not since shutdown.

Is this expected behaviour? Is this the normal recovery process when two
hosts both hosting hosted-engine are started at the same time? I would have
expected once hosted-engine VM was detected as bad (liveliness check from
when I restarted the engine service) and the VM was shutdown, that it would
spin back up on the next available host.

Thanks,
Steve

[ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-21 Thread Steve Dainard

I'm using the hostusb hook on RHEV 3.4 trial.

The usb device is passed through to the VM, but I'm getting errors in a
Windows VM when the device driver is loaded.

I started with a simple usb drive, on the host it is listed as:

Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.

Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.

In Windows I get an error in device manager:
USB Mass Storage Device "This device cannot start. (Code 10)"
Properties/General Tab: Device type: Universal Serial Bus Controllers,
Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001

Under hardware Ids:
USB\VID_05DC&PID_C75C&REV_0102
USB\VID_05DC&PID_C75C

So it looks like the proper USB device ID is passed to the VM.

I don't see any error messages in event viewer, and I don't see anything in
VDSM logs either.

Any help is appreciated.

Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-21 Thread Steve Dainard

I should mention I can mount this usb drive in a CentOS 6.5 VM without any
problems.


On Mon, Jul 21, 2014 at 2:11 PM, Steve Dainard 
wrote:

> I'm using the hostusb hook on RHEV 3.4 trial.
>
> The usb device is passed through to the VM, but I'm getting errors in a
> Windows VM when the device driver is loaded.
>
> I started with a simple usb drive, on the host it is listed as:
>
> Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.
>
> Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.
>
> In Windows I get an error in device manager:
> USB Mass Storage Device "This device cannot start. (Code 10)"
> Properties/General Tab: Device type: Universal Serial Bus Controllers,
> Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001
>
> Under hardware Ids:
> USB\VID_05DC&PID_C75C&REV_0102
> USB\VID_05DC&PID_C75C
>
> So it looks like the proper USB device ID is passed to the VM.
>
> I don't see any error messages in event viewer, and I don't see anything
> in VDSM logs either.
>
> Any help is appreciated.
>
> Steve
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-22 Thread Steve Dainard

Hi Michal,

How can I generate libvirt xml from rhevm?

Thanks,
Steve


On Tue, Jul 22, 2014 at 4:12 AM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> On 21 Jul 2014, at 20:54, Steve Dainard wrote:
>
> I should mention I can mount this usb drive in a CentOS 6.5 VM without any
> problems.
>
>
> Hi,
> there should be no difference configuration-wise. well, please compare
> libvirt's xml to be sure and confirm
> If it's the case then it might be a problem of qemu/kvm and/or windows
>
> Thanks,
> michal
>
>
>
> On Mon, Jul 21, 2014 at 2:11 PM, Steve Dainard 
> wrote:
>
>> I'm using the hostusb hook on RHEV 3.4 trial.
>>
>> The usb device is passed through to the VM, but I'm getting errors in a
>> Windows VM when the device driver is loaded.
>>
>> I started with a simple usb drive, on the host it is listed as:
>>
>> Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.
>>
>> Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.
>>
>> In Windows I get an error in device manager:
>> USB Mass Storage Device "This device cannot start. (Code 10)"
>> Properties/General Tab: Device type: Universal Serial Bus Controllers,
>> Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001
>>
>> Under hardware Ids:
>> USB\VID_05DC&PID_C75C&REV_0102
>> USB\VID_05DC&PID_C75C
>>
>> So it looks like the proper USB device ID is passed to the VM.
>>
>> I don't see any error messages in event viewer, and I don't see anything
>> in VDSM logs either.
>>
>> Any help is appreciated.
>>
>> Steve
>>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-22 Thread Steve Dainard

The USB section is the same on both guests:


  
  



  



  
  


I thought it might be a Windows issue, I've tried on both Win7 64 and
2008R2 64 with the same result. At one point there was a notification
bubble that the USB device can run faster - which I thought might mean the
Windows guest was provided a USB v1 controller.

I've attached both the xml's if you think there might be something I'm not
seeing.

Thanks,
Steve


On Tue, Jul 22, 2014 at 9:50 AM, Michal Skrivanek <
michal.skriva...@redhat.com> wrote:

>
> On Jul 22, 2014, at 15:49 , Steve Dainard  wrote:
>
> > Hi Michal,
> >
> > How can I generate libvirt xml from rhevm?
>
> "virsh -r dumpxml " on the host
>
> Thanks,
> michal
>
> >
> > Thanks,
> > Steve
> >
> >
> > On Tue, Jul 22, 2014 at 4:12 AM, Michal Skrivanek <
> michal.skriva...@redhat.com> wrote:
> >
> > On 21 Jul 2014, at 20:54, Steve Dainard wrote:
> >
> >> I should mention I can mount this usb drive in a CentOS 6.5 VM without
> any problems.
> >
> > Hi,
> > there should be no difference configuration-wise. well, please compare
> libvirt's xml to be sure and confirm
> > If it's the case then it might be a problem of qemu/kvm and/or windows
> >
> > Thanks,
> > michal
> >
> >>
> >>
> >> On Mon, Jul 21, 2014 at 2:11 PM, Steve Dainard 
> wrote:
> >> I'm using the hostusb hook on RHEV 3.4 trial.
> >>
> >> The usb device is passed through to the VM, but I'm getting errors in a
> Windows VM when the device driver is loaded.
> >>
> >> I started with a simple usb drive, on the host it is listed as:
> >>
> >> Bus 002 Device 010: ID 05dc:c75c Lexar Media, Inc.
> >>
> >> Which I added as 0x05dc:0xc75c to the Windows 7 x64 VM.
> >>
> >> In Windows I get an error in device manager:
> >> USB Mass Storage Device "This device cannot start. (Code 10)"
> >> Properties/General Tab: Device type: Universal Serial Bus Controllers,
> Manufacturer: Compatible USB storage device, Location: Port_#0001.Hub_#0001
> >>
> >> Under hardware Ids:
> >> USB\VID_05DC&PID_C75C&REV_0102
> >> USB\VID_05DC&PID_C75C
> >>
> >> So it looks like the proper USB device ID is passed to the VM.
> >>
> >> I don't see any error messages in event viewer, and I don't see
> anything in VDSM logs either.
> >>
> >> Any help is appreciated.
> >>
> >> Steve
> >>
> >> ___
> >> Users mailing list
> >> Users@ovirt.org
> >> http://lists.ovirt.org/mailman/listinfo/users
> >
> >
>
>
CentOS 6.5 VM


  steve-test
  deabe801-bc66-43c9-9d99-ecd852c3ac6e
  2097152
  2097152
  
2097152
  
  160
  
1020
  
  

  Red Hat
  RHEV Hypervisor
  6Server-6.5.0.1.el6
  ----002590DE3912
  deabe801-bc66-43c9-9d99-ecd852c3ac6e

  
  
hvm

  
  

  
  
SandyBridge

  
  

  
  destroy
  restart
  destroy
  
/usr/libexec/qemu-kvm

  
  

  
  
  
  
  
  
  


  
  

  
  
  6edab781-7f9b-4c4d-beb2-1bf197f325d1
  
  
  


  
  


  
  


  
  


  
  


  
  
  
  
  
  
  
  


  
  
  


  
  
  
  


  
  
  
  


  
  
  



  
  
  
  
  
  
  
  
  


  
  
  


  



  
  


  
  

  
  
system_u:system_r:svirt_t:s0:c254,c932
system_u:object_r:svirt_image_t:s0:c254,c932
  


  usb-test
  44856291-3729-485d-8426-e0b05410202f
  4194304
  4194304
  
4194304
  
  160
  
1020
  
  

  Red Hat
  RHEV Hypervisor
  6Server-6.5.0.1.el6
  ----002590DE3912
  44856291-3729-485d-8426-e0b05410202f

  
  
hvm

  
  

  
  
SandyBridge

  
  

  
  destroy
  restart
  destroy
  
/usr/libexec/qemu-kvm

  
  
  
  
  
  
  


  
  

  
  
  58837ddf-ba30-4dce-aa1f-de1d0a9eb27a

Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-22 Thread Steve Dainard

I just saw the "your device can perform faster" warning again in Windows
and decided to check it out.

Should the USB device be showing under an Intel controller? Is the RH
controller only for spice?

On Tue, Jul 22, 2014 at 10:12 AM, Dan Kenigsberg  wrote:

> On Tue, Jul 22, 2014 at 03:50:59PM +0200, Michal Skrivanek wrote:
> >
> > On Jul 22, 2014, at 15:49 , Steve Dainard 
> wrote:
> >
> > > Hi Michal,
> > >
> > > How can I generate libvirt xml from rhevm?
> >
> > "virsh -r dumpxml " on the host
>
> Or dig into vdsm.log (in case the VM is no longer there)
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] hostusb hook - VM device errors in Windows VM

2014-07-25 Thread Steve Dainard

Any other ideas here? Is there a specific driver I should load instead of
the Windows default one?

Thanks,
Steve


On Tue, Jul 22, 2014 at 10:23 AM, Steve Dainard 
wrote:

> I just saw the "your device can perform faster" warning again in Windows
> and decided to check it out.
>
> Should the USB device be showing under an Intel controller? Is the RH
> controller only for spice?
>
>
> 
>
>
> On Tue, Jul 22, 2014 at 10:12 AM, Dan Kenigsberg 
> wrote:
>
>> On Tue, Jul 22, 2014 at 03:50:59PM +0200, Michal Skrivanek wrote:
>> >
>> > On Jul 22, 2014, at 15:49 , Steve Dainard 
>> wrote:
>> >
>> > > Hi Michal,
>> > >
>> > > How can I generate libvirt xml from rhevm?
>> >
>> > "virsh -r dumpxml " on the host
>>
>> Or dig into vdsm.log (in case the VM is no longer there)
>>
>
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Ovirt 3.5 host gluster storage connection failure

2015-12-22 Thread Steve Dainard

I have two hosts, only one of them was running VM's at the time of
this crash so I can't tell if this is a node specific problem.

rpm -qa | egrep -i 'gluster|vdsm|libvirt' |sort
glusterfs-3.6.7-1.el7.x86_64
glusterfs-api-3.6.7-1.el7.x86_64
glusterfs-cli-3.6.7-1.el7.x86_64
glusterfs-fuse-3.6.7-1.el7.x86_64
glusterfs-libs-3.6.7-1.el7.x86_64
glusterfs-rdma-3.6.7-1.el7.x86_64
libvirt-client-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-interface-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-network-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-secret-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-driver-storage-1.2.8-16.el7_1.5.x86_64
libvirt-daemon-kvm-1.2.8-16.el7_1.5.x86_64
libvirt-lock-sanlock-1.2.8-16.el7_1.5.x86_64
libvirt-python-1.2.8-7.el7_1.1.x86_64
vdsm-4.16.30-0.el7.centos.x86_64
vdsm-cli-4.16.30-0.el7.centos.noarch
vdsm-jsonrpc-4.16.30-0.el7.centos.noarch
vdsm-python-4.16.30-0.el7.centos.noarch
vdsm-python-zombiereaper-4.16.30-0.el7.centos.noarch
vdsm-xmlrpc-4.16.30-0.el7.centos.noarch
vdsm-yajsonrpc-4.16.30-0.el7.centos.noarch

VM's were in a paused state, with errors in UI:

2015-Dec-22, 15:06
VM pcic-apps has paused due to unknown storage error.
2015-Dec-22, 15:06
Host compute2 is not responding. It will stay in Connecting state for
a grace period of 82 seconds and after that an attempt to fence the
host will be issued.
2015-Dec-22, 15:03
Invalid status on Data Center EDC2. Setting Data Center status to Non
Responsive (On host compute2, Error: General Exception).
2015-Dec-22, 15:03
VM pcic-storage has paused due to unknown storage error.
2015-Dec-22, 15:03
VM docker1 has paused due to unknown storage error.

VDSM logs look normal until:
Dummy-99::DEBUG::2015-12-22
23:03:58,949::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail)
dd 
if=/rhev/data-center/f72ec125-69a1-4c1b-a5e1-313fcb70b6ff/mastersd/dom_md/inbox
iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-99::DEBUG::2015-12-22
23:03:58,963::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail)
SUCCESS:  = '1+0 records in\n1+0 records out\n1024000 bytes (1.0
MB) copied, 0.00350501 s, 292 MB/s\n';  = 0
VM Channels Listener::INFO::2015-12-22
23:03:59,527::guestagent::180::vm.Vm::(_handleAPIVersion)
vmId=`7067679e-43aa-43c0-b263-b0a711ade2e2`::Guest API version changed
from 2 to 1
Thread-245428::DEBUG::2015-12-22
23:03:59,718::libvirtconnection::151::root::(wrapper) Unknown
libvirterror: ecode: 80 edom: 20 level: 2 message: metadata not found:
Requested metadata element is not present
libvirtEventLoop::INFO::2015-12-22
23:04:00,447::vm::4982::vm.Vm::(_onIOError)
vmId=`376e98b7-7798-46e8-be03-5dddf6cfb54f`::abnormal vm stop device
virtio-disk0 error eother
libvirtEventLoop::DEBUG::2015-12-22
23:04:00,447::vm::5666::vm.Vm::(_onLibvirtLifecycleEvent)
vmId=`376e98b7-7798-46e8-be03-5dddf6cfb54f`::event Suspended detail 2
opaque None
libvirtEventLoop::INFO::2015-12-22
23:04:00,447::vm::4982::vm.Vm::(_onIOError)
vmId=`376e98b7-7798-46e8-be03-5dddf6cfb54f`::abnormal vm stop device
virtio-disk0 error eother
...
libvirtEventLoop::INFO::2015-12-22
23:04:00,843::vm::4982::vm.Vm::(_onIOError)
vmId=`97fbbf97-944b-4b77-b0bf-6a831f9090d8`::abnormal vm stop device
virtio-disk1 error eother
libvirtEventLoop::DEBUG::2015-12-22
23:04:00,844::vm::5666::vm.Vm::(_onLibvirtLifecycleEvent)
vmId=`97fbbf97-944b-4b77-b0bf-6a831f9090d8`::event Suspended detail 2
opaque None
Dummy-99::DEBUG::2015-12-22
23:04:00,973::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail)
dd 
if=/rhev/data-center/f72ec125-69a1-4c1b-a5e1-313fcb70b6ff/mastersd/dom_md/inbox
iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-99::DEBUG::2015-12-22
23:04:00,983::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail)
FAILED:  = "dd: failed to open
'/rhev/data-center/f72ec125-69a1-4c1b-a5e1-313fcb70b6ff/mastersd/dom_md/inbox':
Transport endpoint is not connected\n";  = 1
Dummy-99::ERROR::2015-12-22
23:04:00,983::storage_mailbox::787::Storage.MailBox.SpmMailMonitor::(run)
Error checking for mail
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/storage_mailbox.py", line 785, in run
self._checkForMail()
  File "/usr/share/vdsm/storage/storage_mailbox.py", line 734, in _checkForMail
"Could not read mailbox: %s" % self._inbox)
IOError: [Errno 5] _handleRequests._checkForMail - Could not read
mailbox: 
/rhev/data-center/f72ec125-69a1-4c1b-a5e1-313fcb70b6ff/mastersd/dom_md/inbox
Dummy-99::DEBUG::2015-12-22
23:04:02,987::storage_mailbox::731::Storage.Misc.excCmd::(_checkForMail)
dd 
if=/rhev/data-center/f72ec125-69a1-4c1b-a5e1-313fcb70b6ff/mastersd/dom_md/inbox
iflag=direct,fullblock count=1 bs=1024000 (cwd None)
Dummy-99::DEBUG::2015-12-22
23:04:02,994::storage_mailbox::731::Storage.Misc.excCmd::(_check

Re: [Users] attaching glusterfs storage domain --> method "glusterServicesGet" is not supported

2013-07-01 Thread Steve Dainard

Creating /var/lib/glusterd/groups/virt on each node and adding parameters
found here:
https://access.redhat.com/site/documentation/en-US/Red_Hat_Storage/2.0/html/Quick_Start_Guide/chap-Quick_Start_Guide-Virtual_Preparation.html
solved
this issue.




Steve Dainard
Infrastructure Manager
Miovision Technologies Inc.
Phone: 519-513-2407 x250


On Mon, Jul 1, 2013 at 4:54 PM, Steve Dainard wrote:

> I'm using Ovirt nightly on Fedora 18 to determine if ovirt + gluster is
> something that will work for my organization (or at least when RHEV is
> released with this functionality). I'm attempting to use the nodes for both
> virt and gluster storage.
>
> I've successfully created gluster bricks on two hosts, ovirt001 &
> ovirt002, and started volume vol1 through engine web ui. I've created a
> gluster storage domain, but cannot attach to data center.
>
> *Engine UI error:*
> Failed to attach Storage Domains to Data Center Default. (User:
> admin@internal)
>
> *Engine /var/log/ovirt-engine/engine.log errors:*
> 2013-07-01 16:29:02,650 ERROR
> [org.ovirt.engine.core.vdsbroker.gluster.GlusterServicesListVDSCommand]
> (pool-6-thread-49) Command GlusterServicesListVDS execution failed.
> Exception: VDSNetworkException: org.apache.xmlrpc.XmlRpcException:  'exceptions.Exception'>:method "glusterServicesGet" is not supported
> I've attached the much larger log file
>
> *Host
> /var/log/glusterfs/rhev-data-center-mnt-glusterSD-ovirt001\:vol1.log when
> attaching:*
> [2013-07-01 20:40:33.718871] I [afr-self-heal-data.c:655:afr_sh_data_fix]
> 0-vol1-replicate-0: no active sinks for performing self-heal on file
> /b2076340-84de-45ff-9d4b-d0d48b935fca/dom_md/ids
> [2013-07-01 20:40:33.721059] W
> [client-rpc-fops.c:873:client3_3_writev_cbk] 0-vol1-client-0: remote
> operation failed: Invalid argument
> [2013-07-01 20:40:33.721104] W
> [client-rpc-fops.c:873:client3_3_writev_cbk] 0-vol1-client-1: remote
> operation failed: Invalid argument
> [2013-07-01 20:40:33.721130] W [fuse-bridge.c:2127:fuse_writev_cbk]
> 0-glusterfs-fuse: 304: WRITE => -1 (Invalid argument)
>
> *Engine repos:*
> [root@ovirt-manager2 yum.repos.d]# ll
> total 16
> -rw-r--r--. 1 root root 1145 Dec 20  2012 fedora.repo
> -rw-r--r--. 1 root root 1105 Dec 20  2012 fedora-updates.repo
> -rw-r--r--. 1 root root 1163 Dec 20  2012 fedora-updates-testing.repo
> -rw-r--r--. 1 root root  144 Jun 21 18:34 ovirt-nightly.repo
>
> [root@ovirt-manager2 yum.repos.d]# cat *
> [fedora]
> name=Fedora $releasever - $basearch
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/fedora/linux/releases/$releasever/Everything/$basearch/os/
> mirrorlist=
> https://mirrors.fedoraproject.org/metalink?repo=fedora-$releasever&arch=$basearch
> enabled=1
> #metadata_expire=7d
> gpgcheck=1
> gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch
>
> [fedora-debuginfo]
> name=Fedora $releasever - $basearch - Debug
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/fedora/linux/releases/$releasever/Everything/$basearch/debug/
> mirrorlist=
> https://mirrors.fedoraproject.org/metalink?repo=fedora-debug-$releasever&arch=$basearch
> enabled=0
> metadata_expire=7d
> gpgcheck=1
> gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch
>
> [fedora-source]
> name=Fedora $releasever - Source
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/fedora/linux/releases/$releasever/Everything/source/SRPMS/
> mirrorlist=
> https://mirrors.fedoraproject.org/metalink?repo=fedora-source-$releasever&arch=$basearch
> enabled=0
> metadata_expire=7d
> gpgcheck=1
> gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch
> [updates]
> name=Fedora $releasever - $basearch - Updates
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/fedora/linux/updates/$releasever/$basearch/
> mirrorlist=
> https://mirrors.fedoraproject.org/metalink?repo=updates-released-f$releasever&arch=$basearch
> enabled=1
> gpgcheck=1
> gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch
>
> [updates-debuginfo]
> name=Fedora $releasever - $basearch - Updates - Debug
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/fedora/linux/updates/$releasever/$basearch/debug/
> mirrorlist=
> https://mirrors.fedoraproject.org/metalink?repo=updates-released-debug-f$releasever&arch=$basearch
> enabled=0
> gpgcheck=1
> gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-fedora-$basearch
>
> [updates-source]
> name=Fedora $releasever - Updates Source
> failovermethod=priority
> #baseurl=
> http://download.fedoraproject.org/pub/

[Users] Ovirt 3.3 nightly, Gluster 3.4 stable, cannot launch VM with gluster storage domain backed disk

2013-07-17 Thread Steve Dainard

nd]
(DefaultQuartzScheduler_Worker-2) START, FullListVdsCommand(HostName =
ovirt001, HostId = d07967ab-3764-47ff-8755-bc539a7feb3b,
vds=Host[ovirt001], vmIds=[8e2c9057-deee-48a6-8314-a34530fc53cb]), log id:
119758a
2013-07-17 11:12:39,453 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.FullListVdsCommand]
(DefaultQuartzScheduler_Worker-2) FINISH, FullListVdsCommand, return:
[Ljava.util.HashMap;@2e73b796, log id: 119758a
2013-07-17 11:12:39,478 ERROR
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-2) Rerun vm
8e2c9057-deee-48a6-8314-a34530fc53cb. Called from vds ovirt001
2013-07-17 11:12:39,574 INFO  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-49) Lock Acquired to object EngineLock [exclusiveLocks= key:
8e2c9057-deee-48a6-8314-a34530fc53cb value: VM
, sharedLocks= ]
2013-07-17 11:12:39,603 INFO
 [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand]
(pool-6-thread-49) START, IsVmDuringInitiatingVDSCommand( vmId =
8e2c9057-deee-48a6-8314-a34530fc53cb), log id: 497e83ec
2013-07-17 11:12:39,606 INFO
 [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand]
(pool-6-thread-49) FINISH, IsVmDuringInitiatingVDSCommand, return: false,
log id: 497e83ec
2013-07-17 11:12:39,661 INFO
 [org.ovirt.engine.core.bll.scheduling.VdsSelector] (pool-6-thread-49)  VDS
ovirt001 d07967ab-3764-47ff-8755-bc539a7feb3b have failed running this VM
in the current selection cycle
2013-07-17 11:12:39,663 WARN  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-49) CanDoAction of action RunVm failed.
Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_VDS_VM_CLUSTER
2013-07-17 11:12:39,664 INFO  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-49) Lock freed to object EngineLock [exclusiveLocks= key:
8e2c9057-deee-48a6-8314-a34530fc53cb value: VM
, sharedLocks= ]
2013-07-17 11:13:49,097 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager]
(DefaultQuartzScheduler_Worker-42) Setting new tasks map. The map contains
now 0 tasks
2013-07-17 11:13:49,099 INFO  [org.ovirt.engine.core.bll.AsyncTaskManager]
(DefaultQuartzScheduler_Worker-42) Cleared all tasks of pool
5849b030-626e-47cb-ad90-3ce782d831b3.



*Steve Dainard *
Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |
**LinkedIn<https://www.linkedin.com/company/miovision-technologies>  |
 Twitter <https://twitter.com/miovision>  |
Facebook<https://www.facebook.com/miovision>
*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Ovirt 3.3 nightly, Gluster 3.4 stable, cannot launch VM with gluster storage domain backed disk

2013-07-17 Thread Steve Dainard

lusiveLocks= key: 8e2c9057-deee-48a6-
8314-a34530fc53cb value: VM
, sharedLocks= ]
2013-07-17 12:39:51,459 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(DefaultQuartzScheduler_Worker-85) START, DestroyVDSCommand(HostName =
ovirt001,
HostId = d07967ab-3764-47ff-8755-bc539a7feb3b,
vmId=8e2c9057-deee-48a6-8314-a34530fc53cb, force=false, secondsToWait=0,
gracefully=false), log id: 60626686
2013-07-17 12:39:51,548 INFO
 [org.ovirt.engine.core.vdsbroker.vdsbroker.DestroyVDSCommand]
(DefaultQuartzScheduler_Worker-85) FINISH, DestroyVDSCommand, log id:
60626686
2013-07-17 12:39:51,635 INFO
 [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-85) Running on vds during rerun failed vm:
null
2013-07-17 12:39:51,641 INFO
 [org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-85) vm VM1 running in db and not running in
vds - add to
rerun treatment. vds ovirt001
2013-07-17 12:39:51,660 ERROR
[org.ovirt.engine.core.vdsbroker.VdsUpdateRunTimeInfo]
(DefaultQuartzScheduler_Worker-85) Rerun vm
8e2c9057-deee-48a6-8314-a34530fc53cb. Called
 from vds ovirt001
2013-07-17 12:39:51,729 INFO  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-50) Lock Acquired to object EngineLock [exclusiveLocks= key:
8e2c9057-deee-48a6-8314-a3
4530fc53cb value: VM
, sharedLocks= ]
2013-07-17 12:39:51,753 INFO
 [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand]
(pool-6-thread-50) START, IsVmDuringInitiatingVDSCommand( vmId =
8e2c9057-deee
-48a6-8314-a34530fc53cb), log id: 7647c7d4
2013-07-17 12:39:51,753 INFO
 [org.ovirt.engine.core.vdsbroker.IsVmDuringInitiatingVDSCommand]
(pool-6-thread-50) FINISH, IsVmDuringInitiatingVDSCommand, return: false,
log
id: 7647c7d4
2013-07-17 12:39:51,794 INFO
 [org.ovirt.engine.core.bll.scheduling.VdsSelector] (pool-6-thread-50)  VDS
ovirt001 d07967ab-3764-47ff-8755-bc539a7feb3b have failed running th
is VM in the current selection cycle
2013-07-17 12:39:51,794 WARN  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-50) CanDoAction of action RunVm failed.
Reasons:VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,VAR__ACTION__RUN,VAR__TYPE__VM,ACTION_TYPE_FAILED_VDS_VM_CLUSTER
2013-07-17 12:39:51,795 INFO  [org.ovirt.engine.core.bll.RunVmCommand]
(pool-6-thread-50) Lock freed to object EngineLock [exclusiveLocks= key:
8e2c9057-deee-48a6-8314-a34530fc53cb value: VM
, sharedLocks= ]



*Steve Dainard *
Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |
**LinkedIn<https://www.linkedin.com/company/miovision-technologies>  |
 Twitter <https://twitter.com/miovision>  |
Facebook<https://www.facebook.com/miovision>
*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Wed, Jul 17, 2013 at 12:21 PM, Vijay Bellur  wrote:

> On 07/17/2013 09:04 PM, Steve Dainard wrote:
>
>
>>
>>
>> *Web-UI displays:*
>>
>> VM VM1 is down. Exit message: internal error process exited while
>> connecting to monitor: qemu-system-x86_64: -drive
>> file=gluster://ovirt001/vol1/**a87a7ef6-2c74-4d8e-a6e0-**
>> a392d0f791cf/images/238cc6cf-**070c-4483-b686-c0de7ddf0dfa/**
>> ff2bca2d-4ed1-46c6-93c8-**22a39bb1626a,if=none,id=drive-**
>> virtio-disk0,format=raw,**serial=238cc6cf-070c-4483-**
>> b686-c0de7ddf0dfa,cache=none,**werror=stop,rerror=stop,aio=**threads:
>> could not open disk image
>> gluster://ovirt001/vol1/**a87a7ef6-2c74-4d8e-a6e0-**
>> a392d0f791cf/images/238cc6cf-**070c-4483-b686-c0de7ddf0dfa/**
>> ff2bca2d-4ed1-46c6-93c8-**22a39bb1626a:
>> No such file or directory .
>> VM VM1 was started by admin@internal (Host: ovirt001).
>> The disk VM1_Disk1 was successfully added to VM VM1.
>>
>> *I can see the image on the gluster machine, and it looks to have the
>> correct permissions:*
>>
>> [root@ovirt001 238cc6cf-070c-4483-b686-**c0de7ddf0dfa]# pwd
>> /mnt/storage1/vol1/a87a7ef6-**2c74-4d8e-a6e0-a392d0f791cf/**
>> images/238cc6cf-070c-4483-**b686-c0de7ddf0dfa
>> [root@ovirt001 238cc6cf-070c-4483-b686-**c0de7ddf0dfa]# ll
>> total 1028
>> -rw-rw. 2 vdsm kvm 32212254720 Jul 17 11:11
>> ff2bca2d-4ed1-46c6-93c8-**22a39bb1626a
>> -rw-rw. 2 vdsm kvm 1048576 Jul 17 11:11
>> ff2bca2d-4ed1-46c6-93c8-**22a39bb1626a.lease
>> -rw-r--r--. 2 vdsm kvm 268 Jul 17 11:11
>> ff2bca2d-4ed1-46c6-93c8-**22a39bb1626a.meta
>>
>
> Can you please try after doing these changes:
>
> 1) gluster volume set  server.allo

Re: [Users] so, what do you want next in oVirt?

2013-08-29 Thread Steve Dainard

>
>> importing any data storage domain i assume?
>>
> Yes.
>
>
>>  Possibility of direct use of HW by VM.
>>>
>>
>> such as?
>>
> Telephone modem. PCI-express cards. Graphic cards


+ USB devices, we have hardware license keys for some software. In kvm/qemu
I can expose the license key directly to a VM.


>
>
>>
>>  and the absolutely fantastic feature would be to create clusters from
>>> Intel and AMD processors together!
>>>
>>
>> well, you can do that today if you want to via a config change. the only
>> thing is live migration won't work (you should probably use -cpu host to
>> get best performance, since live migration won't be used anyway)
>> (well, in theory we could live migrate only between hosts of same cpu
>> vendor, but not sure interesting enough use case to make cluster and
>> scheduling more complex). though you can do that part on your own with the
>> new pluggable scheduler, or use -cpu host to get max performance if you
>> don't care about live migration
>>
>>  From my point of view it is better to have slower cpu performance and
> possibility to use all of our servers in cluster. I would like to have live
> migration available from intel to amd. The problem is only in cpu
> instruction sets? If so, I can use only common sets.
>
>
> Another feature which I forgot is network between VMs and mirroring
> traffic. Both configurable from WUI.
>
>
>>> Thank you ;-)
>>>
>>>
>>> __**_
>>> Users mailing list
>>> Users@ovirt.org
>>> http://lists.ovirt.org/**mailman/listinfo/users
>>>
>>>
>>
> __**_
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/**mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Short delay in 3.3 release-- was [Re: oVirt 3.3 Release Go/No-Go Meeting Minutes]

2013-09-04 Thread Steve Dainard

Will the EL6 build include native (non-fuse) gluster storage support? I
believe qemu version is too old at this point.

Someone suggested building the newer version of qemu and deps and including
them in the ovirt repo for 3.3. Any plans/interest in this?

*Steve*

On Wed, Sep 4, 2013 at 11:14 AM, Mike Burns  wrote:

> On 09/04/2013 07:46 AM, Mike Burns wrote:
>
>> On 09/04/2013 04:22 AM, Dave Neary wrote:
>>
>>> Hi Mike, Ofer,
>>>
>>> On 09/03/2013 05:00 PM, Mike Burns wrote:
>>>
 Just to summarize the meeting:

 There is an issue with the EL6 repo missing spice-html5 (fixed)
 There is an issue with current sdk (build coming today)
 There is a bug with gluster-only installs (build coming today)

 Due to these issues, we're delaying until Monday 09-September for the
 release.

>>>
>>> Once we have an EL6 repo from which people can install oVirt 3.3 and an
>>> all in one build people can test, can you post links here, please? I'd
>>> like to try it out this week.
>>>
>>
>> Yes, I will reply.  The new SDK is posted.  The engine build failed, so
>> waiting on Ofer to get me a new build.
>>
>
> The EL6 and F19 builds are uploaded along with the new sdk.
> spice-html package has been uploaded to el6 repos
> new tools (image-uploader,log-collector, etc) uploaded.
>
> Mike
>
>
>
>>
>>> Also, do we have an All in One Live image for the release? Either F19 or
>>> EL6 based.
>>>
>>
>> I haven't had one to post yet, but I'll follow up and see if there is
>> one available (though I expect this will come after we get the new
>> engine rpms posted)
>>
>> Mike
>>
>>
>>> Thanks,
>>> Dave.
>>>
>>>
>>
> __**_
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/**mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Announcement: oVirt 3.3.0 release!

2013-09-18 Thread Steve Dainard

Awesome.

Dave can you have someone edit the installation page and specifically
mention that EL hypervisors don't have native Gluster (fuse only) support -
It would be very frustrating to think you can have all the features, and OS
stability as well, only to find out EL doesn't support a key feature.

I've seen lots of requests on the board for a back-port so this might
preempt more questions / problems.

*Steve *
*
*
On Tue, Sep 17, 2013 at 8:31 AM, Dave Neary  wrote:

>
> The oVirt development team is very happy to announce the general
> availability of oVirt 3.3.0 as of September 16th 2013. This release
> solidifies oVirt as a leading KVM management application, and open
> source alternative to VMware vSphere.
>
> oVirt is available now for Fedora 19 and Red Hat Enterprise Linux 6.4
> (or similar).
>
> Get started with oVirt now! http://www.ovirt.org/Download
>
> Chief among the many new features in the release are:
>
> * Tight integration with Gluster - take advantage of native GlusterFS
> support, or use oVirt to manage your Gluster bricks
> * Integration with OpenStack - share images stored in Glance, and take
> advantage of Neutron integration for network topology definition
> * Extensibility and control - with improvements in VM scheduling, an
> increased array of hooks and APIs, oVirt gives you absolute control over
> your virtual datacenter
>
> Read more about the oVirt 3.3 release on the Red Hat community blog:
>
> http://community.redhat.com/ovirt-3-3-spices-up-the-software-defined-datacenter-with-openstack-and-gluster-integration/
>
> The full release announcement:
> http://www.ovirt.org/OVirt_3.3_release_announcement
>
> oVirt 3.3 release notes: http://www.ovirt.org/OVirt_3.3_release_notes
>
>
> Regards,
> Dave Neary.
>
> --
> Dave Neary - Community Action and Impact
> Open Source and Standards, Red Hat - http://community.redhat.com
> Ph: +33 9 50 71 55 62 / Cell: +33 6 77 01 92 13
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[Users] Ovirt 3.3 Fedora 19 add gluster storage permissions error

2013-09-19 Thread Steve Dainard

Hello,

New Ovirt 3.3 install on Fedora 19.

When I try to add a gluster storage domain I get the following:

*UI error:*
*Error while executing action Add Storage Connection: Permission settings
on the specified path do not allow access to the storage.*
*Verify permission settings on the specified storage path.*

*VDSM logs contain:*
Thread-393::DEBUG::2013-09-19
11:59:42,399::BindingXMLRPC::177::vds::(wrapper) client [10.0.0.34]
Thread-393::DEBUG::2013-09-19
11:59:42,399::task::579::TaskManager.Task::(_updateState)
Task=`12c38fec-0072-4974-a8e3-9125b3908246`::moving from state init ->
state preparing
Thread-393::INFO::2013-09-19
11:59:42,400::logUtils::44::dispatcher::(wrapper) Run and protect:
connectStorageServer(domType=7,
spUUID='----', conList=[{'port': '',
'connection': '192.168.1.1:/rep2-virt', 'iqn': '', 'portal': '', 'user':
'', 'vfs_type': 'glusterfs', 'password': '**', 'id':
'----'}], options=None)
Thread-393::DEBUG::2013-09-19
11:59:42,405::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n
/usr/bin/mount -t glusterfs 192.168.1.1:/rep2-virt
/rhev/data-center/mnt/glusterSD/192.168.1.1:_rep2-virt' (cwd None)
Thread-393::DEBUG::2013-09-19
11:59:42,490::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n
/usr/bin/umount -f -l /rhev/data-center/mnt/glusterSD/192.168.1.1:_rep2-virt'
(cwd None)
Thread-393::ERROR::2013-09-19
11:59:42,505::hsm::2382::Storage.HSM::(connectStorageServer) Could not
connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2379, in connectStorageServer
conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 227, in connect
raise e
StorageServerAccessPermissionError: Permission settings on the specified
path do not allow access to the storage. Verify permission settings on the
specified storage path.: 'path =
/rhev/data-center/mnt/glusterSD/192.168.1.1:_rep2-virt'
Thread-393::DEBUG::2013-09-19
11:59:42,506::hsm::2396::Storage.HSM::(connectStorageServer) knownSDs: {}
Thread-393::INFO::2013-09-19
11:59:42,506::logUtils::47::dispatcher::(wrapper) Run and protect:
connectStorageServer, Return response: {'statuslist': [{'status': 469,
'id': '----'}]}
Thread-393::DEBUG::2013-09-19
11:59:42,506::task::1168::TaskManager.Task::(prepare)
Task=`12c38fec-0072-4974-a8e3-9125b3908246`::finished: {'statuslist':
[{'status': 469, 'id': '----'}]}
Thread-393::DEBUG::2013-09-19
11:59:42,506::task::579::TaskManager.Task::(_updateState)
Task=`12c38fec-0072-4974-a8e3-9125b3908246`::moving from state preparing ->
state finished
Thread-393::DEBUG::2013-09-19
11:59:42,506::resourceManager::939::ResourceManager.Owner::(releaseAll)
Owner.releaseAll requests {} resources {}
Thread-393::DEBUG::2013-09-19
11:59:42,507::resourceManager::976::ResourceManager.Owner::(cancelAll)
Owner.cancelAll requests {}
Thread-393::DEBUG::2013-09-19
11:59:42,507::task::974::TaskManager.Task::(_decref)
Task=`12c38fec-0072-4974-a8e3-9125b3908246`::ref 0 aborting False

*Other info:*
- I have two nodes, ovirt001, ovirt002 they are both Fedora 19.
- The gluster bricks are replicated and located on the nodes.
(ovirt001:rep2-virt, ovirt002:rep2-virt)
- Local directory for the mount, I changed permissions on glusterSD to 777,
it was 755, and there is nothing in that directory:
[root@ovirt001 mnt]# pwd
/rhev/data-center/mnt
[root@ovirt001 mnt]# ll
total 4
drwxrwxrwx. 2 vdsm kvm 4096 Sep 19 12:18 glusterSD

I find it odd that the UUID's listed in the vdsm logs are zero's..

Appreciate any help,


*Steve
*
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [Users] Bottleneck writing to a VM w/ mounted GlusterFS

2013-09-28 Thread Steve Dainard

Are you duplicating the traffic over the same physical network by relaying
through the VM, rather than writing directly to network storage, thereby
halving the write performance?

Assuming you're on a GigE network, are all the network devices running in
full duplex?

Just some guesses based on the fact that the throughput is almost exactly
half.

*Steve Dainard *
IT Infrastructure Manager
Miovision <http://miovision.com/> | *Rethink Traffic*
519-513-2407 ex.250
877-646-8476 (toll-free)

*Blog <http://miovision.com/blog>  |
**LinkedIn<https://www.linkedin.com/company/miovision-technologies>  |
 Twitter <https://twitter.com/miovision>  |
Facebook<https://www.facebook.com/miovision>
*
--
 Miovision Technologies Inc. | 148 Manitou Drive, Suite 101, Kitchener, ON,
Canada | N2C 1L3
This e-mail may contain information that is privileged or confidential. If
you are not the intended recipient, please delete the e-mail and any
attachments and notify us immediately.


On Thu, Sep 26, 2013 at 5:23 AM, Stefano Stagnaro
wrote:

> Hello,
>
> I'm testing oVirt 3.3 with GlusterFS libgfapi back-end. I'm using a node
> for engine and one for VDSM. From the VMs I'm mounting a second GlusterFS
> volume on a third storage server.
>
> I'm experiencing very bad transfer rates (38MB/s) writing from a client to
> a VM on the mounted GlusterFS. On the other hand, from the VM itself I can
> move a big file from the root vda (libgfapi) to the mounted GlusterFS at
> 70MB/s.
>
> I can't really figure out where the bottleneck could be. I'm using only
> the default ovirtmgmt network.
>
> Thank you for your help, any hint will be appreciated.
>
> Regards,
> --
> Stefano Stagnaro
> IT Manager
>
> Prisma Engineering S.r.l.
> Via Petrocchi, 4
> 20127 Milano – Italy
>
> Tel. 02 26113507 int 339
> e-mail: stefa...@prisma-eng.com
> skype: stefano.stagnaro
>
>
>
>
> ___
> Users mailing list
> Users@ovirt.org
> http://lists.ovirt.org/mailman/listinfo/users
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Ovirt Hypervisor vdsm.Scheduler logs fill partition

2016-10-23 Thread Steve Dainard

Do you know when .34 will be released?

http://mirror.centos.org/centos/7/virt/x86_64/ovirt-3.6/
Latest version is:
vdsm-cli-4.17.32-1.el7.noarch.rpm 08-Aug-2016 17:36

On Fri, Oct 14, 2016 at 1:11 AM, Francesco Romani 
wrote:

>
> - Original Message -
> > From: "Simone Tiraboschi" 
> > To: "Steve Dainard" , "Francesco Romani" <
> from...@redhat.com>
> > Cc: "users" 
> > Sent: Friday, October 14, 2016 9:59:49 AM
> > Subject: Re: [ovirt-users] Ovirt Hypervisor vdsm.Scheduler logs fill
> partition
> >
> > On Fri, Oct 14, 2016 at 1:12 AM, Steve Dainard 
> wrote:
> >
> > > Hello,
> > >
> > > I had a hypervisor semi-crash this week, 4 of ~10 VM's continued to
> run,
> > > but the others were killed off somehow and all VM's running on this
> host
> > > had '?' status in the ovirt UI.
> > >
> > > This appears to have been caused by vdsm logs filling up disk space on
> the
> > > logging partition.
> > >
> > > I've attached the log file vdsm.log.27.xz which shows this error:
> > >
> > > vdsm.Scheduler::DEBUG::2016-10-11
> > > 16:42:09,318::executor::216::Executor::(_discard)
> > > Worker discarded:  > > action= > > 'virt.periodic.DriveWatermarkMonitor'>
> > > at 0x7f8e90021210> at 0x7f8e90021250> discarded at 0x7f8dd123e850>
> > >
> > > which happens more and more frequently throughout the log.
> > >
> > > It was a bit difficult to understand what caused the failure, but the
> logs
> > > were getting really large, then being xz'd which compressed 11G+ into
> a few
> > > MB. Once this happened the disk space would be freed, and nagios
> wouldn't
> > > hit the 3rd check to throw a warning, until pretty much right at the
> crash.
> > >
> > > I was able to restart vdsmd to resolve the issue, but I still need to
> know
> > > why these logs started to stack up so I can avoid this issue in the
> future.
> > >
> >
> > We had this one: https://bugzilla.redhat.com/show_bug.cgi?id=1383259
> > but in your case the logs are rotating.
> > Francesco?
>
> Hi,
>
> yes, it is a different issue. Here the log messages are caused by the
> Worker threads
> of the periodic subsystem, which are leaking[1].
> This was a bug in Vdsm (insufficient protection against rogue domains),
> but the
> real problem is that some of your domain are being unresponsive at
> hypervisor level.
> The most likely cause is in turn unresponsive storages.
>
> Fixes are been committed and shipped with Vdsm 4.17.34.
>
> See: ttps://bugzilla.redhat.com/1364925
>
> HTH,
>
> +++
>
> [1] actually, they are replaced too quickly, leading to unbound growth.
> So those aren't actually "leaking", Vdsm is just overzealous handling one
> error condition,
> making things worse than before.
> Still serious issue, no doubt, but quite different cause.
>
> --
> Francesco Romani
> Red Hat Engineering Virtualization R & D
> Phone: 8261328
> IRC: fromani
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] Feature request: add stateless option in VM pool configuration

2015-08-21 Thread Steve Dainard

I'd like to request a setting in VM pools which forces the VM to be stateless.

I see from the docs this will occur if selected under run once, or
started from the user portal, but it would be nice to set this for a
normal admin start of the VM(s) as well.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Re: [ovirt-users] Feature request: add stateless option in VM pool configuration

2015-08-21 Thread Steve Dainard

I should also mention that I can't start multiple pool VM's at the
same time with run-once from the admin portal, so I'd have to
individually run once each VM which is a member of the pool.

On Fri, Aug 21, 2015 at 3:57 PM, Steve Dainard  wrote:
> I'd like to request a setting in VM pools which forces the VM to be stateless.
>
> I see from the docs this will occur if selected under run once, or
> started from the user portal, but it would be nice to set this for a
> normal admin start of the VM(s) as well.
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

[ovirt-users] LDAP authentication with TLS

2015-10-06 Thread Steve Dainard

Hello,

Trying to configure Ovirt 3.5.3.1-1.el7.centos for LDAP authentication.

I've configured the appropriate aaa profile but I'm getting TLS errors
 when I search for users to add via ovirt:

The connection reader was unable to successfully complete TLS
negotiation: javax_net_ssl_SSLHandshakeException:
sun_security_validator_ValidatorException: No trusted certificate
found caused by sun_security_validator_ValidatorException: No trusted
certificate found

I added the external CA certificate using keytool as per
https://github.com/oVirt/ovirt-engine-extension-aaa-ldap with
appropriate adjustments of course:

keytool -importcert -noprompt -trustcacerts -alias myrootca \
   -file myrootca.pem -keystore myrootca.jks -storepass changeit

I know this certificate works, and can connect to LDAP with TLS as I'm
using the same LDAP configuration/certificate with SSSD.

Can anyone clarify whether I should be adding the external CA
certificate or the LDAP host certificate with keytool or any other
suggestions?

Thanks,
Steve
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

66 matches

Mail list logo