Re: [ovirt-users] Move VM disks from gluster storage to local storage

2016-08-14 Thread Siavash Safi
On Sun, Aug 14, 2016 at 10:04 PM Nir Soffer <nsof...@redhat.com> wrote:

> On Sun, Aug 14, 2016 at 8:10 PM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
> >
> >
> > On Sun, Aug 14, 2016 at 8:07 PM Nir Soffer <nsof...@redhat.com> wrote:
> >>
> >> On Sun, Aug 14, 2016 at 5:55 PM, Siavash Safi <siavash.s...@gmail.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > An unknown bug broke our gluster storage (dom_md/ids is corrupted) and
> >> > oVirt
> >> > no longer activates the storage(I tried to recover it using the
> similar
> >> > issues reported in mailing list but it didn't work).
> >>
> >> Can you explain what you did?
> >
> > cd /mnt/4697fbde-45fb-4f91-ac4c-5516bc59f683/dom_md/
> > rm ids
> > touch ids
> > sanlock direct init -s 4697fbde-45fb-4f91-ac4c-5516bc59f683:0:ids:1048576
>
> The offset parameter should be 0, not 1048576:
>
> sanlock direct init -s 4697fbde-45fb-4f91-ac4c-5516bc59f683:0:ids:0
>
> See
> http://lists.ovirt.org/pipermail/users/2016-February/038046.html
>
> Please retry this.
>
> I didn't know what the number at the end of the sring does ;)


> Also, are you using replica 3? These issues  typically happened when people
> used replica 2 gluster volumes.
>
Actually we removed one of the broken nodes from gluster and tried to setup
local storage.
I wiped the storage and added the bricks back to gluster.

Thanks Nir, recreating the ids file with correct offset and resizing
gluster to replica 3 fixed the issue :)

>
> >> The best way to fix this is to initialize the corrupt id file and
> >> activate the domain.
> >
> > This would be great!
> >>
> >>
> >>
> >> > As I checked VM disk images are still accessible when I mount the
> >> > gluster
> >> > storage manually.
> >> > How can we manually move the VM disk images to local storage? (oVirt
> >> > complains about gluster storage being inactive when using the web
> >> > interface
> >> > for move/copy)
> >>
> >> You can easily copy the images to another file based storage (nfs,
> >> gluster) like this:
> >>
> >> 1. activate other storage domain using engine
> >> 2. mount gluster domain manually
> >> 3. copy the image from gluster domain to the other domain:
> >>
> >> cp -r gluster-domain-mountpoint/images/image-uuid
> >> /rhev/data-center/mnt/server:_path/other-domain-uuid/images/
> >>
> >> But the images will not be available since engine does know them.
> >> Maybe this can be
> >> fixed by modifying engine database.
> >>
> > How complicated is it?
>
> I never tried this, lets try the simple way first.
>
> >> Another solution (if you are using ovirt 4.0), is to upload the images
> >> to a new disk,
> >> and attach the disk to the vm instead of the missing disk.
> >
> > We are running 3.6
>
> Maybe consider an upgrade?
>
> Nir
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Move VM disks from gluster storage to local storage

2016-08-14 Thread Siavash Safi
On Sun, Aug 14, 2016 at 8:07 PM Nir Soffer <nsof...@redhat.com> wrote:

> On Sun, Aug 14, 2016 at 5:55 PM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
> > Hi,
> >
> > An unknown bug broke our gluster storage (dom_md/ids is corrupted) and
> oVirt
> > no longer activates the storage(I tried to recover it using the similar
> > issues reported in mailing list but it didn't work).
>
> Can you explain what you did?
>
cd /mnt/4697fbde-45fb-4f91-ac4c-5516bc59f683/dom_md/
rm ids
touch ids
sanlock direct init -s 4697fbde-45fb-4f91-ac4c-5516bc59f683:0:ids:1048576

>
> The best way to fix this is to initialize the corrupt id file and
> activate the domain.

This would be great!

>
>
> As I checked VM disk images are still accessible when I mount the gluster
> > storage manually.
> > How can we manually move the VM disk images to local storage? (oVirt
> > complains about gluster storage being inactive when using the web
> interface
> > for move/copy)
>
> You can easily copy the images to another file based storage (nfs,
> gluster) like this:
>
> 1. activate other storage domain using engine
> 2. mount gluster domain manually
> 3. copy the image from gluster domain to the other domain:
>
> cp -r gluster-domain-mountpoint/images/image-uuid
> /rhev/data-center/mnt/server:_path/other-domain-uuid/images/
>
> But the images will not be available since engine does know them.
> Maybe this can be
> fixed by modifying engine database.
>
> How complicated is it?


> Another solution (if you are using ovirt 4.0), is to upload the images
> to a new disk,
> and attach the disk to the vm instead of the missing disk.

We are running 3.6


>
>
Nir
>
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


[ovirt-users] Move VM disks from gluster storage to local storage

2016-08-14 Thread Siavash Safi
Hi,

An unknown bug broke our gluster storage (dom_md/ids is corrupted) and
oVirt no longer activates the storage(I tried to recover it using the
similar issues reported in mailing list but it didn't work).
As I checked VM disk images are still accessible when I mount the gluster
storage manually.
How can we manually move the VM disk images to local storage? (oVirt
complains about gluster storage being inactive when using the web interface
for move/copy)

Thanks,
Siavash
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users


Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
It seems that dir modes are wrong!?
[root@node1 ~]# ls -ld /data/brick*/brick*
drw---. 5 vdsm kvm 107 Jul 28 20:13 /data/brick1/brick1
drw---. 5 vdsm kvm  82 Jul 27 23:08 /data/brick2/brick2
[root@node2 ~]# ls -ld /data/brick*/brick*
drwxr-xr-x. 5 vdsm kvm 107 Apr 26 19:33 /data/brick1/brick1
drw---. 5 vdsm kvm  82 Jul 27 23:08 /data/brick2/brick2
drw---. 5 vdsm kvm 107 Jul 28 20:13 /data/brick3/brick3
[root@node3 ~]# ls -ld /data/brick*/brick*
drw---. 5 vdsm kvm 107 Jul 28 20:10 /data/brick1/brick1
drw---. 5 vdsm kvm  82 Jul 27 23:08 /data/brick2/brick2

On Thu, Jul 28, 2016 at 9:06 PM Sahina Bose <sab...@redhat.com> wrote:

>
>
> - Original Message -----
> > From: "Siavash Safi" <siavash.s...@gmail.com>
> > To: "Sahina Bose" <sab...@redhat.com>
> > Cc: "David Gossage" <dgoss...@carouselchecks.com>, "users" <
> users@ovirt.org>, "Nir Soffer" <nsof...@redhat.com>,
> > "Allon Mureinik" <amure...@redhat.com>
> > Sent: Thursday, July 28, 2016 9:04:32 PM
> > Subject: Re: [ovirt-users] Cannot find master domain
> >
> > Please check the attachment.
>
> Nothing out of place in the mount logs.
>
> Can you ensure the brick dir permissions are vdsm:kvm - even for the brick
> that was replaced?
>
> >
> > On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sab...@redhat.com> wrote:
> >
> > >
> > >
> > > - Original Message -
> > > > From: "Siavash Safi" <siavash.s...@gmail.com>
> > > > To: "Sahina Bose" <sab...@redhat.com>
> > > > Cc: "David Gossage" <dgoss...@carouselchecks.com>, "users" <
> > > users@ovirt.org>
> > > > Sent: Thursday, July 28, 2016 8:35:18 PM
> > > > Subject: Re: [ovirt-users] Cannot find master domain
> > > >
> > > > [root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/
> > > > drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28
> /rhev/data-center/mnt/glusterSD/
> > > > [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/
> > > > getfacl: Removing leading '/' from absolute path names
> > > > # file: rhev/data-center/mnt/glusterSD/
> > > > # owner: vdsm
> > > > # group: kvm
> > > > user::rwx
> > > > group::r-x
> > > > other::r-x
> > > >
> > >
> > >
> > > The ACLs look correct to me. Adding Nir/Allon for insights.
> > >
> > > Can you attach the gluster mount logs from this host?
> > >
> > >
> > > > And as I mentioned in another message, the directory is empty.
> > > >
> > > > On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sab...@redhat.com>
> wrote:
> > > >
> > > > > Error from vdsm log: Permission settings on the specified path do
> not
> > > > > allow access to the storage. Verify permission settings on the
> > > specified
> > > > > storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11
> :
> > > _ovirt'
> > > > >
> > > > > I remember another thread about a similar issue - can you check
> the ACL
> > > > > settings on the storage path?
> > > > >
> > > > > - Original Message -
> > > > > > From: "Siavash Safi" <siavash.s...@gmail.com>
> > > > > > To: "David Gossage" <dgoss...@carouselchecks.com>
> > > > > > Cc: "users" <users@ovirt.org>
> > > > > > Sent: Thursday, July 28, 2016 7:58:29 PM
> > > > > > Subject: Re: [ovirt-users] Cannot find master domain
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
> > > > > dgoss...@carouselchecks.com >
> > > > > > wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
> > > siavash.s...@gmail.com >
> > > > > > wrote:
> > > > > >
> > > > > >
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Issue: Cannot find master domain
> > > > > > Changes applied before issue started to happen: replaced
> > > > > > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:
> > > /data/bri

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
Yes, the dir is missing on all node. I only created it on node1 (node2 &
node3 are put in maintenance mode manually)

Yes, manual mount works fine:

[root@node1 ~]# /usr/bin/mount -t glusterfs -o
backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
[root@node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm0 Jul 27 23:05 __DIRECT_IO_TEST__
[root@node1 ~]# touch /mnt/test
[root@node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm  4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm 0 Jul 27 23:05 __DIRECT_IO_TEST__
-rw-r--r--. 1 root root0 Jul 28 20:10 test
[root@node1 ~]# chown vdsm:kvm /mnt/test
[root@node1 ~]# ls -l /mnt/
total 4
drwxr-xr-x. 5 vdsm kvm 4096 Apr 26 19:34
4697fbde-45fb-4f91-ac4c-5516bc59f683
-rwxr-xr-x. 1 vdsm kvm0 Jul 27 23:05 __DIRECT_IO_TEST__
-rw-r--r--. 1 vdsm kvm0 Jul 28 20:10 test
[root@node1 ~]# echo foo > /mnt/test
[root@node1 ~]# cat /mnt/test
foo


On Thu, Jul 28, 2016 at 8:06 PM David Gossage <dgoss...@carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 10:28 AM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
>
>> I created the directory with correct permissions:
>> drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51
>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/
>>
>> It was removed after I tried to activate the storage from web.
>>
>> Is dir missing on all 3 oVirt nodes?  Did you create on all 3?
>
> When you did test mount with oVirts mount options did permissions on files
> after mount look proper?  Can you read/write to mount?
>
>
>> Engine displays the master storage as inactive:
>> [image: oVirt_Engine_Web_Administration.png]
>>
>>
>> On Thu, Jul 28, 2016 at 7:40 PM David Gossage <
>> dgoss...@carouselchecks.com> wrote:
>>
>>> On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.s...@gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jul 28, 2016 at 7:19 PM David Gossage <
>>>> dgoss...@carouselchecks.com> wrote:
>>>>
>>>>> On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.s...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> file system: xfs
>>>>>> features.shard: off
>>>>>>
>>>>>
>>>>> Ok was just seeing if matched up to the issues latest 3.7.x releases
>>>>> have with zfs and sharding but doesn't look like your issue.
>>>>>
>>>>>  In your logs I see it mounts with thee commands.  What happens if you
>>>>> use same to a test dir?
>>>>>
>>>>>  /usr/bin/mount -t glusterfs -o 
>>>>> backup-volfile-servers=172.16.0.12:172.16.0.13
>>>>> 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>>>>>
>>>>
>>>> It mounts successfully:
>>>> [root@node1 ~]# /usr/bin/mount -t glusterfs -o
>>>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
>>>> [root@node1 ~]# ls /mnt/
>>>> 4697fbde-45fb-4f91-ac4c-5516bc59f683  __DIRECT_IO_TEST__
>>>>
>>>>
>>>>> It then umounts it and complains short while later of permissions.
>>>>>
>>>>> StorageServerAccessPermissionError: Permission settings on the
>>>>> specified path do not allow access to the storage. Verify permission
>>>>> settings on the specified storage path.: 'path =
>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>>>
>>>>> Are the permissions of dirs to
>>>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?
>>>>>
>>>>
>>>> /rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the
>>>> directory after failure to cleanup?
>>>>
>>>
>>> Maybe though I don't recall it ever being deleted unless you maybe
>>> destroy detach storage. What if you create that directory and permissions
>>> appropriately on any node missing then try and activate storage?
>>>
>>> In engine is it still displaying the master storage domain?
>>>
>>>
>>>> How about on the bricks anything out of place?
>>>>>
>>>>
>>>> I didn't notice anything.
>>>>
>>>>
>>>>> Is gluster still using same options as before?  could it have reset
>>>>> the user and group to not be 36?
>>>>>
>>>>
>>>&

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
Please check the attachment.

On Thu, Jul 28, 2016 at 7:46 PM Sahina Bose <sab...@redhat.com> wrote:

>
>
> - Original Message -----
> > From: "Siavash Safi" <siavash.s...@gmail.com>
> > To: "Sahina Bose" <sab...@redhat.com>
> > Cc: "David Gossage" <dgoss...@carouselchecks.com>, "users" <
> users@ovirt.org>
> > Sent: Thursday, July 28, 2016 8:35:18 PM
> > Subject: Re: [ovirt-users] Cannot find master domain
> >
> > [root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/
> > drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/
> > [root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/
> > getfacl: Removing leading '/' from absolute path names
> > # file: rhev/data-center/mnt/glusterSD/
> > # owner: vdsm
> > # group: kvm
> > user::rwx
> > group::r-x
> > other::r-x
> >
>
>
> The ACLs look correct to me. Adding Nir/Allon for insights.
>
> Can you attach the gluster mount logs from this host?
>
>
> > And as I mentioned in another message, the directory is empty.
> >
> > On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sab...@redhat.com> wrote:
> >
> > > Error from vdsm log: Permission settings on the specified path do not
> > > allow access to the storage. Verify permission settings on the
> specified
> > > storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:
> _ovirt'
> > >
> > > I remember another thread about a similar issue - can you check the ACL
> > > settings on the storage path?
> > >
> > > - Original Message -
> > > > From: "Siavash Safi" <siavash.s...@gmail.com>
> > > > To: "David Gossage" <dgoss...@carouselchecks.com>
> > > > Cc: "users" <users@ovirt.org>
> > > > Sent: Thursday, July 28, 2016 7:58:29 PM
> > > > Subject: Re: [ovirt-users] Cannot find master domain
> > > >
> > > >
> > > >
> > > > On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
> > > dgoss...@carouselchecks.com >
> > > > wrote:
> > > >
> > > >
> > > >
> > > > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
> siavash.s...@gmail.com >
> > > > wrote:
> > > >
> > > >
> > > >
> > > > Hi,
> > > >
> > > > Issue: Cannot find master domain
> > > > Changes applied before issue started to happen: replaced
> > > > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:
> /data/brick3/brick3,
> > > did
> > > > minor package upgrades for vdsm and glusterfs
> > > >
> > > > vdsm log: https://paste.fedoraproject.org/396842/
> > > >
> > > >
> > > > Any errrors in glusters brick or server logs? The client gluster logs
> > > from
> > > > ovirt?
> > > > Brick errors:
> > > > [2016-07-28 14:03:25.002396] E [MSGID: 113091]
> [posix.c:178:posix_lookup]
> > > > 0-ovirt-posix: null gfid for path (null)
> > > > [2016-07-28 14:03:25.002430] E [MSGID: 113018]
> [posix.c:196:posix_lookup]
> > > > 0-ovirt-posix: lstat on null failed [Invalid argument]
> > > > (Both repeated many times)
> > > >
> > > > Server errors:
> > > > None
> > > >
> > > > Client errors:
> > > > None
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > yum log: https://paste.fedoraproject.org/396854/
> > > >
> > > > What version of gluster was running prior to update to 3.7.13?
> > > > 3.7.11-1 from gluster.org repository(after update ovirt switched to
> > > centos
> > > > repository)
> > > >
> > > >
> > > >
> > > >
> > > > Did it create gluster mounts on server when attempting to start?
> > > > As I checked the master domain is not mounted on any nodes.
> > > > Restarting vdsmd generated following errors:
> > > >
> > > > jsonrpc.Executor/5::DEBUG::2016-07-28
> > > > 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
> > > > directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode:
> None
> > > > jsonrpc.Executor/5::DEBUG::2016-07-28
> > > >
> > >
> 18:50:57,661::storageServer::364::Storage.S

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
I created the directory with correct permissions:
drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:51
/rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt/

It was removed after I tried to activate the storage from web.

Engine displays the master storage as inactive:
[image: oVirt_Engine_Web_Administration.png]


On Thu, Jul 28, 2016 at 7:40 PM David Gossage <dgoss...@carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 10:00 AM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
>
>>
>>
>> On Thu, Jul 28, 2016 at 7:19 PM David Gossage <
>> dgoss...@carouselchecks.com> wrote:
>>
>>> On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.s...@gmail.com>
>>> wrote:
>>>
>>>> file system: xfs
>>>> features.shard: off
>>>>
>>>
>>> Ok was just seeing if matched up to the issues latest 3.7.x releases
>>> have with zfs and sharding but doesn't look like your issue.
>>>
>>>  In your logs I see it mounts with thee commands.  What happens if you
>>> use same to a test dir?
>>>
>>>  /usr/bin/mount -t glusterfs -o 
>>> backup-volfile-servers=172.16.0.12:172.16.0.13
>>> 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>>>
>>
>> It mounts successfully:
>> [root@node1 ~]# /usr/bin/mount -t glusterfs -o
>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
>> [root@node1 ~]# ls /mnt/
>> 4697fbde-45fb-4f91-ac4c-5516bc59f683  __DIRECT_IO_TEST__
>>
>>
>>> It then umounts it and complains short while later of permissions.
>>>
>>> StorageServerAccessPermissionError: Permission settings on the specified
>>> path do not allow access to the storage. Verify permission settings on the
>>> specified storage path.: 'path =
>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>>>
>>> Are the permissions of dirs to
>>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt as expected?
>>>
>>
>> /rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory
>> after failure to cleanup?
>>
>
> Maybe though I don't recall it ever being deleted unless you maybe destroy
> detach storage. What if you create that directory and permissions
> appropriately on any node missing then try and activate storage?
>
> In engine is it still displaying the master storage domain?
>
>
>> How about on the bricks anything out of place?
>>>
>>
>> I didn't notice anything.
>>
>>
>>> Is gluster still using same options as before?  could it have reset the
>>> user and group to not be 36?
>>>
>>
>> All options seem to be correct, to make sure I ran "Optimize for Virt
>> Store" from web.
>>
>> Volume Name: ovirt
>> Type: Distributed-Replicate
>> Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2
>> Status: Started
>> Number of Bricks: 2 x 3 = 6
>> Transport-type: tcp
>> Bricks:
>> Brick1: 172.16.0.11:/data/brick1/brick1
>> Brick2: 172.16.0.12:/data/brick3/brick3
>> Brick3: 172.16.0.13:/data/brick1/brick1
>> Brick4: 172.16.0.11:/data/brick2/brick2
>> Brick5: 172.16.0.12:/data/brick2/brick2
>> Brick6: 172.16.0.13:/data/brick2/brick2
>> Options Reconfigured:
>> performance.readdir-ahead: on
>> nfs.disable: off
>> user.cifs: enable
>> auth.allow: *
>> performance.quick-read: off
>> performance.read-ahead: off
>> performance.io-cache: off
>> performance.stat-prefetch: off
>> cluster.eager-lock: enable
>> network.remote-dio: enable
>> cluster.quorum-type: auto
>> cluster.server-quorum-type: server
>> storage.owner-uid: 36
>> storage.owner-gid: 36
>> server.allow-insecure: on
>> network.ping-timeout: 10
>>
>>
>>>> On Thu, Jul 28, 2016 at 7:03 PM David Gossage <
>>>> dgoss...@carouselchecks.com> wrote:
>>>>
>>>>> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.s...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>>>>>> dgoss...@carouselchecks.com> wrote:
>>>>>>
>>>>>>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <
>>>>>>> siavash.s...@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Issue: Cannot find master domain
>>>>>>>> C

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
[root@node1 ~]# ls -ld /rhev/data-center/mnt/glusterSD/
drwxr-xr-x. 2 vdsm kvm 6 Jul 28 19:28 /rhev/data-center/mnt/glusterSD/
[root@node1 ~]# getfacl /rhev/data-center/mnt/glusterSD/
getfacl: Removing leading '/' from absolute path names
# file: rhev/data-center/mnt/glusterSD/
# owner: vdsm
# group: kvm
user::rwx
group::r-x
other::r-x

And as I mentioned in another message, the directory is empty.

On Thu, Jul 28, 2016 at 7:24 PM Sahina Bose <sab...@redhat.com> wrote:

> Error from vdsm log: Permission settings on the specified path do not
> allow access to the storage. Verify permission settings on the specified
> storage path.: 'path = /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>
> I remember another thread about a similar issue - can you check the ACL
> settings on the storage path?
>
> - Original Message -
> > From: "Siavash Safi" <siavash.s...@gmail.com>
> > To: "David Gossage" <dgoss...@carouselchecks.com>
> > Cc: "users" <users@ovirt.org>
> > Sent: Thursday, July 28, 2016 7:58:29 PM
> > Subject: Re: [ovirt-users] Cannot find master domain
> >
> >
> >
> > On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
> dgoss...@carouselchecks.com >
> > wrote:
> >
> >
> >
> > On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi < siavash.s...@gmail.com >
> > wrote:
> >
> >
> >
> > Hi,
> >
> > Issue: Cannot find master domain
> > Changes applied before issue started to happen: replaced
> > 172.16.0.12:/data/brick1/brick1 with 172.16.0.12:/data/brick3/brick3,
> did
> > minor package upgrades for vdsm and glusterfs
> >
> > vdsm log: https://paste.fedoraproject.org/396842/
> >
> >
> > Any errrors in glusters brick or server logs? The client gluster logs
> from
> > ovirt?
> > Brick errors:
> > [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup]
> > 0-ovirt-posix: null gfid for path (null)
> > [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup]
> > 0-ovirt-posix: lstat on null failed [Invalid argument]
> > (Both repeated many times)
> >
> > Server errors:
> > None
> >
> > Client errors:
> > None
> >
> >
> >
> >
> >
> >
> >
> > yum log: https://paste.fedoraproject.org/396854/
> >
> > What version of gluster was running prior to update to 3.7.13?
> > 3.7.11-1 from gluster.org repository(after update ovirt switched to
> centos
> > repository)
> >
> >
> >
> >
> > Did it create gluster mounts on server when attempting to start?
> > As I checked the master domain is not mounted on any nodes.
> > Restarting vdsmd generated following errors:
> >
> > jsonrpc.Executor/5::DEBUG::2016-07-28
> > 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
> > directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None
> > jsonrpc.Executor/5::DEBUG::2016-07-28
> >
> 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
> > Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
> > jsonrpc.Executor/5::DEBUG::2016-07-28
> > 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
> > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope
> > --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
> > backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt
> > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
> > jsonrpc.Executor/5::DEBUG::2016-07-28
> > 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting
> IOProcess...
> > jsonrpc.Executor/5::DEBUG::2016-07-28
> > 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
> > --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
> > /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
> > jsonrpc.Executor/5::ERROR::2016-07-28
> > 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not
> > connect to storageServer
> > Traceback (most recent call last):
> > File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer
> > conObj.connect()
> > File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect
> > six.reraise(t, v, tb)
> > File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect
> > self.getMountObj().getRecord().fs_file)
> > File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess
> > raise se.StorageServerAccessPermissi

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
On Thu, Jul 28, 2016 at 7:19 PM David Gossage <dgoss...@carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 9:38 AM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
>
>> file system: xfs
>> features.shard: off
>>
>
> Ok was just seeing if matched up to the issues latest 3.7.x releases have
> with zfs and sharding but doesn't look like your issue.
>
>  In your logs I see it mounts with thee commands.  What happens if you use
> same to a test dir?
>
>  /usr/bin/mount -t glusterfs -o backup-volfile-servers=172.16.0.12:172.16.0.13
> 172.16.0.11:/ovirt /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
>

It mounts successfully:
[root@node1 ~]# /usr/bin/mount -t glusterfs -o
backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt /mnt
[root@node1 ~]# ls /mnt/
4697fbde-45fb-4f91-ac4c-5516bc59f683  __DIRECT_IO_TEST__


> It then umounts it and complains short while later of permissions.
>
> StorageServerAccessPermissionError: Permission settings on the specified
> path do not allow access to the storage. Verify permission settings on the
> specified storage path.: 'path =
> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>
> Are the permissions of dirs to 
> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt
> as expected?
>

/rhev/data-center/mnt/glusterSD/ is empty. Maybe it remove the directory
after failure to cleanup?

How about on the bricks anything out of place?
>

I didn't notice anything.


> Is gluster still using same options as before?  could it have reset the
> user and group to not be 36?
>

All options seem to be correct, to make sure I ran "Optimize for Virt
Store" from web.

Volume Name: ovirt
Type: Distributed-Replicate
Volume ID: b224d9bc-d120-4fe1-b233-09089e5ca0b2
Status: Started
Number of Bricks: 2 x 3 = 6
Transport-type: tcp
Bricks:
Brick1: 172.16.0.11:/data/brick1/brick1
Brick2: 172.16.0.12:/data/brick3/brick3
Brick3: 172.16.0.13:/data/brick1/brick1
Brick4: 172.16.0.11:/data/brick2/brick2
Brick5: 172.16.0.12:/data/brick2/brick2
Brick6: 172.16.0.13:/data/brick2/brick2
Options Reconfigured:
performance.readdir-ahead: on
nfs.disable: off
user.cifs: enable
auth.allow: *
performance.quick-read: off
performance.read-ahead: off
performance.io-cache: off
performance.stat-prefetch: off
cluster.eager-lock: enable
network.remote-dio: enable
cluster.quorum-type: auto
cluster.server-quorum-type: server
storage.owner-uid: 36
storage.owner-gid: 36
server.allow-insecure: on
network.ping-timeout: 10


>> On Thu, Jul 28, 2016 at 7:03 PM David Gossage <
>> dgoss...@carouselchecks.com> wrote:
>>
>>> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.s...@gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>>>> dgoss...@carouselchecks.com> wrote:
>>>>
>>>>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.s...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Issue: Cannot find master domain
>>>>>> Changes applied before issue started to happen: replaced 
>>>>>> 172.16.0.12:/data/brick1/brick1
>>>>>> with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for
>>>>>> vdsm and glusterfs
>>>>>>
>>>>>> vdsm log: https://paste.fedoraproject.org/396842/
>>>>>>
>>>>>
>>>>>
>>>>> Any errrors in glusters brick or server logs?  The client gluster logs
>>>>> from ovirt?
>>>>>
>>>> Brick errors:
>>>> [2016-07-28 14:03:25.002396] E [MSGID: 113091]
>>>> [posix.c:178:posix_lookup] 0-ovirt-posix: null gfid for path (null)
>>>> [2016-07-28 14:03:25.002430] E [MSGID: 113018]
>>>> [posix.c:196:posix_lookup] 0-ovirt-posix: lstat on null failed [Invalid
>>>> argument]
>>>> (Both repeated many times)
>>>>
>>>> Server errors:
>>>> None
>>>>
>>>> Client errors:
>>>> None
>>>>
>>>>
>>>>>
>>>>>> yum log: https://paste.fedoraproject.org/396854/
>>>>>>
>>>>>
>>>>> What version of gluster was running prior to update to 3.7.13?
>>>>>
>>>> 3.7.11-1 from gluster.org repository(after update ovirt switched to
>>>> centos repository)
>>>>
>>>
>>> What file system do your bricks reside on and do you have sharding
>>> enabled?
>>>
>>>
>>>>> Did it create gl

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
file system: xfs
features.shard: off

On Thu, Jul 28, 2016 at 7:03 PM David Gossage <dgoss...@carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 9:28 AM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
>
>>
>>
>> On Thu, Jul 28, 2016 at 6:29 PM David Gossage <
>> dgoss...@carouselchecks.com> wrote:
>>
>>> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.s...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> Issue: Cannot find master domain
>>>> Changes applied before issue started to happen: replaced 
>>>> 172.16.0.12:/data/brick1/brick1
>>>> with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for
>>>> vdsm and glusterfs
>>>>
>>>> vdsm log: https://paste.fedoraproject.org/396842/
>>>>
>>>
>>>
>>> Any errrors in glusters brick or server logs?  The client gluster logs
>>> from ovirt?
>>>
>> Brick errors:
>> [2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup]
>> 0-ovirt-posix: null gfid for path (null)
>> [2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup]
>> 0-ovirt-posix: lstat on null failed [Invalid argument]
>> (Both repeated many times)
>>
>> Server errors:
>> None
>>
>> Client errors:
>> None
>>
>>
>>>
>>>> yum log: https://paste.fedoraproject.org/396854/
>>>>
>>>
>>> What version of gluster was running prior to update to 3.7.13?
>>>
>> 3.7.11-1 from gluster.org repository(after update ovirt switched to
>> centos repository)
>>
>
> What file system do your bricks reside on and do you have sharding
> enabled?
>
>
>>> Did it create gluster mounts on server when attempting to start?
>>>
>> As I checked the master domain is not mounted on any nodes.
>> Restarting vdsmd generated following errors:
>>
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
>> directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
>> Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope
>> --slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
>> backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt
>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess...
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
>> --cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
>> jsonrpc.Executor/5::ERROR::2016-07-28
>> 18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not
>> connect to storageServer
>> Traceback (most recent call last):
>>   File "/usr/share/vdsm/storage/hsm.py", line 2470, in
>> connectStorageServer
>> conObj.connect()
>>   File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect
>> six.reraise(t, v, tb)
>>   File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect
>> self.getMountObj().getRecord().fs_file)
>>   File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess
>> raise se.StorageServerAccessPermissionError(dirPath)
>> StorageServerAccessPermissionError: Permission settings on the specified
>> path do not allow access to the storage. Verify permission settings on the
>> specified storage path.: 'path =
>> /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {}
>> jsonrpc.Executor/5::INFO::2016-07-28
>> 18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect:
>> connectStorageServer, Return response: {'statuslist': [{'status': 469,
>> 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
>> jsonrpc.Executor/5::DEBUG::2016-07-28
>> 18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare)
>> Task=`21487eb4-de9b-47a3-aa37-7dce0653

Re: [ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
On Thu, Jul 28, 2016 at 6:29 PM David Gossage <dgoss...@carouselchecks.com>
wrote:

> On Thu, Jul 28, 2016 at 8:52 AM, Siavash Safi <siavash.s...@gmail.com>
> wrote:
>
>> Hi,
>>
>> Issue: Cannot find master domain
>> Changes applied before issue started to happen: replaced 
>> 172.16.0.12:/data/brick1/brick1
>> with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for
>> vdsm and glusterfs
>>
>> vdsm log: https://paste.fedoraproject.org/396842/
>>
>
>
> Any errrors in glusters brick or server logs?  The client gluster logs
> from ovirt?
>
Brick errors:
[2016-07-28 14:03:25.002396] E [MSGID: 113091] [posix.c:178:posix_lookup]
0-ovirt-posix: null gfid for path (null)
[2016-07-28 14:03:25.002430] E [MSGID: 113018] [posix.c:196:posix_lookup]
0-ovirt-posix: lstat on null failed [Invalid argument]
(Both repeated many times)

Server errors:
None

Client errors:
None


>
>> yum log: https://paste.fedoraproject.org/396854/
>>
>
> What version of gluster was running prior to update to 3.7.13?
>
3.7.11-1 from gluster.org repository(after update ovirt switched to centos
repository)

>
> Did it create gluster mounts on server when attempting to start?
>
As I checked the master domain is not mounted on any nodes.
Restarting vdsmd generated following errors:

jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,661::fileUtils::143::Storage.fileUtils::(createdir) Creating
directory: /rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt mode: None
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,661::storageServer::364::Storage.StorageServer.MountConnection::(_get_backup_servers_option)
Using bricks: ['172.16.0.11', '172.16.0.12', '172.16.0.13']
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,662::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
--cpu-list 0-31 /usr/bin/sudo -n /usr/bin/systemd-run --scope
--slice=vdsm-glusterfs /usr/bin/mount -t glusterfs -o
backup-volfile-servers=172.16.0.12:172.16.0.13 172.16.0.11:/ovirt
/rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,789::__init__::318::IOProcessClient::(_run) Starting IOProcess...
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,802::mount::229::Storage.Misc.excCmd::(_runcmd) /usr/bin/taskset
--cpu-list 0-31 /usr/bin/sudo -n /usr/bin/umount -f -l
/rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt (cwd None)
jsonrpc.Executor/5::ERROR::2016-07-28
18:50:57,813::hsm::2473::Storage.HSM::(connectStorageServer) Could not
connect to storageServer
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/hsm.py", line 2470, in connectStorageServer
conObj.connect()
  File "/usr/share/vdsm/storage/storageServer.py", line 248, in connect
six.reraise(t, v, tb)
  File "/usr/share/vdsm/storage/storageServer.py", line 241, in connect
self.getMountObj().getRecord().fs_file)
  File "/usr/share/vdsm/storage/fileSD.py", line 79, in validateDirAccess
raise se.StorageServerAccessPermissionError(dirPath)
StorageServerAccessPermissionError: Permission settings on the specified
path do not allow access to the storage. Verify permission settings on the
specified storage path.: 'path =
/rhev/data-center/mnt/glusterSD/172.16.0.11:_ovirt'
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,817::hsm::2497::Storage.HSM::(connectStorageServer) knownSDs: {}
jsonrpc.Executor/5::INFO::2016-07-28
18:50:57,817::logUtils::51::dispatcher::(wrapper) Run and protect:
connectStorageServer, Return response: {'statuslist': [{'status': 469,
'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,817::task::1191::Storage.TaskManager.Task::(prepare)
Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::finished: {'statuslist':
[{'status': 469, 'id': u'2d285de3-eede-42aa-b7d6-7b8c6e0667bc'}]}
jsonrpc.Executor/5::DEBUG::2016-07-28
18:50:57,817::task::595::Storage.TaskManager.Task::(_updateState)
Task=`21487eb4-de9b-47a3-aa37-7dce06533cc9`::moving from state preparing ->
state finished

I can manually mount the gluster volume on the same server.


>
>
>> Setup:
>> engine running on a separate node
>> 3 x kvm/glusterd nodes
>>
>> Status of volume: ovirt
>> Gluster process TCP Port  RDMA Port  Online
>>  Pid
>>
>> --
>> Brick 172.16.0.11:/data/brick1/brick1   49152 0  Y
>> 17304
>> Brick 172.16.0.12:/data/brick3/brick3   49155 0  Y
>> 9363
>> Brick 172.16.0.13:/data/brick1/brick1   49152 0  Y
>> 23684
>> Brick 172.16.0.11:/data/brick2/brick2   49153 0  Y
>> 17323
>> Brick 172.16.0.12:/data/brick2/brick2   49153 0  Y
>> 9382
>

[ovirt-users] Cannot find master domain

2016-07-28 Thread Siavash Safi
Hi,

Issue: Cannot find master domain
Changes applied before issue started to happen: replaced
172.16.0.12:/data/brick1/brick1
with 172.16.0.12:/data/brick3/brick3, did minor package upgrades for vdsm
and glusterfs

vdsm log: https://paste.fedoraproject.org/396842/
yum log: https://paste.fedoraproject.org/396854/

Setup:
engine running on a separate node
3 x kvm/glusterd nodes

Status of volume: ovirt
Gluster process TCP Port  RDMA Port  Online  Pid
--
Brick 172.16.0.11:/data/brick1/brick1   49152 0  Y
17304
Brick 172.16.0.12:/data/brick3/brick3   49155 0  Y
9363
Brick 172.16.0.13:/data/brick1/brick1   49152 0  Y
23684
Brick 172.16.0.11:/data/brick2/brick2   49153 0  Y
17323
Brick 172.16.0.12:/data/brick2/brick2   49153 0  Y
9382
Brick 172.16.0.13:/data/brick2/brick2   49153 0  Y
23703
NFS Server on localhost 2049  0  Y
30508
Self-heal Daemon on localhost   N/A   N/AY
30521
NFS Server on 172.16.0.11   2049  0  Y
24999
Self-heal Daemon on 172.16.0.11 N/A   N/AY
25016
NFS Server on 172.16.0.13   2049  0  Y
25379
Self-heal Daemon on 172.16.0.13 N/A   N/AY
25509

Task Status of Volume ovirt
--
Task : Rebalance
ID   : 84d5ab2a-275e-421d-842b-928a9326c19a
Status   : completed

Thanks,
Siavash
___
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users