Hi Liron,

I've reproduced the issue with a fresh deployment of oVirt 3.5.2rc. I've 
provided you with new screencasts and relevant logs for both cases (see inline 
comments):

screencast for case 1: 
https://www.dropbox.com/s/fdrcwmpy03v5xri/Screencast%20from%2004-03-2015%2010%3A53%3A32.webm?dl=1
screencast for case 2: 
https://www.dropbox.com/s/w72bf86n9v2pvdw/Screencast%20from%2004-03-2015%2015%3A18%3A45.webm?dl=1
logs for case 1: 
https://www.dropbox.com/sh/bl24umw0w1anclb/AAC0Oq7c6oXWetw-tp-55c37a?dl=0
logs for case 2: 
https://www.dropbox.com/sh/rp3pdda68nox099/AABtZGKDfFCH3sD6FZPvxRmEa?dl=0

Please note that I'm using different networks for Management (192.168.48.0/24) 
and GlusterFS replica (192.168.50.0/24):

                management FQDN         GlusterFS FQDN
node 1:         s20.ovirt.prisma        s20gfs.ovirt.prisma
node 2:         s21.ovirt.prisma        s21gfs.ovirt.prisma

On dom, 2015-03-01 at 04:55 -0500, Liron Aravot wrote:
> Hi Stefano,
> thanks for the great input!
> 
> I went over the logs (is the screencast uses the same domains? i don't have 
> the logs from that run) - the master domain deactivation (and the master role 
> migration to the new domain) fails with the error to copy the master fs 
> content to the new domain on tar copy (see on [1] the error).
> 
> 1. Is there a chance that there is any problem inconsistent storage access 
> problem to any of the domains?
Storage domains rely on GlusterFS volumes created on purpose. VMs runs 
correctly.

> 2. Does the issue reproduces always or only in some of the runs?
The issue reproduces always but:
case 1) if DATA and DATA_NEW are both created pointing to s20gfs the issue 
reproduces and Master role changes (Screencast 1).
case 2) if DATA is pointing to s20 and DATA_NEW to s20gfs the issue reproduces 
and Muster roles flips but does not change (Screencast 2).

> 3. Have you tried to run a operation that creates a task? a creation of a 
> disk for example.
Every operations like creating or moving a disk are working correctly.

> 
> thanks,
> Liron.
> 
> 
> 
> [1]:
> Thread-9875::DEBUG::2015-02-25 
> 15:06:57,969::clusterlock::349::Storage.SANLock::(release) Cluster lock for 
> domain 08298f60-4919-4f86-9233-827c1089779a success
> fully released
> Thread-9875::ERROR::2015-02-25 
> 15:06:57,969::task::866::Storage.TaskManager.Task::(_setError) 
> Task=`2a434209-3e96-4d1e-8d1b-8c7463889f6a`::Unexpected error
> Traceback (most recent call last):
>   File "/usr/share/vdsm/storage/task.py", line 873, in _run
>     return fn(*args, **kargs)
>   File "/usr/share/vdsm/logUtils.py", line 45, in wrapper
>     res = f(*args, **kwargs)
>   File "/usr/share/vdsm/storage/hsm.py", line 1246, in deactivateStorageDomain
>     pool.deactivateSD(sdUUID, msdUUID, masterVersion)
>   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
>     return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 1097, in deactivateSD
>     self.masterMigrate(sdUUID, newMsdUUID, masterVersion)
>   File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
>     return method(self, *args, **kwargs)
>   File "/usr/share/vdsm/storage/sp.py", line 816, in masterMigrate
>     exclude=('./lost+found',))
>   File "/usr/share/vdsm/storage/fileUtils.py", line 68, in tarCopy
>     raise TarCopyFailed(tsrc.returncode, tdst.returncode, out, err)
> TarCopyFailed: (1, 0, '', '')
> Thread-9875::DEBUG::2015-02-25 
> 15:06:57,969::task::885::Storage.TaskManager.Task::(_run) 
> Task=`2a434209-3e96-4d1e-8d1b-8c7463889f6a`::Task._run: 2a434209-3e96
> -4d1e-8d1b-8c7463889f6a ('62a034ca-63df-44f2-9a87-735ddd257a6b', 
> '00000002-0002-0002-0002-00000000022f', 
> '08298f60-4919-4f86-9233-827c1089779a', 34) {} failed
>  - stopping task
> 
> ----- Original Message -----
> > From: "Stefano Stagnaro" <stefa...@prisma-eng.com>
> > To: "Vered Volansky" <ve...@redhat.com>
> > Cc: users@ovirt.org
> > Sent: Friday, February 27, 2015 4:54:31 PM
> > Subject: Re: [ovirt-users] Sync Error on Master Domain after adding a 
> > second one
> > 
> > I think I finally managed to replicate the problem:
> > 
> > 1. deploy a datacenter with a virt only cluster and a gluster only cluster
> > 2. create a first GlusterFS Storage Domain (e.g. DATA) and activate it
> > (should become Master)
> > 3. create a second GlusterFS Storage Domain (e.g. DATA_NEW) and activate it
> > 4. put DATA in maintenance
> > 
> > Both Storage Domains flows between the following states:
> > https://www.dropbox.com/s/x542q1epf40ar5p/Screencast%20from%2027-02-2015%2015%3A09%3A29.webm?dl=1
> > 
> > Webadmin Events shows: "Sync Error on Master Domain between Host v10 and
> > oVirt Engine. Domain: DATA is marked as Master in oVirt Engine database but
> > not on the Storage side. Please consult with Support on how to fix this
> > issue."
> > 
> > It seems DATA can be deactivated at the second attempt.
> > 
> > --
> > Stefano Stagnaro
> > 
> > Prisma Engineering S.r.l.
> > Via Petrocchi, 4
> > 20127 Milano – Italy
> > 
> > Tel. 02 26113507 int 339
> > e-mail: stefa...@prisma-eng.com
> > skype: stefano.stagnaro
> > 
> > On mer, 2015-02-25 at 15:41 +0100, Stefano Stagnaro wrote:
> > > This is what I've done basically:
> > > 
> > > 1. added a new data domain (DATA_R3);
> > > 2. activated the new data domain - both domains in "active" state;
> > > 3. moved Disks from DATA to DATA_R3;
> > > 4. tried to put the old data domain in maintenance (from webadmin or
> > > shell);
> > > 5. both domains became inactive;
> > > 6. DATA_R3 came back in "active";
> > > 7. DATA domain went in "being initialized";
> > > 8. Webadmin shows the error "Sync Error on Master Domain between...";
> > > 9. DATA domain completed the reconstruction and came back in "active".
> > > 
> > > Please find engine and vdsm logs here:
> > > https://www.dropbox.com/sh/uuwwo8sxcg4ffqp/AAAx6UrwI3jbsN4oraJuDx9Fa?dl=0
> > > 
> > 
> > 
> > 
> > _______________________________________________
> > Users mailing list
> > Users@ovirt.org
> > http://lists.ovirt.org/mailman/listinfo/users
> > 


_______________________________________________
Users mailing list
Users@ovirt.org
http://lists.ovirt.org/mailman/listinfo/users

Reply via email to