Hi, I have two DC's (both initialized), two-node each, and on the first one I have a replica 2 gluster storage domain that is geo-replicating on a replica 2 slave volume on the second DC (managed within the same engine). When I stop the replication (volumes are synced) and try to import the gluster storage domain that resides on the slave, import storage domain dialog throws a general exception.

Exception is raised when vdsm loads the list of backup servers so that the backup-volfile-servers mount option could get populated. If I override that in storageServer.py, so that it always return blank, or when I manually enter this option in the import storage domain dialog, then everything works as expected.

This problem is addressed in Ala's patch - https://gerrit.ovirt.org/#/c/48308/ Are there multiple interfaces configured for gluster at the slave cluster ?

I've created a separate network and marked it as a gluster network on both datacenters, but I haven't used transport.socket.bind-address option to bind gluster to those particular interfaces. But peers were probed via hostnames that are resolving to the address of the interfaces that belongs to the gluster network without any aliases.

Nir mentioned this bug a few days back (that Ala's patch addresses), but I've managed somehow not to connect the dots. I am using a host in the cluster to connect to the volume, but the host is not a part of the volume info even though it is the same host, just different hostname that is pointing to a different network interface on that host.

Maybe it's worth mentioning that I have a dedicated gluster network and hostnames for all nodes in both DC's (node hostname, and hostname I use for gluster on that node are different), and that all attempts to import a storage domain were on the second DC.

Btw, setting up gluster geo-replication from oVirt was a breeze, easy and straightforward. Importing domain based on slave gluster volume works when gluster storage domain that resides on master volume gets removed from the first DC. This is something that we could improve, if I don't detach and remove original gluster sd, import storage dialog just shows up again after a short "running circle", but it should provide a warning that there is another storage domain already active/registered in the engine with the same ID/name and that the domain should be removed (or the engine can do it for us). I get this warning only when I've already removed storage domain on a master volume from the first DC (which doesn't make sense to me).

2015-11-19 07:33:15,245 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-23) [34886be8] Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: The error message for connection hostname:/volname returned by VDSM was: General Exception 2015-11-19 07:33:15,245 ERROR [org.ovirt.engine.core.bll.storage.BaseFsStorageHelper] (default task-23) [34886be8] The connection with details 'hostname:/volname' failed because of error code '100' and error message is: general exception


Thread-38::ERROR::2015-11-19 07:33:15,237::hsm::2465::Storage.HSM::(connectStorageServer) Could not connect to storageServer
Traceback (most recent call last):
File "/usr/share/vdsm/storage/hsm.py", line 2462, in connectStorageServer
  File "/usr/share/vdsm/storage/storageServer.py", line 224, in connect
    self._mount.mount(self.options, self._vfsType, cgroup=self.CGROUP)
  File "/usr/share/vdsm/storage/storageServer.py", line 324, in options
    backup_servers_option = self._get_backup_servers_option()
File "/usr/share/vdsm/storage/storageServer.py", line 341, in _get_backup_servers_option
ValueError: list.remove(x): x not in list
