Public bug reported:

I understand there are code to clean up the instance directory on the
target host if the live migration failed, but the directory is not
cleanup if libvirt's connection is timeout.

I haven't got a change to root cause the issue, but I feel the code
could be optimized a little bit to avoid this issue.

Here is some trace log from my side.

- Libvirt connection timed out
2017-03-07 02:34:37.540 ERROR nova.virt.libvirt.driver 
[req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 
6714b056-4950-4e63-83d3-fc383e977a53] Live Migration failure: unable to connect 
to server at 'ceph-dev:49152': Connection timed out
2017-03-07 02:34:37.541 DEBUG nova.virt.libvirt.driver 
[req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 
6714b056-4950-4e63-83d3-fc383e977a53] Migration operation thread notification 
from (pid=18073) thread_finished 
/opt/stack/nova/nova/virt/libvirt/driver.py:6361
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 457, 
in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in 
_do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper
    return func(*args, **kwargs)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5962, in 
_live_migration_operation
    instance=instance)
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
    self.force_reraise()
  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5958, in 
_live_migration_operation
    bandwidth=CONF.libvirt.live_migration_bandwidth)
  File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 605, in migrate
    flags=flags, bandwidth=bandwidth)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, in 
doit
    result = proxy_call(self._autowrap, f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, in 
proxy_call
    rv = execute(f, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, in 
execute
    six.reraise(c, e, tb)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, in 
tworker
    rv = meth(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1586, in 
migrateToURI2
    if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)

libvirtError: unable to connect to server at 'ceph-dev:49152':
Connection timed out


- The instance's directory haven't cleanup, and the next migration will fail.

Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", 
line 133, in _process_incoming
    res = self.dispatcher.dispatch(message)

  File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
150, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)

  File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
121, in _do_dispatch
    result = func(ctxt, **new_args)

  File "/opt/stack/nova/nova/exception_wrapper.py", line 75, in wrapped
    function_name, call_dict, binary)

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
    self.force_reraise()

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/nova/nova/exception_wrapper.py", line 66, in wrapped
    return f(self, context, *args, **kw)

  File "/opt/stack/nova/nova/compute/utils.py", line 613, in decorated_function
    return function(self, context, *args, **kwargs)

  File "/opt/stack/nova/nova/compute/manager.py", line 216, in 
decorated_function
    kwargs['instance'], e, sys.exc_info())

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
    self.force_reraise()

  File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/nova/nova/compute/manager.py", line 204, in 
decorated_function
    return function(self, context, *args, **kwargs)

  File "/opt/stack/nova/nova/compute/manager.py", line 5192, in 
pre_live_migration
    migrate_data)

  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6474, in 
pre_live_migration
    raise exception.DestinationDiskExists(path=instance_dir)

DestinationDiskExists: The supplied disk path
(/opt/stack/data/nova/instances/6714b056-4950-4e63-83d3-fc383e977a53)
already exists, it is expected not to exist.

** Affects: nova
     Importance: Undecided
         Status: New

-- 
You received this bug notification because you are a member of Yahoo!
Engineering Team, which is subscribed to OpenStack Compute (nova).
https://bugs.launchpad.net/bugs/1672597

Title:
  [live migration] The  instance directory on the destination host is
  not clean up

Status in OpenStack Compute (nova):
  New

Bug description:
  I understand there are code to clean up the instance directory on the
  target host if the live migration failed, but the directory is not
  cleanup if libvirt's connection is timeout.

  I haven't got a change to root cause the issue, but I feel the code
  could be optimized a little bit to avoid this issue.

  Here is some trace log from my side.

  - Libvirt connection timed out
  2017-03-07 02:34:37.540 ERROR nova.virt.libvirt.driver 
[req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 
6714b056-4950-4e63-83d3-fc383e977a53] Live Migration failure: unable to connect 
to server at 'ceph-dev:49152': Connection timed out
  2017-03-07 02:34:37.541 DEBUG nova.virt.libvirt.driver 
[req-35bc9ca8-c77b-481c-ae3e-93bbfed6187f admin admin] [instance: 
6714b056-4950-4e63-83d3-fc383e977a53] Migration operation thread notification 
from (pid=18073) thread_finished 
/opt/stack/nova/nova/virt/libvirt/driver.py:6361
  Traceback (most recent call last):
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 
457, in fire_timers
      timer()
    File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 
58, in __call__
      cb(*args, **kw)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, 
in _do_send
      waiter.switch(result)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 
214, in main
      result = function(*args, **kwargs)
    File "/opt/stack/nova/nova/utils.py", line 1066, in context_wrapper
      return func(*args, **kwargs)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5962, in 
_live_migration_operation
      instance=instance)
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
      self.force_reraise()
    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)
    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5958, in 
_live_migration_operation
      bandwidth=CONF.libvirt.live_migration_bandwidth)
    File "/opt/stack/nova/nova/virt/libvirt/guest.py", line 605, in migrate
      flags=flags, bandwidth=bandwidth)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 186, 
in doit
      result = proxy_call(self._autowrap, f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 144, 
in proxy_call
      rv = execute(f, *args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 125, 
in execute
      six.reraise(c, e, tb)
    File "/usr/local/lib/python2.7/dist-packages/eventlet/tpool.py", line 83, 
in tworker
      rv = meth(*args, **kwargs)
    File "/usr/local/lib/python2.7/dist-packages/libvirt.py", line 1586, in 
migrateToURI2
      if ret == -1: raise libvirtError ('virDomainMigrateToURI2() failed', 
dom=self)

  libvirtError: unable to connect to server at 'ceph-dev:49152':
  Connection timed out

  
  - The instance's directory haven't cleanup, and the next migration will fail.

  Traceback (most recent call last):

    File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", 
line 133, in _process_incoming
      res = self.dispatcher.dispatch(message)

    File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
150, in dispatch
      return self._do_dispatch(endpoint, method, ctxt, args)

    File 
"/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 
121, in _do_dispatch
      result = func(ctxt, **new_args)

    File "/opt/stack/nova/nova/exception_wrapper.py", line 75, in wrapped
      function_name, call_dict, binary)

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
      self.force_reraise()

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/opt/stack/nova/nova/exception_wrapper.py", line 66, in wrapped
      return f(self, context, *args, **kw)

    File "/opt/stack/nova/nova/compute/utils.py", line 613, in 
decorated_function
      return function(self, context, *args, **kwargs)

    File "/opt/stack/nova/nova/compute/manager.py", line 216, in 
decorated_function
      kwargs['instance'], e, sys.exc_info())

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
220, in __exit__
      self.force_reraise()

    File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 
196, in force_reraise
      six.reraise(self.type_, self.value, self.tb)

    File "/opt/stack/nova/nova/compute/manager.py", line 204, in 
decorated_function
      return function(self, context, *args, **kwargs)

    File "/opt/stack/nova/nova/compute/manager.py", line 5192, in 
pre_live_migration
      migrate_data)

    File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6474, in 
pre_live_migration
      raise exception.DestinationDiskExists(path=instance_dir)

  DestinationDiskExists: The supplied disk path
  (/opt/stack/data/nova/instances/6714b056-4950-4e63-83d3-fc383e977a53)
  already exists, it is expected not to exist.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1672597/+subscriptions

-- 
Mailing list: https://launchpad.net/~yahoo-eng-team
Post to     : yahoo-eng-team@lists.launchpad.net
Unsubscribe : https://launchpad.net/~yahoo-eng-team
More help   : https://help.launchpad.net/ListHelp

Reply via email to