Right, after doing a bit more debugging, this looks like a bug in python-rbd only. Running the following python code, I am seeing the same errors about 50% of the time:
import rados import rbd cluster = rados.Rados(conffile='/etc/ceph/ceph.conf') cluster.connect() ioctx = cluster.open_ioctx('images') image = rbd.Image(ioctx, '41a6aadf-c655-49c8-b77d-eedd356f5349') stat = image.stat() print stat['size'] == image.size() $ python test.py True $ python test.py Traceback (most recent call last): File "test.py", line 7, in <module> stat = image.stat() File "rbd.pyx", line 1124, in rbd.Image.stat (/build/ceph-XmVvyr/ceph-10.2.2/src/build/rbd.c:10803) File "rbd.pyx", line 433, in rbd.decode_cstr (/build/ceph-XmVvyr/ceph-10.2.2/src/build/rbd.c:2440) File "/usr/lib/python2.7/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) UnicodeDecodeError: 'utf8' codec can't decode byte 0xf4 in position 24: unexpected end of data The "position" in the error is always 24, with varying byte values. ** Also affects: ceph (Ubuntu) Importance: Undecided Status: New ** Summary changed: - Glance fails to retrieve image from rbd backend + image.stat() call sometimes fails ** Description changed: - We have a deployment with three controller nodes running glance version - 2:12.0.0-0ubuntu2 with a Ceph backend. About 50% of the image download - requests from compute nodes are failing with an error like this: + The error looks like - 2016-09-20 07:37:32.014 29606 WARNING glance.location [req-78e5191c- - e31b-428e-82c5-30ef87feef4e a73f3a58c6ad478a838c3ca9353704f7 - 7a17e2b80f274591aec96baec3f6bf68 - - -] Get image 41a6aadf-c655-49c8 - -b77d-eedd356f5349 data failed: 'utf8' codec can't decode byte 0x82 in - position 24: invalid start byte. + 'utf8' codec can't decode byte 0x82 in position 24: invalid start byte. - Looking at the code, it looks like the error is occuring in - glance_store/_drivers/rbd.py while reading the image, but it seems very - strange that glance is trying to utf8-decode a raw image, which pretty - likely is bound to fail. But then it is unclear why only every other - request is failing. - - The error can be reproduced by just doing a "glance image-download", so - it's nothing special happening from the Nova side. + where the position always is 24, which seems to be the end of the "block_name_prefix" string, and the byte itself it varying. + About 50% of the time the test script gives an error. ** Tags removed: rbd -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1625489 Title: image.stat() call sometimes fails To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ceph/+bug/1625489/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs