On Fri, May 8, 2020 at 12:10 PM Lomayani S. Laizer <[email protected]> wrote:
>
> Hello,
> On my side at point of vm crash these are logs below. At the moment my debug
> is at 10 value. I will rise to 20 for full debug. these crashes are random
> and so far happens on very busy vms. Downgrading clients in host to Nautilus
> these crashes disappear
You could try adding debug_rados as well but you may get a very large
log so keep an eye on things.
>
> Qemu is not shutting down in general because other vms on the same host
> continues working
A process can not reliably continue after encountering a segfault so
the qemu-kvm process must be ending and therefore it should be
possible to capture a coredump with the right configuration.
In the following example, if you were to search for pid 6060 you would
find it is no longer running.
>> > [ 7682.233684] fn-radosclient[6060]: segfault at 2b19 ip 00007f8165cc0a50
>> > sp 00007f81397f6490 error 4 in librbd.so.1.12.0[7f8165ab4000+537000]
Without a backtrace at a minimum it may be very difficult to work out
what's going on with certainty. If you open a tracker for the issue
though maybe one of the devs specialising in rbd may have some
feedback.
>
> 2020-05-07T13:02:12.121+0300 7f88d57fa700 10 librbd::io::ReadResult:
> 0x7f88c80bfbf0 finish: got {} for [0,24576] bl 24576
> 2020-05-07T13:02:12.193+0300 7f88d57fa700 10 librbd::io::ReadResult:
> 0x7f88c80f9330 finish: C_ObjectReadRequest: r=0
> 2020-05-07T13:02:12.193+0300 7f88d57fa700 10 librbd::io::ReadResult:
> 0x7f88c80f9330 finish: got {} for [0,16384] bl 16384
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageState:
> 0x5569b5da9bb0 0x5569b5da9bb0 send_close_unlock
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageState:
> 0x5569b5da9bb0 0x5569b5da9bb0 send_close_unlock
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_block_image_watcher
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::ImageWatcher:
> 0x7f88c400dfe0 block_notifies
> 2020-05-07T13:02:28.694+0300 7f890ba90500 5 librbd::Watcher: 0x7f88c400dfe0
> block_notifies: blocked_count=1
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_block_image_watcher: r=0
> 2020-05-07T13:02:28.694+0300 7f890ba90500 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_shut_down_update_watchers
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_shut_down_update_watchers: r=0
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_shut_down_io_queue
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ:
> 0x7f88e8001570 shut_down: shut_down: in_flight=0
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_shut_down_io_queue: r=0
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_shut_down_exclusive_lock
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ExclusiveLock:
> 0x7f88c4011ba0 shut_down
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 shut_down:
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 send_shutdown:
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 send_shutdown_release:
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10 librbd::ExclusiveLock:
> 0x7f88c4011ba0 pre_release_lock_handler
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> send_cancel_op_requests:
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> handle_cancel_op_requests: r=0
> 2020-05-07T13:02:28.694+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_block_writes:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ:
> 0x7f88e8001570 block_writes: 0x5569b5e1ffd0, num=1
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> handle_block_writes: r=0
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_wait_for_ops:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 handle_wait_for_ops:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> send_invalidate_cache:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 5 librbd::io::ObjectDispatcher:
> 0x5569b5dab700 invalidate_cache:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> handle_invalidate_cache: r=0
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_flush_notifies:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> handle_flush_notifies:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> send_close_object_map:
> 2020-05-07T13:02:28.698+0300 7f88d4ff9700 10
> librbd::object_map::UnlockRequest: 0x7f88c807a450 send_unlock:
> oid=rbd_object_map.2f18f2a67fad72
> 2020-05-07T13:02:28.702+0300 7f88d57fa700 10
> librbd::object_map::UnlockRequest: 0x7f88c807a450 handle_unlock: r=0
> 2020-05-07T13:02:28.702+0300 7f88d57fa700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020
> handle_close_object_map: r=0
> 2020-05-07T13:02:28.702+0300 7f88d57fa700 10
> librbd::exclusive_lock::PreReleaseRequest: 0x7f88c80b6020 send_unlock:
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 handle_shutdown_pre_release: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10
> librbd::managed_lock::ReleaseRequest: 0x7f88c80b68a0 send_unlock:
> entity=client.58292796, cookie=auto 140225447738256
> 2020-05-07T13:02:28.702+0300 7f88d57fa700 10
> librbd::managed_lock::ReleaseRequest: 0x7f88c80b68a0 handle_unlock: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ExclusiveLock:
> 0x7f88c4011ba0 post_release_lock_handler: r=0 shutting_down=1
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 5 librbd::io::ImageRequestWQ:
> 0x7f88e8001570 unblock_writes: 0x5569b5e1ffd0, num=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher:
> 0x7f88c400dfe0 notify released lock
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher:
> 0x7f88c400dfe0 current lock owner: [0,0]
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 handle_shutdown_post_release: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 wait_for_tracked_ops: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ManagedLock:
> 0x7f88c4011bb8 complete_shutdown: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_shut_down_exclusive_lock: r=0
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_unregister_image_watcher
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::ImageWatcher:
> 0x7f88c400dfe0 unregistering image watcher
> 2020-05-07T13:02:28.702+0300 7f88d4ff9700 10 librbd::Watcher: 0x7f88c400dfe0
> unregister_watch:
> 2020-05-07T13:02:28.702+0300 7f88d57fa700 5 librbd::Watcher: 0x7f88c400dfe0
> notifications_blocked: blocked=1
> 2020-05-07T13:02:28.706+0300 7f88ceffd700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_unregister_image_watcher: r=0
> 2020-05-07T13:02:28.706+0300 7f88ceffd700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_flush_readahead
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_flush_readahead: r=0
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_shut_down_object_dispatcher
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::io::ObjectDispatcher:
> 0x5569b5dab700 shut_down:
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5 librbd::io::ObjectDispatch:
> 0x5569b5ee8360 shut_down:
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5
> librbd::io::SimpleSchedulerObjectDispatch: 0x7f88c4013ce0 shut_down:
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 5
> librbd::cache::WriteAroundObjectDispatch: 0x7f88c8003780 shut_down:
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_shut_down_object_dispatcher: r=0
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 send_flush_op_work_queue
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_flush_op_work_queue: r=0
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::image::CloseRequest:
> 0x7f88c8175fd0 handle_flush_image_watcher: r=0
> 2020-05-07T13:02:28.706+0300 7f88d4ff9700 10 librbd::ImageState:
> 0x5569b5da9bb0 0x5569b5da9bb0 handle_close: r=0
>
> On Fri, May 8, 2020 at 12:40 AM Brad Hubbard <[email protected]> wrote:
>>
>> On Fri, May 8, 2020 at 3:42 AM Erwin Lubbers <[email protected]> wrote:
>> >
>> > Hi,
>> >
>> > Did anyone found a way to resolve the problem? I'm seeing the same on a
>> > clean Octopus Ceph installation on Ubuntu 18 with an Octopus compiled KVM
>> > server running on CentOS 7.8. The KVM machine shows:
>> >
>> > [ 7682.233684] fn-radosclient[6060]: segfault at 2b19 ip 00007f8165cc0a50
>> > sp 00007f81397f6490 error 4 in librbd.so.1.12.0[7f8165ab4000+537000]
>>
>> Are you able to either capture a backtrace from a coredump or set up
>> logging and hopefully capture a backtrace that way?
>>
>> >
>> > Ceph is healthy and stable for a few weeks and I did not get these
>> > messages while running on KVM compiled with Luminous libraries.
>> >
>> > Regards,
>> > Erwin
>> > _______________________________________________
>> > ceph-users mailing list -- [email protected]
>> > To unsubscribe send an email to [email protected]
>> >
>>
>>
>> --
>> Cheers,
>> Brad
>> _______________________________________________
>> ceph-users mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
--
Cheers,
Brad
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]