On Fri, Dec 6, 2019 at 12:12 AM Dongsheng Yang
<[email protected]> wrote:
>
>
>
> On 12/06/2019 12:50 PM, [email protected] wrote:
>
> Hi Jason, dongsheng
> I found a problem using rbd_open_by_id when connection timeout(errno = 110,
> ceph version 12.2.8, there is no change about rbd_open_by_id in master
> branch).
> int r = ictx->state->open(false);
> if (r < 0) { // r = -110
> delete ictx; // crash,the stack is shown below:
> } else {
> *image = (rbd_image_t)ictx;
> }
> Stack is :
> (gdb) bt
> Program terminated with signal 11, Segmentation fault.
> #0 Mutex::Lock (this=this@entry=0x8, no_lockdep=no_lockdep@entry=false)
> at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.cc:92
> #1 0x0000ffff765eb814 in Locker (m=..., this=<synthetic pointer>)
> at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/Mutex.h:115
> #2 PerfCountersCollection::remove (this=0x0, l=0xffff85bf0588
> <main_arena+88>)
> at /var/lib/jenkins/workspace/ceph_L/ceph/src/common/perf_counters.cc:62
> #3 0x0000ffff757f26c4 in librbd::ImageCtx::perf_stop (this=0x2065d2f0)
> at /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:426
> #4 0x0000ffff757f6e48 in librbd::ImageCtx::~ImageCtx (this=0x2065d2f0,
> __in_chrg=<optimized out>)
> at /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/ImageCtx.cc:239
> #5 0x0000ffff757e8e70 in rbd_open_by_id (p=p@entry=0x2065e780,
> id=id@entry=0xffff85928bb4 "20376aa706c0",
> image=image@entry=0xffff75b9b0b8,
> snap_name=snap_name@entry=0x0) at
> /var/lib/jenkins/workspace/ceph_L/ceph/src/librbd/librbd.cc:2692
> #6 0x0000ffff75aed524 in __pyx_pf_3rbd_5Image___init__ (
> __pyx_v_read_only=0xffff85f02bb8 <_Py_ZeroStruct>,
> __pyx_v_snapshot=0xffff85f15938 <_Py_NoneStruct>,
> __pyx_v_by_id=0xffff85f02ba0 <_Py_TrueStruct>,
> __pyx_v_name=0xffff85928b90,
> __pyx_v_ioctx=0xffff7f0283d0, __pyx_v_self=0xffff75b9b0a8)
> at
> /var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12662
> #7 __pyx_pw_3rbd_5Image_1__init__ (__pyx_v_self=0xffff75b9b0a8,
> __pyx_args=<optimized out>,
> __pyx_kwds=<optimized out>)
> at
> /var/lib/jenkins/workspace/ceph_L/ceph/build/src/pybind/rbd/pyrex/rbd.c:12378
> #8 0x0000ffff85e02924 in type_call () from /lib64/libpython2.7.so.1.0
> #9 0x0000ffff85dac254 in PyObject_Call () from /lib64/libpython2.7.so.1.0
I suspect you will hit tons of errors and corruption and leaking RADOS
objects when you start injecting RADOS timeouts. It's not something
librbd really supports or tests against -- so you need to expect the
unexpected. Definitely feel free to open a PR to fix it, though -- but
otherwise it would be a really *low* priority to fix.
> Meanwhile, I have sone questions about rbd_open, if state->open failed(return
> value r < 0), Is there a risk of memory leaks about ictx?
> extern "C" int rbd_open(rados_ioctx_t p, const char *name, rbd_image_t *image,
> const char *snap_name)
> {
> librados::IoCtx io_ctx;
> librados::IoCtx::from_rados_ioctx_t(p, io_ctx);
> TracepointProvider::initialize<tracepoint_traits>(get_cct(io_ctx));
> librbd::ImageCtx *ictx = new librbd::ImageCtx(name, "", snap_name, io_ctx,
> false);
> tracepoint(librbd, open_image_enter, ictx, ictx->name.c_str(),
> ictx->id.c_str(), ictx->snap_name.c_str(), ictx->read_only);
>
> int r = ictx->state->open(false);
> if (r >= 0) { // if r < 0, Is there a risk of memory leaks?
>
>
> Good catch. I think there is. And same at rbd_open_read_only. Seems we need a
> patch to fix them.
No, that's not a memory leak. The blocking version of
"ImageState::open" will free the "ImageCtx" memory upon failure.
> Thanx
> Yang
>
> *image = (rbd_image_t)ictx;
> }
> tracepoint(librbd, open_image_exit, r);
> return r;
> }
>
> Can you give me some advice? many thanks.
>
>
> ________________________________
> [email protected]
>
>
--
Jason
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]