Re: [Ceph-community] Getting WARN in __kick_osd_requests doing stress testing

2015-09-18 Thread Abhishek L
Redirecting to ceph-devel, where such a question might have a better
chance of a reply.

On Fri, Sep 18, 2015 at 4:03 AM,   wrote:
> I'm running in a 3-node cluster and doing osd/rbd creation and deletion, and
> ran across this WARN
> Note, it only happened once (on one rbd add) after approximately 500 cycles
> of the test, but was wondering if
> someone can explain to me why this warning would be happening, and how I can
> prevent it.
>
> Here is what my test script is doing:
>
> while(1):
> create 5 ceph pools   - sleep 2 between each pool create
> sleep 5
> create 5 ceph volumes - sleep 2 between each pool create
> sleep 5
> delete 5 ceph volumes - sleep 2 between each pool create
> sleep 5
> delete 5 ceph pools   - sleep 2 between each pool create
> sleep 5
>
>
> 333940 Sep 17 00:31:54 10.0.41.9 [18372.272771] Call Trace:
> 333941 Sep 17 00:31:54 10.0.41.9 [18372.273489]  []
> dump_stack+0x45/0x57
> 333942 Sep 17 00:31:54 10.0.41.9 [18372.274226]  []
> warn_slowpath_common+0x97/0xe0
> 333943 Sep 17 00:31:54 10.0.41.9 [18372.274923]  []
> warn_slowpath_null+0x1a/0x20
> 333944 Sep 17 00:31:54 10.0.41.9 [18372.275635]  []
> __kick_osd_requests+0x1dc/0x240 [libceph]
> 333945 Sep 17 00:31:54 10.0.41.9 [18372.276305]  []
> osd_reset+0x57/0xa0 [libceph]
> 333946 Sep 17 00:31:54 10.0.41.9 [18372.276962]  []
> con_work+0x112/0x290 [libceph]
> 333947 Sep 17 00:31:54 10.0.41.9 [18372.277608]  []
> process_one_work+0x144/0x470
> 333948 Sep 17 00:31:54 10.0.41.9 [18372.278247]  []
> worker_thread+0x11e/0x450
> 333949 Sep 17 00:31:54 10.0.41.9 [18372.278880]  [] ?
> create_worker+0x1f0/0x1f0
> 333950 Sep 17 00:31:54 10.0.41.9 [18372.279543]  []
> kthread+0xc9/0xe0
> 333951 Sep 17 00:31:54 10.0.41.9 [18372.280174]  [] ?
> flush_kthread_worker+0x90/0x90
> 333952 Sep 17 00:31:54 10.0.41.9 [18372.280803]  []
> ret_from_fork+0x58/0x90
> 333953 Sep 17 00:31:54 10.0.41.9 [18372.281430]  [] ?
> flush_kthread_worker+0x90/0x90
>
> static void __kick_osd_requests(struct ceph_osd_client *osdc,
> struct ceph_osd *osd)
> {
>  :
> list_for_each_entry_safe(req, nreq, >o_linger_requests,
>  r_linger_osd_item) {
> WARN_ON(!list_empty(>r_req_lru_item));
> __kick_linger_request(req);
> }
> :
> }
>
> - Bart
>
>
> ___
> Ceph-community mailing list
> ceph-commun...@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-community-ceph.com
>
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Ceph-community] Getting WARN in __kick_osd_requests doing stress testing

2015-09-18 Thread Ilya Dryomov
On Fri, Sep 18, 2015 at 9:48 AM, Abhishek L
 wrote:
> Redirecting to ceph-devel, where such a question might have a better
> chance of a reply.
>
> On Fri, Sep 18, 2015 at 4:03 AM,   wrote:
>> I'm running in a 3-node cluster and doing osd/rbd creation and deletion, and
>> ran across this WARN
>> Note, it only happened once (on one rbd add) after approximately 500 cycles
>> of the test, but was wondering if
>> someone can explain to me why this warning would be happening, and how I can
>> prevent it.
>>
>> Here is what my test script is doing:
>>
>> while(1):
>> create 5 ceph pools   - sleep 2 between each pool create
>> sleep 5
>> create 5 ceph volumes - sleep 2 between each pool create
>> sleep 5
>> delete 5 ceph volumes - sleep 2 between each pool create
>> sleep 5
>> delete 5 ceph pools   - sleep 2 between each pool create
>> sleep 5
>>
>>
>> 333940 Sep 17 00:31:54 10.0.41.9 [18372.272771] Call Trace:
>> 333941 Sep 17 00:31:54 10.0.41.9 [18372.273489]  []
>> dump_stack+0x45/0x57
>> 333942 Sep 17 00:31:54 10.0.41.9 [18372.274226]  []
>> warn_slowpath_common+0x97/0xe0
>> 333943 Sep 17 00:31:54 10.0.41.9 [18372.274923]  []
>> warn_slowpath_null+0x1a/0x20
>> 333944 Sep 17 00:31:54 10.0.41.9 [18372.275635]  []
>> __kick_osd_requests+0x1dc/0x240 [libceph]
>> 333945 Sep 17 00:31:54 10.0.41.9 [18372.276305]  []
>> osd_reset+0x57/0xa0 [libceph]
>> 333946 Sep 17 00:31:54 10.0.41.9 [18372.276962]  []
>> con_work+0x112/0x290 [libceph]
>> 333947 Sep 17 00:31:54 10.0.41.9 [18372.277608]  []
>> process_one_work+0x144/0x470
>> 333948 Sep 17 00:31:54 10.0.41.9 [18372.278247]  []
>> worker_thread+0x11e/0x450
>> 333949 Sep 17 00:31:54 10.0.41.9 [18372.278880]  [] ?
>> create_worker+0x1f0/0x1f0
>> 333950 Sep 17 00:31:54 10.0.41.9 [18372.279543]  []
>> kthread+0xc9/0xe0
>> 333951 Sep 17 00:31:54 10.0.41.9 [18372.280174]  [] ?
>> flush_kthread_worker+0x90/0x90
>> 333952 Sep 17 00:31:54 10.0.41.9 [18372.280803]  []
>> ret_from_fork+0x58/0x90
>> 333953 Sep 17 00:31:54 10.0.41.9 [18372.281430]  [] ?
>> flush_kthread_worker+0x90/0x90
>>
>> static void __kick_osd_requests(struct ceph_osd_client *osdc,
>> struct ceph_osd *osd)
>> {
>>  :
>> list_for_each_entry_safe(req, nreq, >o_linger_requests,
>>  r_linger_osd_item) {
>> WARN_ON(!list_empty(>r_req_lru_item));
>> __kick_linger_request(req);
>> }
>> :
>> }

What is your kernel version?

There is no mention of rbd map/unmap in the pseudo code you provided.
How are you mapping/unmapping those rbd images?  More details or the
script itself would be nice to see.

Thanks,

Ilya
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html