Ilya,
I can gather the following syslog entries. Attached is the syslog..Please have 
a look if this is helpful.

I can see the following trace..

Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.283268] Workqueue: ceph-msgr 
con_work [libceph]
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.291641] task: ffff880fb6868000 
ti: ffff880ffaa2a000 task.ti: ffff880ffaa2a000
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.304503] RIP: 
0010:[<ffffffffa035a40e>]  [<ffffffffa035a40e>] osd_reset+0x22e/0x2c0 [libceph]
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.319808] RSP: 
0018:ffff880ffaa2bd80  EFLAGS: 00010206
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.328659] RAX: ffff881012fb4ca8 
RBX: ffff8810114a9750 RCX: ffff881012790050
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.599331] RDX: ffff881012fb4ca8 
RSI: 0000000086588656 RDI: 0000000000000286
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.703539] RBP: ffff880ffaa2bdd8 
R08: 0000000000000000 R09: 0000000000000000
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.810053] R10: ffffffff81600edf 
R11: ffffea003fef7a00 R12: ffff881012fb4c58
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371757.918811] R13: ffff8810114a9810 
R14: ffff881012790000 R15: ffff881012790020
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029661] libceph: osd32 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd33 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd38 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029662] libceph: osd39 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd40 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd47 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd48 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029663] libceph: osd49 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd50 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd51 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029664] libceph: osd52 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029665] libceph: osd53 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.029665] libceph: osd57 down
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.631655] FS:  
0000000000000000(0000) GS:ffff88101f300000(0000) knlGS:0000000000000000
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.700074] CS:  0010 DS: 0000 ES: 
0000 CR0: 0000000080050033
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.734306] CR2: 00007f0bbad49000 
CR3: 0000000001c0e000 CR4: 00000000001407e0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.800693] Stack:
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.832457]  ffff8810114a97a8 
ffff8810114a9760 ffff881012fb4800 ffff881012fb4ca8
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.897340]  ffff880ffaa2bda0 
ffff880ffaa2bda0 ffff881012fb4c10 ffff881012fb4830
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371758.962318]  ffff881012fb49b0 
ffff881012fb4860 0000000000000011 ffff880ffaa2be20
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.027390] Call Trace:
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.058230]  [<ffffffffa03549e8>] 
con_work+0x298/0x640 [libceph]
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.089619]  [<ffffffff810838a2>] 
process_one_work+0x182/0x450
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.120139]  [<ffffffff81084641>] 
worker_thread+0x121/0x410
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.149533]  [<ffffffff81084520>] ? 
rescuer_thread+0x3e0/0x3e0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.179041]  [<ffffffff8108b312>] 
kthread+0xd2/0xf0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.209159]  [<ffffffff8108b240>] ? 
kthread_create_on_node+0x1d0/0x1d0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.240921]  [<ffffffff8172637c>] 
ret_from_fork+0x7c/0xb0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.273511]  [<ffffffff8108b240>] ? 
kthread_create_on_node+0x1d0/0x1d0
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.307636] Code: ff ff 48 89 df e8 
e3 f1 ff ff 48 8b 7d a8 e8 7a 1c 3c e1 48 8b 7d b0 e8 41 68 d5 e0 48 83 c4 30 
5b 41 5c 41 5d 41 5e 41 5f 5d c3 <0f> 0b 48 8b 45 b8 49 8b 0e 4c 89 f2 48 c7 c6 
d0 e6 36 a0 48 c7
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.421674] RIP  [<ffffffffa035a40e>] 
osd_reset+0x22e/0x2c0 [libceph]
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.462127]  RSP <ffff880ffaa2bd80>
Dec  9 01:38:01 rack1-ramp-5 kernel: [1371759.567952] ---[ end trace 
37d00d439ac66995 ]---
Dec  9 01:38:17 rack1-ramp-5 kernel: [1371759.614230] BUG: unable to handle 
kernel paging request at ffffffffffffffd8
Dec  9 01:38:17 rack1-ramp-5 kernel: [1371759.659349] IP: [<ffffffff8108b9b0>] 
kthread_data+0x10/0x20

Thanks & Regards
Somnath

-----Original Message-----
From: Somnath Roy
Sent: Monday, January 05, 2015 1:08 PM
To: 'Ilya Dryomov'
Cc: Chaitanya Huilgol; [email protected]
Subject: RE: Ceph-client branch for Ubuntu 14.04.1 LTS (3.13.0-x kernels)

It's happening both in idle and under load.
I don't have the trace right now but will get you one soon.

Thanks & Regards
Somnath

-----Original Message-----
From: Ilya Dryomov [mailto:[email protected]]
Sent: Monday, January 05, 2015 12:34 PM
To: Somnath Roy
Cc: Chaitanya Huilgol; [email protected]
Subject: Re: Ceph-client branch for Ubuntu 14.04.1 LTS (3.13.0-x kernels)

On Mon, Jan 5, 2015 at 11:01 PM, Somnath Roy <[email protected]> wrote:
> Ilya,
> Here is the steps..
>
> 1. You have a cluster (3 nodes) and replication is 3
>
> 2. map krbd image to a client.
>
> 3. Reboot or stop ceph services on one or more nodes
>
> 4. The client with krbd mapped module crashes

Is it idle or under load?

Do you have a trace of the crash?

Thanks,

                Ilya

________________________________

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

Attachment: syslog.tar.gz
Description: syslog.tar.gz

Reply via email to