I hit the following crash on the server side of the testing grmpp with the latest bits. The parameters to the test are: "rmpp=1" "message_size=1000" "responses=1". With "responses=0", I don't see the issue.
Has anyone else seen this? I haven't run grmpp in a while, so I'm not sure if this crash is related to the latest check-in or not. I'll look into this more later this afternoon. - Sean Feb 27 13:12:50 mshefty-linux1 kernel: grmpp: starting server Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:recv GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x3 (Send) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x1 (Data) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x3 (Active - First) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0001 Feb 27 13:14:44 mshefty-linux1 kernel: Payload len....0x03fc Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:sent GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x83 (Send response) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x2 (Ack) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x1 (Active) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0001 Feb 27 13:14:44 mshefty-linux1 kernel: New window.....0x0041 Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:recv GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x3 (Send) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x1 (Data) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x1 (Active) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0002 Feb 27 13:14:44 mshefty-linux1 kernel: Payload len....0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:recv GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x3 (Send) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x1 (Data) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x1 (Active) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0003 Feb 27 13:14:44 mshefty-linux1 kernel: Payload len....0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:recv GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x3 (Send) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x1 (Data) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x1 (Active) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0004 Feb 27 13:14:44 mshefty-linux1 kernel: Payload len....0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: Madeye:recv GMP Feb 27 13:14:44 mshefty-linux1 kernel: MAD version....0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Class..........0x4a (Unknown vendor/application) Feb 27 13:14:44 mshefty-linux1 kernel: Class version..0x1 Feb 27 13:14:44 mshefty-linux1 kernel: Method.........0x3 (Send) Feb 27 13:14:44 mshefty-linux1 kernel: Status.........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Class specific.0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Trans ID.......0x10000000f000000 Feb 27 13:14:44 mshefty-linux1 kernel: Attr ID........0x00 Feb 27 13:14:44 mshefty-linux1 kernel: Attr modifier..0x0000 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP version...0x1 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP type......0x1 (Data) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP RRespTime.0x0 Feb 27 13:14:44 mshefty-linux1 kernel: RMPP flags.....0x5 (Active - Last) Feb 27 13:14:44 mshefty-linux1 kernel: RMPP status....0x0 Feb 27 13:14:44 mshefty-linux1 kernel: Seg number.....0x0005 Feb 27 13:14:44 mshefty-linux1 kernel: Payload len....0x008c Feb 27 13:14:44 mshefty-linux1 kernel: Unable to handle kernel NULL pointer dereference at virtual address 00000014 Feb 27 13:14:44 mshefty-linux1 kernel: printing eip: Feb 27 13:14:44 mshefty-linux1 kernel: f8db2d89 Feb 27 13:14:44 mshefty-linux1 kernel: *pde = 3c448067 Feb 27 13:14:44 mshefty-linux1 kernel: Oops: 0000 [#1] Feb 27 13:14:44 mshefty-linux1 kernel: SMP Feb 27 13:14:44 mshefty-linux1 kernel: Modules linked in: ib_grmpp ib_sa ib_addr ib_madeye ib_mthca ib_mad ib_core edd evdev joydev st sr_mod ide_cd cdrom nvram usbserial parport_pc lp parport ipv6 thermal processor fan button battery ac af_packet e1000 i2c_i801 i2c_core hw_random uhci_hcd usbcore reiserfs aic7xxx scsi_transport_spi sd_mod scsi_mod Feb 27 13:14:44 mshefty-linux1 kernel: CPU: 0 Feb 27 13:14:44 mshefty-linux1 kernel: EIP: 0060:[pg0+949489033/1069335552] Not tainted VLI Feb 27 13:14:44 mshefty-linux1 kernel: EIP: 0060:[<f8db2d89>] Not tainted VLI Feb 27 13:14:44 mshefty-linux1 kernel: EFLAGS: 00010092 (2.6.16-rc1) Feb 27 13:14:44 mshefty-linux1 kernel: EIP is at mthca_ah_grh_present+0x1/0xf [ib_mthca] Feb 27 13:14:44 mshefty-linux1 kernel: eax: 00000000 ebx: f6d9796c ecx: 00000002 edx: f60b8440 Feb 27 13:14:44 mshefty-linux1 kernel: esi: f6d9786c edi: dd017100 ebp: f1db9d94 esp: f1db9d74 Feb 27 13:14:44 mshefty-linux1 kernel: ds: 007b es: 007b ss: 0068 Feb 27 13:14:44 mshefty-linux1 kernel: Process ib_mad1 (pid: 12158, threadinfo=f1db8000 task=dff7b570) Feb 27 13:14:44 mshefty-linux1 kernel: Stack: <0>f1db9d94 f8db1702 00000002 e2710000 00000286 f60b8440 dd017110 f60b8440 Feb 27 13:14:44 mshefty-linux1 kernel: f1db9df8 f8db1bc4 f60b8440 dd017100 dd017110 00000002 00000001 00000000 Feb 27 13:14:44 mshefty-linux1 kernel: 00000002 00000000 00000001 c01522b4 00000000 00000000 00000092 dd017080 Feb 27 13:14:44 mshefty-linux1 kernel: Call Trace: Feb 27 13:14:44 mshefty-linux1 kernel: [show_stack_log_lvl+174/182] show_stack_log_lvl+0xae/0xb6 Feb 27 13:14:44 mshefty-linux1 kernel: [<c01049cf>] show_stack_log_lvl+0xae/0xb6 Feb 27 13:14:44 mshefty-linux1 kernel: [show_registers+244/348] show_registers+0xf4/0x15c Feb 27 13:14:44 mshefty-linux1 kernel: [<c0104af1>] show_registers+0xf4/0x15c Feb 27 13:14:44 mshefty-linux1 kernel: [die+249/365] die+0xf9/0x16d Feb 27 13:14:44 mshefty-linux1 kernel: [<c0104cc6>] die+0xf9/0x16d Feb 27 13:14:44 mshefty-linux1 kernel: [do_page_fault+900/1220] do_page_fault+0x384/0x4c4 Feb 27 13:14:44 mshefty-linux1 kernel: [<c0115126>] do_page_fault+0x384/0x4c4 Feb 27 13:14:44 mshefty-linux1 kernel: [error_code+79/96] error_code+0x4f/0x60 Feb 27 13:14:44 mshefty-linux1 kernel: [<c010463f>] error_code+0x4f/0x60 Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949484484/1069335552] mthca_tavor_post_send+0x2f3/0x56b [ib_mthca] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8db1bc4>] mthca_tavor_post_send+0x2f3/0x56b [ib_mthca] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949924455/1069335552] ib_send_mad+0xdb/0x110 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1d267>] ib_send_mad+0xdb/0x110 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949937804/1069335552] send_next_seg+0xcb/0xd2 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e2068c>] send_next_seg+0xcb/0xd2 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949939122/1069335552] ib_send_rmpp_mad+0xa3/0xb5 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e20bb2>] ib_send_rmpp_mad+0xa3/0xb5 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949924790/1069335552] ib_post_send_mad+0x11a/0x1a8 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1d3b6>] ib_post_send_mad+0x11a/0x1a8 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949597095/1069335552] send_response+0xbb/0xe3 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8dcd3a7>] send_response+0xbb/0xe3 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949597306/1069335552] recv_handler+0x3d/0x51 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8dcd47a>] recv_handler+0x3d/0x51 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949927683/1069335552] ib_mad_complete_recv+0xf3/0x121 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1df03>] ib_mad_complete_recv+0xf3/0x121 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949928209/1069335552] ib_mad_recv_done_handler+0x1e0/0x215 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1e111>] ib_mad_recv_done_handler+0x1e0/0x215 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949929484/1069335552] ib_mad_completion_handler+0x45/0x7a [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1e60c>] ib_mad_completion_handler+0x45/0x7a [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [run_workqueue+130/195] run_workqueue+0x82/0xc3 Feb 27 13:14:44 mshefty-linux1 kernel: [<c012a304>] run_workqueue+0x82/0xc3 Feb 27 13:14:44 mshefty-linux1 kernel: [worker_thread+248/298] worker_thread+0xf8/0x12a Feb 27 13:14:44 mshefty-linux1 kernel: [<c012a43d>] worker_thread+0xf8/0x12a Feb 27 13:14:44 mshefty-linux1 kernel: [kthread+120/160] kthread+0x78/0xa0 Feb 27 13:14:44 mshefty-linux1 kernel: [<c012cfb2>] kthread+0x78/0xa0 Feb 27 13:14:44 mshefty-linux1 kernel: [kernel_thread_helper+5/11] kernel_thread_helper+0x5/0xb Feb 27 13:14:44 mshefty-linux1 kernel: [<c0101be5>] kernel_thread_helper+0x5/0xb Feb 27 13:14:44 mshefty-linux1 kernel: Code: 0f ac ca 05 e8 43 90 ff ff eb 1b 8b 4a 18 8b 80 34 07 00 00 8b 52 14 e8 71 04 49 c7 eb 08 8b 42 14 e8 8e 10 3a c7 31 c0 5d c3 55 <8b> 40 14 89 e5 5d 0f be 40 05 c1 e8 1f c3 55 89 e5 57 56 53 53 Feb 27 13:14:44 mshefty-linux1 kernel: <3>Debug: sleeping function called from invalid context at include/linux/rwsem.h:43 Feb 27 13:14:44 mshefty-linux1 kernel: in_atomic():0, irqs_disabled():1 Feb 27 13:14:44 mshefty-linux1 kernel: [show_trace+13/15] show_trace+0xd/0xf Feb 27 13:14:44 mshefty-linux1 kernel: [<c010491f>] show_trace+0xd/0xf Feb 27 13:14:44 mshefty-linux1 kernel: [dump_stack+21/23] dump_stack+0x15/0x17 Feb 27 13:14:44 mshefty-linux1 kernel: [<c01049fb>] dump_stack+0x15/0x17 Feb 27 13:14:44 mshefty-linux1 kernel: [__might_sleep+143/153] __might_sleep+0x8f/0x99 Feb 27 13:14:44 mshefty-linux1 kernel: [<c011a1f0>] __might_sleep+0x8f/0x99 Feb 27 13:14:44 mshefty-linux1 kernel: [profile_task_exit+27/71] profile_task_exit+0x1b/0x47 Feb 27 13:14:44 mshefty-linux1 kernel: [<c011d36f>] profile_task_exit+0x1b/0x47 Feb 27 13:14:44 mshefty-linux1 kernel: [do_exit+27/853] do_exit+0x1b/0x355 Feb 27 13:14:44 mshefty-linux1 kernel: [<c011eb75>] do_exit+0x1b/0x355 Feb 27 13:14:44 mshefty-linux1 kernel: [do_trap+0/150] do_trap+0x0/0x96 Feb 27 13:14:44 mshefty-linux1 kernel: [<c0104d3a>] do_trap+0x0/0x96 Feb 27 13:14:44 mshefty-linux1 kernel: [do_page_fault+900/1220] do_page_fault+0x384/0x4c4 Feb 27 13:14:44 mshefty-linux1 kernel: [<c0115126>] do_page_fault+0x384/0x4c4 Feb 27 13:14:44 mshefty-linux1 kernel: [error_code+79/96] error_code+0x4f/0x60 Feb 27 13:14:44 mshefty-linux1 kernel: [<c010463f>] error_code+0x4f/0x60 Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949484484/1069335552] mthca_tavor_post_send+0x2f3/0x56b [ib_mthca] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8db1bc4>] mthca_tavor_post_send+0x2f3/0x56b [ib_mthca] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949924455/1069335552] ib_send_mad+0xdb/0x110 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1d267>] ib_send_mad+0xdb/0x110 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949937804/1069335552] send_next_seg+0xcb/0xd2 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e2068c>] send_next_seg+0xcb/0xd2 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949939122/1069335552] ib_send_rmpp_mad+0xa3/0xb5 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e20bb2>] ib_send_rmpp_mad+0xa3/0xb5 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949924790/1069335552] ib_post_send_mad+0x11a/0x1a8 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1d3b6>] ib_post_send_mad+0x11a/0x1a8 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949597095/1069335552] send_response+0xbb/0xe3 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8dcd3a7>] send_response+0xbb/0xe3 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949597306/1069335552] recv_handler+0x3d/0x51 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8dcd47a>] recv_handler+0x3d/0x51 [ib_grmpp] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949927683/1069335552] ib_mad_complete_recv+0xf3/0x121 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1df03>] ib_mad_complete_recv+0xf3/0x121 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949928209/1069335552] ib_mad_recv_done_handler+0x1e0/0x215 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1e111>] ib_mad_recv_done_handler+0x1e0/0x215 [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [pg0+949929484/1069335552] ib_mad_completion_handler+0x45/0x7a [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [<f8e1e60c>] ib_mad_completion_handler+0x45/0x7a [ib_mad] Feb 27 13:14:44 mshefty-linux1 kernel: [run_workqueue+130/195] run_workqueue+0x82/0xc3 Feb 27 13:14:44 mshefty-linux1 kernel: [<c012a304>] run_workqueue+0x82/0xc3 Feb 27 13:14:44 mshefty-linux1 kernel: [worker_thread+248/298] worker_thread+0xf8/0x12a Feb 27 13:14:44 mshefty-linux1 kernel: [<c012a43d>] worker_thread+0xf8/0x12a Feb 27 13:14:44 mshefty-linux1 kernel: [kthread+120/160] kthread+0x78/0xa0 Feb 27 13:14:44 mshefty-linux1 kernel: [<c012cfb2>] kthread+0x78/0xa0 Feb 27 13:14:44 mshefty-linux1 kernel: [kernel_thread_helper+5/11] kernel_thread_helper+0x5/0xb Feb 27 13:14:44 mshefty-linux1 kernel: [<c0101be5>] kernel_thread_helper+0x5/0xb _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
