John,
> Here's a representative one. I got this kind of thing every time an older
> client tried to connect, and the problem went away as soon as I shut down or
> updated the older clients. I didn't do much further debugging than that.
>
> Feb 22 08:28:36 localhost Lustre: 11093:0:(lib-move.c:1644:lnet_parse_put())
> Dropping PUT from [EMAIL PROTECTED] portal 6 match
15660 offset 0 length 240: 2
> Feb 22 08:28:36 localhost LustreError:
> 11215:0:(pack_generic.c:809:lustre_unpack_msg()) bad lustre msg magic:
> 0XBD00BD2
> Feb 22 08:28:36 localhost LustreError:
> 11215:0:(service.c:557:ptlrpc_server_handle_request()) error unpacking
> request: ptl 12 from
[EMAIL PROTECTED] xid 15659
> Feb 22 08:28:36 localhost LustreError:
> 11215:0:(pack_generic.c:1298:lustre_msg_get_opc()) ASSERTION(0)
> failed:incorrect message
magic: 0bd00bd2
> Feb 22 08:28:36 localhost LustreError:
> 11215:0:(pack_generic.c:1298:lustre_msg_get_opc()) LBUG
> Feb 22 08:28:36 localhost Lustre:
> 11215:0:(linux-debug.c:166:libcfs_debug_dumpstack()) showing stack for
> process 11215
> Feb 22 08:28:36 localhost ll_mdt_01 R running task 0 11215 1
> 11216 11178 (L-TLB)
> Feb 22 08:28:36 localhost ffff8100f39f1d18 0000000000000003 ffff810005d3f980
> 0000000000000286
> Feb 22 08:28:36 localhost 0000000000000003 ffff8100f9b24080 ffffffff88120210
> 0000000000000512
> Feb 22 08:28:36 localhost 0000000000000000 0000000000000000
> Feb 22 08:28:36 localhost Call Trace:<ffffffff8010f53f>{show_trace+527}
> <ffffffff8010f6b5>{show_stack+229}
> Feb 22 08:28:36 localhost <ffffffff88000d0a>{:libcfs:lbug_with_loc+122}
> <ffffffff880feb2d>{:ptlrpc:lustre_msg_get_opc+285}
> Feb 22 08:28:36 localhost <ffffffff8810a208>{:ptlrpc:ptlrpc_main+5784}
> <ffffffff80131440>{default_wake_function+0}
> Feb 22 08:28:36 localhost <ffffffff8010ebc2>{child_rip+8}
> <ffffffff88108b70>{:ptlrpc:ptlrpc_main+0}
> Feb 22 08:28:36 localhost <ffffffff8010ebba>{child_rip+0}
Phew! The socklnd is working OK, but these 2 betas aren't interoperable at the
lustre protocol level. Our bad for causing an
assertion failure though - a node shouldn't fall over just because someone
spoke garbage to it!
Cheers,
Eric
_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss