John,

> Here's a representative one.  I got this kind of thing every time an older
> client tried to connect, and the problem went away as soon as I shut down or
> updated the older clients.  I didn't do much further debugging than that.
> 
> Feb 22 08:28:36 localhost Lustre: 11093:0:(lib-move.c:1644:lnet_parse_put()) 
> Dropping PUT from [EMAIL PROTECTED] portal 6 match
15660 offset 0 length 240: 2
> Feb 22 08:28:36 localhost LustreError: 
> 11215:0:(pack_generic.c:809:lustre_unpack_msg()) bad lustre msg magic: 
> 0XBD00BD2
> Feb 22 08:28:36 localhost LustreError: 
> 11215:0:(service.c:557:ptlrpc_server_handle_request()) error unpacking 
> request: ptl 12 from
[EMAIL PROTECTED] xid 15659
> Feb 22 08:28:36 localhost LustreError: 
> 11215:0:(pack_generic.c:1298:lustre_msg_get_opc()) ASSERTION(0) 
> failed:incorrect message
magic: 0bd00bd2
> Feb 22 08:28:36 localhost LustreError: 
> 11215:0:(pack_generic.c:1298:lustre_msg_get_opc()) LBUG
> Feb 22 08:28:36 localhost Lustre: 
> 11215:0:(linux-debug.c:166:libcfs_debug_dumpstack()) showing stack for 
> process 11215
> Feb 22 08:28:36 localhost ll_mdt_01     R  running task       0 11215      1  
>        11216 11178 (L-TLB)
> Feb 22 08:28:36 localhost ffff8100f39f1d18 0000000000000003 ffff810005d3f980 
> 0000000000000286 
> Feb 22 08:28:36 localhost 0000000000000003 ffff8100f9b24080 ffffffff88120210 
> 0000000000000512 
> Feb 22 08:28:36 localhost 0000000000000000 0000000000000000 
> Feb 22 08:28:36 localhost Call Trace:<ffffffff8010f53f>{show_trace+527} 
> <ffffffff8010f6b5>{show_stack+229}
> Feb 22 08:28:36 localhost <ffffffff88000d0a>{:libcfs:lbug_with_loc+122} 
> <ffffffff880feb2d>{:ptlrpc:lustre_msg_get_opc+285}
> Feb 22 08:28:36 localhost <ffffffff8810a208>{:ptlrpc:ptlrpc_main+5784} 
> <ffffffff80131440>{default_wake_function+0}
> Feb 22 08:28:36 localhost <ffffffff8010ebc2>{child_rip+8} 
> <ffffffff88108b70>{:ptlrpc:ptlrpc_main+0}
> Feb 22 08:28:36 localhost <ffffffff8010ebba>{child_rip+0} 

Phew!  The socklnd is working OK, but these 2 betas aren't interoperable at the 
lustre protocol level.  Our bad for causing an
assertion failure though - a node shouldn't fall over just because someone 
spoke garbage to it!

                Cheers,
                        Eric


_______________________________________________
Lustre-discuss mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-discuss

Reply via email to