Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11548
Created an attachment (id=9543) Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9543&action=view) patch to print router NID on checksum failures This patch adds a 'sender' field to the LNET completion event and the lustre bulk descriptor. This records the NID of the peer that the node received a message from as well as the message initiator. If they are different, then 'sender' is the NID of the last router that forwarded the message. If 'sender' != initiator.nid when a bulk checksum fails (WRITEs on the server - READs on the client), this patch includes "via <router NID>" in the error message so that the last router that forwarded the bulk data can be identified. Please note that this patch has had minimal testing - I've checked that 'sender' is being set correctly, but I've not forced checksum errors etc. _______________________________________________ Lustre-devel mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-devel
