Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=11327
Created an attachment (id=9514) Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the following link: --> (https://bugzilla.lustre.org/attachment.cgi?id=9514&action=view) proposed patch against 1.4.8-5chaos Proposed fix to resolve connection / eviction race. Here's the basic race full details can be seen in the attached kernel dk log. ping_evictor ll_ost_io_77 --------------------------------------------------------------------- - ping_evictor_main() - timeout evict client - exp->exp_fail = 1 - target_handle_connect() - valid (but failed) export found in target->obd_exports - exp->exp_connecting = 1 - class_disconnect destroys cookie - export = class_conn2export(&conn) lookup now fails due, no cookie. - LASSERT(export != NULL) The proposed fix checks for exp_fail to be set in the export when it's scanned for in target->obd_exports. If its set we bail because this export will soon be destroyed. This patch also ensures that if an export is successfully found and exp_fail is not set that exp_connecting will be set under the exp_lock spin_lock. This can then be used to prevent class_fail_export() from setting the exp_fail and destroying the export when the connect is in progress. _______________________________________________ Lustre-devel mailing list [email protected] https://mail.clusterfs.com/mailman/listinfo/lustre-devel
