I think I found the race that causes this NULL Dereference. 1) There is a connection error.
2) srp_completion gets bad status and schedules a call to srp_reconnect_work. 3) srp_reconnect_work is scheduled to run and calls srp_reconnect_target. 4) srp_reconnect_target starts to run, changes the target state to SRP_TARGET_CONNECTING but there is a context switch before it gets to execute srp_reset_req. 5) The scsi error handling calls to srp_reset_host. 6) srp_reset_host calls srp_reconnect_target that returns -EAGAIN (because the target state is not SRP_TARGET_LIVE). 7) srp_reset_host returns FAILED and therefore the device goes offline. 8) Because the device goes offline the commands are being freed (In the scsi mid-layer). 9) The first execution of srp_reconnect_target resumes and calls to srp_reset_req that tries to access the commands that were freed. 10) NULL deref. Ishai _______________________________________________ openib-general mailing list [email protected] http://openib.org/mailman/listinfo/openib-general To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
