From: Zhu Yanjun <yanjun....@oracle.com>
Date: Sun, 15 Apr 2018 21:02:07 -0400

> While a faulty cable is used or HCA firmware error, HCA device will
> be offline. When the driver is accessing this offline device, the
> following call trace will pop out.
 ...
> In the above call trace, the function mlx4_cmd_poll calls the function
> mlx4_cmd_post to access the HCA while HCA is offline. Then mlx4_cmd_post
> returns an error -EIO. Per -EIO, the function mlx4_cmd_poll calls
> mlx4_cmd_reset_flow to reset HCA. And the above call trace pops out.
> 
> This is not reasonable. Since HCA device is offline when it is being
> accessed, it should not be reset again.
> 
> In this patch, since HCA is offline, the function mlx4_cmd_post returns
> an error -EINVAL. Per -EINVAL, the function mlx4_cmd_poll directly returns
> instead of resetting HCA.
> 
> CC: Srinivas Eeda <srinivas.e...@oracle.com>
> CC: Junxiao Bi <junxiao...@oracle.com>
> Suggested-by: HÃ¥kon Bugge <haakon.bu...@oracle.com>
> Signed-off-by: Zhu Yanjun <yanjun....@oracle.com>

Tariq, I'm assuming you'll take this in and send it to me later.

Thanks.

Reply via email to