On 3/13/2014 10:40 AM, Eli Cohen wrote:
On Thu, Mar 13, 2014 at 10:12:19AM -0500, Carol Soto wrote:
In mlx4 code, I do not recall a timeout for commands this big. So
the reason in mlx5 is 2 hrs is just for
debugging purposes? So if for any reason a command hang then the
user can not remove this module
for the next 2 hrs?
Hi Carol,
well I haven't seen any such case with latest firmware releases.
Anyway, 10 msec is really too short timeout value since there are
commands that can take more than that (e.g. memory registartion of
regions larger then 512 MB - though this will be changed soon). I
wonder what was the original motivation and have you been able to
simulate PCI errors and see this in action.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Eli,
The motivation to reduce that timeout is that if there is a process in
the middle of a HW command
in the middle of the PCI error, I probably did not want to wait 2hrs
since the command will never complete
since the card is dead. Now you are right, I forgot the case of big
memory registration where commands can
take longer than that. Do you have an idea of what is the longest time
that a command can take in mlx5?
Carol
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html