I just notified Mellanox of the issue with the 1.2.490 firmware ... On Fri, Mar 27, 2009 at 4:48 PM, Roland Dreier <[email protected]> wrote: > > I spent the last couple of days retracing my steps. In my haste, I > > listed the wrong HCA firmware revision. It was firmware 1.2.940 that > > caused the system to crash while booting to Linux. I have the mthca > > driver built into the kernel; it is not a loadable driver. The system > > boots fine with the 1.2.0 firmware. > > Oh, it's mthca firmware version dependent? That's a big clue: you're > using mem-free firmware, which means the HCA uses system memory to store > big chunks of internal state. If something is going wrong with how the > memory is mapped to the HCA (or how the HCA writes to it) then that > could cause memory corruption -- possibly tied to posting receives to > the hardware as part of the MAD initialization. > > So it could be a driver bug exposed by the new firmware, or a firmware bug. > > Is Mellanox following this bug? Maybe they have some idea of how to > figure out what the HCA is doing that could crash a system. > > - R. > _______________________________________________ general mailing list [email protected] http://lists.openfabrics.org/cgi-bin/mailman/listinfo/general
To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
