On Tue, Jun 30, 2020 at 02:21:59AM +0200, Andrew Lunn wrote: > I've no practice experience with modules other than plain old SFPs, > 1G. And those have all sorts of errors, even basic things like the CRC > are systematically incorrect because they are not recalculated after > adding the serial number. We have had people trying to submit patches > to ethtool to make it ignore bits so that it dumps more information, > because the manufacturer failed to set the correct bits, etc. > > Ido, Adrian, what is your experience with these QSFP-DD devices. Are > they generally of better quality, the EEPROM can be trusted? Is there > any form of compliance test.
Vadim, I know you tested with at least two different QSFP-DD modules, can you please share your experience? > > If we go down the path of using the discovery information, it means we > have no way for user space to try to correct for when the information > is incorrect. It cannot request specific pages. So maybe we should > consider an alternative? > > The netlink ethtool gives us more flexibility. How about we make a new > API where user space can request any pages it want, and specify the > size of the page. ethtool can start out by reading page 0. That should > allow it to identify the basic class of device. It can then request > additional pages as needed. Just to make sure I understand, this also means adding a new API towards drivers, right? So that they only read from HW the requested info. > The nice thing about that is we don't need two parsers of the > discovery information, one in user and second in kernel space. We > don't need to guarantee these two parsers agree with each other, in > order to correctly decode what the kernel sent to user space. And user > space has the flexibility to work around known issues when > manufactures get their EEPROM wrong. Sounds sane to me... I know that in the past Vadim had to deal with various faulty modules. Vadim, is this something we can support? What happens if user space requests a page that does not exist? For example, in the case of QSFP-DD, lets say we do not provide page 03h but user space still wants it because it believes manufacturer did not set correct bits.