On 5/21/2024 9:11 PM, Paul Menzel wrote:
> Dear Jacob,
> 
> 
> Am 21.05.24 um 21:27 schrieb Jacob Keller:
> 
>> On 5/20/2024 10:55 PM, Paul Menzel wrote:
> 
>>> Am 20.05.24 um 23:39 schrieb Jacob Keller:
>>>> The ice driver reads data from the Shadow RAM portion of the NVM during
>>>> initialization, including data used to identify the NVM image and device,
>>>> such as the ETRACK ID used to populate devlink dev info fw.bundle.
>>>>
>>>> Currently it is using a fixed offset defined by ICE_CSS_HEADER_LENGTH to
>>>> compute the appropriate offset. This worked fine for E810 and E822 devices
>>>> which both have CSS header length of 330 words.
>>>>
>>>> Other devices, including both E825-C and E830 devices have different sizes
>>>> for their CSS header. The use of a hard coded value results in the driver
>>>> reading from the wrong block in the NVM when attempting to access the
>>>> Shadow RAM copy. This results in the driver reporting the fw.bundle as 0x0
>>>> in both the devlink dev info and ethtool -i output.
>>>>
>>>> The first E830 support was introduced by commit ba20ecb1d1bb ("ice: Hook up
>>>> 4 E830 devices by adding their IDs") and the first E825-C support was
>>>> introducted by commit f64e18944233 ("ice: introduce new E825C devices
>>>
>>> introduced
>>>
>>>> family")
>>>>
>>>> The NVM actually contains the CSS header length embedded in it. Remove the
>>>> hard coded value and replace it with logic to read the length from the NVM
>>>> directly. This is more resilient against all existing and future hardware,
>>>> vs looking up the expected values from a table. It ensures the driver will
>>>> read from the appropriate place when determining the ETRACK ID value used
>>>> for populating the fw.bundle_id and for reporting in ethtool -i.
>>>>
>>>> The CSS header length for both the active and inactive flash bank is stored
>>>> in the ice_bank_info structure to avoid unnecessary duplicate work when
>>>> accessing multiple words of the Shadow RAM. Both banks are read in the
>>>> unlikely event that the header length is different for the NVM in the
>>>> inactive bank, rather than being different only by the overall device
>>>> family.
>>>>
>>>> Fixes: ba20ecb1d1bb ("ice: Hook up 4 E830 devices by adding their IDs")
>>>> Co-developed-by: Paul Greenwalt <[email protected]>
>>>> Signed-off-by: Paul Greenwalt <[email protected]>
>>>> Signed-off-by: Jacob Keller <[email protected]>
>>>> Reviewed-by: Przemek Kitszel <[email protected]>
>>>> ---
>>>>    drivers/net/ethernet/intel/ice/ice_nvm.c  | 88 
>>>> ++++++++++++++++++++++++++++++-
>>>>    drivers/net/ethernet/intel/ice/ice_type.h | 14 +++--
>>>>    2 files changed, 93 insertions(+), 9 deletions(-)
>>>>
>>>> diff --git a/drivers/net/ethernet/intel/ice/ice_nvm.c 
>>>> b/drivers/net/ethernet/intel/ice/ice_nvm.c
>>>> index 84eab92dc03c..5968011e8c7e 100644
>>>> --- a/drivers/net/ethernet/intel/ice/ice_nvm.c
>>>> +++ b/drivers/net/ethernet/intel/ice/ice_nvm.c
>>>> @@ -374,11 +374,25 @@ ice_read_nvm_module(struct ice_hw *hw, enum 
>>>> ice_bank_select bank, u32 offset, u1
>>>>     *
>>>>     * Read the specified word from the copy of the Shadow RAM found in the
>>>>     * specified NVM module.
>>>> + *
>>>> + * Note that the Shadow RAM copy is always located after the CSS header, 
>>>> and
>>>> + * is aligned to 64-byte (32-word) offsets.
>>>>     */
>>>>    static int
>>>>    ice_read_nvm_sr_copy(struct ice_hw *hw, enum ice_bank_select bank, u32 
>>>> offset, u16 *data)
>>>>    {
>>>> -  return ice_read_nvm_module(hw, bank, ICE_NVM_SR_COPY_WORD_OFFSET + 
>>>> offset, data);
>>>> +  u32 sr_copy;
>>>> +
>>>> +  switch (bank) {
>>>> +  case ICE_ACTIVE_FLASH_BANK:
>>>> +          sr_copy = roundup(hw->flash.banks.active_css_hdr_len, 32);
>>>> +          break;
>>>> +  case ICE_INACTIVE_FLASH_BANK:
>>>> +          sr_copy = roundup(hw->flash.banks.inactive_css_hdr_len, 32);
>>>> +          break;
>>>> +  }
>>>> +
>>>> +  return ice_read_nvm_module(hw, bank, sr_copy + offset, data);
>>>>    }
>>>>    
>>>>    /**
>>>> @@ -1009,6 +1023,72 @@ static int ice_determine_active_flash_banks(struct 
>>>> ice_hw *hw)
>>>>            return 0;
>>>>    }
>>>>    
>>>> +/**
>>>> + * ice_get_nvm_css_hdr_len - Read the CSS header length from the NVM CSS 
>>>> header
>>>> + * @hw: pointer to the HW struct
>>>> + * @bank: whether to read from the active or inactive flash bank
>>>> + * @hdr_len: storage for header length in words
>>>> + *
>>>> + * Read the CSS header length from the NVM CSS header and add the 
>>>> Authentication
>>>> + * header size, and then convert to words.
>>>> + *
>>>> + * Return: zero on success, or a negative error code on failure.
>>>> + */
>>>> +static int
>>>> +ice_get_nvm_css_hdr_len(struct ice_hw *hw, enum ice_bank_select bank,
>>>> +                  u32 *hdr_len)
>>>> +{
>>>> +  u16 hdr_len_l, hdr_len_h;
>>>> +  u32 hdr_len_dword;
>>>> +  int status;
>>>> +
>>>> +  status = ice_read_nvm_module(hw, bank, ICE_NVM_CSS_HDR_LEN_L,
>>>> +                               &hdr_len_l);
>>>> +  if (status)
>>>> +          return status;
>>>> +
>>>> +  status = ice_read_nvm_module(hw, bank, ICE_NVM_CSS_HDR_LEN_H,
>>>> +                               &hdr_len_h);
>>>> +  if (status)
>>>> +          return status;
>>>> +
>>>> +  /* CSS header length is in DWORD, so convert to words and add
>>>> +   * authentication header size
>>>> +   */
>>>> +  hdr_len_dword = hdr_len_h << 16 | hdr_len_l;
>>>> +  *hdr_len = (hdr_len_dword * 2) + ICE_NVM_AUTH_HEADER_LEN;
>>>> +
>>>> +  return 0;
>>>> +}
>>>> +
>>>> +/**
>>>> + * ice_determine_css_hdr_len - Discover CSS header length for the device
>>>> + * @hw: pointer to the HW struct
>>>> + *
>>>> + * Determine the size of the CSS header at the start of the NVM module. 
>>>> This
>>>> + * is useful for locating the Shadow RAM copy in the NVM, as the Shadow 
>>>> RAM is
>>>> + * always located just after the CSS header.
>>>> + *
>>>> + * Return: zero on success, or a negative error code on failure.
>>>> + */
>>>> +static int ice_determine_css_hdr_len(struct ice_hw *hw)
>>>> +{
>>>> +  struct ice_bank_info *banks = &hw->flash.banks;
>>>> +  int status;
>>>> +
>>>> +  status = ice_get_nvm_css_hdr_len(hw, ICE_ACTIVE_FLASH_BANK,
>>>> +                                   &banks->active_css_hdr_len);
>>>> +  if (status)
>>>> +          return status;
>>>> +
>>>> +  status = ice_get_nvm_css_hdr_len(hw, ICE_INACTIVE_FLASH_BANK,
>>>> +                                   &banks->inactive_css_hdr_len);
>>>> +  if (status)
>>>> +          return status;
>>>> +
>>>> +  return 0;
>>>> +}
>>>> +
>>>>    /**
>>>>     * ice_init_nvm - initializes NVM setting
>>>>     * @hw: pointer to the HW struct
>>>> @@ -1055,6 +1135,12 @@ int ice_init_nvm(struct ice_hw *hw)
>>>>                    return status;
>>>>            }
>>>>    
>>>> +  status = ice_determine_css_hdr_len(hw);
>>>> +  if (status) {
>>>> +          ice_debug(hw, ICE_DBG_NVM, "Failed to determine Shadow RAM copy 
>>>> offsets.\n");
>>>
>>> As this is a new failure path, should the user be warned about this, if
>>> it cannot be determined, and NVM possibly be broken?
> 
>> Possibly. I'm also trying to avoid spamming the log with failure
>> messages which are more useful for developers who can enable them vs end
>> users who may not understand.
> 
> I agree that logging too much is also confusing. Excuse my ignorance, 
> but what happens if NVM is broken and the offset cannot be determined. 
> Will the user get any error message and know what to do (replace the 
> device or call support)? Or another view point, is the bug report with 
> the Linux log messages included going to have the information, so the 
> support folks or developers can pinpoint the problem without replying to 
> the user to enable debug messages?
> 

I'm not sure. In this case it probably does make sense to warn, since
this is unexpected and is unlikely to result in spamming the log
multiple times.

> 
> Kind regards,
> 
> Paul

Reply via email to