On 6/19/2020 10:02 PM, Michael Walle wrote: > Am 2020-06-19 18:54, schrieb Horia Geantă: >> On 6/19/2020 7:37 PM, Horia Geanta wrote: >>> On 6/17/2020 11:48 PM, Michael Walle wrote: >>>> Am 2020-06-17 21:15, schrieb Horia Geantă: >>>>> On 6/4/2020 6:48 PM, Michael Walle wrote: >>>>>> + >>>>>> + desc = memalign(ARCH_DMA_MINALIGN, desc_size); >>>>>> + if (!desc) { >>>>>> + debug("cannot allocate RNG init descriptor memory\n"); >>>>>> + return -ENOMEM; >>>>>> + } >>>>>> + >>>>>> + for (sh_idx = 0; sh_idx < RNG4_MAX_HANDLES; sh_idx++) { >>>>>> + /* >>>>>> + * If the corresponding bit is set, then it means the >>>>>> state >>>>>> + * handle was initialized by us, and thus it needs to be >>>>>> + * deinitialized as well >>>>>> + */ >>>>>> + >>>>>> + if (state_handle_mask & RDSTA_IF(sh_idx)) { >>>>>> + /* >>>>>> + * Create the descriptor for deinstantating >>>>>> this state >>>>>> + * handle. >>>>>> + */ >>>>>> + inline_cnstr_jobdesc_rng_deinstantiation(desc, >>>>>> sh_idx); >>>>>> + flush_dcache_range((unsigned long)desc, >>>>>> + (unsigned long)desc + >>>>>> desc_size); >>>>> Shouldn't this be roundup(desc_size, ARCH_DMA_MINALIGN) instead of >>>>> desc_size? >>>> >>>> I've seen the same idioms sometimes, but it wasn't clear to me why >>>> that >>>> would >>>> be needed; the hardware only uses the desc_size, right? >>>> >>> Yes, HW will use only [desc, desc + desc_size]. >>> >>> I think this is needed to avoid cacheline sharing issues >>> on non-coherent platforms: CPU needs to make sure a larger area >>> is written back to memory and corresponding cache lines are >>> invalidated. >>> >>> Looking at flush_dcache_range() implementation, it does its own >>> rounding, >>> based on CTR_EL0[DminLine] - "smallest data cache line size". >>> I guess this value might be smaller than ARCH_DMA_MINALIGN, >>> hence the explicit rounding to ARCH_DMA_MINALIGN is needed. >>> >> Btw, I think >> desc = memalign(ARCH_DMA_MINALIGN, desc_size); >> needs to be replaced with >> desc = malloc_cache_aligned(desc_size); > > But then the rounding is not needed, right? I mean there can't > be any other malloc() which might allocate memory in the same > cache line. > Yes, in this case the rounding is no longer needed.
Horia