Christophe Leroy <christophe.le...@csgroup.eu> writes:
> Le 17/06/2020 à 16:38, Peter Zijlstra a écrit :
>> On Thu, Jun 18, 2020 at 12:21:22AM +1000, Michael Ellerman wrote:
>>> Peter Zijlstra <pet...@infradead.org> writes:
>>>> On Mon, Jun 15, 2020 at 12:57:59PM +0000, Christophe Leroy wrote:
>> 
>>>>> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>>>>> +#define __HAVE_ARCH_PTEP_GET
>>>>> +static inline pte_t ptep_get(pte_t *ptep)
>>>>> +{
>>>>> + pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0};
>>>>> +
>>>>> + return pte;
>>>>> +}
>>>>> +#endif
>>>>
>>>> Would it make sense to have a comment with this magic? The casual reader
>>>> might wonder WTH just happened when he stumbles on this :-)
>>>
>>> I tried writing a helpful comment but it's too late for my brain to form
>>> sensible sentences.
>>>
>>> Christophe can you send a follow-up with a comment explaining it? In
>>> particular the zero entries stand out, it's kind of subtle that those
>>> entries are only populated with the right value when we write to the
>>> page table.
>> 
>> static inline pte_t ptep_get(pte_t *ptep)
>> {
>>      unsigned long val = READ_ONCE(ptep->pte);
>>      /* 16K pages have 4 identical value 4K entries */
>>      pte_t pte = {val, val, val, val);
>>      return pte;
>> }
>> 
>> Maybe something like that?
>
> This should work as well. Indeed nobody cares about what's in the other 
> three. They are only there to ensure that ptep++ increases the ptep 
> pointer by 16 bytes. Only the HW require 4 identical values, that's 
> taken care of in set_pte_at() and pte_update().

Right, but it seems less error-prone to have the in-memory
representation match what we have in the page table (well that's
in-memory too but you know what I mean).

> So we should use the most efficient. Thinking once more, maybe what you 
> propose is the most efficient as there is no need to load another 
> register with value 0 in order to write it in the stack.

On 64-bit I'd say it makes zero difference, the only thing that's going
to matter is the load from ptep->pte. I don't know whether that's true
on the 8xx cores though.

cheers

Reply via email to