On 05/18/2015 02:53 PM, Thomas Gleixner wrote:
> On Fri, 8 May 2015, Dave Hansen wrote:
> 
>>
>> From: Dave Hansen <[email protected]>
>>
>> Changes from v20:
>>
>>  * Fix macro confusion between BD and BT
>>  * Add accessor for bt_entry_size_bytes()
> 
> Forgot to say this earlier. Please put the changes after the changelog
> itself, i.e. after the '---'
> 
>> -#ifdef CONFIG_X86_64
>> -
>> -/* upper 28 bits [47:20] of the virtual address in 64-bit used to
>> - * index into bounds directory (BD).
>> +/*
>> + * The upper 28 bits [47:20] of the virtual address in 64-bit
>> + * are used to index into bounds directory (BD).
>> + *
>> + * The directory is 2G (2^31) in size, and with 8-byte entries
>> + * it has 2^28 entries.
>>   */
>> -#define MPX_BD_ENTRY_OFFSET 28
>> -#define MPX_BD_ENTRY_SHIFT  3
>> -/* bits [19:3] of the virtual address in 64-bit used to index into
>> - * bounds table (BT).
>> +#define MPX_BD_SIZE_BYTES_64        (1UL<<31)
>> +/* An entry is a long, so 8 bytes and a shift of 3 */
> 
> I can see the 8 bytes, but where is the shift constant?

The comment is old.  I'll zap it.

>> +static inline int bt_entry_size_bytes(struct mm_struct *mm)
>> +{
>> +    if (is_64bit_mm(mm))
>> +            return MPX_BT_ENTRY_BYTES_64;
>> +    else
>> +            return MPX_BT_ENTRY_BYTES_32;
>> +}
>> +
>> +/*
>> + * Take a virtual address and turns it in to the offset in bytes
>> + * inside of the bounds table where the bounds table entry
>> + * controlling 'addr' can be found.
>> + */
>> +static unsigned long mpx_get_bt_entry_offset_bytes(struct mm_struct *mm,
>> +            unsigned long addr)
>> +{
>> +    unsigned long bt_table_nr_entries;
>> +    unsigned long offset = addr;
>> +
>> +    if (is_64bit_mm(mm)) {
>> +            /* Bottom 3 bits are ignored on 64-bit */
>> +            offset >>= 3;
>> +            bt_table_nr_entries = MPX_BT_NR_ENTRIES_64;
>> +    } else {
>> +            /* Bottom 2 bits are ignored on 32-bit */
>> +            offset >>= 2;
>> +            bt_table_nr_entries = MPX_BT_NR_ENTRIES_32;
>> +    }
>> +    /*
>> +     * We know the size of the table in to which we are
>> +     * indexing, and we have eliminated all the low bits
>> +     * which are ignored for indexing.
>> +     *
>> +     * Mask out all the high bits which we do not need
>> +     * to index in to the table.
>> +     */
>> +    offset &= (bt_table_nr_entries-1);
> 
>              ....  entries - 1); 
> 
> And you might explain why nr_entries - 1 is a proper mask,
> i.e. nr_entries is always a power of 2.
> 
>> +    /*
>> +     * We now have an entry offset in terms of *entries* in
>> +     * the table.  We need to scale it back up to bytes.
>> +     */
>> +    offset *= bt_entry_size_bytes(mm);
> 
> You could store the scale value out in the if () construct above, but I
> guess the compiler can figure that out as well :)
> 
>> +    return offset;
>> +}
>> +
>> +/*
>> + * Total size of the process's virtual address space
>> + * Use a u64 because 4GB (for 32-bit) won't fit in a long.
>> + *
>> + * __VIRTUAL_MASK does not work here.  It only covers the
>> + * user address space and the tables cover the *entire*
>> + * virtual address space supported on the CPU.
>> + */
>> +static inline unsigned long long mm_virt_space(struct mm_struct *mm)
>> +{
>> +    if (is_64bit_mm(mm))
>> +            return 1ULL << 48;
> 
> cpu_info->x86_phys_bits will tell you the proper value
> 
>> +    else
>> +            return 1ULL << 32;
> 
> And for a 32bit kernel 32 might be wrong because with PAE you have 36
> bits.

That's physical space.  I really do need virtual space here.

But your comments stand for ->x86_virt_bits.  I'll fix.

>> +static unsigned long mpx_get_bd_entry_offset(struct mm_struct *mm,
>> +            unsigned long addr)
>> +{
>> +    /*
>> +     * There are several ways to derive the bd offsets.  We
>> +     * use the following approach here:
>> +     * 1. We know the size of the virtual address space
>> +     * 2. We know the number of entries in a bounds table
>> +     * 3. We know that each entry covers a fixed amount of
>> +     *    virtual address space.
>> +     * So, we can just divide the virtual address by the
>> +     * virtual space used by one entry to determine which
>> +     * entry "controls" the given virtual address.
>> +     */
>> +    if (is_64bit_mm(mm)) {
>> +            int bd_entry_size = 8; /* 64-bit pointer */
>> +            /*
>> +             * Take the 64-bit addressing hole in to account.
>> +             * This is a noop on 32-bit since it has no hole.
> 
> But a 32bit kernel will not take this code path because
> is_64bit_mm(mm) evaluates to false.

I meant that in case someone wondered why I didn't have that code in the
32-bit version.  I'll move the comment.

>> +             */
>> +            addr &= ~(mm_virt_space(mm) - 1);
>> +            return (addr / bd_entry_virt_space(mm)) * bd_entry_size;
>> +    } else {
>> +            int bd_entry_size = 4; /* 32-bit pointer */
>> +            return (addr / bd_entry_virt_space(mm)) * bd_entry_size;
>> +    }
>> +    /*
>> +     * The two return calls above are exact copies.  If we
>> +     * pull out a single copy and put it in here, gcc won't
>> +     * realize that we're doing a power-of-2 divide and use
>> +     * shifts.  It uses a real divide.  If we put them up
>> +     * there, it manages to figure it out (gcc 4.8.3).
> 
> Can't we provide the shift values from bd_entry_virt_space() so we
> don't have to worry about gcc versions being more or less clever?

Yes, I could go back and rework all the math to be done with shifts
instead of power-of-2 divides (which is what was done before).  But,
it's very clear the way that it stands, minus this wart.  The code look
a *lot* better this way.

This isn't super performance-sensitive code.  It's basically in the
munmap() path.  I just really didn't like the idea of an actual integer
divide in there.

So, if GCC breaks this, so be it.  I don't think we'll ever notice.  The
optimization was just too obvious to completely ignore.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to