On 26/10/16 13:51, Kyrill Tkachov wrote:
> Hi Andre,
>
> On 25/10/16 17:29, Andre Vieira (lists) wrote:
>> On 24/08/16 12:01, Andre Vieira (lists) wrote:
>>> On 25/07/16 14:23, Andre Vieira (lists) wrote:
>>>> This patch extends support for the ARMv8-M Security Extensions
>>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>>> information through unbanked registers.
>>>>
>>>> When returning from a nonsecure entry function we clear all
>>>> caller-saved
>>>> registers that are not used to pass return values, by writing either
>>>> the
>>>> LR, in case of general purpose registers, or the value 0, in case of FP
>>>> registers. We use the LR to write to APSR and FPSCR too. We
>>>> currently do
>>>> not support entry functions that pass arguments or return variables on
>>>> the stack and we diagnose this. This patch relies on the existing code
>>>> to make sure callee-saved registers used in cmse_nonsecure_entry
>>>> functions are saved and restored thus retaining their nonsecure mode
>>>> value, this should be happening already as it is required by AAPCS.
>>>>
>>>> This patch also clears padding bits for cmse_nonsecure_entry functions
>>>> with struct and union return types. For unions a bit is only considered
>>>> a padding bit if it is an unused bit in every field of that union. The
>>>> function that calculates these is used in a later patch to do the same
>>>> for arguments of cmse_nonsecure_call's.
>>>>
>>>> *** gcc/ChangeLog ***
>>>> 2016-07-25 Andre Vieira <andre.simoesdiasvie...@arm.com>
>>>> Thomas Preud'homme <thomas.preudho...@arm.com>
>>>>
>>>> * config/arm/arm.c (output_return_instruction): Clear
>>>> registers.
>>>> (thumb2_expand_return): Likewise.
>>>> (thumb1_expand_epilogue): Likewise.
>>>> (thumb_exit): Likewise.
>>>> (arm_expand_epilogue): Likewise.
>>>> (cmse_nonsecure_entry_clear_before_return): New.
>>>> (comp_not_to_clear_mask_str_un): New.
>>>> (compute_not_to_clear_mask): New.
>>>> * config/arm/thumb1.md (*epilogue_insns): Change length
>>>> attribute.
>>>> * config/arm/thumb2.md (*thumb2_return): Likewise.
>>>>
>>>> *** gcc/testsuite/ChangeLog ***
>>>> 2016-07-25 Andre Vieira <andre.simoesdiasvie...@arm.com>
>>>> Thomas Preud'homme <thomas.preudho...@arm.com>
>>>>
>>>> * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>>> separate.
>>>> * gcc.target/arm/cmse/struct-1.c: New.
>>>> * gcc.target/arm/cmse/bitfield-1.c: New.
>>>> * gcc.target/arm/cmse/bitfield-2.c: New.
>>>> * gcc.target/arm/cmse/bitfield-3.c: New.
>>>> * gcc.target/arm/cmse/baseline/cmse-2.c: Test that
>>>> registers are
>>>> cleared.
>>>> * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>>> * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>>> * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>>> * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>>> * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>>
>>> Updated this patch to correctly clear only the cumulative
>>> exception-status (0-4,7) and the condition code bits (28-31) of the
>>> FPSCR. I also adapted the code to be handle the bigger floating point
>>> register files.
>>>
>>> ----
>>>
>>> This patch extends support for the ARMv8-M Security Extensions
>>> 'cmse_nonsecure_entry' attribute to safeguard against leak of
>>> information through unbanked registers.
>>>
>>> When returning from a nonsecure entry function we clear all caller-saved
>>> registers that are not used to pass return values, by writing either the
>>> LR, in case of general purpose registers, or the value 0, in case of FP
>>> registers. We use the LR to write to APSR. For FPSCR we clear only the
>>> cumulative exception-status (0-4, 7) and the condition code bits
>>> (28-31). We currently do not support entry functions that pass arguments
>>> or return variables on the stack and we diagnose this. This patch relies
>>> on the existing code to make sure callee-saved registers used in
>>> cmse_nonsecure_entry functions are saved and restored thus retaining
>>> their nonsecure mode value, this should be happening already as it is
>>> required by AAPCS.
>>>
>>> This patch also clears padding bits for cmse_nonsecure_entry functions
>>> with struct and union return types. For unions a bit is only considered
>>> a padding bit if it is an unused bit in every field of that union. The
>>> function that calculates these is used in a later patch to do the same
>>> for arguments of cmse_nonsecure_call's.
>>>
>>> *** gcc/ChangeLog ***
>>> 2016-07-xx Andre Vieira <andre.simoesdiasvie...@arm.com>
>>> Thomas Preud'homme <thomas.preudho...@arm.com>
>>>
>>> * config/arm/arm.c (output_return_instruction): Clear
>>> registers.
>>> (thumb2_expand_return): Likewise.
>>> (thumb1_expand_epilogue): Likewise.
>>> (thumb_exit): Likewise.
>>> (arm_expand_epilogue): Likewise.
>>> (cmse_nonsecure_entry_clear_before_return): New.
>>> (comp_not_to_clear_mask_str_un): New.
>>> (compute_not_to_clear_mask): New.
>>> * config/arm/thumb1.md (*epilogue_insns): Change length
>>> attribute.
>>> * config/arm/thumb2.md (*thumb2_return): Duplicate pattern for
>>> cmse_nonsecure_entry functions.
>>>
>>> *** gcc/testsuite/ChangeLog ***
>>> 2016-07-xx Andre Vieira <andre.simoesdiasvie...@arm.com>
>>> Thomas Preud'homme <thomas.preudho...@arm.com>
>>>
>>> * gcc.target/arm/cmse/cmse.exp: Test different multilibs
>>> separate.
>>> * gcc.target/arm/cmse/struct-1.c: New.
>>> * gcc.target/arm/cmse/bitfield-1.c: New.
>>> * gcc.target/arm/cmse/bitfield-2.c: New.
>>> * gcc.target/arm/cmse/bitfield-3.c: New.
>>> * gcc.target/arm/cmse/baseline/cmse-2.c: Test that registers
>>> are
>>> cleared.
>>> * gcc.target/arm/cmse/mainline/soft/cmse-5.c: New.
>>> * gcc.target/arm/cmse/mainline/hard/cmse-5.c: New.
>>> * gcc.target/arm/cmse/mainline/hard-sp/cmse-5.c: New.
>>> * gcc.target/arm/cmse/mainline/softfp/cmse-5.c: New.
>>> * gcc.target/arm/cmse/mainline/softfp-sp/cmse-5.c: New.
>>>
>> Hi,
>>
>> Rebased previous patch on top of trunk as requested. No changes to
>> ChangeLog.
>>
>> Cheers,
>> Andre
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index
> bb81e5662e81a26c7d3ccf9f749e8e356e6de35e..c6260323ecfd2f2842e6a5aab06b67da16619c73
> 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -17496,6 +17496,279 @@ note_invalid_constants (rtx_insn *insn,
> HOST_WIDE_INT address, int do_pushes)
> return;
> }
>
> +/* This function computes the clear mask and PADDING_BITS_TO_CLEAR for
> structs
> + and unions in the context of ARMv8-M Security Extensions. It is
> used as a
> + helper function for both 'cmse_nonsecure_call' and
> 'cmse_nonsecure_entry'
> + functions. The PADDING_BITS_TO_CLEAR pointer can be the base to
> either one
> + or four masks, depending on whether it is being computed for a
> + 'cmse_nonsecure_entry' return value or a 'cmse_nonsecure_call' argument
> + respectively. The tree for the type of the argument or a field
> within an
> + argument is passed in ARG_TYPE, the current register this argument
> or field
> + starts in is kept in the pointer REGNO and updated accordingly, the
> bit this
> + argument or field starts at is passed in STARTING_BIT and the last
> used bit
> + is kept in LAST_USED_BIT which is also updated accordingly. */
> +
> +static unsigned HOST_WIDE_INT
> +comp_not_to_clear_mask_str_un (tree arg_type, int * regno,
> + uint32_t * padding_bits_to_clear,
> + unsigned starting_bit, int * last_used_bit)
> +
> +{
> + unsigned HOST_WIDE_INT not_to_clear_reg_mask = 0;
> +
> + if (TREE_CODE (arg_type) == RECORD_TYPE)
> + {
> + unsigned current_bit = starting_bit;
> + tree field;
> + long int offset, size;
> +
> +
> + field = TYPE_FIELDS (arg_type);
> + while (field)
> + {
> + /* The offset within a structure is always an offset from
> + the start of that structure. Make sure we take that into the
> + calculation of the register based offset that we use here. */
> + offset = starting_bit;
> + offset += TREE_INT_CST_ELT (DECL_FIELD_BIT_OFFSET (field), 0);
> + offset %= 32;
> +
> + /* This is the actual size of the field, for bitfields this is the
> + bitfield width and not the container size. */
> + size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
> +
> + if (*last_used_bit != offset)
> + {
> + if (offset < *last_used_bit)
> + {
> + /* This field's offset is before the 'last_used_bit', that
> + means this field goes on the next register. So we need to
> + pad the rest of the current register and increase the
> + register number. */
> + uint32_t mask;
> + mask = UINT32_MAX - ((uint32_t) 1 << *last_used_bit);
> + mask++;
> +
> + *(padding_bits_to_clear + *regno) |= mask;
>
> padding_bits_to_clear[*regno] |= mask;
>
> + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
> + (*regno)++;
> + }
> + else
> + {
> + /* Otherwise we pad the bits between the last field's end and
> + the start of the new field. */
> + uint32_t mask;
> +
> + mask = UINT32_MAX >> (32 - offset);
> + mask -= ((uint32_t) 1 << *last_used_bit) - 1;
> + *(padding_bits_to_clear + *regno) |= mask;
>
> Likewise.
>
> + }
> + current_bit = offset;
> + }
> +
> + /* Calculate further padding bits for inner structs/unions too. */
> + if (RECORD_OR_UNION_TYPE_P (TREE_TYPE (field)))
> + {
> + *last_used_bit = current_bit;
> + not_to_clear_reg_mask
> + |= comp_not_to_clear_mask_str_un (TREE_TYPE (field), regno,
> + padding_bits_to_clear, offset,
> + last_used_bit);
> + }
> + else
> + {
> + /* Update 'current_bit' with this field's size. If the
> + 'current_bit' lies in a subsequent register, update 'regno' and
> + reset 'current_bit' to point to the current bit in that new
> + register. */
> + current_bit += size;
> + while (current_bit >= 32)
> + {
> + current_bit-=32;
> + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
> + (*regno)++;
> + }
> + *last_used_bit = current_bit;
> + }
> +
> + field = TREE_CHAIN (field);
> + }
> + not_to_clear_reg_mask |= HOST_WIDE_INT_1U << *regno;
> + }
> + else if (TREE_CODE (arg_type) == UNION_TYPE)
> + {
> + tree field, field_t;
> + int i, regno_t, field_size;
> + int max_reg = -1;
> + int max_bit = -1;
> + uint32_t mask;
> + uint32_t padding_bits_to_clear_res[NUM_ARG_REGS]
> + = {UINT32_MAX, UINT32_MAX, UINT32_MAX, UINT32_MAX};
> +
> + /* To compute the padding bits in a union we only consider bits as
> + padding bits if they are always either a padding bit or fall
> outside a
> + fields size for all fields in the union. */
> + field = TYPE_FIELDS (arg_type);
> + while (field)
> + {
> + uint32_t padding_bits_to_clear_t[NUM_ARG_REGS]
> + = {0U, 0U, 0U, 0U};
> + int last_used_bit_t = *last_used_bit;
> + regno_t = *regno;
> + field_t = TREE_TYPE (field);
> +
> + /* If the field's type is either a record or a union make sure to
> + compute their padding bits too. */
> + if (RECORD_OR_UNION_TYPE_P (field_t))
> + not_to_clear_reg_mask
> + |= comp_not_to_clear_mask_str_un (field_t, ®no_t,
> + &padding_bits_to_clear_t[0],
> + starting_bit, &last_used_bit_t);
> + else
> + {
> + field_size = TREE_INT_CST_ELT (DECL_SIZE (field), 0);
> + regno_t = (field_size / 32) + *regno;
> + last_used_bit_t = (starting_bit + field_size) % 32;
> + }
> +
> + for (i = *regno; i < regno_t; i++)
> + {
> + /* For all but the last register used by this field only keep
> the
> + padding bits that were padding bits in this field. */
> + padding_bits_to_clear_res[i] &= padding_bits_to_clear_t[i];
> + }
> +
> + /* For the last register, keep all padding bits that were padding
> + bits in this field and any padding bits that are still valid
> + as padding bits but fall outside of this field's size. */
> + mask = (UINT32_MAX - ((uint32_t) 1 << last_used_bit_t)) + 1;
> + padding_bits_to_clear_res[regno_t]
> + &= padding_bits_to_clear_t[regno_t] | mask;
> +
> + /* Update the maximum size of the fields in terms of registers used
> + ('max_reg') and the 'last_used_bit' in said register. */
> + if (max_reg < regno_t)
> + {
> + max_reg = regno_t;
> + max_bit = last_used_bit_t;
> + }
> + else if (max_reg == regno_t && max_bit < last_used_bit_t)
> + max_bit = last_used_bit_t;
> +
> + field = TREE_CHAIN (field);
> + }
> +
> + /* Update the current padding_bits_to_clear using the
> intersection of the
> + padding bits of all the fields. */
> + for (i=*regno; i < max_reg; i++)
> + padding_bits_to_clear[i] |= padding_bits_to_clear_res[i];
> +
>
> watch the spacing in the 'for' definition.
>
> + /* Do not keep trailing padding bits, we do not know yet whether
> this
> + is the end of the argument. */
> + mask = ((uint32_t) 1 << max_bit) - 1;
> + padding_bits_to_clear[max_reg]
> + |= padding_bits_to_clear_res[max_reg] & mask;
> +
> + *regno = max_reg;
> + *last_used_bit = max_bit;
> + }
> + else
> + /* This function should only be used for structs and unions. */
> + gcc_unreachable ();
> +
> + return not_to_clear_reg_mask;
> +}
> +
> +/* In the context of ARMv8-M Security Extensions, this function is used
> for both
> + 'cmse_nonsecure_call' and 'cmse_nonsecure_entry' functions to
> compute what
> + registers are used when returning or passing arguments, which is then
> + returned as a mask. It will also compute a mask to indicate
> padding/unused
> + bits for each of these registers, and passes this through the
> + PADDING_BITS_TO_CLEAR pointer. The tree of the argument type is
> passed in
> + ARG_TYPE, the rtl representation of the argument is passed in
> ARG_RTX and
> + the starting register used to pass this argument or return value is
> passed
> + in REGNO. It makes use of 'comp_not_to_clear_mask_str_un' to
> compute these
> + for struct and union types. */
> +
> +static unsigned HOST_WIDE_INT
> +compute_not_to_clear_mask (tree arg_type, rtx arg_rtx, int regno,
> + uint32_t * padding_bits_to_clear)
> +
> +{
> + int last_used_bit = 0;
> + unsigned HOST_WIDE_INT not_to_clear_mask;
> +
> + if (RECORD_OR_UNION_TYPE_P (arg_type))
> + {
> + not_to_clear_mask
> + = comp_not_to_clear_mask_str_un (arg_type, ®no,
> + padding_bits_to_clear, 0,
> + &last_used_bit);
> +
> +
> + /* If the 'last_used_bit' is not zero, that means we are still
> using a
> + part of the last 'regno'. In such cases we must clear the trailing
> + bits. Otherwise we are not using regno and we should mark it as to
> + clear. */
> + if (last_used_bit != 0)
> + *(padding_bits_to_clear + regno)
> + |= UINT32_MAX - ((uint32_t) 1 << last_used_bit) + 1;
>
> padding_bits_to_clear[regno] |= ...
>
> + else
> + not_to_clear_mask &= ~(HOST_WIDE_INT_1U << regno);
> + }
> + else
> + {
> + not_to_clear_mask = 0;
> + /* We are not dealing with structs nor unions. So these
> arguments may be
> + passed in floating point registers too. In some cases a BLKmode is
> + used when returning or passing arguments in multiple VFP
> registers. */
> + if (GET_MODE (arg_rtx) == BLKmode)
> + {
> + int i, arg_regs;
> + rtx reg;
> +
> + /* This should really only occur when dealing with the hard-float
> + ABI. */
> + gcc_assert (TARGET_HARD_FLOAT_ABI);
> +
> + for (i = 0; i < XVECLEN (arg_rtx, 0); i++)
> + {
> + reg = XEXP (XVECEXP (arg_rtx, 0, i), 0);
> + gcc_assert (REG_P (reg));
> +
> + not_to_clear_mask |= HOST_WIDE_INT_1U << REGNO (reg);
> +
> + /* If we are dealing with DF mode, make sure we don't
> + clear either of the registers it addresses. */
> + arg_regs = ARM_NUM_REGS (GET_MODE (reg));
>
> Better assert here that you're indeed dealing with DFmode and/or you
> have 2 registers.
>
The current code actually works for larger modes too. I don't think code
will ever be generated with larger types, but why assert if it works anyway?