Okay, Based on the comments so far, I will work on the 5th version of the
patch, major changes will include:
1. Change the return type of the routine .ACCESS_WITH_SIZE
FROM:
Pointer to the type of the element of the flexible array;
TO:
Pointer to the type of the flexible array;
And then wrap the call with an indirection reference.
2. Adjust all other parts with this change, (this will simplify the bound
sanitizer instrument);
3. Add the fixes to the kernel building failures, which include:
A. The operator “typeof” cannot return correct type for a->array;
(I guess that the above change 1 might automatically resolve this issue)
B. The operator “&” cannot return correct address for a->array;
4. Correctly handle the case when the value of “counted-by” is zero or negative
as following
4.1 . Update the counted-by doc with the following:
When the counted-by field is assigned a negative integer value, the
compiler will treat the value as zero.
4.2. (possibly) Adjust __bdos and array bound sanitizer to handle
correctly when “counted-by” is zero.
__bdos will return size 0 when counted-by is zero;
Array bound sanitizer will report out-of-bound when the counted-by is
zero for any array access.
Let me know if I missed anything.
Thanks a lot for all the comments
Qing
> On Jan 23, 2024, at 7:29 PM, Qing Zhao <[email protected]> wrote:
>
> Hi,
>
> This is the 4th version of the patch.
>
> It based on the following proposal:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/635884.html
> Represent the missing dependence for the "counted_by" attribute and its
> consumers
>
> ******The summary of the proposal is:
>
> * Add a new internal function ".ACCESS_WITH_SIZE" to carry the size
> information for every reference to a FAM field;
> * In C FE, Replace every reference to a FAM field whose TYPE has the
> "counted_by" attribute with the new internal function ".ACCESS_WITH_SIZE";
> * In every consumer of the size information, for example, BDOS or array bound
> sanitizer, query the size information or ACCESS_MODE information from the new
> internal function;
> * When expansing to RTL, replace the internal function with the actual
> reference to the FAM field;
> * Some adjustment to ipa alias analysis, and other SSA passes to mitigate the
> impact to the optimizer and code generation.
>
>
> ******The new internal function
>
> .ACCESS_WITH_SIZE (REF_TO_OBJ, REF_TO_SIZE, CLASS_OF_SIZE, SIZE_OF_SIZE,
> ACCESS_MODE, INDEX)
>
> INTERNAL_FN (ACCESS_WITH_SIZE, ECF_LEAF | ECF_NOTHROW, NULL)
>
> which returns the "REF_TO_OBJ" same as the 1st argument;
>
> Both the return type and the type of the first argument of this function have
> been converted from the incomplete array type to the corresponding pointer
> type.
>
> Please see the following link for why:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/638793.html
> https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639605.html
>
> 1st argument "REF_TO_OBJ": The reference to the object;
> 2nd argument "REF_TO_SIZE": The reference to the size of the object,
> 3rd argument "CLASS_OF_SIZE": The size referenced by the REF_TO_SIZE
> represents
> 0: unknown;
> 1: the number of the elements of the object type;
> 2: the number of bytes;
> 4th argument "PRECISION_OF_SIZE": The precision of the integer that
> REF_TO_SIZE points;
> 5th argument "ACCESS_MODE":
> -1: Unknown access semantics
> 0: none
> 1: read_only
> 2: write_only
> 3: read_write
> 6th argument "INDEX": the INDEX for the original array reference.
> -1: Unknown
>
> NOTE: The 6th Argument is added for bound sanitizer instrumentation.
>
> ****** The Patch sets included:
>
> 1. Provide counted_by attribute to flexible array member field;
> which includes:
> * "counted_by" attribute documentation;
> * C FE handling of the new attribute;
> syntax checking, error reporting;
> * testing cases;
>
> 2. Convert "counted_by" attribute to/from .ACCESS_WITH_SIZE.
> which includes:
> * The definition of the new internal function .ACCESS_WITH_SIZE in
> internal-fn.def.
> * C FE converts every reference to a FAM with "counted_by" attribute to
> a call to the internal function .ACCESS_WITH_SIZE.
> (build_component_ref in c_typeck.cc)
> This includes the case when the object is statically allocated and
> initialized.
> In order to make this working, we should update
> initializer_constant_valid_p_1 and output_constant in varasm.cc to include
> calls to .ACCESS_WITH_SIZE.
>
> However, for the reference inside "offsetof", ignore the "counted_by"
> attribute since it's not useful at all. (c_parser_postfix_expression in
> c/c-parser.cc)
>
> * Convert every call to .ACCESS_WITH_SIZE to its first argument.
> (expand_ACCESS_WITH_SIZE in internal-fn.cc)
> * adjust alias analysis to exclude the new internal from clobbering
> anything.
> (ref_maybe_used_by_call_p_1 and call_may_clobber_ref_p_1 in
> tree-ssa-alias.cc)
> * adjust dead code elimination to eliminate the call to
> .ACCESS_WITH_SIZE when
> it's LHS is eliminated as dead code.
> (eliminate_unnecessary_stmts in tree-ssa-dce.cc)
> * Provide the utility routines to check the call is .ACCESS_WITH_SIZE and
> get the reference from the call to .ACCESS_WITH_SIZE.
> (is_access_with_size_p and get_ref_from_access_with_size in tree.cc)
> * testing cases. (for offsetof, static initialization, generation of
> calls to
> .ACCESS_WITH_SIZE, code runs correctly after calls to
> .ACCESS_WITH_SIZE are
> converted back)
>
> 3. Use the .ACCESS_WITH_SIZE in builtin object size (sub-object only)
> which includes:
> * use the size info of the .ACCESS_WITH_SIZE for sub-object.
> * testing cases.
>
> 4 Use the .ACCESS_WITH_SIZE in bound sanitizer
> Since the result type of the call to .ACCESS_WITH_SIZE is a pointer to
> the element type. The original array_ref is converted to an
> indirect_ref,
> which introduced two issues for the instrumenation of bound sanitizer:
>
> A. The index for the original array_ref was mixed into the offset
> expression for the new indirect_ref.
>
> In order to make the instrumentation for the bound sanitizer easier,
> one
> more argument for the function .ACCESS_WITH_SIZE is added to record
> this
> original index for the array_ref. then later during bound
> instrumentation,
> get the index from the additional argument from the call to the
> function
> .ACCESS_WITH_SIZE.
>
> B. In the current bound sanitizer, no instrumentation will be inserted
> for
> an indirect_ref.
>
> In order to add instrumentation for an indirect_ref with a call to
> .ACCESS_WITH_SIZE, we should specially handle the indirect_ref with a
> call to .ACCESS_WITH_SIZE, and get the index and bound info from the
> arguments of the call.
>
> which includes:
> * Record the index to the 6th argument of the call to .ACCESS_WITH_SIZE.
> * Instrument indirect_ref with call to .ACCESS_WITH_SIZE for bound
> sanitizer.
> * testing cases. (additional testing cases for dynamic array support)
>
> ******Remaining works:
>
> 5 Improve __bdos to use the counted_by info in whole-object size for the
> structure with FAM.
> 6 Emit warnings when the user breaks the requirments for the new counted_by
> attribute
> compilation time: -Wcounted-by
> run time: -fsanitizer=counted-by
> * The initialization to the size field should be done before the first
> reference to the FAM field.
> * the array has at least # of elements specified by the size field all
> the time during the program.
>
> I have bootstrapped and regression tested on both x86 and aarch64, no issue.
>
> Let me know your comments.
>
> thanks.
>
> Qing