Hi! On 2023-06-29T11:29:57+0200, I wrote: > On 2023-06-21T15:58:24+0800, Pan Li via Gcc-patches <gcc-patches@gcc.gnu.org> > wrote: >> We extend the machine mode from 8 to 16 bits already. But there still >> one placing missing from the streamer. It has one hard coded array >> for the machine code like size 256. >> >> In the lto pass, we memset the array by MAX_MACHINE_MODE count but the >> value of the MAX_MACHINE_MODE will grow as more and more modes are >> added. While the machine mode array in tree-streamer still leave 256 as is. >> >> Then, when the MAX_MACHINE_MODE is greater than 256, the memset of >> lto_output_init_mode_table will touch the memory out of range unexpected. > > Uh. :-O > >> This patch would like to take the MAX_MACHINE_MODE as the size of the >> array in streamer, to make sure there is no potential unexpected >> memory access in future. Meanwhile, this patch also adjust some place >> which has MAX_MACHINE_MODE <= 256 assumption. > > Thanks to Jakub and Richard for guidance re the offloading compilation > case, where we've got different 'MAX_MACHINE_MODE's between stream-out > and stream-in, and a modes mapping table. > > However, with this patch, there are ICEs all over the place... I'm > having a look.
Your patch has all the right ideas, there are just a few additional changes necessary. Please merge in the attached "f into Streamer: Fix out of range memory access of machine mode", with 'Co-authored-by: Thomas Schwinge <tho...@codesourcery.com>'. This has already survived compiler-side 'lto.exp' testing and 'check-target-libgomp' with Nvidia GPU offloading; AMD GPU testing is now running (not expecting any bad surprises). Will let you know by (my) tomorrow morning in case there are any more problems. Explanation: >> --- a/gcc/lto-streamer-in.cc >> +++ b/gcc/lto-streamer-in.cc >> @@ -1985,8 +1985,6 @@ lto_input_mode_table (struct lto_file_decl_data >> *file_data) >> internal_error ("cannot read LTO mode table from %s", >> file_data->file_name); >> >> - unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << 8); >> - file_data->mode_table = table; >> const struct lto_simple_header_with_strings *header >> = (const struct lto_simple_header_with_strings *) data; >> int string_offset; >> @@ -1998,16 +1996,22 @@ lto_input_mode_table (struct lto_file_decl_data >> *file_data) >> header->string_size, vNULL); >> bitpack_d bp = streamer_read_bitpack (&ib); >> >> + unsigned mode_bits = bp_unpack_value (&bp, 5); >> + unsigned char *table = ggc_cleared_vec_alloc<unsigned char> (1 << >> mode_bits); >> + >> + file_data->mode_table = table; >> + file_data->mode_bits = mode_bits; Here, we set 'file_data->mode_bits' for the offloading case (where 'lto_input_mode_table' is called) -- but it's not set for the non-offloading case (where 'lto_input_mode_table' isn't called). (See my 'gcc/lto/lto-common.cc:lto_read_decls' change.) That's "not currently a problem", as 'file_data->mode_bits' isn't used anywhere... >> --- a/gcc/lto-streamer.h >> +++ b/gcc/lto-streamer.h >> @@ -604,6 +604,8 @@ struct GTY(()) lto_file_decl_data >> int order_base; >> >> int unit_base; >> + >> + unsigned mode_bits; >> }; >> inline machine_mode >> bp_unpack_machine_mode (struct bitpack_d *bp) >> { >> - return (machine_mode) >> - ((class lto_input_block *) >> - bp->stream)->mode_table[bp_unpack_enum (bp, machine_mode, 1 << 8)]; >> + int last = 1 << ceil_log2 (MAX_MACHINE_MODE); >> + lto_input_block *input_block = (class lto_input_block *) bp->stream; >> + int index = bp_unpack_enum (bp, machine_mode, last); >> + >> + return (machine_mode) input_block->mode_table[index]; >> } ..., but 'file_data->mode_bits' needs to be considered here, in the stream-in for offloading, where 'file_data->mode_bits' -- that is, the host 'MAX_MACHINE_MODE' -- very likely is different from the offload device 'MAX_MACHINE_MODE'. Easiest is in 'gcc/lto-streamer.h:class lto_input_block' to capture 'lto_file_decl_data *file_data' instead of just 'unsigned char *mode_table', and adjust all users. That's it. :-) >> --- a/gcc/tree-streamer.h >> +++ b/gcc/tree-streamer.h >> @@ -108,15 +108,19 @@ inline void >> bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode) >> { >> streamer_mode_table[mode] = 1; >> - bp_pack_enum (bp, machine_mode, 1 << 8, mode); >> + int last = 1 << ceil_log2 (MAX_MACHINE_MODE); >> + >> + bp_pack_enum (bp, machine_mode, last, mode); >> } That use of 'MAX_MACHINE_MODE' is safe, as that only concerns the stream-out phase. >> --- a/gcc/tree-streamer.cc >> +++ b/gcc/tree-streamer.cc >> @@ -35,7 +35,7 @@ along with GCC; see the file COPYING3. If not see >> During streaming in, we translate the on the disk mode using this >> table. For normal LTO it is set to identity, for ACCEL_COMPILER >> depending on the mode_table content. */ >> -unsigned char streamer_mode_table[1 << 8]; >> +unsigned char streamer_mode_table[MAX_MACHINE_MODE]; Likewise. Grüße Thomas ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
>From 88ff74f043235735701f71cdb51a83315f8a3895 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge <tho...@codesourcery.com> Date: Thu, 29 Jun 2023 21:33:06 +0200 Subject: [PATCH] f into Streamer: Fix out of range memory access of machine mode gcc/ * lto-streamer.h (class lto_input_block): Capture 'lto_file_decl_data *file_data' instead of just 'unsigned char *mode_table'. Adjust all users. * tree-streamer.h (bp_unpack_machine_mode): Use 'file_data->mode_bits'. gcc/lto/ * lto-common.cc (lto_read_decls) [!ACCEL_COMPILER]: Initialize 'file_data->mode_bits'. --- gcc/ipa-devirt.cc | 2 +- gcc/ipa-fnsummary.cc | 2 +- gcc/ipa-icf.cc | 2 +- gcc/ipa-modref.cc | 2 +- gcc/ipa-prop.cc | 4 ++-- gcc/ipa-sra.cc | 2 +- gcc/lto-cgraph.cc | 2 +- gcc/lto-section-in.cc | 2 +- gcc/lto-streamer-in.cc | 6 +++--- gcc/lto-streamer.h | 10 +++++----- gcc/lto/lto-common.cc | 3 ++- gcc/tree-streamer.h | 6 +++--- 12 files changed, 22 insertions(+), 21 deletions(-) diff --git a/gcc/ipa-devirt.cc b/gcc/ipa-devirt.cc index 2c61a497cee..87529be4515 100644 --- a/gcc/ipa-devirt.cc +++ b/gcc/ipa-devirt.cc @@ -4147,7 +4147,7 @@ ipa_odr_read_section (struct lto_file_decl_data *file_data, const char *data, class data_in *data_in; lto_input_block ib ((const char *) data + main_offset, header->main_size, - file_data->mode_table); + file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc index a5f5a50c8a5..37c1edc2f3a 100644 --- a/gcc/ipa-fnsummary.cc +++ b/gcc/ipa-fnsummary.cc @@ -4528,7 +4528,7 @@ inline_read_section (struct lto_file_decl_data *file_data, const char *data, unsigned int f_count; lto_input_block ib ((const char *) data + main_offset, header->main_size, - file_data->mode_table); + file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/ipa-icf.cc b/gcc/ipa-icf.cc index cb9f768d85d..836d0914ded 100644 --- a/gcc/ipa-icf.cc +++ b/gcc/ipa-icf.cc @@ -2204,7 +2204,7 @@ sem_item_optimizer::read_section (lto_file_decl_data *file_data, unsigned int count; lto_input_block ib_main ((const char *) data + main_offset, 0, - header->main_size, file_data->mode_table); + header->main_size, file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/ipa-modref.cc b/gcc/ipa-modref.cc index e3196df8aa9..278b2dbd828 100644 --- a/gcc/ipa-modref.cc +++ b/gcc/ipa-modref.cc @@ -3816,7 +3816,7 @@ read_section (struct lto_file_decl_data *file_data, const char *data, unsigned int f_count; lto_input_block ib ((const char *) data + main_offset, header->main_size, - file_data->mode_table); + file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc index 704fe01b02c..8f2119b72e3 100644 --- a/gcc/ipa-prop.cc +++ b/gcc/ipa-prop.cc @@ -5337,7 +5337,7 @@ ipa_prop_read_section (struct lto_file_decl_data *file_data, const char *data, unsigned int count; lto_input_block ib_main ((const char *) data + main_offset, - header->main_size, file_data->mode_table); + header->main_size, file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, @@ -5561,7 +5561,7 @@ read_replacements_section (struct lto_file_decl_data *file_data, unsigned int count; lto_input_block ib_main ((const char *) data + main_offset, - header->main_size, file_data->mode_table); + header->main_size, file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, header->string_size, vNULL); diff --git a/gcc/ipa-sra.cc b/gcc/ipa-sra.cc index 21d281a9756..c35e03b7abd 100644 --- a/gcc/ipa-sra.cc +++ b/gcc/ipa-sra.cc @@ -2944,7 +2944,7 @@ isra_read_summary_section (struct lto_file_decl_data *file_data, unsigned int count; lto_input_block ib_main ((const char *) data + main_offset, - header->main_size, file_data->mode_table); + header->main_size, file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/lto-cgraph.cc b/gcc/lto-cgraph.cc index aed5e9ddb18..32c0f5ac6db 100644 --- a/gcc/lto-cgraph.cc +++ b/gcc/lto-cgraph.cc @@ -2174,7 +2174,7 @@ input_cgraph_opt_section (struct lto_file_decl_data *file_data, unsigned int count; lto_input_block ib_main ((const char *) data + main_offset, - header->main_size, file_data->mode_table); + header->main_size, file_data); data_in = lto_data_in_create (file_data, (const char *) data + string_offset, diff --git a/gcc/lto-section-in.cc b/gcc/lto-section-in.cc index 07cf7326582..5ff00a3c130 100644 --- a/gcc/lto-section-in.cc +++ b/gcc/lto-section-in.cc @@ -262,7 +262,7 @@ lto_create_simple_input_block (struct lto_file_decl_data *file_data, *datar = data; return new lto_input_block (data + main_offset, header->main_size, - file_data->mode_table); + file_data); } diff --git a/gcc/lto-streamer-in.cc b/gcc/lto-streamer-in.cc index 2a0720b4e6f..1876e1967ec 100644 --- a/gcc/lto-streamer-in.cc +++ b/gcc/lto-streamer-in.cc @@ -1629,11 +1629,11 @@ lto_read_body_or_constructor (struct lto_file_decl_data *file_data, struct symta /* Set up the struct function. */ from = data_in->reader_cache->nodes.length (); lto_input_block ib_main (data + main_offset, header->main_size, - file_data->mode_table); + file_data); if (TREE_CODE (node->decl) == FUNCTION_DECL) { lto_input_block ib_cfg (data + cfg_offset, header->cfg_size, - file_data->mode_table); + file_data); input_function (fn_decl, data_in, &ib_main, &ib_cfg, dyn_cast <cgraph_node *>(node)); } @@ -1954,7 +1954,7 @@ lto_input_toplevel_asms (struct lto_file_decl_data *file_data, int order_base) string_offset = sizeof (*header) + header->main_size; lto_input_block ib (data + sizeof (*header), header->main_size, - file_data->mode_table); + file_data); data_in = lto_data_in_create (file_data, data + string_offset, header->string_size, vNULL); diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index 443f0cd616e..0556b34c837 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -344,14 +344,14 @@ public: /* Special constructor for the string table, it abuses this to do random access but use the uhwi decoder. */ lto_input_block (const char *data_, unsigned int p_, unsigned int len_, - const unsigned char *mode_table_) - : data (data_), mode_table (mode_table_), p (p_), len (len_) {} + const lto_file_decl_data *file_data_) + : data (data_), file_data (file_data_), p (p_), len (len_) {} lto_input_block (const char *data_, unsigned int len_, - const unsigned char *mode_table_) - : data (data_), mode_table (mode_table_), p (0), len (len_) {} + const lto_file_decl_data *file_data_) + : data (data_), file_data (file_data_), p (0), len (len_) {} const char *data; - const unsigned char *mode_table; + const lto_file_decl_data *file_data; unsigned int p; unsigned int len; }; diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc index 537570204b3..973ab791712 100644 --- a/gcc/lto/lto-common.cc +++ b/gcc/lto/lto-common.cc @@ -1880,7 +1880,7 @@ lto_read_decls (struct lto_file_decl_data *decl_data, const void *data, uint32_t num_decl_states; lto_input_block ib_main ((const char *) data + main_offset, - header->main_size, decl_data->mode_table); + header->main_size, decl_data); data_in = lto_data_in_create (decl_data, (const char *) data + string_offset, header->string_size, resolutions); @@ -2275,6 +2275,7 @@ lto_file_finalize (struct lto_file_decl_data *file_data, lto_file *file, lto_input_mode_table (file_data); #else file_data->mode_table = lto_mode_identity_table; + file_data->mode_bits = ceil_log2 (MAX_MACHINE_MODE); #endif data = lto_get_summary_section_data (file_data, LTO_section_decls, &len); diff --git a/gcc/tree-streamer.h b/gcc/tree-streamer.h index ff8bccf901a..db01c8c7678 100644 --- a/gcc/tree-streamer.h +++ b/gcc/tree-streamer.h @@ -116,11 +116,11 @@ bp_pack_machine_mode (struct bitpack_d *bp, machine_mode mode) inline machine_mode bp_unpack_machine_mode (struct bitpack_d *bp) { - int last = 1 << ceil_log2 (MAX_MACHINE_MODE); - lto_input_block *input_block = (class lto_input_block *) bp->stream; + lto_input_block *ib = (class lto_input_block *) bp->stream; + int last = 1 << ib->file_data->mode_bits; int index = bp_unpack_enum (bp, machine_mode, last); - return (machine_mode) input_block->mode_table[index]; + return (machine_mode) ib->file_data->mode_table[index]; } #endif /* GCC_TREE_STREAMER_H */ -- 2.34.1