It looks like we cannot simply swap the code and mode in rtx_def, the code may
have to be the same bits as the tree_code in tree_base. Or we will meet ICE
like below.
rtx_def code 16 => 8 bits.
rtx_def mode 8 => 16 bits.
static inline decl_or_value
dv_from_value (rtx value)
{
decl_or_value dv;
dv = value;
gcc_checking_assert (dv_is_value_p (dv)); <= ICE
return dv;
}
Thus we also need to align the bits change to the tree_code like below.
Unfortunately, only 8 bits may be not sufficient due to compile log
"../../gcc/tree-core.h:1034:28: warning: ‘tree_base::code’ is too small to hold
all values of ‘enum tree_code’".
tree_base code 16 => 8 bits.
So the one possible approach for the bits adjustment may look like below, I am
not very sure if it is reasonable or not. Any ideas about this? Thank you all
in advance, 😉.
rtx_def code 16 => 12 bits.
rtx_def mode 8 => 12 bits.
tree_base code 16 => 12 bits.
Pan
-----Original Message-----
From: Li, Pan2
Sent: Saturday, May 6, 2023 10:49 AM
To: 'Kito Cheng' <[email protected]>
Cc: '[email protected]' <[email protected]>; 'rguenther'
<[email protected]>; 'richard.sandiford' <[email protected]>;
'jeffreyalaw' <[email protected]>; 'gcc-patches' <[email protected]>;
'palmer' <[email protected]>; 'jakub' <[email protected]>
Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to
16-bit
Picked all changes mentioned in previous to single patch as attachment. Please
help to review if any mistake.
Pan
-----Original Message-----
From: Li, Pan2
Sent: Saturday, May 6, 2023 10:20 AM
To: Kito Cheng <[email protected]>
Cc: [email protected]; rguenther <[email protected]>; richard.sandiford
<[email protected]>; jeffreyalaw <[email protected]>; gcc-patches
<[email protected]>; palmer <[email protected]>; jakub <[email protected]>
Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to
16-bit
Yes, that makes sense, will have a try and keep you posted.
Pan
-----Original Message-----
From: Kito Cheng <[email protected]>
Sent: Saturday, May 6, 2023 10:19 AM
To: Li, Pan2 <[email protected]>
Cc: [email protected]; rguenther <[email protected]>; richard.sandiford
<[email protected]>; jeffreyalaw <[email protected]>; gcc-patches
<[email protected]>; palmer <[email protected]>; jakub <[email protected]>
Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from 8-bit to
16-bit
I think x86 first? The major thing we want to make sure is that this change
won't affect those targets which do not really require 16 bit machine_mode too
much.
On Sat, May 6, 2023 at 10:12 AM Li, Pan2 via Gcc-patches
<[email protected]> wrote:
>
> Sure thing, I will pick them all together and trigger(will send out the
> overall diff before start to make sure my understand is correct) the test
> again. BTW which target do we prefer first? X86 or RISC-V.
>
> Pan
>
> From: [email protected] <[email protected]>
> Sent: Saturday, May 6, 2023 10:00 AM
> To: kito.cheng <[email protected]>; Li, Pan2 <[email protected]>
> Cc: rguenther <[email protected]>; richard.sandiford
> <[email protected]>; jeffreyalaw <[email protected]>;
> gcc-patches <[email protected]>; palmer <[email protected]>;
> jakub <[email protected]>
> Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> 8-bit to 16-bit
>
> Yeah, you should also swap mode and code in rtx_def according to
> Richard suggestion since it will not change the rtx_def data structure.
>
> I think the only problem is the mode in tree data structure.
> ________________________________
> [email protected]<mailto:[email protected]>
>
> From: Kito Cheng<mailto:[email protected]>
> Date: 2023-05-06 09:53
> To: Li, Pan2<mailto:[email protected]>
> CC: Richard Biener<mailto:[email protected]>;
> 钟居哲<mailto:[email protected]>;
> richard.sandiford<mailto:[email protected]>; Jeff
> Law<mailto:[email protected]>;
> gcc-patches<mailto:[email protected]>;
> palmer<mailto:[email protected]>; jakub<mailto:[email protected]>
> Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size from
> 8-bit to 16-bit Hi Pan:
>
> Could you try to apply the following diff and measure again? This
> makes tree_type_common size unchanged.
>
>
> sizeof tree_type_common= 128 (mode = 8 bit) sizeof tree_type_common=
> 136 (mode = 16 bit) sizeof tree_type_common= 128 (mode = 8 bit w/ this
> diff)
>
> diff --git a/gcc/tree-core.h b/gcc/tree-core.h index
> af795aa81f98..b8ccfa407ed9 100644
> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -1680,6 +1680,8 @@ struct GTY(()) tree_type_common {
> tree attributes;
> unsigned int uid;
>
> + ENUM_BITFIELD(machine_mode) mode : 16;
> +
> unsigned int precision : 10;
> unsigned no_force_blk_flag : 1;
> unsigned needs_constructing_flag : 1; @@ -1687,7 +1689,6 @@ struct
> GTY(()) tree_type_common {
> unsigned restrict_flag : 1;
> unsigned contains_placeholder_bits : 2;
>
> - ENUM_BITFIELD(machine_mode) mode : 16;
>
> /* TYPE_STRING_FLAG for INTEGER_TYPE and ARRAY_TYPE.
> TYPE_CXX_ODR_P for RECORD_TYPE and UNION_TYPE. */ @@ -1712,7
> +1713,7 @@ struct GTY(()) tree_type_common {
> unsigned empty_flag : 1;
> unsigned indivisible_p : 1;
> unsigned no_named_args_stdarg_p : 1;
> - unsigned spare : 15;
> + unsigned spare : 7;
>
> alias_set_type alias_set;
> tree pointer_to;
>
> On Sat, May 6, 2023 at 9:10 AM Li, Pan2 via Gcc-patches
> <[email protected]<mailto:[email protected]>> wrote:
> >
> > Yes, totally agree the number cannot be very accurate up to a point. Update
> > the correlated memory bytes allocated for the X86 target.
> >
> > Bytes allocated with O2:
> > -----------------------------------------------------------------------------------------------------
> > Benchmark | upstream | with this PATCH
> > -----------------------------------------------------------------------------------------------------
> > 400.perlbench | 25286185160 | 25176544846 ~0.0%
> > 401.bzip2 | 1429883731 | 1391040027 -2.7%
> > 403.gcc | 55023568981 | 54798890746 ~0.0%
> > 429.mcf | 1360975660 | 1321537710 -2.9%
> > 445.gobmk | 12791636502 | 12666523431 -1.0%
> > 456.hmmer | 9354433652 | 9279189174 ~0.0%
> > 458.sjeng | 1991260562 | 1944031904 -2.4%
> > 462.libquantum | 1725112078 | 1684213981 -2.4%
> > 464.h264ref | 8597673515 | 8528855778 ~0.0%
> > 471.omnetpp | 37613034778 | 37432278047 ~0.0%
> > 473.astar | 3817295518 | 3772460508 -1.2%
> > 483.xalancbmk | 149418776991 | 148545162207 ~0.0%
> >
> > Bytes allocated with Ofast + funroll-loops:
> > ------------------------------------------------------------------------------------------
> > Benchmark | upstream | with this PATCH
> > ------------------------------------------------------------------------------------------
> > 400.perlbench | 30438407499 | 30574152897 ~0.0%
> > 401.bzip2 | 2277114519 | 2319432664 +1.9%
> > 403.gcc | 64499664264 | 64781232731 ~0.0%
> > 429.mcf | 1361486758 | 1399942116 +2.8%
> > 445.gobmk | 15258056111 | 15396801542 +1.0%
> > 456.hmmer | 10896615649 | 10936223486 ~0.0%
> > 458.sjeng | 2592620709 | 2641687496 +1.9%
> > 462.libquantum | 1814487525 | 1854518500 +2.2%
> > 464.h264ref | 13528736878 | 13614517066 ~0.0%
> > 471.omnetpp | 38721066702 | 38910524667 ~0.0%
> > 473.astar | 3924015756 | 3968057027 +1.1%
> > 483.xalancbmk | 165897692838 | 166843885880 ~0.0%
> >
> > Pan
> >
> >
> > -----Original Message-----
> > From: Richard Biener <[email protected]<mailto:[email protected]>>
> > Sent: Friday, May 5, 2023 2:25 PM
> > To: Li, Pan2 <[email protected]<mailto:[email protected]>>
> > Cc: 钟居哲 <[email protected]<mailto:[email protected]>>;
> > kito.cheng <[email protected]<mailto:[email protected]>>;
> > richard.sandiford
> > <[email protected]<mailto:[email protected]>>; Jeff
> > Law <[email protected]<mailto:[email protected]>>;
> > gcc-patches
> > <[email protected]<mailto:[email protected]>>; palmer
> > <[email protected]<mailto:[email protected]>>; jakub
> > <[email protected]<mailto:[email protected]>>
> > Subject: RE: Re: [PATCH] machine_mode type size: Extend enum size
> > from 8-bit to 16-bit
> >
> > On Fri, 5 May 2023, Li, Pan2 wrote:
> >
> > > I tried the memory profiling by valgrind --tool=memcheck
> > > --trace-children=yes for this change, target the SPEC 2006 INT part with
> > > rv64gcv. Note we only count the bytes allocated from valgrind log like
> > > this "==2832896== total heap usage: 208 allocs, 165 frees, 123,204
> > > bytes allocated".
> > >
> > > Consider some variance of valgrind, it looks like the impact to
> > > bytes allocated may be limited. However, I am still running this
> > > for x86, it will take more than 30 hours for each iteration...
> >
> > I'm not sure I'd call +- 7% on memory use "limited" - but I fear the
> > numbers are off. Note since various structures reside in GC memory there's
> > also changes to GC overhead and fragmentation, so precise measurements are
> > difficult.
> >
> > Richard.
> >
> > > RISC-V GCC Version:
> > > >> ~/bin/test-gnu-8-bits/bin/riscv64-unknown-linux-gnu-gcc
> > > >> --version
> > > riscv64-unknown-linux-gnu-gcc (gd7cb9720ed5) 14.0.0 20230503
> > > (experimental) Copyright (C) 2023 Free Software Foundation, Inc.
> > > This is free software; see the source for copying conditions.
> > > There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
> > > PARTICULAR PURPOSE.
> > >
> > > Bytes allocated with O2:
> > > -----------------------------------------------------------------------------------------------------
> > > Benchmark | upstream | with this PATCH
> > > -----------------------------------------------------------------------------------------------------
> > > 400.perlbench | 29699642875 | 29949876269 ~0.0%
> > > 401.bzip2 | 1641041659 | 1755563972 +6.95%
> > > 403.gcc | 68447500516 | 68900883291 ~0.0%
> > > 429.mcf | 1433156462 | 1433253373 ~0.0%
> > > 445.gobmk | 14239225210 | 14463438465 ~0.0%
> > > 456.hmmer | 9635955623 | 9808534948 +1.8%
> > > 458.sjeng | 2419478204 | 2545478940 +5.4%
> > > 462.libquantum | 1686404489 | 1800884197 +6.8%
> > > 464.h264ref 8j1 | 10190413900 | 10351134161 +1.6%
> > > 471.omnetpp | 40814627684 | 41185864529 ~0.0%
> > > 473.astar | 3807097529 | 3928428183 +3.2%
> > > 483.xalancbmk | 152959418167 | 154201738843 ~0.0%
> > >
> > > Bytes allocated with Ofast + funroll-loops:
> > > ------------------------------------------------------------------------------------------
> > > Benchmark | upstream | with this PATCH
> > > ------------------------------------------------------------------------------------------
> > > 400.perlbench | 39491184733 | 39223020267 ~0.0%
> > > 401.bzip2 | 2843871517 | 2730383463 ~0%
> > > 403.gcc | 84195991898 | 83730632955 -4.0%
> > > 429.mcf | 1481381164 | 1367309565 -7.7%
> > > 445.gobmk | 20123943663 | 19886116394 -1.2%
> > > 456.hmmer | 12302445139 | 12121745383 -1.5%
> > > 458.sjeng | 3884712615 | 3755481930 -3.3%
> > > 462.libquantum | 1966619940 | 1852274342 -5.8%
> > > 464.h264ref | 19219365552 | 19050288201 ~0.0%
> > > 471.omnetpp | 45701008325 | 45327805079 ~0.0%
> > > 473.astar | 4118600354 | 3995943705 -3.0%
> > > 483.xalancbmk | 179481305182 | 178160306301 ~0.0%
> > >
> > > Pan
> > >
> > >
> > > -----Original Message-----
> > > From: Gcc-patches
> > > <[email protected]<mailto:[email protected]>>
> > > On Behalf Of ???
> > > Sent: Thursday, April 13, 2023 7:23 AM
> > > To: kito.cheng
> > > <[email protected]<mailto:[email protected]>>; rguenther
> > > <[email protected]<mailto:[email protected]>>
> > > Cc: richard.sandiford
> > > <[email protected]<mailto:[email protected]>>;
> > > Jeff Law <[email protected]<mailto:[email protected]>>;
> > > gcc-patches
> > > <[email protected]<mailto:[email protected]>>; palmer
> > > <[email protected]<mailto:[email protected]>>; jakub
> > > <[email protected]<mailto:[email protected]>>
> > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size
> > > from 8-bit to 16-bit
> > >
> > > Yeah, like kito said.
> > > Turns out the tuple type model in ARM SVE is the optimal solution for RVV.
> > > And we like ARM SVE style implmentation.
> > >
> > > And now we see swapping rtx_code and mode in rtx_def can make rtx_def
> > > overal not exceed 64 bit.
> > > But it seems that there is still problem in tree_type_common and
> > > tree_decl_common, is that right?
> > >
> > > After several trys (remove all redundant TI/TF vector modes and FP16
> > > vector mode), now there are 252 modes in RISC-V port. Basically, I can
> > > keep supporting new RVV intrinsisc features recently.
> > > However, we can't support more in the future, for example, FP16 vector,
> > > BF16 vector, matrix modes, VLS modes,...etc.
> > >
> > > From RVV side, I think extending 1 more bit of machine mode should be
> > > enough for RVV (overal 512 modes).
> > > Is it possible make it happen in tree_type_common and tree_decl_common,
> > > Richards?
> > >
> > > Thank you so much for all comments.
> > >
> > >
> > > [email protected]<mailto:[email protected]>
> > >
> > > From: Kito Cheng
> > > Date: 2023-04-12 17:31
> > > To: Richard Biener
> > > CC: [email protected]<mailto:[email protected]>;
> > > richard.sandiford; jeffreyalaw; gcc-patches; palmer; jakub
> > > Subject: Re: Re: [PATCH] machine_mode type size: Extend enum size
> > > from 8-bit to 16-bit
> > > > > The concept of fractional LMUL is the same as the concept of
> > > > > AArch64's partial SVE vectors, so they can only access the
> > > > > lowest part, like SVE's partial vector.
> > > > >
> > > > > We want to spill/restore the exact size of those modes (1/2,
> > > > > 1/4, 1/8), so adding dedicated modes for those partial vector
> > > > > modes should be unavoidable IMO.
> > > > >
> > > > > And even if we use sub-vector, we still need to define those
> > > > > partial vector types.
> > > >
> > > > Could you use integer modes for the fractional vectors?
> > >
> > > You mean using the scalar integer mode like using (subreg:SI
> > > (reg:VNx4SI) 0) to represent
> > > LMUL=1/4?
> > > (Assume VNx4SI is mode for M1)
> > >
> > > If so I think it might not be able to model that right - it seems like we
> > > are using 32-bits but actually we are using poly_int16(1, 1) * 32 bits.
> > >
> > > > For computation you can always appropriately limit the LEN?
> > >
> > > RVV provide zvl*b extension like zvl<N>b (e.g.zvl128b or zvl256b)
> > > to guarantee the vector length is at least larger than N bits, but
> > > it's just guarantee the minimal length like SVE guarantee the
> > > minimal vector length is 128 bits
> > >
> > >
> >
> > --
> > Richard Biener <[email protected]<mailto:[email protected]>>
> > SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461
> > Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald,
> > Boudien Moerman; HRB 36809 (AG Nuernberg)
>