[Bug c++/115240] New: [alias] Does we assume the math function have pure attribute ?

2024-05-27 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test: https://gcc.godbolt.org/z/5YbezdW89 ``` float foo (float num[], float r2inv, int n) { float sum = 0.0; for (int i=0; i

[Bug c/112306] New: [AArch64][neon] incorrect combine the (a -1)* b into fnmsub for fixed vector type

2023-10-31 Thread zhongyunde at huawei dot com via Gcc-bugs
Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test: https://gcc.godbolt.org/z/sr6Mevf9G ``` float32x4_t test2_float_vec (float32x4_t a, float32x4_t b

[Bug c/111584] New: [aarch64] Redundant movprfx with ptrue

2023-09-25 Thread zhongyunde at huawei dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test: https://gcc.godbolt.org/z/E6Eez81jh ``` #include typedef svfloat32_t fvec32 __attribute__((arm_sve_vector_bits(256))); typedef svfloat32_t __m256_; __m256_

[Bug c/110638] New: [13 regression] memcpy should be inlined with sve loop

2023-07-12 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test:https://gcc.godbolt.org/z/39KddjbE4 ``` void va(struct args_t * func_args) { for (int nl = 0; nl < iterations; nl++) { for (int i = 0; i <

[Bug c/110103] New: the pointers return from two malloc is not equal

2023-06-03 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test:https://gcc.godbolt.org/z/j74z1qaT9 ``` int check_pointer (void) { int *pa = (int *) malloc (sizeof (int) * NUM); int *pb = (int *) malloc (sizeof (int) * NUM

[Bug c/109269] [sve] should check the upper bound for predicate sve

2023-03-23 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109269 --- Comment #3 from vfdff --- * test: https://gcc.godbolt.org/z/5s4Wbs466 ``` void mset (int *a, int num) { for (int i=0; i< num; i++) a[i] = 2; } ``` * the issue is still exist with int type as we use 32-bits register? . see detail on

[Bug c/109269] New: [sve] should check the upper bound for predicate sve

2023-03-23 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test case:https://gcc.godbolt.org/z/jde11xv53 ``` void mset (int *a, long long num) { for (long long i=0; i< num; i++) a[i] = 2; } ``` * Base on above c

[Bug c/108818] New: [aarch64] use a extra mov instruction compare to llvm

2023-02-16 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test:https://gcc.godbolt.org/z/res6aTYqP ``` unsigned sel(unsigned X) { return X == 6 ? 6 : 8; } ``` * gcc: ``` sel: cmp w0, 6 mov w1, 8

[Bug middle-end/106323] [Suboptimal] memcmp(s1, s2, n) == 0 expansion on AArch64 compare to llvm

2022-12-06 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106323 --- Comment #4 from vfdff --- Now, llvm use 4 loads and CMP+CCMP, https://gcc.godbolt.org/z/PM3jxEM9M

RE: [PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations

2022-11-05 Thread Zhongyunde via Gcc-patches
> -Original Message- > From: Andrew Pinski [mailto:pins...@gcc.gnu.org] > Sent: Saturday, November 5, 2022 2:34 PM > To: Zhongyunde > Cc: hongtao@intel.com; gcc-patches@gcc.gnu.org; Zhangwen(Esan) > ; Weiwei (weiwei, Compiler) > ; zhong_1985...@163.com > Subj

[PATCH] [PHIOPT] Add A ? B + CST : B match and simplify optimizations

2022-11-05 Thread Zhongyunde via Gcc-patches
hi, This patch is try to fix the issue https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107190, would you like to give me some suggestion, thanks. ~/source/gccUpstreamDir/gcc/testsuite(cfg) » git format-patch -1 --start-number=00 HEAD -o ~/patch /home/zhongyunde/patch/-PHIOPT-Add-A-B-CST-B

[Bug target/104611] memcmp/strcmp/strncmp can be optimized when the result is tested for [in]equality with 0 on aarch64

2022-10-29 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104611 --- Comment #3 from vfdff --- As the load instructions usually have long latency, so do it need some extra restrict when we try this transformation?

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-28 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 --- Comment #11 from vfdff --- Created attachment 53787 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53787=edit has different operand order base on different commit node hi @Andrew Pinski * Showed as the figure swap_order.jpg

[Bug target/107316] [aarch64] Init big const value should be improved compare to llvm

2022-10-22 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107316 --- Comment #2 from vfdff --- (In reply to Andrew Pinski from comment #1) > I suspect this is just a dup of bug 106583 and will be fixed by the patch > which was submitted recently >

[Bug c/107359] New: [aarch64] should avoid the punpklo/punpkhi compare to llvm

2022-10-22 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * case, https://godbolt.org/z/38bcszxdo ``` int check (char *mask, double *result, int n) { int count = 0; for (int i=0; i

[Bug c/107316] New: [aarch64] Init big const value should be improved compare to llvm

2022-10-19 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test case: https://godbolt.org/z/ahreYnahE ``` int main (int argc, char** argv) { if (lshift_1 (0xull) != 0ll

[Bug target/104611] memcmp/strcmp/strncmp can be optimized when the result is tested for [in]equality with 0 on aarch64

2022-10-18 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104611 vfdff changed: What|Removed |Added CC||zhongyunde at huawei dot com --- Comment #2

[Bug middle-end/107208] [aarch64] _complex integer return types could be improved

2022-10-13 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107208 --- Comment #3 from vfdff --- it seems releted to targetm.calls.function_value called by assign_parms, who return different behaviour for MODE_COMPLEX_FLOAT and MODE_COMPLEX_INT. With the following changes, then choose a pair of DI for the int

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-12 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 --- Comment #10 from vfdff --- Created attachment 53698 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53698=edit the huge bb sligtly change after match ResLo Thanks for your suggestion, and I think both ctz_table_index and

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-12 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 --- Comment #8 from vfdff --- hi @Andrew Pinski For the 2nd issue, I also matched the huge pattern, but it need return two value, it seems don't work with current framework? so should I have to split it into two simples to match the high and

[Bug c++/107208] New: [aarch64] llvm generate better code than gcc base on _Complex type mul

2022-10-10 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * gcc now generate 2 redundant mov instrunction compared to llvm ``` mul64(unsigned long _Complex, unsigned long _Complex

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-10 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 vfdff changed: What|Removed |Added Attachment #53684|0 |1 is obsolete|

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-09 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 vfdff changed: What|Removed |Added CC||zhongyunde at huawei dot com --- Comment #5

[Bug c++/107190] New: [aarch64] regression with optimization -fexpensive-optimizations

2022-10-09 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- This case is simplify from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090, and we can see that the codegen of function `test_m

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-07 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 --- Comment #4 from vfdff --- (In reply to Andrew Pinski from comment #1) > A few issues. > First is: > > if (_26 != 0) > goto ; [50.00%] > else > goto ; [50.00%] > >[local count: 536870913]: > ht_15 = ht_13 + 4294967296; >

[Bug tree-optimization/107090] [aarch64] sequence logic should be combined with mul and umulh

2022-10-01 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107090 --- Comment #2 from vfdff --- Thanks for your suggestion. As the combine pass can't address more than 4 sequence insns, which pass may be more suitable to match the huge pattern after fixing the 1st issue.

[Bug c/107090] New: [aarch64] sequence logic should be combined with mul and umulh

2022-09-29 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test case: https://godbolt.org/z/x5jMhqW8s ``` # define BN_BITS432 # define BN_MASK2(0xL) # define

[Bug fortran/106954] New: [12 Regression] compiler fail base on gfortran

2022-09-16 Thread zhongyunde at huawei dot com via Gcc-bugs
: fortran Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- hi base on https://gcc.godbolt.org/z/9bonaW4eh, we can see that `aarch64 gfortran` 12 has some regression. * x86-x64 gfortran 12 -- pass * aarch64 gfortran 12

[Bug fortran/106353] New: [suboptimal] Why is a 3D array initialized, use case 2 two-layer loop?

2022-07-19 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- We can see that, the icc use a two-layer loop to initialize a 3D array, and the inner loop initialize the low 2D

[Bug c/106323] New: [Suboptimal] memcmp(s1, s2, n) == 0 expansion on AArch64 compare to llvm

2022-07-16 Thread zhongyunde at huawei dot com via Gcc-bugs
: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test case, see detail https://gcc.godbolt.org/z/PM3jxEM9M ``` #include int src(char* s1, char* s2) { return memcmp(s1, s2

[Bug tree-optimization/106268] [suboptimal] Remove unnecessary loops releated to fortran compare to ifort

2022-07-12 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106268 --- Comment #2 from vfdff --- it seems different for the C version, see detail https://godbolt.org/z/vc1edYKhf in your above case, the icc also doesn't elide the outer loop.

[Bug fortran/106268] New: [suboptimal] Remove unnecessary loops releated to fortran compare to ifort

2022-07-12 Thread zhongyunde at huawei dot com via Gcc-bugs
: normal Priority: P3 Component: fortran Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- For the kernel inner loop body, gcc generate an loop, while icc doesn't, see detail in https://godbolt.org/z/G77nKnf8W

[Bug c/106255] New: [suboptinal] llvm uses instructions with larger access bit width

2022-07-11 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- llvm uses memory access instructions with larger access bit width base on following case, both on x86 and arm, see detail https

[Bug c/106254] New: [suboptinal] llvm uses instructions with larger access bit width

2022-07-11 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- llvm uses memory access instructions with larger access bit width base on following case, both on x86 and arm, see detail https

[Bug c/106200] New: Shrink-wrapping opportunity releated to function call

2022-07-05 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- case:https://godbolt.org/z/sc5rTzaeb ``` double advance(double dt, double dx, double dy, double dz) { double dSquared = dx * dx + dy * dy + dz * dz; double mag

[Bug c/106146] New: [instcombine] a redundant movprfx insn compare to llvm

2022-06-30 Thread zhongyunde at huawei dot com via Gcc-bugs
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- * test case, gcc has a redundant movprfx insn in the kernel loop body, see detail https://gcc.godbolt.org/z/8vG4PzM18. ``` #include #define ARRAY_ALIGNMENT 64

[Bug c/105181] New: [optimization] gcc generate worse code than clang base on neon

2022-04-06 Thread zhongyunde at huawei dot com via Gcc-bugs
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test case: void loop(int N, double *a, double *b) { // #pragma clang loop vectorize_width(4, scalable) for (int i = 0; i < N

[Bug c/104045] New: [AArch64] combine related to insn fmaxnm

2022-01-15 Thread zhongyunde at huawei dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test case, see detail https://gcc.godbolt.org/z/95osxxjx5 float foo(float a) { float x = 1.0f; float y = 0.0f; float z = x / y; return fmax (a, z); } as the z

[Bug tree-optimization/94084] Optimizer produces suboptimal code related to loop-invariant

2021-06-23 Thread zhongyunde at huawei dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94084 vfdff changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug c++/101119] New: Missing the check about modify global variable for __attribute__((const)) function

2021-06-18 Thread zhongyunde at huawei dot com via Gcc-bugs
Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- As __attribute__((const)) function should have no side effect, so it won't modify any global variable

[Bug c/100697] New: Missing fwprop for argument register

2021-05-20 Thread zhongyunde at huawei dot com via Gcc-bugs
Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- cat test.c extern double top[100]; int foo (long long j, double a) { top[j] += a; return 0; } gcc10.3 -g0 -O3 -march=armv8.2-a test.c -save-temps -S, can also be test base

[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value

2020-08-25 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031 --- Comment #4 from zhongyunde at tom dot com --- > As for ivopt, I can see a minor improvement by replacing != exit condition > with <=, thus saving add 2 instruction computing _22, which happens to > "disable" the wr

[Bug c/96427] Missing align attribute for anchor section from local variables

2020-08-20 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96427 --- Comment #6 from zhongyunde at tom dot com --- Created attachment 49087 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49087=edit adjust the alignment according the attibute If user don't specify the alignment, so we can do s

[Bug c/96586] New: suboptimal code generated for condition expression

2020-08-12 Thread zhongyunde at tom dot com
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- For the following case, we can easy known the while loop will execute once, but with newest gcc 10.2, it still generated suboptimal code with condition expression. void

[Bug c/96427] Missing align attribute for anchor section from local variables

2020-08-05 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96427 vfdff changed: What|Removed |Added CC||zhongyunde at huawei dot com --- Comment #4

[Bug tree-optimization/93102] [optimization] is it legal to avoid accessing const local array from stack ?

2020-08-04 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93102 --- Comment #4 from zhongyunde at tom dot com --- case from https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96427 generates *.LC0, but don't emit an aggregate copy a_1 = *.LC0, i.e. it is legal even for non-const local array. typedef int v4si

[Bug c/96427] Missing align attribute for anchor section from local variables

2020-08-03 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96427 --- Comment #2 from zhongyunde at tom dot com --- should the data alignment honor the user specified ? Now, it seems compiler _do_ align the initializer according align load. so even if the local array doesn't specify the __attribute__

[Bug rtl-optimization/95696] regrename creates overlapping register allocations for vliw

2020-08-03 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95696 --- Comment #6 from zhongyunde at tom dot com --- Thanks for you notes and I thinks this issue can be closed now. It doesn't need to handle of non-SMS cases as they'll reschedule in general, which is good for performance under my test.

[Bug c/96427] New: Missing align attribute for anchor section from local variables

2020-08-03 Thread zhongyunde at tom dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- For the following code, we can known the local array a_1 is aligned 64 bytes, but now gcc only aligned to default 32 bytes for related anchor

RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-31 Thread Zhongyunde
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Tuesday, July 28, 2020 1:33 AM > To: Zhongyunde > Cc: gcc-patches@gcc.gnu.org; Yangfei (Felix) > Subject: Re: [PATCH PR95696] regrename creates overlapping register >

RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-26 Thread Zhongyunde
I reconsider the issue and update patch attached. Yes, If the kernel loop bb's information doesn't use in regrename, it also need not be collected to save compile time. > -Original Message- > From: Zhongyunde > Sent: Sunday, July 26, 2020 3:29 PM > To: 'Richard Sandiford

RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-26 Thread Zhongyunde
> >> It's interesting that this is for a testcase using SMS. One of the > >> traditional problems with the GCC implementation of SMS has been > >> ensuring that later passes don't mess up the scheduled loop. So in > >> your testcase, does register allocation succeed for the SMS loop > >>

RE: RE: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-22 Thread Zhongyunde
> -Original Message- > From: Richard Sandiford [mailto:richard.sandif...@arm.com] > Sent: Wednesday, July 22, 2020 12:12 AM > To: Zhongyunde > Cc: gcc-patches@gcc.gnu.org; Yangfei (A) > Subject: Re: 答复: [PATCH PR95696] regrename creates overlapping > register

答复: [PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-21 Thread Zhongyunde
送时间: 2020年7月21日 0:05 收件人: Zhongyunde 抄送: gcc-patches@gcc.gnu.org; Yangfei (A) 主题: Re: [PATCH PR95696] regrename creates overlapping register allocations for vliw Hi, Zhongyunde writes: > Hi, > > In most target, it is limited to issue two insns with change the same > register. S

[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value

2020-07-20 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031 --- Comment #3 from zhongyunde at tom dot com --- I find there is some different between the two cases during in ivopts. For the 2nd case, a UINT32 type iv sum is choosed [local count: 955630224]: # sum_15 = PHI <0(5), sum_

[Bug rtl-optimization/95696] regrename creates overlapping register allocations for vliw

2020-07-20 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95696 --- Comment #3 from zhongyunde at tom dot com --- (In reply to Richard Biener from comment #2) > Please send patches to gcc-patc...@gcc.gnu.org I have send this patch by email according your suggestion, please give me some advice, thanks!

[PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-19 Thread Zhongyunde
Hi, In most target, it is limited to issue two insns with change the same register. So a register is not realy unused if there is another insn, which set the register in the save VLIW. For example, The insn 73 start with insn:TI, so it will be issued together with others insns until a new

[PATCH PR95696] regrename creates overlapping register allocations for vliw

2020-07-16 Thread zhongyunde via Gcc-patches

[Bug rtl-optimization/96031] suboptimal codegen for store low 16-bits value

2020-07-06 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96031 --- Comment #1 from zhongyunde at tom dot com --- this may can be enhance by ivopts. If the case adjusted as following, then the 'and w2, w2, 65535 ' will disappear. typedef unsigned int UINT32; typedef unsigned short UINT16; UINT16

[Bug rtl-optimization/96031] New: suboptimal codegen for store low 16-bits value

2020-07-02 Thread zhongyunde at tom dot com
: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- For the following code, as instruction strh only store the low 16-bits value, so the 'and w2, w2, 65535 ' is redundant. test base on the ARM64 gcc 8.2 on https

Support to check vliw overlapping register constraint created by regrename, please help to review, thanks

2020-06-20 Thread Zhongyunde
In some target, it is limited to issue two insns with change the same register.(The insn 73 start with insn:TI, so it will be issued together with others insns until a new insn start with insn:TI, such as insn 71) The regrename can known the mode V2VF in insn 73 need two successive registers,

[Bug rtl-optimization/95696] regrename creates overlapping register allocations for vliw

2020-06-16 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95696 zhongyunde at tom dot com changed: What|Removed |Added CC||zhongyunde at tom dot com

[Bug rtl-optimization/95696] New: regrename creates overlapping register allocations for vliw

2020-06-16 Thread zhongyunde at tom dot com
Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- In some target, it is limited to issue two insns with change the same register.(The insn 73 start with insn:TI, so

[Bug rtl-optimization/95267] [ICE][gcse]: in process_insert_insn at gcse.c

2020-05-22 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95267 zhongyunde at tom dot com changed: What|Removed |Added CC||zhongyunde at tom dot com

[Bug rtl-optimization/95210] internal compiler error: in prepare_copy_insn, at gcse.c:1988

2020-05-22 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95210 zhongyunde at tom dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution

[Bug c/95210] internal compiler error: in prepare_copy_insn, at gcse.c:1988

2020-05-19 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95210 --- Comment #1 from zhongyunde at tom dot com --- patch for this issue. @ linux-9z2e in ~/software/gcc/gcc on git:master o [23:02:26] $ git diff diff --git a/gcc/gcse.c b/gcc/gcse.c index 8b9518e..65982ec 100644 --- a/gcc/gcse.c +++ b/gcc

[Bug c/95210] New: internal compiler error: in prepare_copy_insn, at gcse.c:1988

2020-05-19 Thread zhongyunde at tom dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- rtx_insn * prepare_copy_insn (rtx reg, rtx exp) { ... else { rtx_insn *insn = emit_insn (gen_rtx_SET (reg, exp

[Bug tree-optimization/95019] Optimizer produces suboptimal code related to -ftree-ivopts

2020-05-13 Thread zhongyunde at tom dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95019 --- Comment #2 from zhongyunde at tom dot com --- It is a generic issue for all targets, such as x86, it also don't enpand IVOPTs as index is not used for DEST and Src directly. we may need expand IVOPTs, then different targets can select

[Bug tree-optimization/95019] New: Optimizer produces suboptimal code related to -ftree-ivopts

2020-05-09 Thread zhongyunde at tom dot com
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- For the following code, we can known the variable C05A1 is only used for the offset of array Dest and Src, and the unit size

[Bug tree-optimization/94573] Optimizer produces suboptimal code related to -fstore-merging

2020-04-14 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94573 --- Comment #6 from vfdff --- Created attachment 48267 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48267=edit only get the adjust_bit_pos change base on the adjust_bit_pos change only, I test it on the gcc 9, and find it take effect.

[Bug c/94573] New: Optimizer produces suboptimal code related to -fstore-merging

2020-04-12 Thread zhongyunde at tom dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at tom dot com Target Milestone: --- For the following code, we can known init the array C16DD is always consecutive, so we can use the more bigger mode size. test base

[Bug c/94421] New: [memory free] bug related to predication speculative schedule

2020-03-31 Thread zhongyunde at huawei dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- After we enable the schedule DO_PREDICATION, then spec_dependency_cache will be alloc in function extend_dependency_caches, and it is obvious

[help] how can I have an email address with @gcc.gnu.org ?

2020-03-21 Thread Zhongyunde

[Bug tree-optimization/93781] Optimizer produces suboptimal code related to -ftree-vrp

2020-03-10 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93781 --- Comment #5 from vfdff --- Created attachment 48008 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48008=edit patch base on gcc 7.3, additional for 1st testcases

[Bug tree-optimization/93781] Optimizer produces suboptimal code related to -ftree-vrp

2020-03-10 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93781 --- Comment #4 from vfdff --- according your prompt, I test it base on gcc 7.3, and the second testcase works. --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -3301,6 +3301,18 @@ extract_range_from_binary_expr (value_range *vr, else

[Bug tree-optimization/93781] Optimizer produces suboptimal code related to -ftree-vrp

2020-03-08 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93781 --- Comment #2 from vfdff --- I test a more simple testcase, and find the arg_5(D) already get the expected range, but the _2 = 1 << arg_9 is unexpected. unsigned int foo (unsigned int arg) { unsigned int C03FE = 4; if (arg + 1 < 4)

[Bug tree-optimization/93781] Optimizer produces suboptimal code related to -ftree-vrp

2020-03-06 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93781 --- Comment #2 from vfdff --- For more test, I find the following case2 can get the expect result, while the case1 can't. == [case1] == unsigned int foo (unsigned int arg) { unsigned int C03FE = 4;

[Bug tree-optimization/94084] Optimizer produces suboptimal code related to loop-invariant

2020-03-06 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94084 --- Comment #2 from vfdff --- thanks very much, you are right. I try the case2 with global pointer and it get similar result with case1. extern int base; extern int *dest, *src; void foo (int n) { int i; // #pragma no_swp for (i=0; i

[Bug c/94084] New: Optimizer produces suboptimal code related to loop-invariant

2020-03-06 Thread zhongyunde at huawei dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- For the following case1 and case2, we can known the global value base is a loop invariant value, so the load insn can be lifted out

[Bug c/93928] New: Is there any interface to define the map of two register in one pattern ?

2020-02-25 Thread zhongyunde at huawei dot com
: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- for example, if the following oril insns need two register for operand0 and operand1 have an implication constraint, i.e

[Bug c/93781] New: Optimizer produces suboptimal code related to -ftree-vrp

2020-02-17 Thread zhongyunde at huawei dot com
Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- For the following code, we can known the value C03FE is always less then 5, so the return value should be true. test base on the x86-64 gcc

[Bug middle-end/90354] Skip the not first insn when traversing the insn node

2020-02-16 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90354 --- Comment #8 from vfdff --- I have a method to fix this issue: check the egde with bb_has_eh_pred, and avoid bundling the jump insn when it is true.

[Bug rtl-optimization/93561] [bounds checking] memory overflow for spill_for

2020-02-06 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93561 --- Comment #3 from vfdff --- thanks very much!

[Bug c/93561] New: [bounds checking] memory overflow for spill_for

2020-02-04 Thread zhongyunde at huawei dot com
: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- In funcion spill_for, there is following code: mode = PSEUDO_REGNO_MODE (regno); ... for (i = 0; i < rclass_size; i++) { hard_regno = ira_class_hard_r

[Bug tree-optimization/93102] [optimization] is it legal to avoid accessing const local array from stack ?

2019-12-30 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93102 --- Comment #2 from vfdff --- do you mean the optimization memtioned https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47980 Yes, it can be with optimized option '-fmerge-all-constants', but it doesn't active in default.

[Bug c/93102] New: [optimization] is it legal to avoid accessing const local array from stack ?

2019-12-30 Thread zhongyunde at huawei dot com
: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- test code: int foo (int m, int n) { const int C2029[10] = {0,1,2,3,2,3,0,1,2,3}; int index, sum = 0; for (index

[Bug middle-end/90354] [7 regression] Skip the not first insn when traversing the insn node

2019-10-22 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90354 --- Comment #6 from vfdff --- (In reply to Richard Biener from comment #5) > (In reply to Richard Biener from comment #2) > > Which target? Which GCC version did work for you? > > Which target are you working on? Since you mark this as

[Bug tree-optimization/90837] Generate infinite loop when using -ftree-vrp

2019-06-12 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90837 --- Comment #4 from vfdff --- this is an invalid issue, thanks

[Bug tree-optimization/90837] New: Generate infinite loop when using -ftree-vrp

2019-06-11 Thread zhongyunde at huawei dot com
: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- Created attachment 46481 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46481=edit a simplified test case compile loop.c with follow command on x86 tar

[Bug middle-end/90354] [7.3 regression] Skip the not first insn when traversing the insn node

2019-06-01 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90354 --- Comment #4 from vfdff --- Another issue, it is not suiteable for some target supported more than 2 insns issued together ? But the following code already exist very long without problem. /* ??? Hopefully multiple delay slots are not

[Bug middle-end/90354] [7.3 regression] Skip the not first insn when traversing the insn node

2019-05-06 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90354 --- Comment #3 from vfdff --- I work on GCC 7.3, in function scan_trace, control = pat->insn (0), so it only check whether or not a jump_insn for the first insn of sequence. for (prev = insn, insn = NEXT_INSN (insn); insn; prev =

[Bug c++/90354] [7.3 regression] Skip the not first insn when traversing the insn node

2019-05-05 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90354 vfdff changed: What|Removed |Added CC||zhongyunde at huawei dot com --- Comment #1

[Bug c++/90354] New: [7.3 regression] Skip the not first insn when traversing the insn node

2019-05-05 Thread zhongyunde at huawei dot com
Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- simplified testcase base on g++.eh/ia64-1.C: ~/ICE » cat exp1.C

[Bug c/90267] [7.3 regression] wrong code generated wth -O2 as missing data dependence base on memory

2019-04-27 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90267 --- Comment #3 from vfdff --- but it doesn't warning anything, even with -Wstrict-aliasing -Wall. Accord to http://blog.sina.com.cn/s/blog_74caf0ce010173up.html, We expect an warning similar the following infomation. warning: dereferencing

[Bug c/90267] New: [7.3 regression] wrong code generated wth -O2 as missing data dependence base on memory

2019-04-26 Thread zhongyunde at huawei dot com
Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- Created attachment 46254 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46254=edit a simple testcase t

[Bug rtl-optimization/56069] [7 Regression] RA pessimization

2019-04-11 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56069 vfdff changed: What|Removed |Added CC||zhongyunde at huawei dot com --- Comment #20

[Bug c/90042] New: [7.3 regression] Unreadable preprocessed files format

2019-04-10 Thread zhongyunde at huawei dot com
Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- the preprocessed file base gcc 7.3 # 1570 "/usr1/bmtest/zhongyunde/SAC_C11/SAC/UT/linux_hcc_SD6186/../../CODE/SRS/SAC_SRSMEAS_EQSINR3I.c

[Bug c/90027] misalign variable access by piece load/store even when define STRICT_ALIGNMENT nonzero

2019-04-09 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90027 --- Comment #2 from vfdff --- for deja testcase: gcc.c-torture/execute/20010518-2.c as the struct a_struct define with __attribute__ ((packed)), so the member variable b also not aligned with 4 bytes, is this case undefined behavior ? typedef

[Bug c/90027] New: misalign variable access by piece load/store even when define STRICT_ALIGNMENT nonzero

2019-04-09 Thread zhongyunde at huawei dot com
Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: zhongyunde at huawei dot com Target Milestone: --- base gcc 7.3.0, in function expand_expr_real_1 we can see the follow code: else if (SLOW_UNALIGNED_ACCESS

[Bug c/89887] the local array data will be laid in different section by different optimization level

2019-04-01 Thread zhongyunde at huawei dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89887 --- Comment #8 from vfdff --- an static variable out put in assemble is decided by an special option flag_toplevel_reorder ? /* Traditionally we do not eliminate static variables when not optimizing and when not doing toplevel

  1   2   >