Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-11 Thread Changbin Du via Gcc
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote: > Hello, folks. > This is to discuss Gcc's heuristic strategy about Predicated Instructions and > Branches. And probably something needs to be improved. > > [The story] > Weeks ago, I built a huffman encoding program with O2, O3, and

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-11 Thread Changbin Du via Gcc-bugs
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote: > Hello, folks. > This is to discuss Gcc's heuristic strategy about Predicated Instructions and > Branches. And probably something needs to be improved. > > [The story] > Weeks ago, I built a huffman encoding program with O2, O3, and

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote: > The result (p-core, no ht, no turbo, performance mode): > > O2 O3 PGO > cycles 2,581,832,749 8,638,401,568 9,394,200,585 >

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc-bugs
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote: > The result (p-core, no ht, no turbo, performance mode): > > O2 O3 PGO > cycles 2,581,832,749 8,638,401,568 9,394,200,585 >

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc-bugs
On Tue, Aug 01, 2023 at 10:44:02AM +0200, Jan Hubicka wrote: > > > If I comment it out as above patch, then O3/PGO can get 16% and 12% > > > performance > > > improvement compared to O2 on x86. > > > > > > O2 O3 PGO > > > cycles

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc
On Tue, Aug 01, 2023 at 10:44:02AM +0200, Jan Hubicka wrote: > > > If I comment it out as above patch, then O3/PGO can get 16% and 12% > > > performance > > > improvement compared to O2 on x86. > > > > > > O2 O3 PGO > > > cycles

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc
On Mon, Jul 31, 2023 at 03:53:26PM +0200, Richard Biener wrote: [snip] > > The main difference in the compilation output about code around the > > miss-prediction > > branch is: > > o In O2: predicated instruction (cmov here) is selected to eliminate above > > branch. cmov is true better

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Changbin Du via Gcc-bugs
On Mon, Jul 31, 2023 at 03:53:26PM +0200, Richard Biener wrote: [snip] > > The main difference in the compilation output about code around the > > miss-prediction > > branch is: > > o In O2: predicated instruction (cmov here) is selected to eliminate above > > branch. cmov is true better

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Jan Hubicka via Gcc
> > If I comment it out as above patch, then O3/PGO can get 16% and 12% > > performance > > improvement compared to O2 on x86. > > > > O2 O3 PGO > > cycles 2,497,674,824 2,104,993,224 2,199,753,593 > > instructions

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-08-01 Thread Jan Hubicka via Gcc-bugs
> > If I comment it out as above patch, then O3/PGO can get 16% and 12% > > performance > > improvement compared to O2 on x86. > > > > O2 O3 PGO > > cycles 2,497,674,824 2,104,993,224 2,199,753,593 > > instructions

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Richard Biener via Gcc
On Mon, Jul 31, 2023 at 2:57 PM Changbin Du via Gcc wrote: > > Hello, folks. > This is to discuss Gcc's heuristic strategy about Predicated Instructions and > Branches. And probably something needs to be improved. > > [The story] > Weeks ago, I built a huffman encoding program with O2, O3, and

Re: [Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Richard Biener via Gcc-bugs
On Mon, Jul 31, 2023 at 2:57 PM Changbin Du via Gcc wrote: > > Hello, folks. > This is to discuss Gcc's heuristic strategy about Predicated Instructions and > Branches. And probably something needs to be improved. > > [The story] > Weeks ago, I built a huffman encoding program with O2, O3, and

[Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Changbin Du via Gcc-bugs
Hello, folks. This is to discuss Gcc's heuristic strategy about Predicated Instructions and Branches. And probably something needs to be improved. [The story] Weeks ago, I built a huffman encoding program with O2, O3, and PGO respectively. This program is nothing special, just a random code I

[Predicated Ins vs Branches] O3 and PGO result in 2x performance drop relative to O2

2023-07-31 Thread Changbin Du via Gcc
Hello, folks. This is to discuss Gcc's heuristic strategy about Predicated Instructions and Branches. And probably something needs to be improved. [The story] Weeks ago, I built a huffman encoding program with O2, O3, and PGO respectively. This program is nothing special, just a random code I