On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote:
> Hello, folks.
> This is to discuss Gcc's heuristic strategy about Predicated Instructions and
> Branches. And probably something needs to be improved.
>
> [The story]
> Weeks ago, I built a huffman encoding program with O2, O3, and
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote:
> Hello, folks.
> This is to discuss Gcc's heuristic strategy about Predicated Instructions and
> Branches. And probably something needs to be improved.
>
> [The story]
> Weeks ago, I built a huffman encoding program with O2, O3, and
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote:
> The result (p-core, no ht, no turbo, performance mode):
>
> O2 O3 PGO
> cycles 2,581,832,749 8,638,401,568 9,394,200,585
>
On Mon, Jul 31, 2023 at 08:55:35PM +0800, Changbin Du wrote:
> The result (p-core, no ht, no turbo, performance mode):
>
> O2 O3 PGO
> cycles 2,581,832,749 8,638,401,568 9,394,200,585
>
On Tue, Aug 01, 2023 at 10:44:02AM +0200, Jan Hubicka wrote:
> > > If I comment it out as above patch, then O3/PGO can get 16% and 12%
> > > performance
> > > improvement compared to O2 on x86.
> > >
> > > O2 O3 PGO
> > > cycles
On Tue, Aug 01, 2023 at 10:44:02AM +0200, Jan Hubicka wrote:
> > > If I comment it out as above patch, then O3/PGO can get 16% and 12%
> > > performance
> > > improvement compared to O2 on x86.
> > >
> > > O2 O3 PGO
> > > cycles
On Mon, Jul 31, 2023 at 03:53:26PM +0200, Richard Biener wrote:
[snip]
> > The main difference in the compilation output about code around the
> > miss-prediction
> > branch is:
> > o In O2: predicated instruction (cmov here) is selected to eliminate above
> > branch. cmov is true better
On Mon, Jul 31, 2023 at 03:53:26PM +0200, Richard Biener wrote:
[snip]
> > The main difference in the compilation output about code around the
> > miss-prediction
> > branch is:
> > o In O2: predicated instruction (cmov here) is selected to eliminate above
> > branch. cmov is true better
> > If I comment it out as above patch, then O3/PGO can get 16% and 12%
> > performance
> > improvement compared to O2 on x86.
> >
> > O2 O3 PGO
> > cycles 2,497,674,824 2,104,993,224 2,199,753,593
> > instructions
> > If I comment it out as above patch, then O3/PGO can get 16% and 12%
> > performance
> > improvement compared to O2 on x86.
> >
> > O2 O3 PGO
> > cycles 2,497,674,824 2,104,993,224 2,199,753,593
> > instructions
On Mon, Jul 31, 2023 at 2:57 PM Changbin Du via Gcc wrote:
>
> Hello, folks.
> This is to discuss Gcc's heuristic strategy about Predicated Instructions and
> Branches. And probably something needs to be improved.
>
> [The story]
> Weeks ago, I built a huffman encoding program with O2, O3, and
On Mon, Jul 31, 2023 at 2:57 PM Changbin Du via Gcc wrote:
>
> Hello, folks.
> This is to discuss Gcc's heuristic strategy about Predicated Instructions and
> Branches. And probably something needs to be improved.
>
> [The story]
> Weeks ago, I built a huffman encoding program with O2, O3, and
Hello, folks.
This is to discuss Gcc's heuristic strategy about Predicated Instructions and
Branches. And probably something needs to be improved.
[The story]
Weeks ago, I built a huffman encoding program with O2, O3, and PGO respectively.
This program is nothing special, just a random code I
Hello, folks.
This is to discuss Gcc's heuristic strategy about Predicated Instructions and
Branches. And probably something needs to be improved.
[The story]
Weeks ago, I built a huffman encoding program with O2, O3, and PGO respectively.
This program is nothing special, just a random code I
14 matches
Mail list logo