Hi everyone,
So this is another question on peephole optimisation for x86_64.
Occasionally you get situations where you write a load of constants to
the stack - in this case it's part of an array parameter to a function call:
movl $23199763,32(%rsp)
movl $262149,36(%rsp)
Am 22.01.2014 23:22, schrieb Martin Frb:
On 22/01/2014 21:29, Florian Klämpfl wrote:
Am 22.01.2014 04:06, schrieb Martin Frb:
On 21/01/2014 21:28, Florian Klämpfl wrote:
Can you post some example code? It might be worth to think about
improving this already in at the node level.
While
On 23/01/2014 19:35, Florian Klämpfl wrote:
Am 22.01.2014 23:22, schrieb Martin Frb:
One of the optimizations you said it where better avoided to be created
in first. I agree.
Only, even if that is archived at some time, who guarantees that it will
not be back (and unnoticed)?
Are there tests,
Am 23.01.2014 20:52, schrieb Martin Frb:
On 23/01/2014 19:35, Florian Klämpfl wrote:
Am 22.01.2014 23:22, schrieb Martin Frb:
One of the optimizations you said it where better avoided to be created
in first. I agree.
Only, even if that is archived at some time, who guarantees that it will
On 23/01/2014 20:04, Florian Klämpfl wrote:
Am 23.01.2014 20:52, schrieb Martin Frb:
On 23/01/2014 19:35, Florian Klämpfl wrote:
I think this is hard to achive as well.
Why?
I consider it as complicated and it covers only cases one can forsee.
Some statistical analysis of benchmark timings
Am 23.01.2014 21:15, schrieb Martin Frb:
On 23/01/2014 20:04, Florian Klämpfl wrote:
Am 23.01.2014 20:52, schrieb Martin Frb:
On 23/01/2014 19:35, Florian Klämpfl wrote:
I think this is hard to achive as well.
Why?
I consider it as complicated and it covers only cases one can forsee.
Some
On 23/01/2014 20:34, Florian Klaempfl wrote:
Yes and no. It is extra code and extra code is always bad ;) and it
requires a separate compiler run. I wouldn't waste effort in it.
testcase are extra code too. ;) scnr
Ok, i see what you mean. No problem. It was just an idea.
On 22/01/2014 21:23, Florian Klämpfl wrote:
Submit them to a bug report, I can look during the weekend into them.
Done: 0025584, 0025586, 0025587
http://bugs.freepascal.org/view.php?id=25584
http://bugs.freepascal.org/view.php?id=25586
http://bugs.freepascal.org/view.php?id=25587
Am 22.01.2014 00:27, schrieb Martin Frb:
On 21/01/2014 21:28, Florian Klämpfl wrote:
Am 20.01.2014 01:18, schrieb Martin:
It used
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.base) and
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.index) then
but should only compare the supregister
Am 22.01.2014 04:06, schrieb Martin Frb:
On 21/01/2014 21:28, Florian Klämpfl wrote:
Can you post some example code? It might be worth to think about
improving this already in at the node level.
While getting examples, another issue:
with -O2 , -O3 or -O4
Note the
movl
On 22/01/2014 21:29, Florian Klämpfl wrote:
Am 22.01.2014 04:06, schrieb Martin Frb:
On 21/01/2014 21:28, Florian Klämpfl wrote:
Can you post some example code? It might be worth to think about
improving this already in at the node level.
While getting examples, another issue:
with -O2 ,
Am 20.01.2014 01:18, schrieb Martin:
Just been looking at the peehole opt (i386). Other than the 2 items
already mailed, I found that:
1) Gode as follows is sometimes generated (at various opt levels)
.Ll2:
# [36] i := 1;
movl$1,%eax
.Ll3:
# [38] i := i + 1;
movl
On 21/01/2014 21:28, Florian Klämpfl wrote:
Am 20.01.2014 01:18, schrieb Martin:
It used
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.base) and
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.index) then
but should only compare the supregister part
I replaced that
On 21/01/2014 21:28, Florian Klämpfl wrote:
Can you post some example code? It might be worth to think about
improving this already in at the node level.
While getting examples, another issue:
with -O2 , -O3 or -O4
Note the
movl%eax,%edx
movl%edx,%eax
with -O1 it is
On 21/01/2014 23:27, Martin Frb wrote:
On 21/01/2014 21:28, Florian Klämpfl wrote:
Am 20.01.2014 01:18, schrieb Martin:
It used
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.base) and
(taicpu(p).oper[1]^.regtaicpu(hp1).oper[0]^^.ref^.index) then
but should only compare the supregister
Just been looking at the peehole opt (i386). Other than the 2 items
already mailed, I found that:
1) Gode as follows is sometimes generated (at various opt levels)
.Ll2:
# [36] i := 1;
movl$1,%eax
.Ll3:
# [38] i := i + 1;
movl$2,%eax
I could not find any code dealing with it,
16 matches
Mail list logo