This is an interesting idea indeed.
https://www.researchgate.net/publication/343322212_Automatic_Code_Optimization_With_Machine_Learning_And_Combinatorial_Optimization
https://www.ijresm.com/Vol.2_2019/Vol2_Iss4_April19/IJRESM_V2_I4_149.pdf -
Compiler Optimization using Artificial Intelligence
just as point for consideration, I'm not sure if data alignment will
improve speed on future processors:
https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/
Food for thought: imagine if we had single (32 bits floating point) values
dynamic arrays with 1 million values
I think that you'll find some interesting links on this thread:
https://forum.lazarus.freepascal.org/index.php/topic,38569.msg262288.html#msg262288
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
I regret to say that I can't reproduce my initial result showing 9%
improvement on 3.2.0rc1 against 3.0.4. Both versions show the same speed
now.
I also compared 3.0.4 against trunk in another environment:
Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-1014-gcp x86_64)
cpu model name: Intel(R) Xeon(R) CPU
Just tested with my own neural networks API and I can confirm that it works!
Environment: WIN10 64bits AVX
Tested with:
https://github.com/joaopauloschuler/neural-api/blob/master/examples/SimpleImageClassifier/SimpleImageClassifier.lpr
In this test, there is a performance gain (speed) against
Hello,
I'm not sure if it's happening only to me, but I have a feeling that trunk
produces more "marked as inline is not inlined" than FPC 3.0.4.
This is an example if anyone intends to build and see:
https://github.com/joaopauloschuler/neural-api/tree/master/examples/XorAndOr
BTW, trunk is
Dear Moreton,
I think that you might have touched the most important question of all.
I'll express my own professional opinion in regards to this (not wishing to
convince others - just expressing my own).
I've been thinking on this question for more than 20 years. If you own a
company and your
Hello Simon - wondering if you have code examples that provoke problems you
are experiencing? It will be easier to measure/test improvements with test
cases. Solutions might not come from a single person/team and therefore not
sure how to apply the bounty in the most effective/fair way.
Assuming: v=1; x=10; y=3
(v-x) < (y-x) == (1-10) < (3-10) == -9 < -7 == *true*
(v>=x) and (v<=y) == (1>=10) and (1<=3) == false and true = *false*
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
I can confirm that this works:
VEXTRACTF32x4 xmm2, zmm0, 1
VEXTRACTF32x4 xmm3, zmm0, 2
VEXTRACTF32x4 xmm4, zmm0, 3
Well done job!
I have more good news: I've just finished coding support for AVX512 in my
own project: https://www.youtube.com/watch?v=qGnfwpKUTIQ
I'm getting loads of
Quick update in reply to my own question: VEXTRACTF128 should not support
zmm registers. Therefore, the current behavior is correct. This is the
reference:
https://www.felixcloutier.com/x86/VEXTRACTF128:VEXTRACTF32x4:VEXTRACTF64x2:VEXTRACTF32x8:VEXTRACTF64x4.html
Anyway, supporting VEXTRACTF32x4
Hello,
Almost everything I tested works perfectly.
This is what I tested so far:
zmm registers are properly recognized:
end [
'RAX', 'RCX', 'RDX',
'ymm2', 'ymm3', 'ymm4', 'ymm5', 'ymm0'
{$IFDEF AVX512},'zmm2', 'zmm3', 'zmm0'{$ENDIF}
];
*These commands work:*
VBROADCASTSS
THANK YOU S MUCH!!! Intend to test along weekend.
___
fpc-devel maillist - fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
86ins.dat, which contains the syntax
> information for all of the x86-64 assembler commands.
>
> A tool that's run by "make" will then generate a number of .inc files that
> are then referenced by the source code.
>
> Gareth aka. Kit
>
>
>
> On Sun 17/06/18 20:59
I can give a try to support vaddps and other instructions I need the most
in AVX512. Where is the code (what file) for the above please?
On Sun, Jun 17, 2018 at 6:30 PM, Florian Klämpfl
wrote:
> Am 17.06.2018 um 06:37 schrieb Joao Schuler:
>
>> Hi,
>> I started testing
Hi,
I started testing the AVX512 branch:
https://svn.freepascal.org/svn/fpc/branches/tg74/avx512/
This is the code:
{$ASMMODE intel}
asm
vaddps zmm1, zmm2, zmm3
end;
The error message is: invalid combination of opcode and operands.
The assembly code looks correct to me:
16 matches
Mail list logo