Re: [fpc-devel] An interesting thought... AI

2022-11-10 Thread Joao Schuler via fpc-devel
This is an interesting idea indeed. https://www.researchgate.net/publication/343322212_Automatic_Code_Optimization_With_Machine_Learning_And_Combinatorial_Optimization https://www.ijresm.com/Vol.2_2019/Vol2_Iss4_April19/IJRESM_V2_I4_149.pdf - Compiler Optimization using Artificial Intelligence

Re: [fpc-devel] State of SSE/AVX intrinsics

2020-04-21 Thread Joao Schuler
just as point for consideration, I'm not sure if data alignment will improve speed on future processors: https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/ Food for thought: imagine if we had single (32 bits floating point) values dynamic arrays with 1 million values

Re: [fpc-devel] FPC and Z80

2020-04-19 Thread Joao Schuler
I think that you'll find some interesting links on this thread: https://forum.lazarus.freepascal.org/index.php/topic,38569.msg262288.html#msg262288 ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] FPC 3.2.0RC1 released!

2020-04-01 Thread Joao Schuler
I regret to say that I can't reproduce my initial result showing 9% improvement on 3.2.0rc1 against 3.0.4. Both versions show the same speed now. I also compared 3.0.4 against trunk in another environment: Ubuntu 18.04.2 LTS (GNU/Linux 4.15.0-1014-gcp x86_64) cpu model name: Intel(R) Xeon(R) CPU

Re: [fpc-devel] FPC 3.2.0RC1 released!

2020-03-30 Thread Joao Schuler
Just tested with my own neural networks API and I can confirm that it works! Environment: WIN10 64bits AVX Tested with: https://github.com/joaopauloschuler/neural-api/blob/master/examples/SimpleImageClassifier/SimpleImageClassifier.lpr In this test, there is a performance gain (speed) against

[fpc-devel] market as inline is not inlined

2019-12-27 Thread Joao Schuler
Hello, I'm not sure if it's happening only to me, but I have a feeling that trunk produces more "marked as inline is not inlined" than FPC 3.0.4. This is an example if anyone intends to build and see: https://github.com/joaopauloschuler/neural-api/tree/master/examples/XorAndOr BTW, trunk is

Re: [fpc-devel] Simplicity vs. Complexity

2019-03-26 Thread Joao Schuler
Dear Moreton, I think that you might have touched the most important question of all. I'll express my own professional opinion in regards to this (not wishing to convince others - just expressing my own). I've been thinking on this question for more than 20 years. If you own a company and your

Re: [fpc-devel] The 15k bounty: Optimizing executable speed for Linux x86 / LLVM

2018-10-25 Thread Joao Schuler
Hello Simon - wondering if you have code examples that provoke problems you are experiencing? It will be easier to measure/test improvements with test cases. Solutions might not come from a single person/team and therefore not sure how to apply the bounty in the most effective/fair way.

Re: [fpc-devel] Attn. Florian, r39759

2018-09-17 Thread Joao Schuler
Assuming: v=1; x=10; y=3 (v-x) < (y-x) == (1-10) < (3-10) == -9 < -7 == *true* (v>=x) and (v<=y) == (1>=10) and (1<=3) == false and true = *false* ___ fpc-devel maillist - fpc-devel@lists.freepascal.org

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-09-07 Thread Joao Schuler
I can confirm that this works: VEXTRACTF32x4 xmm2, zmm0, 1 VEXTRACTF32x4 xmm3, zmm0, 2 VEXTRACTF32x4 xmm4, zmm0, 3 Well done job! I have more good news: I've just finished coding support for AVX512 in my own project: https://www.youtube.com/watch?v=qGnfwpKUTIQ I'm getting loads of

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-08-26 Thread Joao Schuler
Quick update in reply to my own question: VEXTRACTF128 should not support zmm registers. Therefore, the current behavior is correct. This is the reference: https://www.felixcloutier.com/x86/VEXTRACTF128:VEXTRACTF32x4:VEXTRACTF64x2:VEXTRACTF32x8:VEXTRACTF64x4.html Anyway, supporting VEXTRACTF32x4

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-08-25 Thread Joao Schuler
Hello, Almost everything I tested works perfectly. This is what I tested so far: zmm registers are properly recognized: end [ 'RAX', 'RCX', 'RDX', 'ymm2', 'ymm3', 'ymm4', 'ymm5', 'ymm0' {$IFDEF AVX512},'zmm2', 'zmm3', 'zmm0'{$ENDIF} ]; *These commands work:* VBROADCASTSS

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-08-22 Thread Joao Schuler
THANK YOU S MUCH!!! Intend to test along weekend. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-06-17 Thread Joao Schuler
86ins.dat, which contains the syntax > information for all of the x86-64 assembler commands. > > A tool that's run by "make" will then generate a number of .inc files that > are then referenced by the source code. > > Gareth aka. Kit > > > > On Sun 17/06/18 20:59

Re: [fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-06-17 Thread Joao Schuler
I can give a try to support vaddps and other instructions I need the most in AVX512. Where is the code (what file) for the above please? On Sun, Jun 17, 2018 at 6:30 PM, Florian Klämpfl wrote: > Am 17.06.2018 um 06:37 schrieb Joao Schuler: > >> Hi, >> I started testing

[fpc-devel] AVX 512 - Can't compile vaddps zmm1, zmm2, zmm3

2018-06-17 Thread Joao Schuler
Hi, I started testing the AVX512 branch: https://svn.freepascal.org/svn/fpc/branches/tg74/avx512/ This is the code: {$ASMMODE intel} asm vaddps zmm1, zmm2, zmm3 end; The error message is: invalid combination of opcode and operands. The assembly code looks correct to me: