Re: [fpc-devel] Bitset assembler
On 11/09/16 15:11, Jeppe Johansen wrote: Here's an ARM version that runs in 5 cycles on a Cortex A8: movr2,r1,lsr #5 movr12,#1 ldrr3,[r0, r2, lsl #2]! orrr2,r3,r12,lsl r1 strr2,[r0] andr0,r12,r3,lsr r1 It's one cycle faster than what the compiler can generate due to it not doing the pre-indexed writeback optimization when the address calculation has shifts. Given that this code will be in an non-inlinable routine (we can't inline routines with inline assembler), the Pascal version is probably faster then (since you won't have the call/return overhead). Jonas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Bitset assembler
On 09/08/2016 11:02 AM, Jy V wrote: Hello to all assembler experts, I would greatly appreciate if some people could help me prepare some asm code for FreePascal for Win32, Win64, Linux x86, Linux x64 (and maybe some ARM32bit + AARCH64) I am using Lazarus 1.6, FPC 3.0.0 SVN revision 51630 x86_64-win64-win32/win64 Here's an ARM version that runs in 5 cycles on a Cortex A8: movr2,r1,lsr #5 movr12,#1 ldrr3,[r0, r2, lsl #2]! orrr2,r3,r12,lsl r1 strr2,[r0] andr0,r12,r3,lsr r1 It's one cycle faster than what the compiler can generate due to it not doing the pre-indexed writeback optimization when the address calculation has shifts. ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Bitset assembler
Thank you Thomas, I will experiment with {$ASMMODE INTEL} On Sunday, September 11, 2016, Tomas Hajnywrote: > On Sun, September 11, 2016 09:43, Jy V wrote: >> Thank you Jonas, >> so back to my original question, >> is there an asm expert out there who knows if the syntax is invalid, or >> simply the compiler does not implement bt, bts, btr instructions > . > . > > In general, GNU assembler syntax requires you to specify the operand size > in the instruction name. You can use the Intel syntax (i.e. the working > code you tried with DCC) when adding {$ASMMODE INTEL} at the top. You can > also have a look at the translated GNU assembler syntax version by > compiling with command line parameter -a and looking at the generated file > with prefix .s. > > Hope this helps > > Tomas > > > ___ > fpc-devel maillist - fpc-devel@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel > ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Bitset assembler
On Sun, September 11, 2016 09:43, Jy V wrote: > Thank you Jonas, > so back to my original question, > is there an asm expert out there who knows if the syntax is invalid, or > simply the compiler does not implement bt, bts, btr instructions . . In general, GNU assembler syntax requires you to specify the operand size in the instruction name. You can use the Intel syntax (i.e. the working code you tried with DCC) when adding {$ASMMODE INTEL} at the top. You can also have a look at the translated GNU assembler syntax version by compiling with command line parameter -a and looking at the generated file with prefix .s. Hope this helps Tomas ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
[fpc-devel] Overflow in TMemoryStream?
Hi, While working on the MSEgui fork of classes unit I saw a suspect piece of code in streams.inc: " function TMemoryStream.Realloc(var NewCapacity: PtrInt): Pointer; begin If NewCapacity<0 Then NewCapacity:=0 else begin // if growing, grow at least a quarter if (NewCapacity>FCapacity) and (NewCapacity < (5*FCapacity) div 4) then NewCapacity := (5*FCapacity) div 4; " Isn't there an overflow if the capacity grows above high(ptrint) div 5 (about 430MB on 32 bit)? IIRC there was a discussion on the list about memory problems with big TMemoryStream's. Martin ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel
Re: [fpc-devel] Bitset assembler
Thank you Jonas, so back to my original question, is there an asm expert out there who knows if the syntax is invalid, or simply the compiler does not implement bt, bts, btr instructions function BitsetGet(const Bits; Index: UInt32): Boolean; assembler; asm {$IFDEF WIN64} // Win64 IN: rcx = Bits, edx = Index OUT: rax = Result bt (%rcx), %edx // -> Error asm: [bt reg32,mem32] // bt (%rcx), %rdx // -> Error asm: [bt reg64,mem64] sbb %eax, %eax and %eax, $1 {$ELSE} // Linux IN: rdi = Bits, esi = Index OUT: rax = Result bt (%rdi), %esi sbb %rax, %rax and %rax, $1 {$ENDIF} end; On Sat, Sep 10, 2016 at 5:17 PM, Jonas Maebewrote: > On 10/09/16 12:55, Jy V wrote: > >> I am not sure the FreePascal compiler is able to convert the code of the >> procedure BitsetSet(var Bits; Index: UInt32); >> PUInt64Array(@Bits)^[Index shr 6] := PUInt64Array(@Bits)^[Index shr 6] >> or (Int64(1) shl (Index and 63)); >> >> into a single instruction: >> >> bts [eax], edx >> > > It could easily do it with > > type > tbitarray = bitpacked array[0..high(qword)-1] of boolean; > pbitarray = ^tbitarray; > > var > ba: pbitarray; > index: qword; > begin > ... > ba^[index]:=1; > end. > > but only *if* someone would first override > thlcgobj.a_load_regconst_subsetref_intern() > for x86 in the compiler source code and implement the special cases for > setting a single bit to 0 or 1 (which is not the case, currently). The bts > instruction is already used for include(setvar,value), but sets are > obviously limited to 256 elements. > > > Jonas > > ___ > fpc-devel maillist - fpc-devel@lists.freepascal.org > http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel > ___ fpc-devel maillist - fpc-devel@lists.freepascal.org http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel