Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-24 Thread Michael Ring
The PIC32MX chips have one or two shadow sets, on the systems with only 
one shadow set it is hardcoded to the highest priority interrupt (7)


I have implemented detection of shadow register use, in this case the 
registers do not get pushed on stack, that saves quite a few cpu-cycles 
in that case.


But in a microcontroller systems you usually have several peripherals 
running at different interrupt levels, not so important tasks run on 
lower priorities so register saving is still an issue.


Until now I did not look into floating point at all, as the small 
PIC32MX1/2 devices have no floating point processor I never used the 
real datatype anyway but when code for hardware floating point is always 
generated then this will of course create issues at runtime when 
somebody uses that datatype. But that's a story for another rainy day.


Right now it is sufficient for me to know that it may be most likely a 
bug or an unimplemented feature (like sergej said), I was fearing that I 
did something wrong when defining the target.


Perhaps in a future far far away I will look what it needs to also 
support PIC32MM and PIC32MZ but as they use MicroAdaptiv instruction set 
(something like thumb mode on arm) I will need to learn more on the 
inner workings of fpc so please bare with me when I continue asking 
questions on the inner working of fpc, I am slowly understanding more 
and more how things work but I still do not see the big picture


Michael

Am 24.08.16 um 10:09 schrieb Florian Klämpfl:

Am 13.08.2016 um 18:57 schrieb Michael Ring:

Hi!

I am trying to bring interrupt handling routine size down (and speed up) for 
mipsel-embedded target.

I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler; 
nostackframe;
asm
   mtc0 $a1,$11,0
end ['a1'];

inside of the interrupt handler, but as soon as I include the call to this 
procedure the number of
registers that get saved explodes. When I only need to modify some peripheral I 
usually get away
with only $v0 and $v1 registers getting saved, but with asm routine included 
all registers get saved.

Same is true if I put the asm block directly inside of the interrupt handler.

As you can see I have added the used registers list for this procedure so my 
expectation was that
only the register declared does get added to the list of used registers.

Is this a bug on mips platform or is there in general no way to define the list 
of used registers
for an assembler routine so that register allocation works more efficient?

Or is there another way for me to trick freepascal in not saving all registers?


Did you read the suggestion from Michael Schnell? Does the MIPS you use have a 
shadow register set?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-24 Thread Florian Klämpfl
Am 13.08.2016 um 18:57 schrieb Michael Ring:
> Hi!
> 
> I am trying to bring interrupt handling routine size down (and speed up) for 
> mipsel-embedded target.
> 
> I need to use inline assembler routines like this one
> 
> procedure TSystemCore.setCoreTimerComp(value : longWord); assembler; 
> nostackframe;
> asm
>   mtc0 $a1,$11,0
> end ['a1'];
> 
> inside of the interrupt handler, but as soon as I include the call to this 
> procedure the number of
> registers that get saved explodes. When I only need to modify some peripheral 
> I usually get away
> with only $v0 and $v1 registers getting saved, but with asm routine included 
> all registers get saved.
> 
> Same is true if I put the asm block directly inside of the interrupt handler.
> 
> As you can see I have added the used registers list for this procedure so my 
> expectation was that
> only the register declared does get added to the list of used registers.
> 
> Is this a bug on mips platform or is there in general no way to define the 
> list of used registers
> for an assembler routine so that register allocation works more efficient?
> 
> Or is there another way for me to trick freepascal in not saving all 
> registers?
> 

Did you read the suggestion from Michael Schnell? Does the MIPS you use have a 
shadow register set?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-23 Thread Michael Ring
I tried a small test program with -CfSoft or -CfMIPS2 or -CfMIPS3, 
result is always the same, I did a grep on the *.s files.


I see allocations of float registers within the procedures, when they 
are called only cpu registers are marked as allocated.


Michael


Output:

test.s:# Register v0 allocated
test.s:# Register f2 allocated
test.s:# Register v0 released
test.s:# Register v0 allocated
test.s:# Register v0 released
test.s:# Register f0 allocated
test.s:# Register f0 released
test.s:# Register f2 released
test.s:# Register v0 allocated
test.s:# Register v1 allocated
test.s:# Register v0,v1 released
test.s:# Register a0 allocated
test.s:# Register a0 released
test.s:# Register v0 allocated
test.s:# Register v0 released
test.s:# Register a0 allocated
test.s:# Register a0 released
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra allocated
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra released
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra allocated
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra released
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra allocated
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra released

test.s:# Register v0 allocated
test.s:# Register v0 released
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra allocated
test.s:# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra released

test1.s:   # Register a0 allocated
test1.s:   # Register a0 released
test1.s:   # Register v0 allocated
test1.s:   # Register f2 allocated
test1.s:   # Register v0 released
test1.s:   # Register v0 allocated
test1.s:   # Register v0 released
test1.s:   # Register f0 allocated
test1.s:   # Register f0 released
test1.s:   # Register f2 released

Am 22.08.16 um 21:52 schrieb Florian Klämpfl:

Am 21.08.2016 um 13:32 schrieb Michael Ring:

Was getting high hopes for a moment...

@Charlie: the last point you mention, this optimization is already there. As 
long as I do not call a
procedure and directly include inline assembler in the interrupt routine all is 
fine, only really
used registers are in the list of registers that need to get saved and the 
interrupt handlers gets
quite lean & efficient.

@Sergej: I just started wondering on usage of fp registers, when I call a 
routine that uses floating
point I see that the fp registers are not marked as reserved by the compiler, 
what do you think?

The procedure below (test) uses $f0 and $f2 but they are not marked as 
allocated:

 # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
allocated
 jal P$TEST_$$_TEST
 nop
 # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
released

could this be a bug? (I have also modified 
tcpuparamanager.get_volatile_registers_fp to return [] so
i'd expect to see $f0..$f19 pushed to stack but I see nothing)

Soft float activated?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-22 Thread Florian Klämpfl
Am 21.08.2016 um 13:32 schrieb Michael Ring:
> Was getting high hopes for a moment...
> 
> @Charlie: the last point you mention, this optimization is already there. As 
> long as I do not call a
> procedure and directly include inline assembler in the interrupt routine all 
> is fine, only really
> used registers are in the list of registers that need to get saved and the 
> interrupt handlers gets
> quite lean & efficient.
> 
> @Sergej: I just started wondering on usage of fp registers, when I call a 
> routine that uses floating
> point I see that the fp registers are not marked as reserved by the compiler, 
> what do you think?
> 
> The procedure below (test) uses $f0 and $f2 but they are not marked as 
> allocated:
> 
> # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
> allocated
> jal P$TEST_$$_TEST
> nop
> # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
> released
> 
> could this be a bug? (I have also modified 
> tcpuparamanager.get_volatile_registers_fp to return [] so
> i'd expect to see $f0..$f19 pushed to stack but I see nothing)

Soft float activated?

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-22 Thread Michael Schnell



On 13.08.2016 18:57, Michael Ring wrote:
As you can see I have added the used registers list for this procedure 
so my expectation was that only the register declared does get added 
to the list of used registers.

Just an additional comment:

There are MIPS CPUs (e.g. PIC32 series) that feature multiple register 
sets and with that (certain definable) ISRs don't need to save any 
registers, but the hardware automatically activates an an alternate 
register set until return from interrupt is performed.


I suggest that this case should be able to be handled as well.

-Michael
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Sergei Gorelkin

21.08.2016 14:32, Michael Ring пишет:

Was getting high hopes for a moment...

@Sergej: I just started wondering on usage of fp registers, when I call a 
routine that uses floating
point I see that the fp registers are not marked as reserved by the compiler, 
what do you think?

The procedure below (test) uses $f0 and $f2 but they are not marked as 
allocated:

 # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
allocated
 jal P$TEST_$$_TEST
 nop
 # Register at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra 
released

could this be a bug? (I have also modified 
tcpuparamanager.get_volatile_registers_fp to return [] so
i'd expect to see $f0..$f19 pushed to stack but I see nothing)

Could of course be me causing this bug, but I checked my diff to trunk, I have 
not knowingly changed
fp behaviour besides changing get_volatile_registers_fp

It's either a bug or unimplemented feature, not your fault. Currently calls allocate non-integer 
register types only if caller uses registers of that type itself. This is good for calls between 
procedures with same calling convention (i.e. equal sets of volatile registers), but not for calls 
where callee's set of volatile registers is larger than one of caller's.


But probably you can force 'use' of fp registers by adding one of them after 
'asm' block.

Regards,
Sergei
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Michael Ring

Was getting high hopes for a moment...

@Charlie: the last point you mention, this optimization is already 
there. As long as I do not call a procedure and directly include inline 
assembler in the interrupt routine all is fine, only really used 
registers are in the list of registers that need to get saved and the 
interrupt handlers gets quite lean & efficient.


@Sergej: I just started wondering on usage of fp registers, when I call 
a routine that uses floating point I see that the fp registers are not 
marked as reserved by the compiler, what do you think?


The procedure below (test) uses $f0 and $f2 but they are not marked as 
allocated:


# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra allocated

jal P$TEST_$$_TEST
nop
# Register 
at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra released


could this be a bug? (I have also modified 
tcpuparamanager.get_volatile_registers_fp to return [] so i'd expect to 
see $f0..$f19 pushed to stack but I see nothing)


Could of course be me causing this bug, but I checked my diff to trunk, 
I have not knowingly changed fp behaviour besides changing 
get_volatile_registers_fp


Michael

procedure test;
var
  b : real;
begin
  b := sqrt(a);
end;

procedure test_interrupt; interrupt;
var
  b : real;
begin
  inc(a);
  asm
nop
  end ['a0'];
  test;
  //b := round(a);
end;

Michael


Am 21.08.16 um 12:25 schrieb Karoly Balogh (Charlie/SGR):

Hi,

On Sun, 21 Aug 2016, Sergei Gorelkin wrote:


It is actually the opposite way around.
g_save_registers/g_restore_registers methods are only used for first
implemented targets (i386 and maybe m68k). All newer targets are written
without calling them, since they are completely inappropriate to
implement stack frame optimizations or push/pop-alike instructions for
register saving.

Well, they still have stubs in the HLCG, which is why I thought it must be
newer than just dumping everything in g_proc_entry. Actually, I
implemented them quite recently for 68k, and they're still routed in live
code in psub.pas.

However, since historically I missed the large compiler refactor in the
mid-'00s, I believe you. Anyway...


MIPS codegenerator does check which registers are actually used. The
issue is, a procedure with 'interrupt' modifier must not modify any
registers at all because it can be called asynchronously. As soon as an
'interrupt' procedure calls another (regular) procedure, the callee may
modify any registers from volatile list, and the caller has no way to
know which ones. Therefore, it has no other option than to save/restore
all volatile registers.

Well, one possible optimization would be to only save all volatiles if the
interrupt routine actually calls another function. Otherwise only save the
ones used by the current proc. This would allow some very small and very
fast interrupt functions, right? I'm not sure though if there's an easy
way to determine if there is a function call inside the function I'm
generating code for.

Maybe at the point of generating a function call, if the current proc is
an interrupt, mark all volatiles as used somehow?

Charlie
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Sergei Gorelkin



21.08.2016 13:25, Karoly Balogh (Charlie/SGR) пишет:



Well, one possible optimization would be to only save all volatiles if the
interrupt routine actually calls another function. Otherwise only save the
ones used by the current proc. This would allow some very small and very
fast interrupt functions, right? I'm not sure though if there's an easy
way to determine if there is a function call inside the function I'm
generating code for.

Maybe at the point of generating a function call, if the current proc is
an interrupt, mark all volatiles as used somehow?


What you suggest is already implemented for ages. Any call node allocates all volatile registers 
before doing the call and releases them afterwards, to indicate that these registers are modified by 
callee. The same is done when a call to helper function is generated at low level (see 
thlcg.g_call_system_proc). Furthermore, any call sets pi_do_call in current_procinfo.flags, to 
simplify checking the case.
There's no difference in processing calls between 'interrupt' and regular caller procedures. The 
difference only appears when generating caller's prologue/epilogue: a regular procedure won't save 
volatile registers even if they are used, but 'interrupt' one will save them because it considers 
all registers non-volatile.


The things are a bit trickier with inline assembler, because in general author may write anything 
there, and compiler does no deep checks. For this reason, an assembler block without a list of 
modified registers is historically considered equal to a call (i.e. modifies all volatile 
registers). An explicit list of modified registers overrides that behavior, however.


Regards,
Sergei
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Karoly Balogh (Charlie/SGR)
Hi,

On Sun, 21 Aug 2016, Sergei Gorelkin wrote:

> It is actually the opposite way around.
> g_save_registers/g_restore_registers methods are only used for first
> implemented targets (i386 and maybe m68k). All newer targets are written
> without calling them, since they are completely inappropriate to
> implement stack frame optimizations or push/pop-alike instructions for
> register saving.

Well, they still have stubs in the HLCG, which is why I thought it must be
newer than just dumping everything in g_proc_entry. Actually, I
implemented them quite recently for 68k, and they're still routed in live
code in psub.pas.

However, since historically I missed the large compiler refactor in the
mid-'00s, I believe you. Anyway...

> MIPS codegenerator does check which registers are actually used. The
> issue is, a procedure with 'interrupt' modifier must not modify any
> registers at all because it can be called asynchronously. As soon as an
> 'interrupt' procedure calls another (regular) procedure, the callee may
> modify any registers from volatile list, and the caller has no way to
> know which ones. Therefore, it has no other option than to save/restore
> all volatile registers.

Well, one possible optimization would be to only save all volatiles if the
interrupt routine actually calls another function. Otherwise only save the
ones used by the current proc. This would allow some very small and very
fast interrupt functions, right? I'm not sure though if there's an easy
way to determine if there is a function call inside the function I'm
generating code for.

Maybe at the point of generating a function call, if the current proc is
an interrupt, mark all volatiles as used somehow?

Charlie
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Sergei Gorelkin



21.08.2016 12:06, Karoly Balogh (Charlie/SGR) пишет:

Hi,

On Sun, 21 Aug 2016, Michael Ring wrote:


So unless there is a way to find out which registers get used by a procedure
the only thing I can do to make interrupt routines as lean as possible is to
not use procedures in them if possible.


There is a way, of course. Seems like the MIPS CG was never updated to
depend on cgobj.g_save_registers/cgobj.g_restore_registers (or implement
these on its own), which takes into account which registers were used in
the procedure, and only saves those. See g_save/restore_registers
implementation in cgobj.pas for an inspiration how it should be done. The
m68k CG also reimplements these methods with some CPU-specific extensions.

It seems to use the old-style approach of just always saving all volatile
registers, and do everything on its own in g_proc_entry/exit which is the
old way (and also still used in some other CGs).

Not sure why it was never updated tho' for MIPS, I don't know anything
about the MIPS CG, and very little about the architecture itself. But it
sounds like this definitely the improvement you want.




It is actually the opposite way around. g_save_registers/g_restore_registers 
methods
are only used for first implemented targets (i386 and maybe m68k). All newer targets are written 
without calling them, since they are completely inappropriate to implement stack frame optimizations 
or push/pop-alike instructions for register saving.


MIPS codegenerator does check which registers are actually used. The issue is, a procedure with 
'interrupt' modifier must not modify any registers at all because it can be called asynchronously. 
As soon as an 'interrupt' procedure calls another (regular) procedure, the callee may modify any 
registers from volatile list, and the caller has no way to know which ones. Therefore, it has no 
other option than to save/restore all volatile registers.


Regards,
Sergei
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Karoly Balogh (Charlie/SGR)
Hi,

On Sun, 21 Aug 2016, Michael Ring wrote:

> So unless there is a way to find out which registers get used by a procedure
> the only thing I can do to make interrupt routines as lean as possible is to
> not use procedures in them if possible.

There is a way, of course. Seems like the MIPS CG was never updated to
depend on cgobj.g_save_registers/cgobj.g_restore_registers (or implement
these on its own), which takes into account which registers were used in
the procedure, and only saves those. See g_save/restore_registers
implementation in cgobj.pas for an inspiration how it should be done. The
m68k CG also reimplements these methods with some CPU-specific extensions.

It seems to use the old-style approach of just always saving all volatile
registers, and do everything on its own in g_proc_entry/exit which is the
old way (and also still used in some other CGs).

Not sure why it was never updated tho' for MIPS, I don't know anything
about the MIPS CG, and very little about the architecture itself. But it
sounds like this definitely the improvement you want.

Cheers,
--
Charlie
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-21 Thread Michael Ring

Thank you, your comments made me think (and write some more test code).

This is how I understand things right now:

The problem for Interrupt routines is that whenenever a procedure is 
called the default list of Non-Volatile registers is allocated.
This is a no-issue in normal code as those registers never end up on the 
stack because they get filtered out in g_proc_enter and g_proc_exit


For an interrupt things are a little different, all registers that get 
touched need to get saved on the stack. So each time a procedure is 
called in an interrupt routine the list of non-volatile registers gets 
saved on the stack because for this case I changed 
tcpuparamanager.get_volatile_registers_int() to return [].


So unless there is a way to find out which registers get used by a 
procedure the only thing I can do to make interrupt routines as lean as 
possible is to not use procedures in them if possible.


Fortunately inline assembler inside of the interrupt routine works just 
fine (It seems I did something wrong when testing this):


The simple test program below saves only 3 registers on the stack when I 
comment out the call of the procedure ($vo,$v1,$a0), the moment I remove 
the comment the following registers get saved:


at,v0,v1,a0,a1,a2,a3,t0,t1,t2,t3,t4,t5,t6,t7,t8,t9,ra,fp and I guess I 
must live with this fact.


But I am good with this because using inline assember in the interrupt 
works and so I can streamline the interrrupt routines.



Thank you both for your valuable help,


Michael


program test;
{$MODE OBJFPC}
var
 a : longWord;

procedure test;
begin
  inc(a);
end;

procedure test_interrupt; interrupt;
begin
  inc(a);
  asm
nop
  end ['a0'];
  //test;
end;

begin
  a := 0;
end.

Am 19.08.16 um 23:00 schrieb Jonas Maebe:

On 19/08/16 22:49, Michael Ring wrote:


Am 19.08.16 um 14:49 schrieb Jonas Maebe:


Michael Ring wrote on Sat, 13 Aug 2016:


I am trying to bring interrupt handling routine size down (and speed
up) for mipsel-embedded target.

I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler;
nostackframe;
asm
  mtc0 $a1,$11,0
end ['a1'];


Mentioning changed registers at the end of a pure assembler routine
has no effect. The compiler normally prints a warning about this. The
set of changed registers by a routine always only depends on its
calling convention. On most platforms we only support the official
ABI's calling convention, which is also the default.


I also tried also something like this:

procedure TSystemCore.setCoreTimerComp(value : longWord);
begin
  asm
mtc0 $a1,$11,0
  end ['a1'];
end;

with same result, all registers are saved. intead of only a few.


It is not clear what you mean by this. In your original message, you 
said that all registers were saved "as soon as I include the call to 
this procedure". As explained, the registers that are saved when 
calling a routine only depend on what the ABI says are 
volatile/callee-saved registers. Which registers are actually used by 
the called routine have no influence at all.



Same is true if I put the asm block directly inside of the interrupt
handler.


In that case, the list of changed registers should be taken into
account. OTOH, using an inline assembler blocks disables the use the
use of register variables for that routine by the compiler, but that
should result in less registers getting saved rather than more.


Do you remember where this is coded or for what I should search in the
fpc sourcecode? Then I can try to find out what is going on in the mips
case.


It's the last part of the _asm_statement function in 
compiler/pstatemnt.pas


FWIW, tcpuparamanager.get_volatile_registers_int() in mips/cpupara.pas 
suggests that all integer registers except for R16-R23 are volatile, 
so no matter what you do, if any of those registers contains a value 
that is still needed after a call, they will be saved and restored.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-19 Thread Jonas Maebe

On 19/08/16 22:49, Michael Ring wrote:


Am 19.08.16 um 14:49 schrieb Jonas Maebe:


Michael Ring wrote on Sat, 13 Aug 2016:


I am trying to bring interrupt handling routine size down (and speed
up) for mipsel-embedded target.

I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler;
nostackframe;
asm
  mtc0 $a1,$11,0
end ['a1'];


Mentioning changed registers at the end of a pure assembler routine
has no effect. The compiler normally prints a warning about this. The
set of changed registers by a routine always only depends on its
calling convention. On most platforms we only support the official
ABI's calling convention, which is also the default.


I also tried also something like this:

procedure TSystemCore.setCoreTimerComp(value : longWord);
begin
  asm
mtc0 $a1,$11,0
  end ['a1'];
end;

with same result, all registers are saved. intead of only a few.


It is not clear what you mean by this. In your original message, you 
said that all registers were saved "as soon as I include the call to 
this procedure". As explained, the registers that are saved when calling 
a routine only depend on what the ABI says are volatile/callee-saved 
registers. Which registers are actually used by the called routine have 
no influence at all.



Same is true if I put the asm block directly inside of the interrupt
handler.


In that case, the list of changed registers should be taken into
account. OTOH, using an inline assembler blocks disables the use the
use of register variables for that routine by the compiler, but that
should result in less registers getting saved rather than more.


Do you remember where this is coded or for what I should search in the
fpc sourcecode? Then I can try to find out what is going on in the mips
case.


It's the last part of the _asm_statement function in compiler/pstatemnt.pas

FWIW, tcpuparamanager.get_volatile_registers_int() in mips/cpupara.pas 
suggests that all integer registers except for R16-R23 are volatile, so 
no matter what you do, if any of those registers contains a value that 
is still needed after a call, they will be saved and restored.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-19 Thread Florian Klämpfl
Am 19.08.2016 um 22:49 schrieb Michael Ring:
> 
> Am 19.08.16 um 14:49 schrieb Jonas Maebe:
>>
>> Michael Ring wrote on Sat, 13 Aug 2016:
>>
>>> I am trying to bring interrupt handling routine size down (and speed up) 
>>> for mipsel-embedded target.
>>>
>>> I need to use inline assembler routines like this one
>>>
>>> procedure TSystemCore.setCoreTimerComp(value : longWord); assembler; 
>>> nostackframe;
>>> asm
>>>   mtc0 $a1,$11,0
>>> end ['a1'];
>>
>> Mentioning changed registers at the end of a pure assembler routine has no 
>> effect. The compiler
>> normally prints a warning about this. The set of changed registers by a 
>> routine always only
>> depends on its calling convention. On most platforms we only support the 
>> official ABI's calling
>> convention, which is also the default.
>>
> 
> I also tried also something like this:
> 
> procedure TSystemCore.setCoreTimerComp(value : longWord);
> begin
>   asm
> mtc0 $a1,$11,0
>   end ['a1'];
> end;
> 
> with same result, all registers are saved. intead of only a few.

All? Or only the non-volatiles?

> 
>>> inside of the interrupt handler, but as soon as I include the call to this 
>>> procedure the number
>>> of registers that get saved explodes. When I only need to modify some 
>>> peripheral I usually get
>>> away with only $v0 and $v1 registers getting saved, but with asm routine 
>>> included all registers
>>> get saved.
>>
>> If the ABI default calling convention states that a routine may change all 
>> registers, that is to
>> be expected.
>>
>>> Same is true if I put the asm block directly inside of the interrupt 
>>> handler.
>>
>> In that case, the list of changed registers should be taken into account. 
>> OTOH, using an inline
>> assembler blocks disables the use the use of register variables for that 
>> routine by the compiler,
>> but that should result in less registers getting saved rather than more.
> 
> Do you remember where this is coded or for what I should search in the fpc 
> sourcecode? Then I can
> try to find out what is going on in the mips case.

This is sligthly spread over the compiler, a starting point might be 
tcgmips.g_proc_entry in
compiler/mips/cgcpu.pas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-19 Thread Michael Ring


Am 19.08.16 um 14:49 schrieb Jonas Maebe:


Michael Ring wrote on Sat, 13 Aug 2016:

I am trying to bring interrupt handling routine size down (and speed 
up) for mipsel-embedded target.


I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler; 
nostackframe;

asm
  mtc0 $a1,$11,0
end ['a1'];


Mentioning changed registers at the end of a pure assembler routine 
has no effect. The compiler normally prints a warning about this. The 
set of changed registers by a routine always only depends on its 
calling convention. On most platforms we only support the official 
ABI's calling convention, which is also the default.




I also tried also something like this:

procedure TSystemCore.setCoreTimerComp(value : longWord);
begin
  asm
mtc0 $a1,$11,0
  end ['a1'];
end;

with same result, all registers are saved. intead of only a few.

inside of the interrupt handler, but as soon as I include the call to 
this procedure the number of registers that get saved explodes. When 
I only need to modify some peripheral I usually get away with only 
$v0 and $v1 registers getting saved, but with asm routine included 
all registers get saved.


If the ABI default calling convention states that a routine may change 
all registers, that is to be expected.


Same is true if I put the asm block directly inside of the interrupt 
handler.


In that case, the list of changed registers should be taken into 
account. OTOH, using an inline assembler blocks disables the use the 
use of register variables for that routine by the compiler, but that 
should result in less registers getting saved rather than more.


Do you remember where this is coded or for what I should search in the 
fpc sourcecode? Then I can try to find out what is going on in the mips 
case.


Thank you,

Michael




Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-19 Thread Jonas Maebe


Michael Ring wrote on Sat, 13 Aug 2016:

I am trying to bring interrupt handling routine size down (and speed  
up) for mipsel-embedded target.


I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler;  
nostackframe;

asm
  mtc0 $a1,$11,0
end ['a1'];


Mentioning changed registers at the end of a pure assembler routine  
has no effect. The compiler normally prints a warning about this. The  
set of changed registers by a routine always only depends on its  
calling convention. On most platforms we only support the official  
ABI's calling convention, which is also the default.


inside of the interrupt handler, but as soon as I include the call  
to this procedure the number of registers that get saved explodes.  
When I only need to modify some peripheral I usually get away with  
only $v0 and $v1 registers getting saved, but with asm routine  
included all registers get saved.


If the ABI default calling convention states that a routine may change  
all registers, that is to be expected.



Same is true if I put the asm block directly inside of the interrupt handler.


In that case, the list of changed registers should be taken into  
account. OTOH, using an inline assembler blocks disables the use the  
use of register variables for that routine by the compiler, but that  
should result in less registers getting saved rather than more.



Jonas
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Is there a way to make Register Allocation inside of Interrupt Service Routines more efficient when using inline-assembler?

2016-08-13 Thread Michael Ring

Hi!

I am trying to bring interrupt handling routine size down (and speed up) 
for mipsel-embedded target.


I need to use inline assembler routines like this one

procedure TSystemCore.setCoreTimerComp(value : longWord); assembler; 
nostackframe;

asm
  mtc0 $a1,$11,0
end ['a1'];

inside of the interrupt handler, but as soon as I include the call to 
this procedure the number of registers that get saved explodes. When I 
only need to modify some peripheral I usually get away with only $v0 and 
$v1 registers getting saved, but with asm routine included all registers 
get saved.


Same is true if I put the asm block directly inside of the interrupt 
handler.


As you can see I have added the used registers list for this procedure 
so my expectation was that only the register declared does get added to 
the list of used registers.


Is this a bug on mips platform or is there in general no way to define 
the list of used registers for an assembler routine so that register 
allocation works more efficient?


Or is there another way for me to trick freepascal in not saving all 
registers?


Thank you,

Michael


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
http://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel