Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-13 Thread TJF
Hi Kasimir, sorry my post overlapped.

Kasimir schrieb am Donnerstag, 13. Mai 2021 um 20:36:18 UTC+2:

> So the loop instruction is not known ( UNKNOWN in disassembler list )
> Is not a solution for Beaglebone black.
> Assembler did not warn or complain.
>

The LOOP instruction works in PASM assembler.

Note: nested LOOP instructions are not allowed.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/5562829e-1b8f-4f89-b8cc-a7d2649acccbn%40googlegroups.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-13 Thread din...@gmail.com
Which assembler are you using? It should have warned you that "loop weiter" 
body must be at least two instructions, whereas you have zero. 

Also, you cannot nest HW-assisted loops.

Regards,
Dimitar

On Thursday, May 13, 2021 at 9:36:18 PM UTC+3 Kasimir wrote:

> HI Mark,
> was trying to use the loop instruction . 
>.global ausgabe 
> ausgabe: 
>ldi r18, 0  ; initialisation
>
>ldi r30, 0x10   ; debug 
>ldi r17, 0x00   ; debug 
>mov r20, r15; save start addresss 
>mov r21, r14; save number of pattern 
> naechster: 
>loopnext_pattern, r14   ; for each pattern 
>lbbo, r15, 4, 1 ; output (r15) = pattern 
>lbbo, r15, 0, 2 ; load number of delay loops  
>loopweiter, R17 ; delay loop 
> weiter: 
>add r15, r15, 5 ; increment address pointer by 5 ( next 
> data structure element ) 
> next_pattern: 
>mov r15, r20; load saved start address in address 
> pointer 
>mov r14, r21; load saved number of pattern in pattern 
> counter 
>lbbo, r16, 0, 1 ; check if stop request
>
>or  r30, r30, (1<<4); debug 
>qbeqnaechster, r18, 0   ; if handshake[0] == 0 continue 
>jmp r3.w2   ; otherwise return r3 contains return 
> address
>
>
> **
> I used prudebug to test the behavior.  
> So the loop instruction is not known ( UNKNOWN in disassembler list )
> Is not a solution for Beaglebone black.
> Assembler did not warn or complain.
>
> Bottom line ... independent of the above code, I'm missing the 200MHz 
> performance,
> I'm far away, seems to be 20:1  if I think in 5nsec instruction time 
> cycles for register operations.
>
> There is something else  up to now no idea what it can be.
>
> Thanks for help and thinking
> Kasimir
>
>
>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/adc7ae89-f733-4ca9-bea5-35617341ea56n%40googlegroups.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-13 Thread TJF

Kasimir schrieb am Mittwoch, 12. Mai 2021 um 21:49:33 UTC+2:

> It works fine, only the  delay time loop need better resolution, at the 
> moment the time for only one loop is too long.
> Have no idea to optimize ist.
>

Twice as fast:

LOOP EndWait, R17.w0 // note: max 16 bit counter
EndWait:
 

> Also from
> or  r30, r30, (1<<4); debug, trigger signal for oscilloscope
> to
> naechster: 
>lbbo, r15, 4, 1 ; (r15) = pattern
> I measure 250nsec . was expecting 25nsec . 
>
> I can see some jitter on my oscilloscope ( Tektronix THS730A ), has 
> nothing to do with
> GND connection, long wires etc., all that is perfect. Oscilloscope works 
> fine.
>
> Is it possible that "some what" from Linux / ARM area is disturbing my 
> timing?
>

The LBBO , r15, 4, 1 instruction needs at least 3+1 cycles (as long as 
the adress in R15 is not in the PRU local memory map). And it may take 
additional cycles in case of heavy trafic on the L3 bus.

Note: for cycle watching you don't need an osci. Instead you can use the CYCLE 
Register (offset = Ch) in the PRUSS_PRU_CTRL register space.

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/7aece696-e87a-4668-971a-932aa700ff59n%40googlegroups.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-13 Thread Kasimir
HI Mark,
was trying to use the loop instruction . 
   .global ausgabe 
ausgabe: 
   ldi r18, 0  ; initialisation
   ldi r30, 0x10   ; debug 
   ldi r17, 0x00   ; debug 
   mov r20, r15; save start addresss 
   mov r21, r14; save number of pattern 
naechster: 
   loopnext_pattern, r14   ; for each pattern 
   lbbo, r15, 4, 1 ; output (r15) = pattern 
   lbbo, r15, 0, 2 ; load number of delay loops  
   loopweiter, R17 ; delay loop 
weiter: 
   add r15, r15, 5 ; increment address pointer by 5 ( next data 
structure element ) 
next_pattern: 
   mov r15, r20; load saved start address in address 
pointer 
   mov r14, r21; load saved number of pattern in pattern 
counter 
   lbbo, r16, 0, 1 ; check if stop request
   or  r30, r30, (1<<4); debug 
   qbeqnaechster, r18, 0   ; if handshake[0] == 0 continue 
   jmp r3.w2   ; otherwise return r3 contains return address

**
I used prudebug to test the behavior.  
So the loop instruction is not known ( UNKNOWN in disassembler list )
Is not a solution for Beaglebone black.
Assembler did not warn or complain.

Bottom line ... independent of the above code, I'm missing the 200MHz 
performance,
I'm far away, seems to be 20:1  if I think in 5nsec instruction time 
cycles for register operations.

There is something else  up to now no idea what it can be.

Thanks for help and thinking
Kasimir


-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/2f40337b-11dc-44a5-8c3e-c78a9e8890b1n%40googlegroups.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-12 Thread 'Mark Lazarewicz' via BeagleBoard
Hello Kasmir

I will take a look and hopefully others who are using PRU can also be helpful I 
began programming in asm many many years ago but haven't used PRU assembler. 
Can you reply whether you have an oscilloscope or high speed logic analyzer? 
This is what we used to debug many years ago. 
You could remove any memory Accesses by hard coding the data( modify your code) 
just do a tight loop toggling GPIO and measure the frequency.
This will tell you the max frequency of your GPIO 
Perhaps write some test code doing just that and share results . Staring at 
source code isn't always the fastest way to find error especially since we 
don't have your  exact set-up.
In the meantime hopefully someone sees something obvious. I'm sure the max 
frequency of what you are attempting has been discussed.
Maybe someone will comment on what they have achieved and share their solution.
Break the problem into peices and resist the temptations to be drawn into 
detour's can be challenging when getting input. 
By running experiments you can stay busy while waiting for input from group 
members 
I hope that's helpful
Mark





Sent from Yahoo Mail on Android 
 
  On Wed, May 12, 2021 at 2:57 PM, Kasimir wrote:   
This is my code to output pattern on __R30; 

    .global ausgabe
ausgabe:
    ldi r18, 0  ; initial value
    ldi r30, 0x10   ; debug
    ldi r17, 0x00   ; debug
    mov r13, r15    ; R15 contains start address, save in R13
    mov r12, r14    ; R14 contains number of data points
naechster:
    lbbo    , r15, 4, 1 ; (r15) = pattern
    lbbo    , r15, 0, 2 ; (r17) = time to wait to output next pattern  
warte:
    sub r17, r17, 1 ; delay loop 
    qbne    warte, r17, 0   ;
    add r15, r15, 5 ; next element, update pointer
    sub r14, r14, 1     ; number of remaining elements - 1
    qbne    naechster, r14, 0   ; was it the last one?
    mov r15, r13    ; yes, load addess pointer with saved value
    mov r14, r12    ; and load loop counter with saved number of 
elements
    lbbo    , r16, 0, 1 ; load variable, if 0 run again, if != 0 exit
    or  r30, r30, (1<<4)    ; debug, trigger signal for oscilloscope
    qbeq    naechster, r18, 0   ; as long handshake[0] = 0 is
    jmp r3.w2   ; r3 contains return 
address;*
The datastructure:typedef struct Event Event_t; 
 
struct  Event   
   
{   
   
    unsigned int  time; // number of loops to the next event    
   
    unsigned char pattern;  // Bit 7 | 6 | 5 | 4 |  3 | 2 |  1 | 0 |    
   
    // --+---+---+---++---++---+    
   
    //   |   |   | d |~z34|z34|~z12|z12|    

    // --+---+---+---++---++---+
};

int main( int argc, char *argv[])
{
 int i;
 int j;
 Event_t event_knoten[500];
...ausgabe(pattern_liste.anzahl, _knoten[0].time, [0]) ; // 
asm to write pattern
  // as long handshake[0} == 0
It works fine, only the  delay time loop need better resolution, at the moment 
the time for only one loop is too long.Have no idea to optimize ist. 
Also from or  r30, r30, (1<<4)    ; debug, trigger signal for 
oscilloscopetonaechster:
    lbbo    , r15, 4, 1 ; (r15) = patternI measure 250nsec . was 
expecting 25nsec . 

I can see some jitter on my oscilloscope ( Tektronix THS730A ), has nothing to 
do withGND connection, long wires etc., all that is perfect. Oscilloscope works 
fine.
 Is it possible that "some what" from Linux / ARM area is disturbing my timing?
Thanks again for any helpfull input.Kasimir


-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/2a9748b2-ed2a-4278-9e30-fa153bf5c0fbn%40googlegroups.com.
  

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion 

Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-12 Thread Kasimir
This is my code to output pattern on __R30
; 
   .global ausgabe 
ausgabe: 
   ldi r18, 0  ; initial value
   ldi r30, 0x10   ; debug 
   ldi r17, 0x00   ; debug 
   mov r13, r15; R15 contains start address, save in R13 
   mov r12, r14; R14 contains number of data points 
naechster: 
   lbbo, r15, 4, 1 ; (r15) = pattern 
   lbbo, r15, 0, 2 ; (r17) = time to wait to output next 
pattern  
warte: 
   sub r17, r17, 1 ; delay loop  
   qbnewarte, r17, 0   ; 
   add r15, r15, 5 ; next element, update pointer 
   sub r14, r14, 1 ; number of remaining elements - 1 
   qbnenaechster, r14, 0   ; was it the last one? 
   mov r15, r13; yes, load addess pointer with saved value 
   mov r14, r12; and load loop counter with saved number of 
elements 
   lbbo, r16, 0, 1 ; load variable, if 0 run again, if != 0 exit
   or  r30, r30, (1<<4); debug, trigger signal for oscilloscope 
   qbeqnaechster, r18, 0   ; as long handshake[0] = 0 is 
   jmp r3.w2   ; r3 contains return address
;*

The datastructure:
typedef struct Event Event_t; 
 
struct  Event 
 
{ 

 
   unsigned int  time; // number of loops to the next event 
  
   unsigned char pattern;  // Bit 7 | 6 | 5 | 4 |  3 | 2 |  1 | 0 | 
  
   // --+---+---+---++---++---+ 
  
   //   |   |   | d |~z34|z34|~z12|z12| 
   
   // --+---+---+---++---++---+ 
};

int main( int argc, char *argv[]) 
{ 
int i; 
int j; 
*Event_t event_knoten[500];*
...

ausgabe(pattern_liste.anzahl, _knoten[0].time, [0]) ; // 
asm to write pattern
  // as 
long handshake[0} == 0

It works fine, only the  delay time loop need better resolution, at the 
moment the time for only one loop is too long.
Have no idea to optimize ist. 
Also from
or  r30, r30, (1<<4); debug, trigger signal for oscilloscope
to
naechster: 
   lbbo, r15, 4, 1 ; (r15) = pattern
I measure 250nsec . was expecting 25nsec . 

I can see some jitter on my oscilloscope ( Tektronix THS730A ), has nothing 
to do with
GND connection, long wires etc., all that is perfect. Oscilloscope works 
fine.

Is it possible that "some what" from Linux / ARM area is disturbing my 
timing?

Thanks again for any helpfull input.
Kasimir

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/2a9748b2-ed2a-4278-9e30-fa153bf5c0fbn%40googlegroups.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-12 Thread Peter Lange
Hi Mark,
thanks you very much for the quick response.
Going to post the ASM. Looking Forward.
Kasimir
 ...

'Mark Lazarewicz' via BeagleBoard  schrieb am
Mi., 12. Mai 2021, 17:55:

> The memory access will add some cycle post your assembler code with
> comments you're correct it doesn't make sense maybe someone will see the
> issues. The PRU labs discuss measuring cycle times in CCS if you have JTAG
> but toggle a GPIO and measure with a scope is probably easier.
>
>
>
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, May 12, 2021 at 8:40 AM, Kasimir
>  wrote:
> Hi,
> I'm working on a sine - triangle modulator, is running on BeagleBone black
> / PRU 1.
> On Linux/Arm I calculate the pattern for one period in form of a data
> structure
> pattern to output and time to the next event.
> Output is PRU 1 __R30 bit 0, 1, 2, 3 ( 4 only for debug reasons,
> oscilloscope trigger )
> It works  but I'm not surprised about the speed.
> The output loop of the PRU is written in some lines of ASM.
> Frequencies: triangle should be 400kHz, better 800kHz,
> sine wave is between 20kHz and 100kHz
> Beaglebone has to drive a high speed GaN H-Bridge.
>
> The datatransport and handshake between Linux and PRU works fine.
> A C-Program on PRU is watching for new data. Then the new data (
> pattern-time structure )
> are copied into local ram, to get the best speed ( lowest latency ).
> If the data are stored in local ram, the assembler program is called, to
> output the given pattern. First the arguments are saved in registers,
> then the output starts in a loop.
> Pick up pattern from local RAM, and output,
> feed delay loop from local RAM,
> delay loop,
> update index register,
> check for possible new data,
> if not, back to the top, output next period.
>
> What I said ... it works. But with cycle time of 5nsec ( 1/200MHz ) and 1
> cycle for most of the (ASM) instructions, I can't see the speed.
>
> So there is something wrong in my setup or code.
> If somebody would like to help debugging, let me know.
> Sources with Makefile etc are available.
>
> All based on latest Debian image, all udates are  installed, HDMI is off.
>
> So, let me know, think it makes only sense to upload that stuff in case
> there is really
> somebody able to help on that.
>
> Thanks in advance
> Kasimir
>
> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to the Google Groups
> "BeagleBoard" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to beagleboard+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beagleboard/e9fe59e9-e00d-476e-99e2-6b85a90695d2n%40googlegroups.com
> 
> .
>
> --
> For more options, visit http://beagleboard.org/discuss
> ---
> You received this message because you are subscribed to a topic in the
> Google Groups "BeagleBoard" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/beagleboard/EvWTZ1wM8zQ/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> beagleboard+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beagleboard/474549500.1653123.1620834888085%40mail.yahoo.com
> 
> .
>

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/CAHjW%3DhfM8ZHObAkM4ZrWtQNcxgByd0gGC8YqVvpDURnkF9LgOw%40mail.gmail.com.


Re: [beagleboard] unexpected "low speed" of PRU 1

2021-05-12 Thread 'Mark Lazarewicz' via BeagleBoard
The memory access will add some cycle post your assembler code with comments 
you're correct it doesn't make sense maybe someone will see the issues. The PRU 
labs discuss measuring cycle times in CCS if you have JTAG but toggle a GPIO 
and measure with a scope is probably easier.



Sent from Yahoo Mail on Android 
 
  On Wed, May 12, 2021 at 8:40 AM, Kasimir wrote:   
Hi,I'm working on a sine - triangle modulator, is running on BeagleBone black / 
PRU 1.On Linux/Arm I calculate the pattern for one period in form of a data 
structurepattern to output and time to the next event.Output is PRU 1 __R30 bit 
0, 1, 2, 3 ( 4 only for debug reasons, oscilloscope trigger )It works  but 
I'm not surprised about the speed.The output loop of the PRU is written in some 
lines of ASM.Frequencies: triangle should be 400kHz, better 800kHz,sine wave is 
between 20kHz and 100kHzBeaglebone has to drive a high speed GaN H-Bridge.
The datatransport and handshake between Linux and PRU works fine.A C-Program on 
PRU is watching for new data. Then the new data ( pattern-time structure )are 
copied into local ram, to get the best speed ( lowest latency ).If the data are 
stored in local ram, the assembler program is called, to output the given 
pattern. First the arguments are saved in registers, 
then the output starts in a loop.Pick up pattern from local RAM, and 
output,feed delay loop from local RAM,delay loop, 
update index register,check for possible new data,if not, back to the top, 
output next period.
What I said ... it works. But with cycle time of 5nsec ( 1/200MHz ) and 1 cycle 
for most of the (ASM) instructions, I can't see the speed. 

So there is something wrong in my setup or code.If somebody would like to help 
debugging, let me know.Sources with Makefile etc are available. 

All based on latest Debian image, all udates are  installed, HDMI is off.
So, let me know, think it makes only sense to upload that stuff in case there 
is really 
somebody able to help on that.
Thanks in advanceKasimir
 


-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/e9fe59e9-e00d-476e-99e2-6b85a90695d2n%40googlegroups.com.
  

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/474549500.1653123.1620834888085%40mail.yahoo.com.


[beagleboard] unexpected "low speed" of PRU 1

2021-05-12 Thread Kasimir
Hi,
I'm working on a sine - triangle modulator, is running on BeagleBone black 
/ PRU 1.
On Linux/Arm I calculate the pattern for one period in form of a data 
structure
pattern to output and time to the next event.
Output is PRU 1 __R30 bit 0, 1, 2, 3 ( 4 only for debug reasons, 
oscilloscope trigger )
It works  but I'm not surprised about the speed.
The output loop of the PRU is written in some lines of ASM.
Frequencies: triangle should be 400kHz, better 800kHz,
sine wave is between 20kHz and 100kHz
Beaglebone has to drive a high speed GaN H-Bridge.

The datatransport and handshake between Linux and PRU works fine.
A C-Program on PRU is watching for new data. Then the new data ( 
pattern-time structure )
are copied into local ram, to get the best speed ( lowest latency ).
If the data are stored in local ram, the assembler program is called, to 
output the given pattern. First the arguments are saved in registers, 
then the output starts in a loop.
Pick up pattern from local RAM, and output,
feed delay loop from local RAM,
delay loop, 
update index register,
check for possible new data,
if not, back to the top, output next period.

What I said ... it works. But with cycle time of 5nsec ( 1/200MHz ) and 1 
cycle for most of the (ASM) instructions, I can't see the speed. 

So there is something wrong in my setup or code.
If somebody would like to help debugging, let me know.
Sources with Makefile etc are available. 

All based on latest Debian image, all udates are  installed, HDMI is off.

So, let me know, think it makes only sense to upload that stuff in case 
there is really 
somebody able to help on that.

Thanks in advance
Kasimir

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beagleboard+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/e9fe59e9-e00d-476e-99e2-6b85a90695d2n%40googlegroups.com.