Thanks, Kumar! I ended up doing a slightly different program before reading your comment. I used the STALL register to get how many clock cycles an instruction spares, so that means the instruction actually takes 1 + stall. I came up with these values for my BBB rev. A5A for the instructions that matter for my application (all of the tests were made using 32 bits only): LBBO/LBCO = 3 clocks for DRAM and Shared RAM SBBO/SBCO = 2 clocks for DRAM and Shared RAM LBBO = 43+ clocks for DDR reading (43.3 in average over 10000 tries)
LBBO = 42 (or 43) clocks for ADC FIFO0DATA reading (41.0001 in average over 10000 tries) The ADC clock didn't impact here, tried with 3 MHz and 8 MHz by changing the ADC_CLKDIV register). I believe I will have to adapt my programs to always consider the difference of the CYCLE register... It seems to be the only way to be deterministic. Up to now I was manually counting instructions and subtracting them from the number of delay loops. Em terça-feira, 26 de maio de 2015 13:34:17 UTC-3, Kumar Abhishek escreveu: > > You could also use the code snippet in this article to calculate clock > cycles for individual instructions: > > > http://theembeddedkitchen.net/beaglelogic-building-a-logic-analyzer-with-the-prus-part-1/449 > > On Tuesday, May 26, 2015 at 7:09:38 PM UTC+5:30, [email protected] > wrote: >> >> Sorry, just saw that you actually mentioned that the shared memory has >> the same performance as the DRAM. >> Also, I found this: >> http://processors.wiki.ti.com/index.php/Programmable_Realtime_Unit#Load_.2F_Store_Instructions >> where it is said that LBBO should take (1+word count) cycles. If that's >> right, an LBBO instruction up to 4 bytes should take 2 cycles for VBUS and >> 3 cycles for VBUSP. For now I need to study more to understand which one is >> the case, but VBUSP matches with your findings. >> >> Em sexta-feira, 3 de janeiro de 2014 23:05:30 UTC-2, Lenny escreveu: >>> >>> Hello, >>> >>> I am using a Beaglebone Black. When i measured the number of PRU clock >>> cycles needed for the execution of various assembler instructions, I found >>> surprisingly large values for memory access. Here follows a list, in which >>> one cycle corresponds to a delay of 5ns as expected: >>> >>> Most operations, such as ADD,SUB,QBxx,MOV,JMP etc.: 1 cycle >>> >>> LBBO 1,2,4 Bytes from PRU DRAM: 3 cycles >>> LBBO 8 Bytes from PRU DRAM: 4 cycles >>> LBBO 12 Bytes from PRU DRAM: 5 cycles >>> LBBO 16 Bytes from PRU DRAM: 6 cycles >>> >>> LBCO 4 Bytes from DDR: 43 cycles >>> LBCO 8 Bytes from DDR: 44 cycles >>> LBCO 12 Bytes from DDR: 45 cycles >>> LBCO 16 Bytes from DDR: 46 cycles >>> >>> With PRU DRAM, i mean any addresses between 0x00000000 and 0x00004000 >>> and the shared PRU RAM (12 kB starting from 0x00010000). Any other address >>> i tried had the delay stated for "DDR". >>> >>> Can anybody confirm the long DDR (and other delays if possible) readout >>> times that I have measured? Does anybody have an explanation for these >>> large delays? >>> >>> Thanks in advance! Lenny >>> >> -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
