I also don't see any references to VBUS/VBUSP, but based on the context
they are internal PRU buses.

To answer your original question, access to anything outside the PRU
block (the two PRU cores, data memories, and local peripherals) requires
communicating over the SoC's internal interconnect fabric. You can
perform zero wait state writes to these resources (at least until you
saturate the posted write logic), but reads will stall the PRU until
data is returned from the far end.

I characterized the performance when accessing the GPIO registers from
the PRU, and got results similar to your DDR memory timings:

https://github.com/machinekit/machinekit/blob/master/src/hal/drivers/hal_pru_generic/pru_generic.p#L137-L165

Note that all timings are approximate.  The exact number of PRU cycles
it will take to complete a write or read will depend on things like
internal bus utilization, various clock crossing latencies (which by
nature will have a varying amount of latency) and how quickly the far
end can respond.  The DDR DRAM controller in particular needs to
schedule the read request and there are many factors that can cause the
read latency to vary.

On 3/14/2017 2:24 PM, ags wrote:
> Would someone kindly decode what VBUS and VBUSP are? Searched but could not 
> find other relevant references. Thanks.
> 
> On Tuesday, May 26, 2015 at 6:39:38 AM UTC-7, [email protected] wrote:
>>
>> Sorry, just saw that you actually mentioned that the shared memory has the 
>> same performance as the DRAM.
>> Also, I found this: 
>> http://processors.wiki.ti.com/index.php/Programmable_Realtime_Unit#Load_.2F_Store_Instructions
>> where it is said that LBBO should take (1+word count) cycles. If that's 
>> right, an LBBO instruction up to 4 bytes should take 2 cycles for VBUS 
>> and 3 cycles for VBUSP. For now I need to study more to understand which 
>> one is the case, but VBUSP matches with your findings.
>>
>> Em sexta-feira, 3 de janeiro de 2014 23:05:30 UTC-2, Lenny escreveu:
>>>
>>> Hello, 
>>>
>>> I am using a Beaglebone Black. When i measured the number of PRU clock 
>>> cycles needed for the execution of various assembler instructions, I found 
>>> surprisingly large values for memory access. Here follows a list, in which 
>>> one cycle corresponds to a delay of 5ns as expected:
>>>
>>> Most operations, such as ADD,SUB,QBxx,MOV,JMP etc.: 1 cycle
>>>
>>> LBBO 1,2,4 Bytes from PRU DRAM: 3 cycles
>>> LBBO 8 Bytes from PRU DRAM: 4 cycles
>>> LBBO 12 Bytes from PRU DRAM: 5 cycles
>>> LBBO 16 Bytes from PRU DRAM: 6 cycles
>>>
>>> LBCO 4 Bytes from DDR: 43 cycles
>>> LBCO 8 Bytes from DDR: 44 cycles
>>> LBCO 12 Bytes from DDR: 45 cycles
>>> LBCO 16 Bytes from DDR: 46 cycles
>>>
>>> With PRU DRAM, i mean any addresses between 0x00000000 and 0x00004000 and 
>>> the shared PRU RAM (12 kB starting from 0x00010000). Any other address i 
>>> tried had the delay stated for "DDR".
>>>
>>> Can anybody confirm the long DDR (and other delays if possible) readout 
>>> times that I have measured? Does anybody have an explanation for these 
>>> large delays?
>>>
>>> Thanks in advance! Lenny
>>>
>>
> 


-- 
Charles Steinkuehler
[email protected]

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/1818edb3-b783-8142-d683-29d51407d387%40steinkuehler.net.
For more options, visit https://groups.google.com/d/optout.

Reply via email to