On 9/19/2014 3:51 PM, Cedric Malitte wrote: > > Le vendredi 19 septembre 2014 16:46:05 UTC-4, Cedric Malitte a écrit : >> >> Hi all, >> >> I had a few hours to play with the pruss, but I came to a dead end... >> >> My goal is to read ADCs, ADS8326 to be precise. >> It's a kind of SPI adc with one clock, one select, one out. >> >> I'd like to use 4 in parallel, which means only one clock, one select and >> 4 inputs on the PRUSS. >> I try to pull up CLK line and then read each input, shift them into >> variables to be sent to main app. >> >> When I look at the CLK line on a scope, it's taking way too much time to >> get input states and shift even if the asm code should only take a few >> cycles. >> I'm lazy, I write the pruss code in C, but asm looks nice. >> >> Here's the code in C
<snip> >> My great trouble is that it takes to much time, in fact way too much. >> >> Using this code, the CLK line is at 757 Khz. >> CLK hi is around 1us and low is the rest.... >> >> I'd like to achieve at least 2Mhz for CLK line. >> >> I might have misread the doc, but isn't an instruction supposed to be 5ns >> ? >> That should be 35ns for first part and 40ns for second part. >> >> Any clue or help ? >> >> The learning curve is a bit harder than I tought :) >> >> Thanks >> > Well I misread the doc.... not all instructions are created equal :) > > Even that, it's still slow as hell to read the inputs... The *INSTRUCTION* takes 5 nS (or maybe 10-15, depending on exactly what you're doing), but since you're reading data from outside the PRU domain, the round-trip time for each GPIO read is killing your performance. You need to use the direct PRU inputs, and not general purpose I/O accessed through the AXI fabric. I have some details on read/write timings to the GPIO via the interconnect fabric in the comments of my PRU code for Machinekit: https://github.com/machinekit/machinekit/blob/master/src/hal/drivers/hal_pru_generic/pru_generic.p#L135-L163 Note that *WRITES* from the PRU to the GPIO are fairly quick, but *READS* are very slow. This is because the write can be posted allowing the PRU to continue on executing code, but on reads the PRU stalls until the data is returned. Executive Summary of PRU <-> GPIO timing: Peak GPIO write speed : 10 nS (100 MHz) Sustained GPIO write speed : 40 nS ( 25 MHz) GPIO Read speed : ~165 nS ( ~6 MHz) You are then making things much worse by reading from the GPIO bank multiple times in your code. You should factor all the HWREG(SOC_GPIO_3_REGS + GPIO_DATAIN) accesses into a single read to a local variable, then use the local variable to do the bit manipulations, rather than performing the expensive read four times. Also, don't blame the compiler for not optimizing this for you. If you are wondering why this didn't get optimized, the compiler cannot treat a GPIO register read as a generic (ie: cachable) memory read since the value read can potentially be different each time (ie: the access is volatile). Therefore, it's up to you to integrate any read or write combining that is acceptable, the compiler can't do it for you. Also, even standard memory reads from DDR via the PRU are really volatile, since the ARM core is running in the background and could potentially be changing the values between each PRU access. Clean up your code a bit and I expect you'll see much better results! -- Charles Steinkuehler [email protected] -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
