Sounds promising... Anyway I have a felling that it will not beat the PRU speed, for reasons already explained in this discussion. If you are willing to get rid of the OS to run direcly your application on beaglebone, maybe using FPGAs would be a better idea. I don't know FPGAs very well, but I know you can even "program" some FPGA devices to work as a CPU/FPU. No idea of the performance nor the costs involved, this is just a guess on a interesting topic.
Le mercredi 5 août 2015 18:49:24 UTC-3, Lenny a écrit : > > Hehe, a beast indeed :) > > I downloaded the StarterWare software and I like it. I'll summarize my > current understanding, and if someone wants to correct me in case its > necessary, I'll be glad: > As far as I understand, StarterWare does not use an OS overhead, so you > get to execute your code directly in the MPU - bare metal access so to say. > I imagine the same can be accomplished by properly embedding your compiled > file into a bootloader at the right place. The provided examples are > reasonably clear, for example to set a GPIO pin, you find the instruction > > GPIOPinWrite(GPIO_INSTANCE_ADDRESS, > GPIO_INSTANCE_PIN_NUMBER, > GPIO_PIN_LOW); > > Checking what is behind is really a simple instruction > HWREG(baseAdd + GPIO_CLEARDATAOUT) = (1 << pinNumber); > where the macro HWREG only provides a properly shaped pointer to the > address in brackets. Using these examples is equivalent to a painful and > time-consuming study of the TRM, where you can find the addresses of all > those registers. > > So as far as I understand, this operation should also only take the > equivalent of one single assembler instruction after compilation. Two > questions now remain: > 1) how many cycles does it take the MPU to execute this instruction (or > any other one - this is not specified in the TRM but I am sure it is > somewhere in the ARM documention) > 2) how long does it take until the value arrives at the output pin > > The second question aims at Charles concern. Again, from the TRM i deduce > that for example the GPIO modules are connected to the MPU through the L4 > interconnect. The interface clock rate is specified in the GPIO chapter of > the TRM to be 100 MHz. Now I do not understand how this bus works in > detail, but the fact that it can handle several sources and destinations > simultaneously raises the concern that there is a buffer involved that > comes with some extra latency. But I would assume that by running only the > one code snippet that I define, and no OS processes in the background, that > all other devices are disabled, and therefore the bus is really only used > when my program does so. So there should be top prority handling for my > packets and therefore they should arrive with minimal, and up to clock > missynchronization, deterministic delay. So I would just estimate a delay > of a few interface clock cycles, so a latency less than - say - 1/20MHz. Is > my reasoning correct here or do I forget something? > > I guess my next steps will be reading on the MPU itself, as to whether one > can hope to implement really fast algorithms on a very low level here. If > Im not mistaken, this > <https://web.eecs.umich.edu/~prabal/teaching/eecs373-f10/readings/ARMv7-M_ARM.pdf> > > is the document to read. A first glance tells me that maybe I'll understand > what the A in Cortex-A9 actually means on a low level :) > > If there is a catch that I am not aware of - thanks for letting me know! > > -- For more options, visit http://beagleboard.org/discuss --- You received this message because you are subscribed to the Google Groups "BeagleBoard" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
