> > > > >> You can have arm, arm64, mips64, etc. directories inside > > > > >> examples/ipfrag. > > > > > > > > > > This does not scale in practice. If another example comes along, does > > > > > that also need to split whatever is determined to be "arch" code into > > > > > "arch" directories within its own folder? > > > > > > > > > > I recommend that this contribution be accepted without any further > > > > change. > > > > > > > > From my perspective, separate arm, aarch64 and generic headers would be > > > > sufficient, even if they are not put under separate directories. That > > > > was just a suggestion. > > > > > > > > > > The main concern I have with this is that: ODP examples should > > > demonstrate how > > > ODP APIs are used to do X. In other words: "If you implement X like this > > > with > > > ODP APIs, your code is: > > > > > > * 100% portable > > > * still good performance > > > * clean and easy to understand / maintain" > > > > > > Super optimized, architecture specific implementation of X would not > > > drive that > > > purpose. It also leads to maintenance issues later on - a bug found and > > > fixed > > > on CPU arch Y does not get automatically fixed for arch Z. > > > > > > We are drawing a line here if embedded assembly or otherwise arch > > > dependent > > > code belongs to example applications. I'd say it does not, since HW > > > abstraction is the sole purpose of ODP. > > > > You know far more about ODP than I, so please correct me if I'm wrong here, > > but my understanding is that ODP is primarily about abstracting hardware > > **accelerators** (in a networking context). It does not, and should not, be > > overly involved with the details of how programmers may wish to write and > > abstract other parts of their programs. > > > > This example should run perfectly well on a wide range of CPUs — that is, > > pretty much anything with 32 bit or 64 bit pointers and that has a compiler > > supporting GCC's "__atomic_compare_exchange_n" for double the pointer size — > > and consists of code that is almost entirely independent of other > > architectural > > details. There is a single exception to this, in that the standard 128-bit > > "__atomic_compare_exchange_n" implementation on ARM is not lock-free, and so > > there are some short snippets of inline assembly, confined to a single file > > (in the new version of the patch I have locally), that resolve this. I don't > > see a problem here. > > ODP is about HW abstraction and CPU instructions are HW features. Of course, > applications are free to use other libs/APIs/GCC features, but should our > examples do that?
I'm not sure I agree that ODP primarily exists to abstract non-accelerator HW, as far as I'm concerned that's really the job of the language, other libraries, and at the end of the day the programmer. (Talk about biting off more than you can chew: abstracting all hardware in the system?!) But regardless... > "There is a single exception to this, in that the standard 128-bit > "__atomic_compare_exchange_n" implementation on ARM is not lock-free ..." > > This is the point. How about the built-in implementation on other CPU > architectures: MIPS, PPC, Tilera, Kalray, ... etc ? Does the code build on > those. If it does build, does it have race conditions or dead locks due to a > slightly different synchronization scheme? A change of architecture shouldn't introduce any additional race conditions or deadlocks, no. If this were the case, then I agree that this example would be far too fragile to include. Any standards compliant implementation of "__atomic_compare_exchange_n" will work fine — including the default one used by ARM platforms. As the algorithm underlying this example is supposed to be lock-free, though, a proper lock-free compare exchange implementation is the icing on the cake, showing off the properties of the algorithm in full. > I think the first option would be to remove 128 atomics. Other option is to > separate those into different header files, test this on different archs and > enable it only on tested archs (may be only on x86 and arm first). I would > not want an example to fail on e.g. Kalray platform, just because the example > decided to by-pass ODP API. > > -Petri
