On Tue, Dec 1, 2015 at 7:56 AM, Dan Cross <[email protected]> wrote: > ...but you were talking about the dates of his *blog* post. His *blog > post* doesn't mention those macros at all. Rather, he talks about a > technique and makes a statement of a general principle. Those macros were > written in the 1980s or early 1990s. >
No, everything started from the macros, got steered away to a blog post, which in order to prove its own point, was making false assumptions about what other people which is not him, would be doing, and I commented those points. Actually, that's not what happened. The benchmark itself is correct; it's > rather that I had screwed things up so that the only path ever executed was > the fast path. *Cough* *cough* my bad. > > But regardless, the "slow" version is only 4 times slower, and runs in > something like a little over a nanosecond on my machine. Is that enough > overhead to argue about? Maybe, but it's not immediately clear. > For reading/writing operands and result in an ioctl-like syscall, it does not matter. That was agreed about 15 posts ago 😉 If you are writing a library function, which you do not know beforehand how and where it will be used, it does matter, especially when dealing like APIs of this kind, which could indeed be used in tight high frequency loops. But that code *is* portable, provided the proper machine description >> definitions. >> > > Great. Run void *p = 0x110011; uint32_t d = *(uint32_t *)p; on an MC68k > and tell me what happens. Saying, "no one cares about 68k" doesn't count as > an answer. :-) > I think all saints day just passed, so we can stop bringing back the deceased CPUs ☺ Nobody was thinking about running that code as is, w/out ifdef guards. Or you could just have -I/$objtype/include and have an, 'endian.h' in > /$objtype/include that has static-inline functions that do the right thing. > I note you are started to steer away from assembly (remember, this branch of the discussion born from you posting an assembly solution stating it was a better deal), but OBJTYPE/endian.h ... is not that simple. Take Linux for example. You have ARCH (the whole 15 or so of them), and within each ARCH, you have many CPU model and revisions. Choices that are good for an Intel P4, might not be good with an Haswell (dertainly the fast unaligned, but also things which depends on pipeline length). Let's not even go in the ARM world, where the head can literally explode. So you have like 15 ARCHs, each with an AVG of, say, 3 CPU revs., 45 combos, instead of two variables: LE, FAST_UNALIGNED. Yes, you can use symlinks, and/or other makefile generated machinery, but you still have to deal with it. Eh? I didn't *defend* it, I just explained it. That code was a direct > import from OpenBSD.... I didn't write it, and given what it's doing, I saw > no reason to change it. :-) > You are arguing for simplicity, and yet you want to leave complex and useless optimizations in place? ☺ -- You received this message because you are subscribed to the Google Groups "Akaros" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. For more options, visit https://groups.google.com/d/optout.
