On Tue, Dec 1, 2015 at 4:46 AM, Dan Cross <[email protected]> wrote:

> So in fairness, that code was originally written well before 'static
>>> inline' was a thing.
>>>
>>
>> They were certainly here in 2012 😉
>> TUESDAY, APRIL 03, 2012
>>
>
> ...but Rob didn't write any macros in that blog post....
>

But everything here started from arguing the PBITx/GBITx macros in the
first place, which were coming from plan9.


Let's run it through gcc-4.9.2 (see attached: gcc-mp-4.9 -std=c11 -fasm -Os
> -S bench.c). Yeah, the assembly is not as pretty, but did you actually
> measure the elapsed runtime? They appear to be about the same to me:
>
> : hurricane; time ./bench abcd fast
> t 1684234849000000000
>
> real 0m0.339s
> user 0m0.333s
> sys 0m0.003s
> : hurricane; time ./bench abcd slow
> t 1684234849000000000
>
> real 0m0.334s
> user 0m0.328s
> sys 0m0.003s
> : hurricane;
>
> Further, you're arguing for a technique based on hardware that didn't make
> this fast until pretty recently (I can't remember when unaligned access
> became fast in x86). Sure, the object code is a bit bigger (4 words instead
> of 2 bytes) so it takes up more space in icache, but for something this
> small, I don't think it matters. Moral: measure, but only when it can be
> shown that it's important.
>

Amended.
You need to check assembly code when benching, as GCC can strip out entire
code sections if it feels they have no used outcome.




>
> Even in ARM, ARM64 that is (the only thing that matters, eventually, for
>>>> Akaros), a single load/store is faster than open coding.
>>>> Unaligned faulting (or sucking) junk is thing of the past. Processors
>>>> doing that are either dead, or turning around with new silicon versions.
>>>>
>>>
>>> That's a dangerous assumption, and as there's clearly no harm in writing
>>> it the portable way since I get the same output anyway, I don't see a point
>>> in making the assumption.
>>>
>>
>> Note that nobody was trying to push anything which wasn't portable. You
>> came up with the assembly thing.
>>
>
> '*(uint32_t *)p;' isn't portable because of alignment issues (unless you
> can guarantee that p always points to properly aligned data). Sure, you can
> wrap that up in an 'ifdef' so that you don't compile it on a system where
> alignment is important, but the code itself is still inherently unportable.
> ifdef'ing it out or handwaving away platforms where it matters doesn't
> really change that. I'd rather just write one version of the code that's
> portable.
>

But that code *is* portable, provided the proper machine description
definitions.
Much more portable than assembly, provided two CPU level configurable being
in place.
Narrowing to what we are dealing here. One we already have (endian), and
one (fast unaligned) which can be defaulted to 0, so at the worst you fall
back to the sucky behavior.
Both of them, could be even auto-generated by autoconf snippets (endian is
already for sure), for non-OS software.




>
> I think the overarching point of Rob's post was that if a programmer feels
>>>>> like s/he needs to write something to deal with endianness of the machine
>>>>> one is on, one's almost certainly going to be wrong.
>>>>>
>>>>
>>>> Really? And who's this guy? Anyone I can recognize here?😀
>>>>
>>>
>>> Rob Pike? No, he's not one of the scientists in that picture (cool
>>> picture by the way). But he is this guy:
>>> https://en.wikipedia.org/wiki/The_Unix_Programming_Environment
>>> https://en.wikipedia.org/wiki/The_Practice_of_Programming
>>>
>>
>> I will always be taking hard shots to the guys which assume that either
>> "other people will get it wrong", or, along the same lines, "other people
>> will fail because they failed".
>>
>
> ...but he didn't fail at anything. His point is absolutely correct.
>

We seem to have different kind heroes. Mine are the ones whom don't talk
down to people telling them they will fail.
He did fail at least in something though. He failed to conceive an OS used
in more than 10 computers around the globe.
Sorry, but you were calling for it 😀


 Yes, like, in linux, 8 of them ☺

>
> Check out glibc.
>
> : chandra; find glibc-2.19 -name '*.[Ss]' | wc -l
>     2061
> : chandra;
>

Yes, GLIBC. It has entire floating point emulation libraries written in
assembly (A LOT of single function .S files - one per FP insn).
In Linux (and many other OSs I had my eyes on), assembly is used in boot
related code and maybe a *few* hot or particular places where writing
inline assembly would not be practical.
Certainly you do not see anywhere doing the kind of stuff we are talking
about here, by makefile machinery orchestrating assembly files.



If a system has 100 valid combinations, you have to handle those 100
>> combinations.
>>
>
> Ah, but ifdef's don't just cover the *valid* combinations and that's part
> of the problem with them. Ifdefs allow you to introduce a tweak-able knob
> that introduces a decision space much bigger than what's actually needed.
> If I restrict myself only to boolean expression predicated on the existence
> or lack thereof of a preprocessor symbol, then I have a number of
> combinations that's exponential in the number of terms; for anything
> non-trivial, that gets big fast. But probably only a handful of
> combinations are actually meaningful. So the set I actually use is much
> smaller than the decision space I've created. A classic problem with
> preprocessor magic is what happens when I tweak the knobs to force a
> decision that isn't handled in the code. This makes things fragile, and
> really brittle to change.
>

...


>
> On the other hand, if I use separate compilation units then I can provide
> exactly what I support and nothing more.
>
> Either you do it with Makefile magic (makefiles, which are driven by
>> configs themselves - they are just called $(FOO)), or you do it with C
>> pre-processing magic.
>>
>
> Err, if by makefile magic you mean a directory name in a variable, then I
> guess so.... I think history has shown again and again that that's much
> cleaner than using the preprocessor. Plan 9 ran on a dozen architectures
> without a single #ifdef related to portability.
>

And yet, most of the software you are using today (certainly Unix based
ones), is based on auto/manual generated HAVE_FEATURE_X macros, and ifdef
machinery at C/C++ level.
Software that drops C in exchange of a bunch of assembly files covering the
different branches of the conditionals, looks pretty rare to me.
It could be that almost everyone else is wrong though. But in this
particular case, I am with almost everyone else ☺

I just would like to understand what you are arguing for.
In one email, you are fighting over potential over-optimization, in another
you defend code with silly ones (see the array alloc).
In one email macros are fine (PBIT/GBIT for example - places, like function
like pattern, where macros should not be used), in another email macros are
evil (in places, like variable declaration expansion - where you can't do
w/out using CPP features like # and ##).

-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to