Re: [akaros] perfmon read/write interface

Dan Cross Sun, 29 Nov 2015 14:29:16 -0800

On Sun, Nov 29, 2015 at 9:59 AM, 'Davide Libenzi' via Akaros <
[email protected]> wrote:

> On Sun, Nov 29, 2015 at 5:28 AM, barret rhoden <[email protected]>
> wrote:
>
>> They do come from the original plan 9 code, but we can still change
>> them to be more descriptive (and inlines).
>>
>> One thing though is that we do want to keep the Plan 9 implementations
>> of those functions that do not use ifdefs:
>> http://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html
>> (at least I think we do).
>>
>
> I dispute almost every single point he makes there, but I kept my inlines
> the slow, fully open coded version ☺
> First, is more code? Yes, but, in many CPUs (like, Intel, which is what
> matter most) is much faster.
>

But very often these functions are on the boundary of a program; a place
where performance doesn't matter that much.

And I'm not sure that Intel matters all that much. There are probably more
ARM machines out there than Intel machines at this point. And who knows how
many other weirdo processors are running embedded things? AVR and PIC are
super common; I imagine that eZ80 and HC11s and all sorts of random junk
still has a huge footprint out there.

Second, you can have a swap32() and swap64() APIs, which in Intel turns
> into dedicated, single insn operation (would not matter for LE, the most
> common case).
>

Little-endian being the most common case has only been true for a
comparatively short time. That assertion reminds me of Henry Spencer's "10
Commandments of C Programmers" in which the tenth of Henry's commandments
was to avoid falling into the mistaken belief that, "all the world's a VAX"
(https://www.lysator.liu.se/c/ten-commandments.html). The point is that
what's most common today may not be tomorrow....

Regardless, a dedicated interface for these functions is really good. Even
in Plan 9, such a thing existed (though the implementation probably used
shifts and or's; Barret alludes to that with the conv* functions). Under
Plan 9 today, there are functions for getting big- and little- endian data
from somewhere.

I think the overarching point of Rob's post was that if a programmer feels
like s/he needs to write something to deal with endianness of the machine
one is on, one's almost certainly going to be wrong. You program to some
defined interface and don't worry about it beyond that. Binary data is
fine, provided it's well-defined with respect to byte order etc, and
presented in some packed representation so that I don't have to care about
things like alignment of elements within the data stream. But the point is
that one divorces the specifics of the processor one is working on from

Third, alignment? Define an HAVE_FAST_UNALIGNED_ACCESS in the arch/ domain,
> and use that to select the proper behavior (#if LE && FAST_UNALIGNED -> Use
> single insn load/store).
>

But now one is resorting to tricks with conditional compilation (which is
kind of Rob's point), which gives rise to a combinatorial explosion of
options that I can't compile and test everywhere. That sort of thing
usually leads to ugly assumptions being baked in over time, but the naive
code won't work on every *machine*. For example, 68000 would trap on
unaligned accesses to multi-byte data types.

        MOVE.L #$10001,D0        | Oh no you didn't!

One could argue that 68k doesn't matter anymore (and one would mostly be
right). But what about Power, MIPS, ARM, or some new hypothetical processor
that hasn't been invented yet?

Having every application have an #ifdef that does the right thing is a step
backward; conditional compilation is really nasty stuff because it's hard
to test without a number of separate compilations that quickly becomes
exponential in the number of boolean terms in the conditional.... Instead,
a far better option is to program to a library that makes it not matter,
and implement that library in the most performant way for whatever platform
one is running on. #ifdef is a technique to put all that code in a single
file, but it is only one of many (and not a very good one, frankly:
conditional compilation leads to fragile code that's frankly hard to read).
So instead of LE && FAST_UNALIGNED one has
/sys/src/libc/x86_64/accessdata.s that has things like,

; X
.globl gets32le
 gets32le:
        movl (%rdi),%eax
        ret

Then I just link against the library that's right for my platform. No
#ifdef's in site.

Fourth, *(int)ptr? Really? ☺ Those should be using u16, u32, u64, and not
> short, int, long, etc...
>

Indeed. But there's a lot of code out there that's written in terms of
'int'. Rob's no dummy; he's writing from really long experience using lots
of production systems. He's seen that code out in the wild; so have I. :-(

For encoding/decoding a few tenths on bytes coming into a file API, maybe
> you can get along with the fully open coded version.
>

A tenth of a byte, eh? That's less than a bit unless you're on a PDP-10.
:-P  (Just kidding! Just kidding!)

But say, if you are decoding packed docdata, or a posting list payloads,
> the three lines of code more per function will make a difference.
>

...or decoding IP or TCP headers.

I don't think his point was to always write things using (foo[0] << 24 |
...); I think rather his point was that application programmers who are
worrying about those issues are thinking at the wrong level of abstraction.
Hide it behind a function call or similar, but don't think about "I need an
#ifdef because I'm dealing with binary data...."

To sum up, program to the *data*, not the CPU that's manipulating the data.

        - Dan C.

-- 
You received this message because you are subscribed to the Google Groups 
"Akaros" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [akaros] perfmon read/write interface

Reply via email to