On 08/11/18 19:42, Alexei Starovoitov wrote:
> same link let's continue at 1pm PST. 
So, one thing we didn't really get onto was maps, and you mentioned that it
 wasn't really clear what I was proposing there.
What I have in mind comes in two parts:
1) map type.  A new BTF_KIND_MAP with metadata 'key_type', 'value_type'
 (both are type_ids referencing other BTF type records), describing the
 type "map from key_type to value_type".
2) record in the 'instances' table.  This would have a name_off (the
 name of the map), a type_id (pointing at a BTF_KIND_MAP in the 'types'
 table), and potentially also some indication of what symbol (from
 section 'maps') refers to this map.  This is pretty much the exact
 same metadata that a function in the 'instances' table has, the only
 differences being
 (a) function's type_id points at a BTF_KIND_FUNC record
 (b) function's symbol indication refers from .text section
 (c) in future functions may be nested inside other functions, whereas
 AIUI a map can't live inside a function.  (But a variable, which is
 the other thing that would want to go in an 'instances' table, can.)
So the 'instances' table record structure looks like

struct btf_instance {
    __u32 type_id; /* Type of object declared.  An index into type section */
    __u32 name_off; /* Name of object.  An offset into string section */
    __u32 parent; /* Containing object if any (else 0).  An index into instance 
section */
};

and we extend the BTF header:

struct btf_header {
    __u16   magic;
    __u8    version;
    __u8    flags;
    __u32   hdr_len;

    /* All offsets are in bytes relative to the end of this header */
    __u32   type_off;      /* offset of type section       */
    __u32   type_len;      /* length of type section       */
    __u32   str_off;       /* offset of string section     */
    __u32   str_len;       /* length of string section     */
    __u32   inst_off;      /* offset of instance section   */
    __u32   inst_len;      /* length of instance section   */
};

Then in the .BTF.ext section, we have both

struct bpf_func_info {
    __u32 prog_symbol; /* Index of symbol giving address of subprog */
    __u32 inst_id; /* Index into instance section */
}

struct bpf_map_info {
{
    __u32 map_symbol; /* Index of symbol creating this map */
    __u32 inst_id; /* Index into instance section */
}

(either living in different subsections, or in a single table with
 the addition of a kind field, or in a single table relying on the
 ultimately referenced type to distinguish funcs from maps).

Note that the name (in btf_instance) of a map or function need not
 match the name of the corresponding symbol; we use the .BTF.ext
 section to tie together btf_instance IDs and symbol IDs.  Then in
 the case of functions (subprogs), the prog_symbol can be looked
 up in the ELF symbol table to find the address (== insn_offset)
 of the subprog, as well as the section containing it (since that
 might not be .text).  Similarly in the case of maps the BTF info
 about the map is connected with the info in the maps section.

Now when the loader has munged this, what it passes to the kernel
 might not have map_symbol, but instead map_fd.  Instead of
 prog_symbol it will have whatever identifies the subprog in the
 blob of stuff it feeds to the kernel (so probably insn_offset).

All this would of course require a bit more compiler support than
 the current BPF_ANNOTATE_KV_PAIR, since that just causes the
 existing BTF machinery to declare a specially constructed struct
 type.  At the C level you could still have BPF_ANNOTATE_KV_PAIR
 and the '____bpf_map_foo' name, but then the compiler would
 recognise that and convert it into an instance record by looking
 up the name 'foo' in its "maps" section.  That way the special
 ____bpf_map_* handling (which ties map names to symbol names,
 also) would be entirely compiler-internal and not 'leak out' into
 the definition of the format.  Frontends for other languages
 which do possess a native map type (e.g. Python dict) might have
 other ways of indicating the key/value type of a map at source
 level (e.g. PEP 484) and could directly generate the appropriate
 BTF_KIND_MAP and bpf_map_info records rather than (as they would
 with the current design) having to encode the information as a
 struct ____bpf_map_foo type-definition.


While I realise the desire to concentrate on one topic at once, I
 think this question of maps should be discussed in tomorrow's
 call, since it is when we start having other kinds of instances
 besides functions that the advantages of my design become
 apparent, unifying the process of 'declaration' of functions,
 maps, and (eventually) variables while separating them all from
 the process of 'definition' of the types of all three.

Thank you for your continued patience with me.
-Ed

Reply via email to