Re: [PATCH bpf-next v2 0/3] bpf: add boot parameters for sysctl knobs

2018-05-24 Thread Jesper Dangaard Brouer
On Wed, 23 May 2018 15:02:45 -0700
Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote:

> On Wed, May 23, 2018 at 02:18:19PM +0200, Eugene Syromiatnikov wrote:
> > Some BPF sysctl knobs affect the loading of BPF programs, and during
> > system boot/init stages these sysctls are not yet configured.
> > A concrete example is systemd, that has implemented loading of BPF
> > programs.
> > 
> > Thus, to allow controlling these setting at early boot, this patch set
> > adds the ability to change the default setting of these sysctl knobs
> > as well as option to override them via a boot-time kernel parameter
> > (in order to avoid rebuilding kernel each time a need of changing these
> > defaults arises).
> > 
> > The sysctl knobs in question are kernel.unprivileged_bpf_disable,
> > net.core.bpf_jit_harden, and net.core.bpf_jit_kallsyms.  
> 
> - systemd is root. today it only uses cgroup-bpf progs which require root,
>   so disabling unpriv during boot time makes no difference to systemd.
>   what is the actual reason to present time?
> 
> - say in the future systemd wants to use so_reuseport+bpf for faster
>   networking. With unpriv disable during boot, it will force systemd
>   to do such networking from root, which will lower its security barrier.
>   How that make sense?
> 
> - bpf_jit_kallsyms sysctl has immediate effect on loaded programs.
>   Flipping it during the boot or right after or any time after
>   is the same thing. Why add such boot flag then?
> 
> - jit_harden can be turned on by systemd. so turning it during the boot
>   will make systemd progs to be constant blinded.
>   Constant blinding protects kernel from unprivileged JIT spraying.
>   Are you worried that systemd will attack the kernel with JIT spraying?


I think you are missing that, we want the ability to change these
defaults in-order to avoid depending on /etc/sysctl.conf settings, and
that the these sysctl.conf setting happen too late.

For example with jit_harden, there will be a difference between the
loaded BPF program that got loaded at boot-time with systemd (no
constant blinding) and when someone reloads that systemd service after
/etc/sysctl.conf have been evaluated and setting bpf_jit_harden (now
slower due to constant blinding).   This is inconsistent behavior.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/3] bpf: add boot parameters for sysctl knobs

2018-05-23 Thread Jesper Dangaard Brouer
On Wed, 23 May 2018 13:35:47 +0200
Eugene Syromiatnikov <e...@redhat.com> wrote:

> On Mon, May 21, 2018 at 11:58:13AM -0700, Alexei Starovoitov wrote:
> > On Mon, May 21, 2018 at 02:29:30PM +0200, Eugene Syromiatnikov wrote:  
> > > Hello.
> > > 
> > > This patch set adds ability to set default values for
> > > kernel.unprivileged_bpf_disable, net.core.bpf_jit_harden,
> > > net.core.bpf_jit_kallsyms sysctl knobs as well as option to override
> > > them via a boot-time kernel parameter.  
> > 
> > Commits log not only should explain 'what' is being done by the patch,
> > but 'why' as well.  
> 
> Some BPF sysctl knobs affect the loading of BPF programs, and during
> system boot/init stages these sysctls are not yet configured. A
> concrete example is systemd, that has implemented loading of BPF
> programs.
> 
> Thus, to allow controlling these setting at early boot, this patch set
> adds the ability to change the default setting of these sysctl knobs
> as well as option to override them via a boot-time kernel parameter
> (in order to avoid rebuilding kernel each time a need of changing these
> defaults arises).
> 
> The sysctl knobs in question are kernel.unprivileged_bpf_disable,
> net.core.bpf_jit_harden, and net.core.bpf_jit_kallsyms.

Hi Eugene,

You have to resend the entire patchset with this explanation in the
cover-letter.  Your old patchset have been dropped from patchwork[1]
due to being marked with "Changes Requested".

Please remember to give it a "V2" tag and also specify which git tree
you are targeting[2], like [PATCH bpf-next V2].


[1] http://patchwork.ozlabs.org/project/netdev/list/?series=45617=%2a

[2] 
https://github.com/netoptimizer/linux/blob/bpf_doc10/Documentation/bpf/bpf_devel_QA.rst#q-how-do-i-indicate-which-tree-bpf-vs-bpf-next-my-patch-should-be-applied-to
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [bpf-next PATCH 5/5] bpf, doc: howto use/run the BPF selftests

2018-05-14 Thread Jesper Dangaard Brouer
On Mon, 14 May 2018 17:15:54 +0200
Silvan Jegen <s.je...@gmail.com> wrote:

> Hi
> 
> Some typo fixes below.
> 
> On Mon, May 14, 2018 at 3:43 PM Jesper Dangaard Brouer <bro...@redhat.com>
> wrote:
> > I always forget howto run the BPF selftests. Thus, lets add that info
> > to the QA document.  
> 
> > Documentation was based on Cilium's documentation:
> >   http://cilium.readthedocs.io/en/latest/bpf/#verifying-the-setup  
> 
> > Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
> > ---
> >   Documentation/bpf/bpf_devel_QA.rst |   29 +
> >   1 file changed, 29 insertions(+)  
> 
> > diff --git a/Documentation/bpf/bpf_devel_QA.rst  
> b/Documentation/bpf/bpf_devel_QA.rst
> > index 2254bdeae990..0e7c1d946e83 100644
> > --- a/Documentation/bpf/bpf_devel_QA.rst
> > +++ b/Documentation/bpf/bpf_devel_QA.rst
> > @@ -417,6 +417,33 @@ submitted by the BPF maintainers to the stable  
> maintainers.
> >   Testing patches
> >   ===  
> 
> > +Q: How to run BPF selftests
> > +---
> > +A: After you have booted into the newly compiled kernel, navigate to
> > +the BPF selftests_ suite in order to test BPF functionality (current
> > +working directory points to the root of the cloned git tree)::
> > +
> > +  $ cd tools/testing/selftests/bpf/
> > +  $ make
> > +
> > +To run the verifier tests::
> > +
> > +  $ sudo ./test_verifier
> > +
> > +The verifier tests print out all the current checks being
> > +performed. The summary at the end of running all tests will dump
> > +information of test successes and failures::  
> 
> Two colons at the end of the line. Don't think that was intended.

It is intended, that is part of the RST formatting.

> 
> > +
> > +  Summary: 418 PASSED, 0 FAILED
> > +
> > +In order to run through all BPF selftests, the following command is
> > +needed::
> > +
> > +  $ sudo make run_tests
> > +
> > +See the kernels selftest `Documentation/dev-tools/kselftest.rst`_  
> 
> s/kernels/kernel's/

I guess that is more correct...

> I also think the underscore at the end of this line is misplaced (or it
> should be a dash instead).

This is also part of the RST formatting.  This is a link. 


> > +document for further documentation.
> > +
> >   Q: Which BPF kernel selftests version should I run my kernel against?
> >   -
> >   A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
> > @@ -607,5 +634,7 @@ when:
> >   .. _netdev FAQ: ../networking/netdev-FAQ.txt
> >   .. _samples/bpf/: ../../samples/bpf/
> >   .. _selftests: ../../tools/testing/selftests/bpf/
> > +.. _Documentation/dev-tools/kselftest.rst:
> > +   https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html  

The link is defined above/here.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bpf-next PATCH 3/5] bpf, doc: convert bpf_design_QA.rst to use RST formatting

2018-05-14 Thread Jesper Dangaard Brouer
The RST formatting is done such that that when rendered or converted
to different formats, an automatic index with links are created to the
subsections.

Thus, the questions are created as sections (or subsections), in-order
to get the wanted auto-generated FAQ/QA index.

Special thanks to Quentin Monnet <quentin.mon...@netronome.com> who
have reviewed and corrected both RST formatting and GitHub rendering
issues in this file.  Those commits have been squashed.

I've manually tested that this also renders nicely if included as part
of the kernel 'make htmldocs'.  As the end-goal is for this to become
more integrated with kernel-doc project/movement.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/bpf_design_QA.rst |  223 +++
 1 file changed, 144 insertions(+), 79 deletions(-)

diff --git a/Documentation/bpf/bpf_design_QA.rst 
b/Documentation/bpf/bpf_design_QA.rst
index f3e458a0bb2f..6780a6d81745 100644
--- a/Documentation/bpf/bpf_design_QA.rst
+++ b/Documentation/bpf/bpf_design_QA.rst
@@ -1,156 +1,221 @@
+==
+BPF Design Q
+==
+
 BPF extensibility and applicability to networking, tracing, security
 in the linux kernel and several user space implementations of BPF
 virtual machine led to a number of misunderstanding on what BPF actually is.
 This short QA is an attempt to address that and outline a direction
 of where BPF is heading long term.
 
+.. contents::
+:local:
+:depth: 3
+
+Questions and Answers
+=
+
 Q: Is BPF a generic instruction set similar to x64 and arm64?
+-
 A: NO.
 
 Q: Is BPF a generic virtual machine ?
+-
 A: NO.
 
-BPF is generic instruction set _with_ C calling convention.
+BPF is generic instruction set *with* C calling convention.
+---
 
 Q: Why C calling convention was chosen?
+~~~
+
 A: Because BPF programs are designed to run in the linux kernel
-   which is written in C, hence BPF defines instruction set compatible
-   with two most used architectures x64 and arm64 (and takes into
-   consideration important quirks of other architectures) and
-   defines calling convention that is compatible with C calling
-   convention of the linux kernel on those architectures.
+which is written in C, hence BPF defines instruction set compatible
+with two most used architectures x64 and arm64 (and takes into
+consideration important quirks of other architectures) and
+defines calling convention that is compatible with C calling
+convention of the linux kernel on those architectures.
 
 Q: can multiple return values be supported in the future?
+~
 A: NO. BPF allows only register R0 to be used as return value.
 
 Q: can more than 5 function arguments be supported in the future?
+~
 A: NO. BPF calling convention only allows registers R1-R5 to be used
-   as arguments. BPF is not a standalone instruction set.
-   (unlike x64 ISA that allows msft, cdecl and other conventions)
+as arguments. BPF is not a standalone instruction set.
+(unlike x64 ISA that allows msft, cdecl and other conventions)
 
 Q: can BPF programs access instruction pointer or return address?
+-
 A: NO.
 
 Q: can BPF programs access stack pointer ?
-A: NO. Only frame pointer (register R10) is accessible.
-   From compiler point of view it's necessary to have stack pointer.
-   For example LLVM defines register R11 as stack pointer in its
-   BPF backend, but it makes sure that generated code never uses it.
+--
+A: NO.
+
+Only frame pointer (register R10) is accessible.
+From compiler point of view it's necessary to have stack pointer.
+For example LLVM defines register R11 as stack pointer in its
+BPF backend, but it makes sure that generated code never uses it.
 
 Q: Does C-calling convention diminishes possible use cases?
-A: YES. BPF design forces addition of major functionality in the form
-   of kernel helper functions and kernel objects like BPF maps with
-   seamless interoperability between them. It lets kernel call into
-   BPF programs and programs call kernel helpers with zero overhead.
-   As all of them were native C code. That is particularly the case
-   for JITed BPF programs that are indistinguishable from
-   native kernel C code.
+---
+A: YES.
+
+BPF design forces addition of major functionality in the form
+of kernel helper functions and kernel objects like BPF maps with
+seamless interoperability between them. It lets kernel call into
+BPF programs and programs call kernel helpers with zero overhead.
+As all of them were n

[bpf-next PATCH 5/5] bpf, doc: howto use/run the BPF selftests

2018-05-14 Thread Jesper Dangaard Brouer
I always forget howto run the BPF selftests. Thus, lets add that info
to the QA document.

Documentation was based on Cilium's documentation:
 http://cilium.readthedocs.io/en/latest/bpf/#verifying-the-setup

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/bpf_devel_QA.rst |   29 +
 1 file changed, 29 insertions(+)

diff --git a/Documentation/bpf/bpf_devel_QA.rst 
b/Documentation/bpf/bpf_devel_QA.rst
index 2254bdeae990..0e7c1d946e83 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -417,6 +417,33 @@ submitted by the BPF maintainers to the stable maintainers.
 Testing patches
 ===
 
+Q: How to run BPF selftests
+---
+A: After you have booted into the newly compiled kernel, navigate to
+the BPF selftests_ suite in order to test BPF functionality (current
+working directory points to the root of the cloned git tree)::
+
+  $ cd tools/testing/selftests/bpf/
+  $ make
+
+To run the verifier tests::
+
+  $ sudo ./test_verifier
+
+The verifier tests print out all the current checks being
+performed. The summary at the end of running all tests will dump
+information of test successes and failures::
+
+  Summary: 418 PASSED, 0 FAILED
+
+In order to run through all BPF selftests, the following command is
+needed::
+
+  $ sudo make run_tests
+
+See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
+document for further documentation.
+
 Q: Which BPF kernel selftests version should I run my kernel against?
 -
 A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
@@ -607,5 +634,7 @@ when:
 .. _netdev FAQ: ../networking/netdev-FAQ.txt
 .. _samples/bpf/: ../../samples/bpf/
 .. _selftests: ../../tools/testing/selftests/bpf/
+.. _Documentation/dev-tools/kselftest.rst:
+   https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html
 
 Happy BPF hacking!

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bpf-next PATCH 4/5] bpf, doc: convert bpf_devel_QA.rst to use RST formatting

2018-05-14 Thread Jesper Dangaard Brouer
Same story as bpf_design_QA.rst RST format conversion.

Again thanks to Quentin Monnet <quentin.mon...@netronome.com> for
fixes and patches that have been squashed.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/bpf_devel_QA.rst |  799 +++-
 1 file changed, 420 insertions(+), 379 deletions(-)

diff --git a/Documentation/bpf/bpf_devel_QA.rst 
b/Documentation/bpf/bpf_devel_QA.rst
index da57601153a0..2254bdeae990 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -1,424 +1,446 @@
+=
+HOWTO interact with BPF subsystem
+=
+
 This document provides information for the BPF subsystem about various
 workflows related to reporting bugs, submitting patches, and queueing
 patches for stable kernels.
 
 For general information about submitting patches, please refer to
-Documentation/process/. This document only describes additional specifics
+`Documentation/process/`_. This document only describes additional specifics
 related to BPF.
 
-Reporting bugs:

+.. contents::
+:local:
+:depth: 2
 
-Q: How do I report bugs for BPF kernel code?
+Reporting bugs
+==
 
+Q: How do I report bugs for BPF kernel code?
+
 A: Since all BPF kernel development as well as bpftool and iproute2 BPF
-   loader development happens through the netdev kernel mailing list,
-   please report any found issues around BPF to the following mailing
-   list:
+loader development happens through the netdev kernel mailing list,
+please report any found issues around BPF to the following mailing
+list:
 
- net...@vger.kernel.org
+ net...@vger.kernel.org
 
-   This may also include issues related to XDP, BPF tracing, etc.
+This may also include issues related to XDP, BPF tracing, etc.
 
-   Given netdev has a high volume of traffic, please also add the BPF
-   maintainers to Cc (from kernel MAINTAINERS file):
+Given netdev has a high volume of traffic, please also add the BPF
+maintainers to Cc (from kernel MAINTAINERS_ file):
 
- Alexei Starovoitov <a...@kernel.org>
- Daniel Borkmann <dan...@iogearbox.net>
+* Alexei Starovoitov <a...@kernel.org>
+* Daniel Borkmann <dan...@iogearbox.net>
 
-   In case a buggy commit has already been identified, make sure to keep
-   the actual commit authors in Cc as well for the report. They can
-   typically be identified through the kernel's git tree.
+In case a buggy commit has already been identified, make sure to keep
+the actual commit authors in Cc as well for the report. They can
+typically be identified through the kernel's git tree.
 
-   Please do *not* report BPF issues to bugzilla.kernel.org since it
-   is a guarantee that the reported issue will be overlooked.
+**Please do NOT report BPF issues to bugzilla.kernel.org since it
+is a guarantee that the reported issue will be overlooked.**
 
-Submitting patches:

+Submitting patches
+==
 
 Q: To which mailing list do I need to submit my BPF patches?
-
+
 A: Please submit your BPF patches to the netdev kernel mailing list:
 
- net...@vger.kernel.org
+ net...@vger.kernel.org
 
-   Historically, BPF came out of networking and has always been maintained
-   by the kernel networking community. Although these days BPF touches
-   many other subsystems as well, the patches are still routed mainly
-   through the networking community.
+Historically, BPF came out of networking and has always been maintained
+by the kernel networking community. Although these days BPF touches
+many other subsystems as well, the patches are still routed mainly
+through the networking community.
 
-   In case your patch has changes in various different subsystems (e.g.
-   tracing, security, etc), make sure to Cc the related kernel mailing
-   lists and maintainers from there as well, so they are able to review
-   the changes and provide their Acked-by's to the patches.
+In case your patch has changes in various different subsystems (e.g.
+tracing, security, etc), make sure to Cc the related kernel mailing
+lists and maintainers from there as well, so they are able to review
+the changes and provide their Acked-by's to the patches.
 
 Q: Where can I find patches currently under discussion for BPF subsystem?
-
+-
 A: All patches that are Cc'ed to netdev are queued for review under netdev
-   patchwork project:
+patchwork project:
 
- http://patchwork.ozlabs.org/project/netdev/list/
+  http://patchwork.ozlabs.org/project/netdev/list/
 
-   Those patches which target BPF, are assigned to a 'bpf' delegate for
-   further processing from BPF maintainers. The current queue with
-   patches under review can be found

[bpf-next PATCH 2/5] bpf, doc: rename txt files to rst files

2018-05-14 Thread Jesper Dangaard Brouer
This will cause them to get auto rendered, e.g. when viewing them on GitHub.
Followup patches will correct the content to be RST compliant.

Also adjust README.rst to point to the renamed files.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/README.rst|4 
 Documentation/bpf/bpf_design_QA.rst |  156 ++
 Documentation/bpf/bpf_design_QA.txt |  156 --
 Documentation/bpf/bpf_devel_QA.rst  |  570 +++
 Documentation/bpf/bpf_devel_QA.txt  |  570 ---
 5 files changed, 728 insertions(+), 728 deletions(-)
 create mode 100644 Documentation/bpf/bpf_design_QA.rst
 delete mode 100644 Documentation/bpf/bpf_design_QA.txt
 create mode 100644 Documentation/bpf/bpf_devel_QA.rst
 delete mode 100644 Documentation/bpf/bpf_devel_QA.txt

diff --git a/Documentation/bpf/README.rst b/Documentation/bpf/README.rst
index 329469c33db8..b9a80c9e9392 100644
--- a/Documentation/bpf/README.rst
+++ b/Documentation/bpf/README.rst
@@ -28,8 +28,8 @@ Two sets of Questions and Answers (Q) are maintained.
 
 
 .. Links:
-.. _bpf_design_QA: bpf_design_QA.txt
-.. _bpf_devel_QA:  bpf_devel_QA.txt
+.. _bpf_design_QA: bpf_design_QA.rst
+.. _bpf_devel_QA:  bpf_devel_QA.rst
 .. _Documentation/networking/filter.txt: ../networking/filter.txt
 .. _man-pages: https://www.kernel.org/doc/man-pages/
 .. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
diff --git a/Documentation/bpf/bpf_design_QA.rst 
b/Documentation/bpf/bpf_design_QA.rst
new file mode 100644
index ..f3e458a0bb2f
--- /dev/null
+++ b/Documentation/bpf/bpf_design_QA.rst
@@ -0,0 +1,156 @@
+BPF extensibility and applicability to networking, tracing, security
+in the linux kernel and several user space implementations of BPF
+virtual machine led to a number of misunderstanding on what BPF actually is.
+This short QA is an attempt to address that and outline a direction
+of where BPF is heading long term.
+
+Q: Is BPF a generic instruction set similar to x64 and arm64?
+A: NO.
+
+Q: Is BPF a generic virtual machine ?
+A: NO.
+
+BPF is generic instruction set _with_ C calling convention.
+
+Q: Why C calling convention was chosen?
+A: Because BPF programs are designed to run in the linux kernel
+   which is written in C, hence BPF defines instruction set compatible
+   with two most used architectures x64 and arm64 (and takes into
+   consideration important quirks of other architectures) and
+   defines calling convention that is compatible with C calling
+   convention of the linux kernel on those architectures.
+
+Q: can multiple return values be supported in the future?
+A: NO. BPF allows only register R0 to be used as return value.
+
+Q: can more than 5 function arguments be supported in the future?
+A: NO. BPF calling convention only allows registers R1-R5 to be used
+   as arguments. BPF is not a standalone instruction set.
+   (unlike x64 ISA that allows msft, cdecl and other conventions)
+
+Q: can BPF programs access instruction pointer or return address?
+A: NO.
+
+Q: can BPF programs access stack pointer ?
+A: NO. Only frame pointer (register R10) is accessible.
+   From compiler point of view it's necessary to have stack pointer.
+   For example LLVM defines register R11 as stack pointer in its
+   BPF backend, but it makes sure that generated code never uses it.
+
+Q: Does C-calling convention diminishes possible use cases?
+A: YES. BPF design forces addition of major functionality in the form
+   of kernel helper functions and kernel objects like BPF maps with
+   seamless interoperability between them. It lets kernel call into
+   BPF programs and programs call kernel helpers with zero overhead.
+   As all of them were native C code. That is particularly the case
+   for JITed BPF programs that are indistinguishable from
+   native kernel C code.
+
+Q: Does it mean that 'innovative' extensions to BPF code are disallowed?
+A: Soft yes. At least for now until BPF core has support for
+   bpf-to-bpf calls, indirect calls, loops, global variables,
+   jump tables, read only sections and all other normal constructs
+   that C code can produce.
+
+Q: Can loops be supported in a safe way?
+A: It's not clear yet. BPF developers are trying to find a way to
+   support bounded loops where the verifier can guarantee that
+   the program terminates in less than 4096 instructions.
+
+Q: How come LD_ABS and LD_IND instruction are present in BPF whereas
+   C code cannot express them and has to use builtin intrinsics?
+A: This is artifact of compatibility with classic BPF. Modern
+   networking code in BPF performs better without them.
+   See 'direct packet access'.
+
+Q: It seems not all BPF instructions are one-to-one to native CPU.
+   For example why BPF_JNE and other compare and jumps are not cpu-like?
+A: This was necessary to avoid introducing flags into ISA which are
+   impossible to make generic and efficient across CPU architectures.
+
+

[bpf-next PATCH 1/5] bpf, doc: add basic README.rst file

2018-05-14 Thread Jesper Dangaard Brouer
A README.rst file in a directory have special meaning for sites like
github, which auto renders the contents.  Plus search engines like
Google also index these README.rst files.

Auto rendering allow us to use links, for (re)directing eBPF users to
other places where docs live.  The end-goal would be to direct users
towards https://www.kernel.org/doc/html/latest but we haven't written
the full docs yet, so we start out small and take this incrementally.

This directory itself contains some useful docs, which can be linked
to from the README.rst file (verified this works for github).

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/README.rst |   36 
 1 file changed, 36 insertions(+)
 create mode 100644 Documentation/bpf/README.rst

diff --git a/Documentation/bpf/README.rst b/Documentation/bpf/README.rst
new file mode 100644
index ..329469c33db8
--- /dev/null
+++ b/Documentation/bpf/README.rst
@@ -0,0 +1,36 @@
+=
+BPF documentation
+=
+
+This directory contains documentation for the BPF (Berkeley Packet
+Filter) facility, with a focus on the extended BPF version (eBPF).
+
+This kernel side documentation is still work in progress.  The main
+textual documentation is (for historical reasons) described in
+`Documentation/networking/filter.txt`_, which describe both classical
+and extended BPF instruction-set.
+The Cilium project also maintains a `BPF and XDP Reference Guide`_
+that goes into great technical depth about the BPF Architecture.
+
+The primary info for the bpf syscall is available in the `man-pages`_
+for `bpf(2)`_.
+
+
+
+Frequently asked questions (FAQ)
+
+
+Two sets of Questions and Answers (Q) are maintained.
+
+* QA for common questions about BPF see: bpf_design_QA_
+
+* QA for developers interacting with BPF subsystem: bpf_devel_QA_
+
+
+.. Links:
+.. _bpf_design_QA: bpf_design_QA.txt
+.. _bpf_devel_QA:  bpf_devel_QA.txt
+.. _Documentation/networking/filter.txt: ../networking/filter.txt
+.. _man-pages: https://www.kernel.org/doc/man-pages/
+.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
+.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[bpf-next PATCH 0/5] bpf, doc: convert Documentation/bpf to RST-formatting

2018-05-14 Thread Jesper Dangaard Brouer
The kernel is moving files under Documentation to use the RST
(reStructuredText) format and Sphinx [1].  This patchset converts the
files under Documentation/bpf/ into RST format.  The Sphinx
integration is left as followup work.

[1] https://www.kernel.org/doc/html/latest/doc-guide/sphinx.html

This patchset have been uploaded as branch bpf_doc10 on github[2], so
reviewers can see how GitHub renders this.

[2] https://github.com/netoptimizer/linux/tree/bpf_doc10/Documentation/bpf

---

Jesper Dangaard Brouer (5):
  bpf, doc: add basic README.rst file
  bpf, doc: rename txt files to rst files
  bpf, doc: convert bpf_design_QA.rst to use RST formatting
  bpf, doc: convert bpf_devel_QA.rst to use RST formatting
  bpf, doc: howto use/run the BPF selftests


 Documentation/bpf/README.rst|   36 ++
 Documentation/bpf/bpf_design_QA.rst |  221 
 Documentation/bpf/bpf_design_QA.txt |  156 -
 Documentation/bpf/bpf_devel_QA.rst  |  640 +++
 Documentation/bpf/bpf_devel_QA.txt  |  570 ---
 5 files changed, 897 insertions(+), 726 deletions(-)
 create mode 100644 Documentation/bpf/README.rst
 create mode 100644 Documentation/bpf/bpf_design_QA.rst
 delete mode 100644 Documentation/bpf/bpf_design_QA.txt
 create mode 100644 Documentation/bpf/bpf_devel_QA.rst
 delete mode 100644 Documentation/bpf/bpf_devel_QA.txt

--
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-19 Thread Jesper Dangaard Brouer
On Thu, 19 Apr 2018 13:44:41 +0100
Quentin Monnet <quentin.mon...@netronome.com> wrote:

> 2018-04-18 17:43 UTC+0200 ~ Jesper Dangaard Brouer <bro...@redhat.com>
> > On Wed, 18 Apr 2018 15:09:41 +0100
> > Quentin Monnet <quentin.mon...@netronome.com> wrote:
> >   
> >> 2018-04-18 15:34 UTC+0200 ~ Jesper Dangaard Brouer <bro...@redhat.com>  
> >>> On Tue, 17 Apr 2018 15:34:38 +0100
> >>> Quentin Monnet <quentin.mon...@netronome.com> wrote:
> >>> 
> >>>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >>>> index 350459c583de..3d329538498f 100644
> >>>> --- a/include/uapi/linux/bpf.h
> >>>> +++ b/include/uapi/linux/bpf.h
> >>>> @@ -1276,6 +1276,50 @@ union bpf_attr {
> >>>>   *  Return
> >>>>   *  0 on success, or a negative error in case of failure.
> >>>>   *
> >>>> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> >>>> + *  Description
> >>>> + *  Redirect the packet to the endpoint referenced by *map* 
> >>>> at
> >>>> + *  index *key*. Depending on its type, his *map* can 
> >>>> contain
> >>> ^^^
> >>>
> >>> "his" -> "this"
> >>
> >> Thanks!
> >>  
> >>>> + *  references to net devices (for forwarding packets 
> >>>> through other
> >>>> + *  ports), or to CPUs (for redirecting XDP frames to 
> >>>> another CPU;
> >>>> + *  but this is only implemented for native XDP (with driver
> >>>> + *  support) as of this writing).
> >>>> + *
> >>>> + *  All values for *flags* are reserved for future usage, 
> >>>> and must
> >>>> + *  be left at zero.
> >>>> + *  Return
> >>>> + *  **XDP_REDIRECT** on success, or **XDP_ABORT** on error.
> >>>> + *
> >>>
> >>> "XDP_ABORT" -> "XDP_ABORTED"
> >>
> >> Ouch. And I did the same for bpf_redirect(). Thanks for the catch.
> >>  
> >>>
> >>> I don't know if it's worth mentioning in the doc/man-page; that for XDP
> >>> using bpf_redirect_map() is a HUGE performance advantage, compared to
> >>> the bpf_redirect() call ?
> >>
> >> It seems worth to me. How would you simply explain the reason for this
> >> difference?  
> > 
> > The basic reason is "bulking effect", as devmap avoids the NIC
> > tailptr/doorbell update on every packet... how to write that in a doc
> > format?
> > 
> > I wrote about why XDP_REDIRECT with maps are smart here:
> >  
> > http://vger.kernel.org/netconf2017_files/XDP_devel_update_NetConf2017_Seoul.pdf
> > 
> > Using maps for redirect, hopefully makes XDP_REDIRECT the last driver
> > XDP action code we need.  As new types of redirect can be introduced
> > without driver changes. See that AF_XDP also uses a map.
> > 
> > It is more subtle, but maps also function as a sorting step. Imagine
> > your XDP program need to redirect out different interfaces (or CPUs in
> > cpumap case), and packets arrive intermixed.  Packets get sorted into
> > the different map indexes, and the xdp_do_flush_map() will trigger the
> > flush operation.
> > 
> > 
> > Happened to have an i40e NIC benchmark setup, and ran a single flow pktgen 
> > test.
> > 
> > Results with 'xdp_redirect_map'
> >  13589297 pps (13,589,297) 
> > 
> > Results with 'xdp_redirect' NOT using devmap:
> >   7567575 pps (7,567,575)
> > 
> > Just to point out the performance benefit of devmap...  
> 
> 
> Thanks for those details! This is an impressive change in performance
> indeed.
> 
> I think I will just keep it simple for the documentation. I will add the
> following for bpf_redirect_map():
> 
> When used to redirect packets to net devices, this helper
> provides a high performance increase over **bpf_redirect**\ ().
> This is due to various implementation details of the underlying
> mechanisms, one of which is the fact that **bpf_redirect_map**\ ()
> tries to send packet as a "bulk" to the device.
> 
> And also append the following to bpf_redirect():
> 
> The same effect can be attained with the more generic
> **bpf_redirect_map**\ (), which requires specific maps
> to be used but offers better performance.

This sounds good to me! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-18 Thread Jesper Dangaard Brouer
On Wed, 18 Apr 2018 15:09:41 +0100
Quentin Monnet <quentin.mon...@netronome.com> wrote:

> 2018-04-18 15:34 UTC+0200 ~ Jesper Dangaard Brouer <bro...@redhat.com>
> > On Tue, 17 Apr 2018 15:34:38 +0100
> > Quentin Monnet <quentin.mon...@netronome.com> wrote:
> >   
> >> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> >> index 350459c583de..3d329538498f 100644
> >> --- a/include/uapi/linux/bpf.h
> >> +++ b/include/uapi/linux/bpf.h
> >> @@ -1276,6 +1276,50 @@ union bpf_attr {
> >>   *Return
> >>   *0 on success, or a negative error in case of failure.
> >>   *
> >> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> >> + *Description
> >> + *Redirect the packet to the endpoint referenced by *map* 
> >> at
> >> + *index *key*. Depending on its type, his *map* can 
> >> contain  
> > ^^^
> > 
> > "his" -> "this"  
> 
> Thanks!
> 
> >> + *references to net devices (for forwarding packets 
> >> through other
> >> + *ports), or to CPUs (for redirecting XDP frames to 
> >> another CPU;
> >> + *but this is only implemented for native XDP (with driver
> >> + *support) as of this writing).
> >> + *
> >> + *All values for *flags* are reserved for future usage, 
> >> and must
> >> + *be left at zero.
> >> + *Return
> >> + ***XDP_REDIRECT** on success, or **XDP_ABORT** on error.
> >> + *  
> > 
> > "XDP_ABORT" -> "XDP_ABORTED"  
> 
> Ouch. And I did the same for bpf_redirect(). Thanks for the catch.
> 
> > 
> > I don't know if it's worth mentioning in the doc/man-page; that for XDP
> > using bpf_redirect_map() is a HUGE performance advantage, compared to
> > the bpf_redirect() call ?  
> 
> It seems worth to me. How would you simply explain the reason for this
> difference?

The basic reason is "bulking effect", as devmap avoids the NIC
tailptr/doorbell update on every packet... how to write that in a doc
format?

I wrote about why XDP_REDIRECT with maps are smart here:
 http://vger.kernel.org/netconf2017_files/XDP_devel_update_NetConf2017_Seoul.pdf

Using maps for redirect, hopefully makes XDP_REDIRECT the last driver
XDP action code we need.  As new types of redirect can be introduced
without driver changes. See that AF_XDP also uses a map.

It is more subtle, but maps also function as a sorting step. Imagine
your XDP program need to redirect out different interfaces (or CPUs in
cpumap case), and packets arrive intermixed.  Packets get sorted into
the different map indexes, and the xdp_do_flush_map() will trigger the
flush operation.


Happened to have an i40e NIC benchmark setup, and ran a single flow pktgen test.

Results with 'xdp_redirect_map'
 13589297 pps (13,589,297) 

Results with 'xdp_redirect' NOT using devmap:
  7567575 pps (7,567,575)

Just to point out the performance benefit of devmap...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH bpf-next v3 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-18 Thread Jesper Dangaard Brouer
On Tue, 17 Apr 2018 15:34:38 +0100
Quentin Monnet <quentin.mon...@netronome.com> wrote:

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 350459c583de..3d329538498f 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1276,6 +1276,50 @@ union bpf_attr {
>   *   Return
>   *   0 on success, or a negative error in case of failure.
>   *
> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> + *   Description
> + *   Redirect the packet to the endpoint referenced by *map* at
> + *   index *key*. Depending on its type, his *map* can contain
^^^

"his" -> "this"

> + *   references to net devices (for forwarding packets through other
> + *   ports), or to CPUs (for redirecting XDP frames to another CPU;
> + *   but this is only implemented for native XDP (with driver
> + *   support) as of this writing).
> + *
> + *   All values for *flags* are reserved for future usage, and must
> + *   be left at zero.
> + *   Return
> + *   **XDP_REDIRECT** on success, or **XDP_ABORT** on error.
> + *

"XDP_ABORT" -> "XDP_ABORTED"

I don't know if it's worth mentioning in the doc/man-page; that for XDP
using bpf_redirect_map() is a HUGE performance advantage, compared to
the bpf_redirect() call ?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC bpf-next v2 8/8] bpf: add documentation for eBPF helpers (58-64)

2018-04-11 Thread Jesper Dangaard Brouer
On Tue, 10 Apr 2018 15:41:57 +0100
Quentin Monnet <quentin.mon...@netronome.com> wrote:

> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 7343af4196c8..db090ad03626 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1250,6 +1250,51 @@ union bpf_attr {
>   *   Return
>   *   0 on success, or a negative error in case of failure.
>   *
> + * int bpf_redirect_map(struct bpf_map *map, u32 key, u64 flags)
> + *   Description
> + *   Redirect the packet to the endpoint referenced by *map* at
> + *   index *key*. Depending on its type, his *map* can contain
> + *   references to net devices (for forwarding packets through other
> + *   ports), or to CPUs (for redirecting XDP frames to another CPU;
> + *   but this is not fully implemented as of this writing).

Stating that CPUMAP redirect "is not fully implemented" is confusing.
The issue is that CPUMAP only works for "native" XDP.

What about saying:

"[...] or to CPUs (for redirecting XDP frames to another CPU;
 but this is only implemented for native XDP as of this writing)"

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH net-next 2/2] tools: bpf: add bpftool

2017-09-27 Thread Jesper Dangaard Brouer
On Wed, 27 Sep 2017 03:57:42 -0700
Jakub Kicinski <jakub.kicin...@netronome.com> wrote:

> On Wed, 27 Sep 2017 12:45:11 +0200, Jesper Dangaard Brouer wrote:
> > On Wed, 27 Sep 2017 00:02:08 +0100
> > Jakub Kicinski <jakub.kicin...@netronome.com> wrote:
> >   
> > > On Tue, 26 Sep 2017 15:24:06 -0700, Alexei Starovoitov wrote:
> > > > On Tue, Sep 26, 2017 at 08:35:22AM -0700, Jakub Kicinski wrote:  
> > > > > Add a simple tool for querying and updating BPF objects on the system.
> > > > > 
> > > > > Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> > > > > Reviewed-by: Simon Horman <simon.hor...@netronome.com>
> > [...]  
> > > > >  tools/bpf/Makefile |  18 +-
> > > > >  tools/bpf/bpftool/Makefile |  80 +
> > > > >  tools/bpf/bpftool/common.c | 214 
> > > > >  tools/bpf/bpftool/jit_disasm.c |  83 +
> > > > >  tools/bpf/bpftool/main.c   | 212 
> > > > >  tools/bpf/bpftool/main.h   |  99 ++
> > > > >  tools/bpf/bpftool/map.c| 742 
> > > > > +
> > > > >  tools/bpf/bpftool/prog.c   | 392 ++
> > > > >  8 files changed, 1837 insertions(+), 3 deletions(-)
> > > > ...  
> > > > > +static int do_help(int argc, char **argv)
> > > > > +{
> > > > > + fprintf(stderr,
> > > > > + "Usage: %s %s show   [MAP]\n"
> > > > > + "   %s %s dumpMAP\n"
> > > > > + "   %s %s update  MAP  key BYTES value VALUE 
> > > > > [UPDATE_FLAGS]\n"
> > > > > + "   %s %s lookup  MAP  key BYTES\n"
> > > > > + "   %s %s getnext MAP [key BYTES]\n"
> > > > > + "   %s %s delete  MAP  key BYTES\n"
> > > > > + "   %s %s pin MAP  FILE\n"
> > > > > + "   %s %s help\n"
> > > > > + "\n"
> > > > > + "   MAP := { id MAP_ID | pinned FILE }\n"
> > > > > + "   " HELP_SPEC_PROGRAM "\n"
> > > > > + "   VALUE := { BYTES | MAP | PROG }\n"
> > > > > + "   UPDATE_FLAGS := { any | exist | noexist }\n"
> > > > > + "",
> > > > 
> > > > overall looks good to me, but still difficult to grasp how to use it.
> > > > Can you add README with example usage and expected output?  
> > > 
> > > I have a README on GitHub, but I was thinking about perhaps writing a
> > > proper man page?  Do you prefer one over the other?
> > 
> > I would prefer adding a README.rst file, in RST-format, as the rest of
> > the kernel documentation is moving in that direction[1] (your github
> > version is in README.md format).  A man page will always be
> > out-of-sync, and even out-of-sync on different distros.
> > 
> >  See[1]: https://www.kernel.org/doc/html/latest/
> > 
> > And then I would find some place in Documentation/admin-guide/ and
> > include the README.rst file, so it shows up at [1].
> > 
> > RST have an include method like:
> > 
> > .. include:: ../../tools/bpf/bpftool/README.rst  
> 
> Can the docs in new format be rendered into a man page?  Call me old
> fashioned but I think we should provide some form of a man page.. :)

Yes, simply create the man page like:

 rst2man README.rst README.man

You can add this to your local makefile.

The standard sphinx build can also generate man-pages, but it have been
removed from the kernel makefile targets:

Documentation targets:
 Linux kernel internal documentation in different formats from ReST:
  htmldocs- HTML
  latexdocs   - LaTeX
  pdfdocs - PDF
  epubdocs- EPUB
  xmldocs - XML
  linkcheckdocs   - check for broken external links (will connect to external 
hosts)
  cleandocs   - clean all generated files

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-08 Thread Jesper Dangaard Brouer
On Tue, 7 Feb 2017 14:23:23 -0700
Jonathan Corbet <cor...@lwn.net> wrote:

> On Tue, 7 Feb 2017 21:51:49 +0100
> Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> 
> > I sounds like Daniel (see other email) have bigger plans for what
> > Documentation/BPF/ should contain.  E.g. consolidating
> > Documentation/networking/filter.txt which covers the cBPF/eBPF internals.
> > If that is the case (and I like the idea), then it goes beyond a
> > "userspace-guide".  And perhaps "BPF" is a "book" of its own?  
> 
> One of the real problems with the kernel's documentation is that there is
> really almost no thought given to who the audience is.  We have docs for
> kernel developers, for system admins, for user-space developers, etc., and
> it's all mixed up into one big jumble.
> 
> An objective of mine in launching into this whole project is to try to fix
> that, so that people can readily find the documentation they need.  So I
> don't think a single top-level directory, with a mix of user-space API
> info and "internals", is the right direction to go.  The internals docs
> should, IMO, go elsewhere, probably in the core-api manual.
> 
> See what I'm getting at here?

First I was reluctant (as it would be "easier" just to cramp every eBPF
thing into one directory).  Thinking more about, I agree with you, and
I like your vision.  Focus on the target audience and avoid mixing
different target audience in the same document/book is the way forward.

My audience and objective is helping developers getting started using
eBPF, not core-developers on eBPF (like Daniel).  I do see that, if we
start mixing in too much "internals" then we loose sense of the
original target audience, and then they "exit" as they get "lost" in
details that does not concern them.

Separating BPF docs into different directories (or "books") will make
us think about the target audience.

I would like to propose directory structure:

 Documentation/user-guide/bpf/
 Documentation/core-api/bpf/


> > And it seems Daniel is proposing capital-letters BPF for the directory
> > name "Documentation/BPF/"?  Any opinions on that? (I'm neutral)  
> 
> I think we should paint it green; otherwise I'm not too concerned about
> this particular point...:)

True, bikeshedding...

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-07 Thread Jesper Dangaard Brouer
On Tue, 07 Feb 2017 17:43:38 +0100
Daniel Borkmann <dan...@iogearbox.net> wrote:

> Hi Jesper,
> 
> On 02/07/2017 03:30 PM, Jesper Dangaard Brouer wrote:
> > Question: What kernel tree should this go into???
> >
> > If going through Jonathan Corbet, will it appear sooner here???
> >   https://www.kernel.org/doc/html/latest/
> > If it will not appear sooner that way, then it's likely best to keep
> > it in sync with the tree that takes eBPF code changes.  
> 
> For initial parts, I don't have a preference (Jonathan has though,
> so seems fine via docs tree then). If at some /later/ point in time
> features come in along with doc updates (similar to test case updates),
> probably best to route them via net-next.
> 
> > This marks the beginning of user-facing developer documentation for
> > using eBPF (extended Berkeley Packet Filter) as part of the kernel
> > Documentation/ tree.
> >
> > This documentation is also available here[1], as an intermidiate quick
> > way of prototyping and releasing the documentation.  The autoriative
> > and official version of the documentation is what gets included in the
> > kernel tree.  The docs at [2] will get updated based on what gets
> > accepted after the standard peer-review kernel process.  
> 
> Thanks for your effort of writing a doc. Some high-level comments on
> the set from my PoV first.
> 
> I think it's definitely the right direction to move everything BPF
> related to Documentation/BPF/. Right now, there are a lot of different
> places with different kind of documentation, f.e.:

Agree that we need some in-kernel place to centralize bpf related
documentation, as it is too scattered at the moment.

> * Documentation/networking/filter.txt
>Covers some cBPF/eBPF internals, tooling, etc; mostly technical,
>historically the central spot for BPF documentation. "filter" in
>filter.txt is long obsolete name, but looks like various sites,
>talks, blogs, etc still link to it. (At best, we should keep the
>file saying that the doc moved to Documentation/BPF/.)

Agree.
 
> * bpf(2) man page
>Has a good start, but right now is heavily behind the current user
>facing kernel code.

Yes, the man-page have proven to get out-of-sync.  This is one of the
reasons I prefer this in-kernel-tree documentation, as documentation
can follow the patchset submission, instead of being something
developers need to submit _after_ patches are accepted.

 
> * include/uapi/linux/bpf.h
>Mostly relevant for helper function API description.
> 
> * netdev conference slides/proceedings
>Also contain mostly technical details on eBPF.
> 
> * https://github.com/iovisor/bpf-docs
>Non-exhaustive collection of various talks from different confs.
> 
> * https://qmonnet.github.io/whirl-offload/2016/09/01/dive-into-bpf/
>Even bigger and more complete list of documentation material.
> 
> * Various lwn articles ;), blog posts (f.e. from Brendan), etc.
> 
> Now, challenge is to bring the relevant parts together and logically
> separated into Documentation/BPF/ and bpf(2) man page. I think everything
> user API relevant would help most if it updates bpf(2) man page. That can
> be explanation of different map types, interaction with maps, quirks, etc.

Sorry, but I disagree.  The man-page bpf(2) should only describe the
bpf syscall.  Details on map types should be documented in this
documentation.  Why, because this allow us to enforce documentation
of a new map type is included together with the code submission (else
it will never get documented).


> Eventually also helper functions. Right now they're all documented in
> include/uapi/linux/bpf.h and that's okay as it ships along with the
> kernel code, so they're in sync. Eventually, there should be some more
> elaborate description of them, perhaps with tiny examples, in bpf(2)
> as well, since it's part of the uapi and stable (helpers themselves at
> least).

IMHO descriptions for bpf helpers function does not belong in the
man-page for the bpf(2) syscall.  The bpf helpers are something that
gets used by the ebpf program code which runs kernel side.  (And again
the same argument about introducing new will not get updated in the man
page).


> The Documentation/networking/filter.txt would need to be reworked a
> bit and split into pieces for Documentation/BPF/, so we keep that as a
> central place for the technical parts. Documentation/RCU/ is doing a
> great job at that, and I would like to see Documentation/BPF/ being as
> helpful for developers here. Part of that would be to add missing
> pieces from the various available sources mentioned above or
> elsewhere, so people can get a deeper understanding on internals
> beyond reading just m

Re: [net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-07 Thread Jesper Dangaard Brouer
On Tue, 7 Feb 2017 09:46:08 -0700
Jonathan Corbet <cor...@lwn.net> wrote:

> On Tue, 7 Feb 2017 17:09:08 +0100
> Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> 
> > > > Question: What kernel tree should this go into???
> > > > 
> > > > If going through Jonathan Corbet, will it appear sooner here???
> > > >  https://www.kernel.org/doc/html/latest/
> > 
> > What about this question?  Or let me ask in another way, what tree is
> > https://www.kernel.org/doc/html/latest/ based on?  
> 
> I believe it's generated from the current -rc.  If this stuff goes into
> 4.11, it should show up there next week.
> 
> > Yes, I was also wondering hard where to put it... and a book for
> > user-space developer documentation would likely be the right place, but
> > it was not there, as you mention ;-) 
> > 
> > I'm fine with moving it later under another "book". Linking to it as
> > HTML would still be the same right? 
> > (https://www.kernel.org/doc/html/latest/bpf/index.html)
> > And is the Documentation/bpf/ directory the correct place?  
> 
> Moving it would change the URL, of course.  If we want to avoid that, we
> should try to come up with the proper placement from the outset.  And we
> would want to move it; I really want to clean up the mess that is the
> top-level directory.
> 
> How about if it goes into Documentation/userspace-guide/bpf ?  The
> intermediate directory could just be empty for now, I'll put the book
> structure into place later on.  Then the URL for the BPF guide itself
> wouldn't change.

I sounds like Daniel (see other email) have bigger plans for what
Documentation/BPF/ should contain.  E.g. consolidating
Documentation/networking/filter.txt which covers the cBPF/eBPF internals.
If that is the case (and I like the idea), then it goes beyond a
"userspace-guide".  And perhaps "BPF" is a "book" of its own?

And it seems Daniel is proposing capital-letters BPF for the directory
name "Documentation/BPF/"?  Any opinions on that? (I'm neutral)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-07 Thread Jesper Dangaard Brouer
On Tue, 7 Feb 2017 08:37:17 -0700
Jonathan Corbet <cor...@lwn.net> wrote:

> On Tue, 07 Feb 2017 15:30:11 +0100
> Jesper Dangaard Brouer <bro...@redhat.com> wrote:
> 
> > Question: What kernel tree should this go into???
> > 
> > If going through Jonathan Corbet, will it appear sooner here???
> >  https://www.kernel.org/doc/html/latest/

What about this question?  Or let me ask in another way, what tree is
https://www.kernel.org/doc/html/latest/ based on?


> > If it will not appear sooner that way, then it's likely best to keep
> > it in sync with the tree that takes eBPF code changes.  
> 
> I've developed a fairly strong preference for carrying patches touching
> index.rst; otherwise I spend a lot of time explaining merge conflicts to
> Linus.
> 
> If the consensus is that this is ready to go, I expect I can squeeze it in
> for 4.11.  I'm not too worried about regressions...:)
> 
> I haven't actually built it yet, but from a first look it seems like an
> awfully good start.  The one thing that comes to mind is that I'm likely
> to want to move it at some point.  I'd really like to start a separate
> book for user-space developer documentation, and this would certainly
> belong there.  That book doesn't exist yet, though, so I can't quite blame
> you, hard as I might try, for not putting this document there.

Yes, I was also wondering hard where to put it... and a book for
user-space developer documentation would likely be the right place, but
it was not there, as you mention ;-) 

I'm fine with moving it later under another "book". Linking to it as
HTML would still be the same right? 
(https://www.kernel.org/doc/html/latest/bpf/index.html)
And is the Documentation/bpf/ directory the correct place?

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next PATCH 0/4] Documenting eBPF - extended Berkeley Packet Filter

2017-02-07 Thread Jesper Dangaard Brouer
Question: What kernel tree should this go into???

If going through Jonathan Corbet, will it appear sooner here???
 https://www.kernel.org/doc/html/latest/
If it will not appear sooner that way, then it's likely best to keep
it in sync with the tree that takes eBPF code changes.


This marks the beginning of user-facing developer documentation for
using eBPF (extended Berkeley Packet Filter) as part of the kernel
Documentation/ tree.

This documentation is also available here[1], as an intermidiate quick
way of prototyping and releasing the documentation.  The autoriative
and official version of the documentation is what gets included in the
kernel tree.  The docs at [2] will get updated based on what gets
accepted after the standard peer-review kernel process.

[1] http://prototype-kernel.readthedocs.io/en/latest/bpf/index.html
[2] 
https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/Documentation

Thanks to the following people, who have already reviewed and fixed
earlier versions of this documentation on the IOvisor mailing-list:

 Alexander Alemayhu <alexan...@alemayhu.com>
 Alexei Starovoitov <a...@fb.com>
 Daniel Borkmann <dan...@iogearbox.net>
 Quentin Monnet <quentin.mon...@6wind.com>


---

Jesper Dangaard Brouer (4):
  doc/bpf: start eBPF documentation tree bpf/
  doc/bpf: document interacting with eBPF maps
  doc/bpf: describes the different types of eBPF maps available
  doc/bpf: describe BCC the BPF Compiler Collection


 Documentation/bpf/bcc_tool_chain.rst  |   37 +
 Documentation/bpf/ebpf_maps.rst   |  256 +
 Documentation/bpf/ebpf_maps_types.rst |  119 +++
 Documentation/bpf/index.rst   |   68 +
 Documentation/index.rst   |1 
 5 files changed, 481 insertions(+)
 create mode 100644 Documentation/bpf/bcc_tool_chain.rst
 create mode 100644 Documentation/bpf/ebpf_maps.rst
 create mode 100644 Documentation/bpf/ebpf_maps_types.rst
 create mode 100644 Documentation/bpf/index.rst

--
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next PATCH 1/4] doc/bpf: start eBPF documentation tree bpf/

2017-02-07 Thread Jesper Dangaard Brouer
The learning curve for eBPF programs is hard.  The purpose of this
documentation (subtree) is to make it easier for developers to get
started using and writing eBPF programs.

Including bpf/index under section User-oriented documentation.

Thanks to Quentin Monnet <quentin.mon...@6wind.com> for improving
this document with areas eBFP is used in and early review.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/index.rst |   66 +++
 Documentation/index.rst |1 +
 2 files changed, 67 insertions(+)
 create mode 100644 Documentation/bpf/index.rst

diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
new file mode 100644
index ..f262fe8f9f95
--- /dev/null
+++ b/Documentation/bpf/index.rst
@@ -0,0 +1,66 @@
+==
+eBPF - extended Berkeley Packet Filter
+==
+
+Introduction
+
+
+The Berkeley Packet Filter (BPF) started (`article 1992`_) as a
+special-purpose virtual machine (register based filter evaluator) for
+filtering network packets, best known for its use in tcpdump. It is
+documented in the kernel tree, in the first part of:
+`Documentation/networking/filter.txt`_
+
+The extended BPF (eBPF) variant has become a universal in-kernel
+virtual machine, that has hooks all over the kernel.  The eBPF
+instruction set is quite different, see description in section "BPF
+kernel internals" of `Documentation/networking/filter.txt`_ or look at
+this `presentation by Alexei`_.
+
+Areas using eBPF:
+ * XDP - eXpress Data Path
+ * `Traffic control`_
+ * Sockets
+ * Firewalling (``xt_bpf`` module)
+ * Tracing
+ * Tracepoints
+ * kprobe (dynamic tracing of a kernel function call)
+ * cgroups
+
+Documentation
+=
+
+The primary user documentation for extended BPF is in the man-page for
+the `bpf(2)`_ syscall.
+
+This documentation is focused on the kernel tree's `samples/bpf/`_ and
+`tools/lib/bpf/`_.  It is worth mentioning that other projects exist,
+like BCC_, that has a slightly different user-facing
+syntax, but is interfacing with the same kernel facilities as those
+covered by this documentation.
+
+.. toctree::
+   :maxdepth: 1
+
+.. links:
+
+.. _article 1992: http://www.tcpdump.org/papers/bpf-usenix93.pdf
+
+.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
+
+.. _Documentation/networking/filter.txt:
+   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/Documentation/networking/filter.txt
+
+.. _presentation by Alexei:
+   http://www.slideshare.net/AlexeiStarovoitov/bpf-inkernel-virtual-machine
+
+.. _samples/bpf/:
+   https://github.com/torvalds/linux/blob/master/samples/bpf/
+
+.. _tools/lib/bpf/:
+   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/tools/lib/bpf/
+
+.. _Traffic control: http://man7.org/linux/man-pages/man8/tc-bpf.8.html
+
+.. _BCC: https://github.com/iovisor/bcc
+
diff --git a/Documentation/index.rst b/Documentation/index.rst
index cb5d77699c60..dacf202febb8 100644
--- a/Documentation/index.rst
+++ b/Documentation/index.rst
@@ -23,6 +23,7 @@ trying to get it to work optimally on a given system.
:maxdepth: 2
 
admin-guide/index
+   bpf/index
 
 Introduction to kernel development
 --

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next PATCH 2/4] doc/bpf: document interacting with eBPF maps

2017-02-07 Thread Jesper Dangaard Brouer
Describe what eBPF maps are, how to create them, and how to
interact with them.

Thanks to Quentin Monnet <quentin.mon...@6wind.com> for improving
this document by fixing many typos and early review.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/ebpf_maps.rst |  255 +++
 Documentation/bpf/index.rst |2 
 2 files changed, 257 insertions(+)
 create mode 100644 Documentation/bpf/ebpf_maps.rst

diff --git a/Documentation/bpf/ebpf_maps.rst b/Documentation/bpf/ebpf_maps.rst
new file mode 100644
index ..b8808f3bc31c
--- /dev/null
+++ b/Documentation/bpf/ebpf_maps.rst
@@ -0,0 +1,255 @@
+=
+eBPF maps
+=
+
+This document describes what eBPF maps are, how you create them
+(`Creating a map`_), and how to interact with them (`Interacting with
+maps`_).
+
+Using eBPF maps is a method to keep state between invocations of the
+eBPF program, and allows sharing data between eBPF kernel programs,
+and also between kernel and user-space applications.
+
+Basically a key/value store with arbitrary structure (from man-page
+`bpf(2)`_):
+
+ eBPF maps are a generic data structure for storage of different data
+ types.  Data types are generally treated as binary blobs, so a user
+ just specifies the size of the key and the size of the value at
+ map-creation time.  In other words, a key/value for a given map can
+ have an arbitrary structure.
+
+The map handles are file descriptors, and multiple maps can be created
+and accessed by multiple programs (from man-page `bpf(2)`_):
+
+ A user process can create multiple maps (with key/value-pairs being
+ opaque bytes of data) and access them via file descriptors.
+ Different eBPF programs can access the same maps in parallel.  It's
+ up to the user process and eBPF program to decide what they store
+ inside maps.
+
+.. _`Creating a map`:
+
+Creating a map
+==
+
+A map is created based on a request from userspace, via the `bpf`_
+syscall (specifically `bpf_cmd`_ BPF_MAP_CREATE), which returns a new
+file descriptor that refers to the map.  On error, -1 is returned and
+errno is set to EINVAL, EPERM, or ENOMEM. These are the struct
+``bpf_attr`` setup arguments to use when creating a map via the
+syscall:
+
+.. code-block:: c
+
+ bpf(BPF_MAP_CREATE, _attr, sizeof(bpf_attr));
+
+Notice how this kernel ABI is extensible, as more struct arguments can
+easily be added later as the sizeof(bpf_attr) is passed along to the
+syscall.  This also implies that API users must clear/zero
+sizeof(bpf_attr), as compiler can size-align the struct differently,
+to avoid garbage data to be interpreted as parameters by future
+kernels.
+
+The following configuration attributes are needed when creating the map:
+
+.. code-block:: c
+
+ union bpf_attr {
+  struct { /* anonymous struct used by BPF_MAP_CREATE command */
+ __u32   map_type;   /* one of enum bpf_map_type */
+ __u32   key_size;   /* size of key in bytes */
+ __u32   value_size; /* size of value in bytes */
+ __u32   max_entries;/* max number of entries in a map */
+ __u32   map_flags;  /* prealloc or not */
+  };
+ }
+
+.. _bpf_cmd: http://lxr.free-electrons.com/ident?i=bpf_cmd
+
+
+Kernel sample/bpf ELF convention
+
+
+For programs under samples/bpf/, defining a map have been integrated
+with ELF binary generated by LLVM.  This is purely one example of a
+userspace convention and not part of the kernel ABI.  It still invokes
+the bpf syscall.
+
+Map definitions are done by defining a ``struct bpf_map_def`` with an
+elf section __attribute__ ``SEC("maps")``, in the xxx_kern.c file.
+The maps file descriptor is available in the userspace xxx_user.c
+file, via global array variable ``map_fd[]``, and the array map index
+corresponds to the order the maps sections were defined in elf file of
+xxx_kern.c file.  Behind the scenes it is the ``load_bpf_file()`` call
+(from `samples/bpf/bpf_load`_) that takes care of parsing ELF file
+compiled by LLVM, pickup 'maps' section and creates maps via the bpf
+syscall.
+
+.. code-block:: c
+
+  struct bpf_map_def {
+   unsigned int type;
+   unsigned int key_size;
+   unsigned int value_size;
+   unsigned int max_entries;
+   unsigned int map_flags;
+  };
+
+  struct bpf_map_def SEC("maps") my_map = {
+   .type= BPF_MAP_TYPE_XXX,
+   .key_size= sizeof(u32),
+   .value_size  = sizeof(u64),
+   .max_entries = 42,
+   .map_flags   = 0
+  };
+
+.. section links
+
+.. _samples/bpf/bpf_load:
+   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/samples/bpf/bpf_load.c
+
+Qdisc Traffic Control convention
+
+
+It is worth mentioning, that qdisc TC (Traffic Control), also use ELF
+files for defining the maps, but it uses another layout.  See man-page
+`tc-bpf(8)`_ and `tc bpf 

[net-next PATCH 4/4] doc/bpf: describe BCC the BPF Compiler Collection

2017-02-07 Thread Jesper Dangaard Brouer
It is worth mentioning BCC (BPF Compiler Collection) in-order
to direct developers into that community.

Reviewed-by: Alexander Alemayhu <alexan...@alemayhu.com>
Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/bcc_tool_chain.rst |   37 ++
 Documentation/bpf/index.rst  |5 ++---
 2 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/bpf/bcc_tool_chain.rst

diff --git a/Documentation/bpf/bcc_tool_chain.rst 
b/Documentation/bpf/bcc_tool_chain.rst
new file mode 100644
index ..b721875065bc
--- /dev/null
+++ b/Documentation/bpf/bcc_tool_chain.rst
@@ -0,0 +1,37 @@
+=
+BCC (BPF Compiler Collection)
+=
+
+BCC is a toolkit to make eBPF programs easier to write, with
+front-ends in Python and Lua.  BCC requires LLVM and clang (in version
+3.7.1 or newer) to be available on target, because BCC programs do
+runtime compilation of the restricted-C code into eBPF instructions.
+
+BCC includes several useful tools_ and examples_, developed by
+recognized performance analyst `Brendan Gregg`_ and covered with a
+tutorial_ and slides_.
+
+.. _tools:
+   https://github.com/iovisor/bcc/tree/master/tools
+
+.. _examples:
+   https://github.com/iovisor/bcc/tree/master/examples
+
+.. _`Brendan Gregg`: http://www.brendangregg.com/
+
+.. _tutorial:
+   https://github.com/iovisor/bcc/blob/master/docs/tutorial.md
+
+.. _slides:
+   http://www.slideshare.net/brendangregg/linux-bpf-superpowers/43/
+
+The project maintains an overview of `eBPF supported kernels`_ and
+what versions got which specific features.  There is also a `BCC
+Reference Guide`_.
+
+.. _eBPF supported kernels:
+   https://github.com/iovisor/bcc/blob/master/docs/kernel-versions.md
+
+.. _BCC Reference Guide:
+   https://github.com/iovisor/bcc/blob/master/docs/reference_guide.md
+
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index 618a28f7e959..686cc33fffab 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -35,7 +35,7 @@ the `bpf(2)`_ syscall.
 
 This documentation is focused on the kernel tree's `samples/bpf/`_ and
 `tools/lib/bpf/`_.  It is worth mentioning that other projects exist,
-like BCC_, that has a slightly different user-facing
+like :doc:`bcc_tool_chain`, that has a slightly different user-facing
 syntax, but is interfacing with the same kernel facilities as those
 covered by this documentation.
 
@@ -44,6 +44,7 @@ covered by this documentation.
 
ebpf_maps
ebpf_maps_types
+   bcc_tool_chain
 
 .. links:
 
@@ -65,5 +66,3 @@ covered by this documentation.
 
 .. _Traffic control: http://man7.org/linux/man-pages/man8/tc-bpf.8.html
 
-.. _BCC: https://github.com/iovisor/bcc
-

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[net-next PATCH 3/4] doc/bpf: describes the different types of eBPF maps available

2017-02-07 Thread Jesper Dangaard Brouer
The purpose is to help choose the right map type based on the
individual use-case.

To start with, only BPF_MAP_TYPE_ARRAY is described. It is the plan
that all types should have descriptions here.

Signed-off-by: Jesper Dangaard Brouer <bro...@redhat.com>
---
 Documentation/bpf/ebpf_maps.rst   |3 +
 Documentation/bpf/ebpf_maps_types.rst |  119 +
 Documentation/bpf/index.rst   |1 
 3 files changed, 122 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/bpf/ebpf_maps_types.rst

diff --git a/Documentation/bpf/ebpf_maps.rst b/Documentation/bpf/ebpf_maps.rst
index b8808f3bc31c..a5ba49f07cf5 100644
--- a/Documentation/bpf/ebpf_maps.rst
+++ b/Documentation/bpf/ebpf_maps.rst
@@ -4,7 +4,8 @@ eBPF maps
 
 This document describes what eBPF maps are, how you create them
 (`Creating a map`_), and how to interact with them (`Interacting with
-maps`_).
+maps`_).  The different map types available are described here:
+:doc:`ebpf_maps_types`.
 
 Using eBPF maps is a method to keep state between invocations of the
 eBPF program, and allows sharing data between eBPF kernel programs,
diff --git a/Documentation/bpf/ebpf_maps_types.rst 
b/Documentation/bpf/ebpf_maps_types.rst
new file mode 100644
index ..82136efecb04
--- /dev/null
+++ b/Documentation/bpf/ebpf_maps_types.rst
@@ -0,0 +1,119 @@
+==
+Types of eBPF maps
+==
+
+This document describes the different types of eBPF maps available,
+and goes into details about the individual map types.  The purpose is
+to help choose the right type based on the individual use-case.
+Creating and interacting with maps are described in another document
+here: :doc:`ebpf_maps`.
+
+The different types of maps available, are defined by ``enum
+bpf_map_type`` in include/uapi/linux/bpf.h.  These type definition
+"names" are needed when creating the map. Example of ``bpf_map_type``,
+but remember to `lookup latest`_ available maps in the source code.
+
+.. code-block:: c
+
+ enum bpf_map_type {
+   BPF_MAP_TYPE_UNSPEC,
+   BPF_MAP_TYPE_HASH,
+   BPF_MAP_TYPE_ARRAY,
+   BPF_MAP_TYPE_PROG_ARRAY,
+   BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+   BPF_MAP_TYPE_PERCPU_HASH,
+   BPF_MAP_TYPE_PERCPU_ARRAY,
+   BPF_MAP_TYPE_STACK_TRACE,
+   BPF_MAP_TYPE_CGROUP_ARRAY,
+   BPF_MAP_TYPE_LRU_HASH,
+   BPF_MAP_TYPE_LRU_PERCPU_HASH,
+ };
+
+.. section links
+
+.. _lookup latest:
+   http://lxr.free-electrons.com/ident?i=bpf_map_type
+
+Implementation details
+==
+
+In-order to understand and follow the descriptions of the different
+map types, in is useful for the reader to understand how a map type is
+implemented by the kernel.
+
+On the kernel side, implementing a map type requires defining some
+function call (pointers) via `struct bpf_map_ops`_.  The eBPF programs
+(and userspace) have access to the functions calls
+``map_lookup_elem``, ``map_update_elem`` and ``map_delete_elem``,
+which get invoked from eBPF via bpf-helpers in `kernel/bpf/helpers.c`_,
+or via userspace the bpf syscall (as described in :doc:`ebpf_maps`).
+
+:ref:`Creating a map` requires supplying the following configuration
+attributes: map_type, key_size, value_size, max_entries and map_flags.
+
+.. section links
+
+.. _struct bpf_map_ops: http://lxr.free-electrons.com/ident?i=bpf_map_ops
+
+.. _kernel/bpf/helpers.c:
+   
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/kernel/bpf/helpers.c
+
+
+BPF_MAP_TYPE_ARRAY
+==
+
+Implementation defined in `kernel/bpf/arraymap.c`_ via struct
+bpf_map_ops `array_ops`_.
+
+As the name ``BPF_MAP_TYPE_ARRAY`` indicates, this can be seen as an
+array.  All array elements are pre-allocated and zero initialized at
+init time.  Key is an index in array and can only be 4 bytes (32-bit).
+The constant size is defined by ``max_entries``.  This init-time
+constant also implies bpf_map_delete_elem (`array_map_delete_elem`_)
+is an invalid operation.
+
+Optimized for fastest possible lookup. The size is constant for the
+life of the eBPF program, which allows verifier+JIT to perform a wider
+range of optimizations.  E.g. `array_map_lookup_elem()`_ may be
+'inlined' by JIT.
+
+Small size gotcha, the ``value_size`` is rounded up to 8 bytes.
+
+Example usage BPF_MAP_TYPE_ARRAY, based on `samples/bpf/sockex1_kern.c`_:
+
+.. code-block:: c
+
+  struct bpf_map_def SEC("maps") my_map = {
+   .type = BPF_MAP_TYPE_ARRAY,
+   .key_size = sizeof(u32),
+   .value_size = sizeof(long),
+   .max_entries = 256,
+  };
+
+  u32 index = 42;
+  long *value;
+  value = bpf_map_lookup_elem(_map, );
+   if (value)
+   __sync_fetch_and_add(value, 1);
+
+The lookup (from kernel side) ``bpf_map_lookup_elem()`` returns a pointer
+into the array element.  To avoid data races with userspace reading
+the value, the API-user must use primitives like ``__sync_fetch_and_add()``
+when u

Re: XDP (eXpress Data Path) documentation

2016-09-22 Thread Jesper Dangaard Brouer

On Tue, 20 Sep 2016 11:08:44 +0200 Jesper Dangaard Brouer <bro...@redhat.com> 
wrote:

> As promised, I've started documenting the XDP eXpress Data Path):
> 
>  [1] 
> https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/index.html
> 
> IMHO the documentation have reached a stage where it is useful for the
> XDP project, BUT I request collaboration on improving the documentation
> from all. (Native English speakers are encouraged to send grammar fixes ;-))

I want to publicly thanks Edward Cree for being the first contributor
to the XDP documentation with formulation and grammar fixes.

Pulled and pushed:
 https://github.com/netoptimizer/prototype-kernel/commit/fb6a3de95

Thanks!
-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: XDP (eXpress Data Path) documentation

2016-09-22 Thread Jesper Dangaard Brouer
On Wed, 21 Sep 2016 17:03:24 -0700
Tom Herbert <t...@herbertland.com> wrote:

> On Tue, Sep 20, 2016 at 2:08 AM, Jesper Dangaard Brouer
> <bro...@redhat.com> wrote:
> > Hi all,
> >
> > As promised, I've started documenting the XDP eXpress Data Path):
> >
> >  [1] 
> > https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/index.html
> >
> > IMHO the documentation have reached a stage where it is useful for the
> > XDP project, BUT I request collaboration on improving the documentation
> > from all. (Native English speakers are encouraged to send grammar fixes ;-))
> >  
> Hi Jesper,
> 
> Thanks for taking the initiative on the this, The document reads more
> like a design doc than description right now, that's probably okay
> since we could use a design doc.

Yes, I fully agree.

I want to state very clearly, this document is not an attempt to hijack
the XDP project and control the "spec".  This is an attempt to collaborate.
We discuss things on the mailing list, each with our own vision of the
project, and most times we reach an agreement. But nobody document this
agreement. 

Month later, we make implementation choices that goes against these
agreements, because we simply forgot.  If someone remembers, we have to
reiterate the same arguments again (like it just happened with the
XDP_ABORTED action, my mistake).  And can anybody remember the
consensus around VLANs, it never got implemented that way...

I had to start the doc project somewhere, so I dumped my own vision
into the docs, and what I could remember from upstream discussions.
I need collaboration from others to adjust and "fix" my vision of the
project, into something that becomes a common ground we all can agree
on.

If some part of the docs provoke you, good, then you have a reason to
correct and fix it.  I'll do my best to keep an very open-mind about
any changes.  This should be a very "live" document.  


> Under "Important to understand" there are some disclaimers that XDP
> does not implement qdiscs or BQL and fairness otherwise. This is true
> for it's own traffic, but it does not (or at least should not) affect
> these mechanisms or normal stack traffic running simultaneously. I
> think we've made assumptions about fairness between XDP and non-XDP
> queues, we probably want to clarify fairness (and also validate
> whatever assumptions we've made with testing).

I love people pointing out mistakes in the documentation.  I want
update this ASAP when people point it out.  I'll even do the work of
integrating and committing these changes, for people too lazy to git
clone the repo themselves.

For you section Tom, I fully agree, but I don't know how to formulate
and adjust this in the text.

For people that want to edit the docs, notice the link "Edit on GitHub"
which takes you directly to the file you need to edit...



> > You wouldn't believe it: But this pretty looking documentation actually
> > follows the new Kernel documentation format.  It is actually just
> > ".rst" text files stored in my github repository under kernel/Documentation 
> > [2]
> >
> >  [2] 
> > https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/Documentation
> >
> > Thus, just git clone my repository and started editing and send me
> > patches (or github pull requests). Like:
> >
> >  $ git clone https://github.com/netoptimizer/prototype-kernel
> >  $ cd prototype-kernel/kernel/Documentation/
> >  $ make html
> >  $ firefox _build/html/index.html &
> >
> > This new documentation format combines the best of two worlds, pretty
> > online browser documentation with almost plain text files, and changes
> > being tracked via git commits [3] (and auto git hooks to generate the
> > readthedocs.org page). You got to love it! :-)
> >
> > --
> > Best regards,
> >   Jesper Dangaard Brouer
> >   MSc.CS, Principal Kernel Engineer at Red Hat
> >   Author of http://www.iptv-analyzer.org
> >   LinkedIn: http://www.linkedin.com/in/brouer
> >
> > [3] https://github.com/netoptimizer/prototype-kernel/commits/master  

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [iovisor-dev] XDP (eXpress Data Path) documentation

2016-09-21 Thread Jesper Dangaard Brouer
On Tue, 20 Sep 2016 19:47:07 -0700
Alexei Starovoitov <alexei.starovoi...@gmail.com> wrote:

> On Tue, Sep 20, 2016 at 11:08:44AM +0200, Jesper Dangaard Brouer via 
> iovisor-dev wrote:
> > Hi all,
> > 
> > As promised, I've started documenting the XDP eXpress Data Path):
> > 
> >  [1] 
> > https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/index.html
> > 
> > IMHO the documentation have reached a stage where it is useful for the
> > XDP project, BUT I request collaboration on improving the documentation
> > from all. (Native English speakers are encouraged to send grammar fixes ;-))
> > 
> > You wouldn't believe it: But this pretty looking documentation actually
> > follows the new Kernel documentation format.  It is actually just
> > ".rst" text files stored in my github repository under kernel/Documentation 
> > [2]
> > 
> >  [2] 
> > https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/Documentation
> >   
> 
> Thanks so much for doing it. This is great start!
> Some minor editing is needed here and there.
> To make it into official doc do you mind preparing a patch for Jon's doc tree 
> ?
> If you think the doc is too volatile and not suitable for kernel.org,
> another alternative is to host it on https://github.com/iovisor
> since it's LF collaborative project it won't disappear suddenly.
> You can be a maintainer of that repo if you like.

I do see this as kernel documentation that eventually should end-up in
Jon's doc tree.  Right now it is too volatile.  Once XDP have
"stabilized" some more, I plan to push/submit the *relevant* pieces
into the kernel.  E.g. I plan to have a "proposals" section, which is
not meant upstream doc as it is an intermediate specification step
before implementing, after which it should move to another doc section.
Likewise some of the use-case documents might be "rejected" before
reaching upstream doc.

The reason I've not created a separate repository for XDP doc only, is
because I also plan to document other parts of the kernel in this
repo[3], not just XDP.  Like my page_pool work.  The documentation is
not really official documentation before it reach a kernel git tree,
together with the code it documents.

I hope this approach can help us document while developing, and
turn email discussions into specifications.  Like I forgot the
XDP_ABORTED not warning argument, which you had to re-iterate, but now
it is documented[4][5].

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

[3] https://github.com/netoptimizer/prototype-kernel/
[4] https://github.com/netoptimizer/prototype-kernel/commit/a4e60e2d7a894
[5] 
https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/implementation/userspace_api.html#troubleshooting-and-monitoring
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


XDP (eXpress Data Path) documentation

2016-09-20 Thread Jesper Dangaard Brouer
Hi all,

As promised, I've started documenting the XDP eXpress Data Path):

 [1] https://prototype-kernel.readthedocs.io/en/latest/networking/XDP/index.html

IMHO the documentation have reached a stage where it is useful for the
XDP project, BUT I request collaboration on improving the documentation
from all. (Native English speakers are encouraged to send grammar fixes ;-))

You wouldn't believe it: But this pretty looking documentation actually
follows the new Kernel documentation format.  It is actually just
".rst" text files stored in my github repository under kernel/Documentation [2]

 [2] 
https://github.com/netoptimizer/prototype-kernel/tree/master/kernel/Documentation

Thus, just git clone my repository and started editing and send me
patches (or github pull requests). Like:

 $ git clone https://github.com/netoptimizer/prototype-kernel
 $ cd prototype-kernel/kernel/Documentation/
 $ make html
 $ firefox _build/html/index.html &

This new documentation format combines the best of two worlds, pretty
online browser documentation with almost plain text files, and changes
being tracked via git commits [3] (and auto git hooks to generate the
readthedocs.org page). You got to love it! :-)

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  Author of http://www.iptv-analyzer.org
  LinkedIn: http://www.linkedin.com/in/brouer

[3] https://github.com/netoptimizer/prototype-kernel/commits/master
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html