from:"Harald Welte"

Re: [PATCH v2 net-next 2/4] net: add skeleton of bpfilter kernel module

2018-05-07 Thread Harald Welte

Hi Alexei + netdev list,

On Wed, May 02, 2018 at 09:36:02PM -0700, Alexei Starovoitov wrote:
> Later bpfilter_process_sockopt() will be called from bpfilter hooks
> in get/setsockopt() to pass iptable commands into umh via bpfilter.ko

This is a part I'm quite heavily opposed to - at least at this point.
Unless bpfilter offered something that is semantically compatible to
what netfilter/iptables is currently implementing, I don't think
bpfilter should be [allowed to] overriding the iptables
{get,set}sockopt() calls.

I appreciate that people are working on a different architecture packet
filter than what we used to.   I also understand that there is a need
for backwards compatibility.  I still think it's wrong to offer that
compatibility on the {set,get}sockopt level, rather than on the
"iptables command line utility replacement" level.  But nevermind, you
guys have a different opinion on that, on which we can agree to
disagree.

However, no matter what you do, the most important part from the user
point of view is to make sure you don't break semantics.

netfilter/iptables semantics have an intricate notion abut when which
chain of which table is executed, in which order, at what particular
point of the packet traversal during the network stack.  The packet
filtering rulesets that people have created over more than 18 years
are based on those semantics.  If you offer the same interface, but not
that very same semantics, the packet filtering policies can an will
break - and they will break so in a hidden way.  To the user, it appears
as if the ruleset is loaded with the assumed semantics, but in reality
it isn't.

Unless you can replicate those semantics 1:1, I think it is not only
wrong to override the iptables sockopt interface, but it's outright
dangerous.

Having less matches/targets implemented than original iptables is
something that I believe is acceptable (and inevitable, at least in the
beginning).  If somebody tries to load a related ruleset with bpfilter
active, it will fail gracefully and the user can chose to not use that
match/target in his ruleset, or to not use bpfilter.

But if the ruleset loads but behaves different than before (because e.g.
it's executed from a completely different place in the stack), that's
IMHO an absolute no-go that must be avoided at all cost.  If that's the
case, you are actively breaking network security, rather than creating
it.

So I think there's only two ways to go:

a) replicate the exact semantics/order of the filter/mangle/raw/...
   tables and their chains, both among themselves as well as in terms of
   ordering with other parts of the network stack, or

b) not use the existing tables/chains with their pre-defined semantics
   but rather start new 'tables' which can then have different semantics
   as defined at the time of their implementation.

My apologies if I misunderstood something about bpfilter.  Feel free to
correct me where I'm wrong.  Thanks.

Regards,
    Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi David,

On Mon, Feb 19, 2018 at 12:29:08PM -0500, David Miller wrote:
> People with an Android phone in their pocket is using iptables, and
> the overhead and performance of those rules really does matter.  It
> determines how long your battery life is, etc.

I am not the android expert.  However, I just dumped the ruleset on my
Galaxy Tab S2 (Android 7.1.2 / LineageOS), and it was a whooping 91
rules across all tables.  The longest chain interation I could spot
was 24 rules.  That's not the kind of ruleset where I would expect
performance worries.

And if there was, nftables is around for quite some time and would be
much faster.

Sure, that was just one tablet, but I wonder how much Android packet
filter performance issue there are.  Would be interesting to hear about
those (and on whether they benchmarked against nftables).

> > I can just as well ask how many millions of users / devices are
> > already using eBPF or XDP?
> 
> Every time someone connects to a major provider, they are using it.

I was speaking of actual *users* as in indiiduals running their own
systems, companies running their own servers/datacenter.  The fact that
some ISP (or its supplier) decisdes that one of my IP packets is routed
via a smartnic with XDP offloading somewhere is great, but still doesn't
turn me into a "user" of that technology.  Not in my linke of thinking,
at least.

> And by in large, for system tracing and analysis eBPF is basically
> a hard requirement for people doing anything serious these days.

That's great, but misses the point.  I was referring to usage in the
context of the kernel network stack.  Sorry for not being explicit
enough.

Also, the entire point was about "new technologies need time to be
adopted widely".  Doesn't matter which new kernel feature that is.

Sure, one data center / hosting / "cloud" provider can quickly roll out
a change in their network.  But I'm referring to significant,
(Linux-)industry-wide adoption.  That would first include major
distributions to include/enable/support the feature, and then people
actually building their systems/products/software on top of those.

> Please see the wonderful work by Brendan Gregg and others which has
> basically made the GPL'ing of DTrace by Oracle entirely irrelevant and
> our Linux's tracing infrastructure has become must more powerful and
> capable thanks to eBPF.

Agreed.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi David,

On Mon, Feb 19, 2018 at 10:31:39AM -0500, David Miller wrote:
> > Why is it practical to replace your kernel but not practical to replace
> > a small userspace tool running on top of it?
> 
> The container is just userspace components.  Those are really baked in
> and are never changing.

never until you have to apply a bug fix for any of the many components you bake
into it.  I am doing this on an (at least) weekly basis for my Docker 
containers.
That's no different from a classic Linux distribution where you update your 
apt/rpm
packages all the time.

A container that is static and cannot continuously updated with new versions
for security (and other) fixes is broken by design.  If some people are doing
this, they IMHO have no sense of IT security, and such usage pattersn are not
what kernel development should cite as primary use case (again IMHO).

> This is how cloud hosting environments work.

Yes, *one* particular use case.  By far not every use case of Linux, or
Linux packet filtering.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi David,

On Mon, Feb 19, 2018 at 10:36:51AM -0500, David Miller wrote:

> nftables has been proported as "better" for years, yet large
> institutions did not migrate to it.  In fact, they explicitly
> disabled NFTABLES in their kernel config.

It's like with any migration.  People were using ipchains for a long
time even after iptables existed.  Many people simply don't care
about packet filter performance.  It's only a small fraction of their
entire CPU workload, so probably not worth optimzing.  For dedicated
firewall devices, that's of course a different story.

How long did it take for the getrandom() system call to be actually used
by applications [even glibc!]?  Or many other things that get introduced
in the kernel?

I can just as well ask how many millions of users / devices are already
using eBPF or XDP? How many major Linux distributions are enabling
and/or supporting this yet?  I'm not criticizing, I'm just attempting
to illustrate that technologies always take time to establish
themselves - and of course those people with the biggest benefit (and
knowing about it) will be the early adopters, while many others have no
motivation to migrate.

> In my opinion, any resistence to integration with eBPF and XDP will
> lead to even less adoption of netfilter as a technology.

1) I may not have made my point clear, sorry.  I have not argued against
   any integration with eBPF, I have just made some specific arguments
   against specific aspects of the current RFC.

2) You have indicated repeatedly that there are millions and millions of
   netfilter/iptables users out there.  So I fail to see the "even less
   adoption" part.  "Even less" than those millions and millions? SCNR.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Dear David,

On Mon, Feb 19, 2018 at 10:27:27AM -0500, David Miller wrote:
> > Would you be willing to merge nftables into kernel tools directory
> > then?
> 
> Did you miss the part where I explained that people explicitly disable
> NFTABLES in their kernel configs in most if not all large datacenters?

If people to chose to disable a certain feature, then that is their own
decision to do so.  We should respect that decision.  Clearly they seem
to have no interest in a better or more featureful packet filter, then.

I mean, it's not like somebody proposes to implement NTFS inside the FAT
filesystem kernel module because distributors (or data centers) tend to
disable the NTFS module?!

How is kernel development these days constrained by what some users may
or may not put in their Kconfig?  If they want a given feature, they
must enable it.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi David,

On Mon, Feb 19, 2018 at 10:13:35AM -0500, David Miller wrote:

> Florian, first of all, the whole "change the iptables binary" idea is
> a non-starter.  For the many reasons I have described in the various
> postings I have made today.
> 
> It is entirely impractical.

Why is it practical to replace your kernel but not practical to replace
a small userspace tool running on top of it?

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi David,

On Mon, Feb 19, 2018 at 09:44:51AM -0500, David Miller wrote:
> I see talk about "just replacing the iptables binary".
> 
> A long time ago in a galaxy far far away, that would be a reasonable
> scheme.  But that kind of approach won't work in the realities of
> today.
>
> You aren't going to be able to replace the iptables binary in the tens
> of thousands of container images out there, nor the virtualization
> installations either.

I appear to have been under the impression that the entire movement to
DevOps and automatic provisioning of containers/nodes/pods/Vms with
puppet, ansible, Dockerfiles & Co is to be *more* agile in deployments,
rather than less.

If you cannot even rebuild your thousands of container images with
updated binaries, then what is this all worth?  You need to be able to
update the iptables (or any other) binary in case there's an important
(security or otherwise) bug that needs fixing.  I don't see this any
different.

Also, as long as legacy ip_tables/x_tables is still in the kernel, you
can still run your old userspace against that old implementation in the
kernel.  Nobody forces you to use anything else [for another decade or
so].  Just if you want to take advantage of new more
modular/performant/... things like nftables or an eBPF backend, then you
would have to go that extra mile.

I don't think the kernel (network) developers should burden themselves
with too many things.  There's sufficient on their plate as-is.  So 
* if there's some new system (nftables, bpfilter, ...)
* and some documented migration paths for the vast majority of the use
  cases (replacing iptables binaries with a compat wrapper)
* and the old system continues to work as-is (x_tables kernel code stays for
  several more years)

Then people who care about the new features or performance will migrate
to the new system.  And those who don't care stay with the old system -
which is not a problem as they clearly wouldn't need the new system
anyway.

> Like it or not iptables ABI based filtering is going to be in the data
> path for many years if not a decade or more to come.  

I beg to differ.  For some people, yes.  but then, as Florian points
out, they can just as well use the existing x_tables kernel code.  If
they want something better, they can either replace their iptables
program with xtables-compat from nftables, or whatever else might
exist for eBPF support.

> iptables is a victim of it's own success, like it or not :-) Yes, the
> ABI is terrible, but obviously it was useful enough for lots of
> people.

and it continues to do so.  I just don't think it is a great idea
to kludge any new packet filter against such an arcane uapi.

> Therefore it behooves us to accept this reality and align the data
> path generated to match what the rest of the kernel is moving towards
> and that is eBPF and XDP.

This argument is unrelated to the question of the uapi. I'm not arguing
against an eBPF backend/implementation for packet filtering.  It's more
a question of _how_.

> Furthrmore, on a long term maintainence perspective, it means that
> every data path used by the kernel for iptables will be fully verified
> by the eBPF verifier.  This means that the iptables data path will be
> guaranteed to never get into a loop, access out of bounds data, etc.
> 
> That to me is real power, and something we should pursue.

Once again, both not related to the question of the uapi.

> I know you can't see how offloading is possible, but I hope
> are some further discussion you can see how that might work.

I'm looking forward to that point.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-19 Thread Harald Welte

Hi Daniel,

On Mon, Feb 19, 2018 at 01:03:17PM +0100, Daniel Borkmann wrote:
> Hi Harald,
> 
> On 02/17/2018 01:11 PM, Harald Welte wrote:
> [...]
> >> As rule translation can potentially become very complex, this is performed
> >> entirely in user space. In order to ease deployment, request_module() code
> >> is extended to allow user mode helpers to be invoked. Idea is that user 
> >> mode
> >> helpers are built as part of the kernel build and installed as traditional
> >> kernel modules with .ko file extension into distro specified location,
> >> such that from a distribution point of view, they are no different than
> >> regular kernel modules. 
> > 
> > That just blew my mind, sorry :)  This goes much beyond
> > netfilter/iptables, and adds some quiet singificant new piece of
> > kernel/userspace infrastructure.  To me, my apologies, it just sounds
> > like a quite strange hack.  But then, I may lack the vision of how this
> > might be useful in other contexts.
> 
> Thought was that it would be more suitable to push all the complexity of
> such translation into user space [...]

Sure, you have no complaints from my side about that goal.  I'm just not
sure if turning the kernel module loader into a new mechanism to start
userspace processes is.  I guess that's a question that the people
involved with core kernel code and module loader have to answer.  To me
it seems like a very lng detour away from the actual topic (packet
filtering).

> Given normal user mode helpers make this rather painful since they
> need to be shipped as extra package by the various distros, the idea
> was that the module loader back end could treat umh similarly as
> kernel modules and hook them in through request_module() approach
> while still operating out of user space. In any case, I could image
> this approach might be interesting and useful in general also for
> other subsystems requiring umh in one way or another.

I completely agree this approach has some logic to it.  I just think the
approach taken is *very* different from what has been traditionally done
in the Linux world.  All sorts of userspace programs to configure kernel
features (iptables being one of them iproute2, etc.) have always been
distributed as separate/independent application programs, which are
packaged separately, etc.

Making the kernel source tree build such userspace utilities and
executing them in a new fashion via the kernel module loaders are to me
two quite large conceptual changes on how "Linux works", and I believe
you will have to "sell" this idea to many people outside the kernel
networking communit, i.e. core kernel developers, people who do
packaging, etc.

I'm not saying I'm fundamentally opposed to it.  Will be curious to see
how the wider kernel community thinks of that architecture.

> Right, having a custom iptables, libiptc or LD_PRELOAD approach would work
> as well of course, but it still wouldn't address applications that have
> their own custom libs programmed against iptables uapi directly or those
> that reused a builtin or modified libiptc directly in their application.

How many of those wide-spread applications are you aware of?  The two
projects you have pointed out (docker and kubernetes) don't. As the
assumption that many such tools would need to be supported drives a lot
of the design decisions, I would argue one needs a solid empircal basis.

Also, the LD_PRELOAD wrapper *would* work with all those programs.  Only
the iptables command line replacement wouldn't catch those.

> Such requests could only be covered transparently by having a small shim
> layer in kernel and it also wouldn't require any extra packages from distro
> side.

What is wrong with extra packages in distributions?  Distributions also
will have to update the kernel to include your new code, so they could
at the same time use a new iptables (or $whatever) package.  This is
true for virtually all new kernel features.  Your userland needs to go
along with it, if it wants to use those new features.

> > Some of those can be implemented easily in BPF (like recomputing the
> > checksum or the like).   Some others I would find much more difficult -
> > particularly if you want to off-load it to the NIC.  They require access
> > to state that only the kernel has (like 'cgroup' or 'owner' matching).
> 
> Yeah, when it comes to offloading, the latter two examples are heavily tied
> to upper layers of the (local) stack, so for cases like those it wouldn't
> make much sense, but e.g. matches, mangling or forwarding based on packet
> data are obvious candidates that can already be offloaded today in a
> flexible and programmable manner all with existing BPF infra, so for those
> it could definitely be highly interesting to m

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-17 Thread Harald Welte

ctionality traditionally done
> by kernel modules into user space processes (either root or !root) and
> reduces security attack surface of such new code, meaning in case of
> potential bugs only the umh would crash but not the kernel. Another
> advantage coming with that would be that bpfilter.ko can be debugged and
> tested out of user space as well (e.g. opening the possibility to run
> all clang sanitizers, fuzzers or test suites for checking translation).
> Also, such architecture makes the kernel/user boundary very precise,
> meaning requests can be handled and BPF translated in control plane part
> in user space with its own user memory etc, while minimal data plane
> bits are in kernel. 

I understand that it has advantages to have the compiler in userspace.
But then, why first send your rules into the kernel and back?

> In the implemented proof of concept we show that simple /32 src/dst IPs
> are translated in such manner. 

Of course this is the first that one starts with.  However, as we all
know, iptables was never very good or efficient about 5-tuple matching.
If you want a fast implementation of this, you don't use iptables which
does linear list iteration.  The reason/rationale/use-case of iptables
is its many (I believe more than 100 now?) extensions both on the area
of matches and targets.

Some of those can be implemented easily in BPF (like recomputing the
checksum or the like).   Some others I would find much more difficult -
particularly if you want to off-load it to the NIC.  They require access
to state that only the kernel has (like 'cgroup' or 'owner' matching).

> In the below example, we show that dumping, loading and offloading of
> one or multiple simple rules work, we show the bpftool XDP dump of the
> generated BPF instruction sequence as well as a simple functional ping
> test to enforce policy in such way.

Could you please clarify why the 'filter' table INPUT chain was used if
you're using XDP?  AFAICT they have completely different semantics.

There is a well-conceived and generally understood notion of where
exactly the filter/INPUT table processing happens.  And that's not as
early as in the NIC, but it's much later in the processing of the
packet.

I believe _if_ one wants to use the approach of "hiding" eBPF behind
iptables, then either

a) the eBPF programs must be executed at the exact same points in the
   stack as the existing hooks of the built-in chains of the
   filter/nat/mangle/raw tables, or

b) you must introduce new 'tables', like an 'xdp' table which then has
   the notion of processing very early in processing, way before the
   normal filter table INPUT processing happens.

> Feedback very welcome!

Thanks.  Despite being a former netfilter core team member, I'm trying
to look at this as neutral as possible.  So please don't perceive my
comments as overly defensive or the like.

My main points are:

1) What is the goal of this?

2) Why iptables and not nftables?

3) If something looks like existing iptables, it must behave *exactly*
   like existing iptables, otherwise it is prone to break users security
   in subtle and very dangerous ways.

Looking forward to the following discussion and on other points of view.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-17 Thread Harald Welte

Hi Daniel,

On Fri, Feb 16, 2018 at 09:44:01PM +0100, Daniel Borkmann wrote:
> We started out with the
> iptables part in the demo as the majority of bigger infrastructure projects
> all still rely heavily on it (e.g. docker, k8s to just name two big ones).

docker is exec'ing the iptables command line program.  So one could simply
offer a syntactically compatible userspace replacement that does the compilation
in userspce and avoid the iptables->libiptc->setsockopt->userspace roundtrip
and the associated changes to the kernel module loader you introduced.

kubernetes is using iptables-restore, which is part of iptables and
again has the same syntax.  However, it aovids the per-rule fork+exec
overhead, which is why the netfilter project has been recommending it to
be used in such situations.

Do you have a list of known projects that use the legacy sockopt-based
iptables uapi directly, without using code from the iptables.git
codebase (e.g. libiptc, iptables or iptables-restore)?  IMHO only
those projects would benefit from the approach you have taken vs. an
approach that simply offers a compatible commandline syntax.

> Usually they have their requests to iptables baked into their code directly
> which probably won't change any time soon, so thought was that they could
> benefit initially from it once there would be sufficient coverage.

If the binary offeers the same syntax (it could even be a fork/version
of the iptables codebase, only using the parsing without the existing
backend generating the ruleS), the same goal could be achieved.

The above of course assumes that you have a 100% functional replacement
(for 100% of the features that your use cases use) underneath the
"iptables command syntax" compatibility.  But you need that in both
cases, whether you use the existing userspace api or not.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH RFC 0/4] net: add bpfilter

2018-02-17 Thread Harald Welte

Hi David,

On Fri, Feb 16, 2018 at 05:33:54PM -0500, David Miller wrote:
> From: Florian Westphal <f...@strlen.de>
> 
> > Any particular reason why translating iptables rather than nftables
> > (it should be possible to monitor the nftables changes that are
> >  announced by kernel and act on those)?
> 
> As Daniel said, iptables is by far the most deployed of the two
> technologies.  Therefore it provides the largest environment for
> testing and coverage.

As I outlined earlier, this way you are perpetuating the architectural
mistakes and constraints that were created ~ 18 years ago without any
benefit from the lessons learned ever since.  In netfilter, we already
wanted to replace it as early as 2006 (AFAIR) with nfnetlink based
pkttables (which never materialized).

I would strongly suggest to focus on nftables (or even some other way of
configuration / userspace interaction) to ensure that the iptables
userspace interface can at some point be phased out eventually.  Like we
did with ipchains before, and before that with ipfwadm.

By making a new implementation dependant on the oldest interface you are
perpetuating it.  Sure, one can go that way, but I would suggest this to
be a *very* carefully weighed decision after a detailed
analysis/discusison.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH v7 net-next 00/13] gtp: Additional feature support - Part I

2017-11-12 Thread Harald Welte

Hi Tom,

sorry for the delayed response.  But I remain committed in pushing
the non-controversial part of your GTP patches forward.

On Sat, Oct 28, 2017 at 06:47:59PM +0200, Harald Welte wrote:
> Thanks.  As indicated, I'm planning some testing later this weekend on
> the non-IPv6 patches, and am happy to add my Acked-by and/or re-submit
> those to Dave after that.

After some more delays and returning from netdev 2.2, I've finally put
together a testing setup and successfully (manually) tested with the
following patches:

01/13 vxlan: Move gro_cells_init to ndo_init
02/13 iptunnel: Add common functions to get a tunnel route
04/13 gtp: Call common functions to get tunnel routes and add dst_cache
05/13 iptunnel: Generalize tunnel update pmtu
06/13 gtp: Change to use gro_cells
07/13 gtp: Use goto for exceptions in gtp_udp_encap_recv funcs
08/13 gtp: udp recv clean up
09/13 gtp: Call function to update path mtu
10/13 gtp: Eliminate pktinfo and add port configuration

I hereby acknowledge those patches.  How should we proceed?  Should I

a) do nothing, you will add Acked-By and re-submit?

b) send an individual Acked-By in a reply to each related patch here on
   netdev and you will re-submit those patches?

c) simply create a rebased set from those patches and
   re-submit them to the list for net-next myself, with the Acked-by?

d) be preposterous and provide a gtp git tree for DaveM to pull from?

As discussed before, I will not merge/ack IPv6 will until we have an
implementation that is interoperable.  I have a TODO list of other
bugfixes and improvements for Kernel GTP, but I'm hopeful that IPv6 can
still be addressed before the end of 2017.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH v7 net-next 00/13] gtp: Additional feature support - Part I

2017-10-28 Thread Harald Welte

Hi Tom,

On Sat, Oct 28, 2017 at 09:16:01AM -0700, Tom Herbert wrote:
> Here is what the Kconfig for the EXPERIMENTAL option says:
> 
> "This is an experimental implementation that allows encapsulating IPv6
> over GTP and using GTP over IPv6 for testing and development purposes.
> This is not a standards conformant implementation for IPv6 and GTP.
> More work is needed to reach that level."
> 
> I don't see any ambiguity here about it not being standards complete.
> Nor is there any ambiguity about the its purpose to enable further
> development and the fact that more work is needed.

As stated repeatedly: I have no issue with an *incomplete* implementation,
but I have a problem with an *incompatible* one that takes left turns
where the spec takes right turns.

An *incomplete* implementation could still interoperate with other
implementations but is e.g. missing some optional bits.  It can later
extended with those missing bits and wills stay compatible to users of
the incomplete as well as to any later complete implementation.

An *incompatible* implementation is what I have issues with.

> This a foundation for an IPv6 datapath and is sufficient to do
> benchmarking and performance to determine the prospects of replacing
> proprietary HW with commodity servers running Linux kernel. 

I completely agree with that.

> This is a forward step to get IPv6 into GTP, and frankly the _only_ code that
> has been proposed. There is no reason why someone can't build upon
> this to make a first rate conformant implementation.

I also agree with this.  However, I don't think it makes sense to have it in
the kernel given it implements something that's actually not GTP as per
the relevant specs.  No matter how many disclaimers you put at it,
people will still assume it's GTP if it's called GTP.  And if it's only
useful for benchmarking the poential of a later proper IPv6
implementation, I don't think it should go in.

> In any case, I've invested as much time in this as I can for now. I'll
> leave it up to DaveM to decide if we wants to take all, none, or some
> subset of these patches.

Thanks.  As indicated, I'm planning some testing later this weekend on
the non-IPv6 patches, and am happy to add my Acked-by and/or re-submit
those to Dave after that.

For sure, the kernel networking maintainer can merge any patches,
including the proposed IPv6 patches as-is, and I will accept that.  But
my vote as the original author and co-maintainer of the kernel GTP code
goes politely and respectfully against that - as I have made quite clear
by now.

Thanks + Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH v7 net-next 00/13] gtp: Additional feature support - Part I

2017-10-28 Thread Harald Welte

Hi Tom,

my apologies for only getting back to reviews now, after return from
holidays I was too overloaded with plenty of other tasks.

On Fri, Oct 27, 2017 at 05:09:24PM -0700, Tom Herbert wrote:
>   - Experimental IPv6 support

As far as I can tell, my previous comments/concerns regarding an IPv6
support that is in clear violation of how it is specified is not
acceptable to me, sorry.

The question is - as illustrated quite verbosely before - not one
of missing certain bits, but it is simply *different* from what
the protocol specification says.

This makes absolutely no sense to me.  All it will do, is it will raise
the impression that IPv6 is supported in a spec-compliant way.
Furthermore, it will mean that once somebody does convert this into
proper IPv6 support later, they will break the existing use cases of
the non-compliant implementation that you're adding in this patch
series.

To me, I simply don't understand how that makes any sense. There are no
known users of GTP outside of 3GPP networks, and particularly none of a
different GTP flavour than the one specified in 3GPP.  If there are, I
would want to hear of it.  And if there really are, I would strongly
recommend them to use some other tunneling protocol, one that's more
reasonable for normal use cases and with better protocol-level security.

I wouldn't mind merging *incomplete* IPv6 support.  However, I do mind
merging *incompatible* or *incompliant* IPv6 support.

>   - Configurable networking interfaces so that GTP kernel can be
> used and tested without needing GSN network emulation (i.e. no user
> space daemon needed).
>   - Addition of a dst_cache in the GTP structure and other cleanup

No issues with this from my side.  I plan on setting up a test system
with your patches over the weekend and give it some more testing from
the use cases I know to avoid regressions.

>   - Common functions to get a route fo for an IP tunnel
>   - Fix VXLAN gro cells initialization

Also no issues from my side, but that's for core networking folks to
decide.

> For IPv6 support, the mobile subscriber needs to allow IPv6 addresses,
> and the remote endpoint can be IPv6.

You are raising the impression - again - that this implementation will
work with any mobile subscriber.  I would bet at lot on the fact that
the current IPv6 implementation will in fact *not* work with any
existing equipment/devices out there.

> Tested:
> 
> Configured the matrix of IPv4/IPv6 mobile subscriber, 

Please indicate how that testing was done, see my comment above.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH v4 net-next 00/12] gtp: Additional feature support - Part I

2017-09-27 Thread Harald Welte

Hi Tom,

thanks for your updated series!

I'll revie as soon as I a, but that may likely not be before next
week.  As indicated before, I'm on a motorbike roadtrip on vacation,
with very limited connectivity and even more limited time for any
technical work.  Thanks for your understanding.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone

2017-09-24 Thread Harald Welte

Hi Tom,

On Sun, Sep 24, 2017 at 08:55:49AM -0700, Tom Herbert wrote:
> Do you believe that these patches are not at all on the right track,
> that they can't be built upon to get to a standards-compliant
> implementation, and that we are going to have to throw all of this and
> start from scratch to provide IPv6 support?

I believe I have pointed out where the problem areas are, several times
by now.  I see no reason why things would have to be started from
scratch.  However, the issues pointed out in the IPv6 support patch[es]
have to be resolved *before* any merge to mainline.

I don't mind merging "incomplete" code that doesn't cover all parts of a
spec but provides basic interoperability.  I also am not arguing that
code must be bug-free at the time it is merged (which is impossible
anyway).  But I am arguing that we cannot merge something that is
a wrong implementation as per the spec, and hence it must be brought
in-line with the spec before it can be merged.

> > There's no use in merging an IPv6 support patch if already by code
> > review it can be shown that it's impossible to create a spec-compliant
> > implementation using that patch.  To me, that would be "merging IPv6
> > support so we can check off a box on a management form or marketing
> > sheet", but not for any practical value.
> 
> To be clear, these patches are not done because to be a bullet point
> on a marketing sheet. 

Great.

> IPv6 is becoming _the_ Internet protocol.

I'm all aware of that, and I've been a very early adopter, since the
1990ies with 6bone.

My argument is not against IPv6 support.  My argument is against merging
something that introdues IPv6 in a way that's not in-line with the GTP
protocol specifications, as such a way is of no use to anyone (except
marketing sheets).

> We should be far past the days of vendors only providing IPv4 in the
> kernel support because "that's what our customers use" and they'll get
> to IPv6 support at their leisure. 

I'm not sure where a "vendor" is involved with the GTP patches so far.  I
think we have to draw a distinction between what you expect from
professional, corporate "vendors" with a commercial interest in mind
(such as supporting their hardware) and what you can expect from people
doing things in their spare time, out of enthusiasm to finally bring
some Free Software into the closed world of telecommunications.

The Telecom world should have implemented something like a GTP kernel
module a decade to 15 years ago.  They could have saved significant
investments in proprietary hardware by running open source GGSNs with an
accelerated user plane in the kernel.  Nobody seemed to have an interest
in that, until today - as you can see from Pablo and me working on this
in our spare time, whenever we have a couple of spare cycles next to
many other projects.  You can see from the osmo-gtp-kernel commit log it
took years of being a ultra-low-priority on-and-off project  to ever get
to a point where we thought it was worth submitting it mainline.
Andreas deserves the praise for finally pushing it ahead.

I'm looking forward to reviewing the next version of the patch series.
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone

2017-09-23 Thread Harald Welte

osmocom.org/osmo-gtp-kernel/commit/?id=3bc7019c7afd06b5c7d94e5621728d092b82bb85
it was actually intended to make IPv6 support easier, but the
partial IPv6 support was removed before mainline submission.

* 07/14 Support encapsulation of IPv6 packets
Not acceptable in its current form, see extensive review

* 08/14 Support encpasulating over IPv6
No concerns in principle. Pending you making it dependent on AF
of socket

* 09/14 Allow configuring GTP interface as standalone
Can be merged unless strong objection from Pablo/Andreas (see above)

* 10/14 Add support for devnet
No concerns from my side

* 12/14 Configuration for zero UDP checksum
Up to Dave, he raised a question on it

* 13/14 Support for GRO
No concerns from my side

* 14/14 GSO support
No concerns from my side

BTW: Where have the iproute2/ip patches been posted, which you mention
in your cover page of the patch series?

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 03/14] gtp: Call common functions to get tunnel routes and add dst_cache

2017-09-23 Thread Harald Welte

Hi Andreas,

On Wed, Sep 20, 2017 at 05:37:52PM +0200, Andreas Schultz wrote:
> I think we had this discussion before. The sending IP and port are not part
> of the identity of the PDP context. So IMHO the sender is permitted
> to change the source IP at random.

Thanks for the reminder: You are correct, at least in the uplink case
(MS->GGSN) where there is mobility of the MS. In the downlink case
(GGSN->MS), which is the "sending" part for the kernel GTP code used at
a GGSN, I'm not sure if that theory holds true in reality.

Do you agree that the current behavior of not using automatic source
address selection for encapsulated GTP packets but rather using the
source address of the socket is intended?

Do you further agree that the dst_cache support patch by Tom retains
that intended behavior and it should be merged?

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone

2017-09-21 Thread Harald Welte

he test itself (GGSN_Test) executing against the IUT

The code so far implements PDP context activation + address allocation
for IPv4-only and IPv6-only cases and can be run against a GGSN
implementing those.  The IPv6-only PDP context unit tests include the
convoluted two-phase address assignment including sending the router
solicitation from the simulated phone as well as verifying the router
advertisement sent in response from the GGSN.

Yes, I know they're written in an unknown niche programming language
called TTCN-3, but this was the best tool at hand for the job I could
find, and the tests are open source as is the Eclipse Titan toolchain
for compiling it.

We even have a Dockerfile that will build you a docker container
containing the compiled GGSN_test at
http://git.osmocom.org/docker-playground/tree/ggsn-test

That docker container is used by a jenkins build test job to test
current OsmoGGSN master every night against the test suite.

I was stupid enough to break the testsuite with an accidential commit,
so tests between August 27th and today failed.  I've just reverted that
accident - don't let that mistake of mine mislead you.

I'm happy to contribute further to this by adding actual user-IP GTP-U
functional testing beyond the router soliciatation/advertisement to it.

So my suggestion in terms of a "realistic testbed without having to
configure + run dozens of programs and using real RF + phones" is to use
that testsuite.

What one cannot get around is having to implement support for new
features added on the kernel side such as IPv6 in libgtpnl and at least
one of the GGSN's using it, sorry.  Without that, there is no way to
know if that code would do anything useful.  You simply cannot
realistically test GTP-U alone without GTP-C.

I'd love to offer help on this, but it's really impossible right now.
I'm on holidays on a motorbike tour through rural Taiwan's mountains,
had to deal with a flat tire today, have limited connectivity and am
already cutting down hard on sleep every night to be able to respond to
the absolute minimally required work e-mails.  And review/follow-up to
your much appreciated patch series the last couple of days has also used
a lot of (unexpected not scheduled for) time.  I'm not complaining, I'm
just saying I am really not able to contribute more to this effort right
now beyond my review, the offer of free hardware for a real cellular
network, and the extension of the test cases for GTP-U beyond the
already implemented very important IPv6 address allocation/assignment
which I believe your current code would not pass.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 00/14] gtp: Additional feature support

2017-09-21 Thread Harald Welte

Hi Tom,

On Tue, Sep 19, 2017 at 04:47:11PM -0700, Tom Herbert wrote:
> On Tue, Sep 19, 2017 at 4:19 PM, Harald Welte <lafo...@gnumonks.org> wrote:
>
> > I think there has to be a clear plan/architecture on how to implement
> > those bits in terms of the kernel/userspace split, and at least a proof
> > of concept implementation that we can show works with some real phones
> > out there - otherwise there's no point in having IPv6 support that works
> > well with some custom tools.
> >
> OTOH, I will argue that the GTP patches should never have been allowed
> in the kernel in the first place without IPv6 support! ;-) 

Well, it could be shown that the code works with integration to OpenGGSN
(and later ergw, and now OsmoGGSN) and works within a complete 3GPP
network.  So we were not merging something that we hypothesized it would
work once the rest would be implemented, but we could actually show it
was working before it got merged.

> I think the best plan forward is to get the IPv6 data path running
> that so can demonstrate a functional GTP/IPv6 datapath 

Yes, but a functional "datapath" (= GTP-U) unfortunately includes the
way how address allocation/assignment is done.  Putting something into
the mainline kernel that we know will for sure not work/interop in a
real scenario is not a good idea.  I think what's worse than not
supporting a feature is to implying support for it while actually
doing it in an incomplete/incompliant way.

So from my point of view, what's needed is
* making sure router advertisement/solicitation are covered in some
  way, either by doing it in the kernel or having a clear strategy how
  it could be done from userspace while not a) introducing races regarding
  who owns the sequence numbers
* making sure the implementation covers entire /64 prefixes for each
  PDP context and not single addresses.  That is non-negotiable and
  mandatory by 3GPP specs.

> Since "real" configuration path doesn't use the path to set up a
> standalone interface, I would presume that that will be fleshed when
> someone has cycles and expertise to work on both sides of the problem.

The configuration will be different, yes. but we need to ensure that the
actual *implementation* of the data path does what it is expected to do,
no matter who configures it via which interface.

> Even if this requires structural changes to how IPv6 is managed in
> GTP, I doubt that the fundamental TX/RX, GRO/GSO data paths will
> change much.  In other words, please consider this to be a step on an
> evolutionary path. More work is required to reach the ultimate
> deployable solution.

Agreed. But then I'm still against merging something that we know for
sure will not be compatible with real-world use case.  It should be kept
out of mainline until we are sure of that, at the very least
theoretically, but even better which we can prove in practise will do
what it claims to do.

> As for testing on real phones, that is cannot be a requirement for a
> kernel feature. 

GTP is not implemented on phones.  GTP is implemented only inside the
fixed (land-side) of the cellular network.  However, the inner IP data
originates from phones, and large parts of what you see on GTP
originates from phones in a different format.  The inner IP data inside
GTP-U originates from phones.  And that's where address configuration
for IPv6 works.

> If you expect Linux community to support this, then we
> need a way to be develop and test on commodity PC hardware. 

I'm not arguing you need to run any of the code on a different
architecture such as a phone.  I'm arguing for you to run a cellular
network including the kernel GTP code, and then use that cellular
network from a real phone.  This is the only way to know for sure you
interoperate.  See my other mail related to the Open Soruce based
configurations for both 2G and 3G that can be used for this.

As a replacement, one can e.g. look at protocol traces of real phones
and then simulate the behavior of one or several different phones and
implement that as test caess.  This is what I did in the GGSN_Test
pointed out in my other mail in this thread.  And btw, all I've asked is
for showing it works with *one* phone model at all.  I'm not talking
about the various different implementation specifics, such as whether or
not the phone will insist on using neighbor solicitation and mandate
neighbor advertisement (on a point-to-point link, how absurd!) after
the (mandatory) router discovery.

> That is one of the major values of creating a standalone interface
> configuration-- we can test the datapath just like any other
> encapsulation supported by the kernel.

Well, you cannot.  You might be able to do some benchmarking to compare
if an old version of the kernel gtp driver will perform better or worse
than some optimizations introduced.  But you can *not* have a rea

Re: [PATCH net-next 12/14] gtp: Configuration for zero UDP checksum

2017-09-20 Thread Harald Welte

Hi Tom,

On Wed, Sep 20, 2017 at 11:09:29AM -0700, Tom Herbert wrote:
> On Mon, Sep 18, 2017 at 9:24 PM, David Miller <da...@davemloft.net> wrote:
> > From: Tom Herbert <t...@quantonium.net>
> >> Add configuration to control use of zero checksums on transmit for both
> >> IPv4 and IPv6, and control over accepting zero IPv6 checksums on
> >> receive.
> >
> > I thought we were trying to move away from this special case of allowing
> > zero UDP checksums with tunnels, especially for ipv6.
> 
> I don't have a strong preference either way. I like consistency with
> VXLAN and foo/UDP, but I guess it's not required. Interestingly, since
> GTP only carries IP, IPv6 zero checksums are actually safer here than
> VXLAN or GRE/UDP.

Just for the record: I don't care either way and I defer to the kernel
networking developers to decide if they want to have zero UDP checksum
in GTP or not.

The 3GPP specs don't say anything about UDP checksums.  So there's no
requirement to use them, and hence operation without UDP checksums
should be compliant.  Cisco GTP implementation has udp checksumming
configurable, so other implementations also seem to provide both ways.

In general, I would argue one wants UDP checksumming of GTP in all
setups, as while the inner IP packet might be protected, the GTP header
itself is not, and that's what contains important data suhc as the TEID
(Tunnel Endpoint ID).  But that's of course just my personal opinion,
and I'm not saying we should prevent people from using lower protection
if that's what they want.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 08/14] gtp: Support encpasulating over IPv6

2017-09-20 Thread Harald Welte

Hi Tom,

On Wed, Sep 20, 2017 at 01:40:54PM -0700, Tom Herbert wrote:
> On Wed, Sep 20, 2017 at 12:45 PM, David Miller <da...@davemloft.net> wrote:
> > There is a socket associated with the tunnel to do the encapsulation
> > and it has an address family, right?
> 
> If fd's are set from userspace for the sockets then we could derive
> the address family from them. I'll change that. Although, looking at
> now I am wondering why were passing fds into GTP instead of just
> having the kernel create the UDP port like is done for other encaps.

because the userspace process has to take care of those bits of GTP-U
that the kernel doesn't, such as responding to GTP ECHO requests with
GTP echo responses.  Only the "GTP Message type G-PDU" is handled in the
kernel, as only those frames contain user plane.  See table 1 of Section
7.1 of 3GPP TS 29.060.

If you create the socket in the kernel, how would you hand the socket to
the userspace process later on?

IMHO, it feels more natural to simply create it in userspace (like you
would do in the non-kernel-accelerated case) and then simply handle the
G-PDU messages in the kernel while doing the rest in userspace.

But if there's another method that feels more usual to the kernel
community, I'm not against any changes - but given kernel policies, we'd
have to keep userspace compatbility, right?

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 09/14] gtp: Allow configuring GTP interface as standalone

2017-09-20 Thread Harald Welte

Hi Tom,

On Wed, Sep 20, 2017 at 09:24:07AM -0700, Tom Herbert wrote:
> On Wed, Sep 20, 2017 at 9:07 AM, Andreas Schultz <aschu...@tpip.net> wrote:
> > GTP isn't special, I just don't like to have testing only features in there
> > when the same goal can be reached without having to add extra stuff. Adding
> > code that is not going to be useful in real production setups (or in this
> > case would even break production setups when enabled accidentally) makes the
> > implementation more complex than it needs to be.
> 
> Well, you could make the same argument that allowing GTP to configured
> as standalone interface is a problem since GTP is only allowed to be
> with used with GTP-C. But, then we have something in the kernel that
> the community is expected to support, but requires jumping through a
> whole bunch of hoops just to run a simple netperf. 

"A whole bunch of hoops" without your new interface would consist of
running a single command-line program that is supplied with libgtpnl.
This is not a complete 3GPP network, but a simple libmnl-based helper
library with no other depenencies.

I'm not neccessarily against introducing features like the 'standalone
interface configuration'.  However, we must make sure that any
significant new feature contributions like IPv6 are tested in a
"realistic setup" and not just using those 'interfaces added for easy
development'.  Also, I would argue those 'interfaces added for easy
deveopment/benchmarking' should probably be clearly marked as such to
avoid raising the impression that this is what leads to a
standard-conforming / production-type setup.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 00/14] gtp: Additional feature support

2017-09-19 Thread Harald Welte

Hi Tom,

On Tue, Sep 19, 2017 at 08:59:28AM -0700, Tom Herbert wrote:
> On Tue, Sep 19, 2017 at 5:43 AM, Harald Welte <lafo...@gnumonks.org>
> wrote:
> > On Mon, Sep 18, 2017 at 05:38:50PM -0700, Tom Herbert wrote:
> >>   - IPv6 support
> >
> > see my detailed comments in other mails.  It's unfortunately only
> > support for the already "deprecated" IPv6-only PDP contexts, not the
> > more modern v4v6 type.  In order to interoperate with old and new
> > approach, all three cases (v4, v6 and v4v6) should be supported from
> > one code base.
> >
> It sounds like something that can be subsequently added. 

Not entirely, at least on the netlink (and any other configuration
interface) you will have to reflect this from the very beginning.  You
have to have an explicit PDP type and cannot rely on the address type to
specify the type of PDP context.  Whatever interfaces are introduced
now will have to remain compatible to any future change.

My strategy to avoid any such possible 'road blocks' from being
introduced would be to simply add v4v6 and v6 support in one go.  The
differences are marginal (having both an IPv6 prefix and a v4 address in
parallel, rather than mutually exclusive only).

> Do you have a reference to the spec?

See http://osmocom.org/issues/2418#note-7 which lists Section 11.2.1.3.2
of 3GPP TS 29.061 in combination with RFC3314, RFC7066, RFC6459 and
3GPP TS 23.060 9.2.1 as well as a summary of my understanding of it some
months ago.

> >>   - Configurable networking interfaces so that GTP kernel can be
> >>   used and tested without needing GSN network emulation (i.e. no
> >>   user space daemon needed).
> >
> > We have some pretty decent userspace utilities for configuring the
> > GTP interfaces and tunnels in the libgtpnl repository, but if it
> > helps people to have another way of configuration, I won't be
> > against it.
> >
> AFAIK those userspace utilities don't support IPv6. 

Of course not [yet]. libgtpnl and the command line tools have been
implemented specifically for the in-kernel GTP driver, and you have to
make sure to add related support on both the kernel and the userspace
side (libgtpnl). So there's little point in adding features on either
side before the other side.  There would be no way to test...

> Being able to configure GTP like any other encapsulation will
> facilitate development of IPv6 and other features.

That may very well be the case, but adding "IPv6 support" to kernel GTP
in a way that is not in line with the existing userspace libraries and
control-plane implementations means that you're developing those
features in an artificial environment that doesn't resemble real 3GPP
interoperable networks out there.

As indicated, I'm not against adding additional interfaces, but we have
to make sure that we add IPv6 support (or any new feature support) to at
least libgtpnl, and to make sure we test interoperability with existing
3GPP network equipment such as real IPv6 capable phones and SGSNs.

> > I'm not sure if this is a useful feature.  GTP is used only in
> > operator-controlled networks and only on standard ports.  It's not
> > possible to negotiate any non-standard ports on the signaling plane
> > either.
> >
> Bear in mind that we're not required to do everything the GTP spec
> says. 

Yes, we are, at least as long as it affects interoperability with other
implemetations out there.

GTP uses well-known port numbers on *both* sides of the tunnel, and you
cannot deviate from that.

There's no point in having all kinds of feetures in the GTP user plane
which are not interoperable with other implementations, and which are
completely outside of the information model / architecture of GTP.

In the real world, GTP-U is only used in combination with GTP-C.  And in
GTP-C you can only negotiate the IP address of both sides of GTP-U, and
not the port number information.  As a result, the port numbers are
static on both sides.

> My impression is GTP designers probably didn't think in terms of
> getting best performance. But we can ;-)

I think it's wasted efforts if it's about "random udp ports" as no
standards-compliant implementation out there with which you will have to
interoperate will be able to support it.

GTP is used between home and roaming operator.  If you want to introduce
changes to how it works, you will have to have control over both sides
of the implementation of both the GTP-C and the GTP-u plane, which is
very unlikely and rather the exception in the hundreds of operators you
interoperate with.  Also keep in mind that there often are various
"middleboxes" that will suddenly have to reflect your changes.  That
starts from packet filters at various locations in the operator networks
and/or roaming hubs, down to GTP hubs a

Re: [PATCH net-next 00/14] gtp: Additional feature support

2017-09-19 Thread Harald Welte

 of v4v6 it works differently:
* MS/UE requests dynamic or fixed IPv4 address in EUA IE of PDP context
  activation
* GGSN responds with an IPv6 address, but that address is *not* used
  for communication, but simply used as an "interface identifier" to
  build a link-local address.
* MS then uses router solicitation using that link-local address
* GGSN responds with router advertisement, allocating a single /64
  prefix, from which the MS then generates a fully-qualified IPv6
  source address for communication.

How did you envision this to be done with the v6 support you just added?
At the very least, the /64 prefix matching would have to be implemented
so that in fact all addresses within that /64 prefix are matched +
encapsulated for a given PDP context in the downlink (to phone)
direction.

Also, I think the responsibility for the router advertisements would be
in the kernel, too.  Otherwise, a GTP-C userspace implementation would
have to inject packets into the user plane (which is otherwise handled
completely inside the kernel).  Injecting packets would mean that in caes
GTP sequence numbers are used, that userspace implementation would have
to alter the sequence numbers of the kernel gtp.ko code using netlink,
but therre would  be race conditions, ...

The router advertisements and neighbor advertisements basically have the
semantics of one link per PDP context.  Each of them is a point-to-point
link, and it's not one router advertisement that's sent to all of the
PDP contexts on that gtp-device.

I know it all sucks.  I'm still happy to see somebody tackling v6
support in gtp.c :)

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 08/14] gtp: Support encpasulating over IPv6

2017-09-19 Thread Harald Welte

Hi Tom,

On Mon, Sep 18, 2017 at 05:38:58PM -0700, Tom Herbert wrote:
> Allow peers to be specified by IPv6 addresses.

> + u16 peer_af;
> + union {
> + struct in_addr  peer_addr_ip4;
> + struct in6_addr peer_addr_ip6;
> + };

this will not really work, as an union means that a PDP context
will be either IPv4-only or IPV6-only, while in reality there
are three types, see my other mail.  So you have to  deal
with v4-only, v6-only or v4v6.

The v6-only is legacy by now, and all modern phones I've tested in
recent years can do v4v6 rather than having a v4-only and a v6-only
PDP context in parallel.

>From the operator point of view, v4v6 is very desirable, as it basically
halves the amount of PDP contexts compared to the old approach, which
significantly reduces signalling load across your network, as well as
the amount of memory (and thus capacity) in your core network elements.

I've recently implemented v6 + v4v6 support in osmo-ggsn (see
http://git.osmocom.org/osmo-ggsn/) in case you would like to see another
FOSS implementation for v6 + v4v6 - though in userspace, of course.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 13/14] gtp: Support for GRO

2017-09-19 Thread Harald Welte

On Mon, Sep 18, 2017 at 05:39:03PM -0700, Tom Herbert wrote:
> Populate GRO receive and GRO complete functions for GTP-Uv0 and v1.

looks fine to me, though I'm not the GRO expert here.  Let's say what
the netdev gurus have to say in their review.

If you say it is tested with GRO-capable and non-GRO capable device
drivers, I'm fine with the patch.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 07/14] gtp: Support encapsulation of IPv6 packets

2017-09-19 Thread Harald Welte

Hi Dave,

On Mon, Sep 18, 2017 at 09:19:08PM -0700, David Miller wrote:

> > +static inline u32 ipv6_hashfn(const struct in6_addr *a)
> > +{
> > +   return __ipv6_addr_jhash(a, gtp_h_initval);
> > +}
> 
> I know you are just following the pattern of the existing "ipv4_hashfn()" here
> but this kind of stuff is not very global namespace friendly.  Even simply
> adding a "gtp_" prefix to these hash functions would be a lot better.

I would agree if this was an inline function defined in a header file or
a non-static function.  But where is the global namespace concern in
case of static inline functions defined and used in the same .c file?

If it makes you happy, I'm all for adding the prefix - I just would like
to understand the rationale better, thanks :)

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 04/14] gtp: udp recv clean up

2017-09-19 Thread Harald Welte

Hi Tom,

I think this patch does too many things at once:
* introduce separate rx functions
* convert from netif_rx to gro_cells_receive
* cosmetic changes like "return -1" to "goto drop"

In the context of reviewability and the "one patch per topic", I would
prefer to see those separated, thanks.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 03/14] gtp: Call common functions to get tunnel routes and add dst_cache

2017-09-19 Thread Harald Welte

Hi Dave,

On Mon, Sep 18, 2017 at 09:17:51PM -0700, David Miller wrote:
> This and the new dst caching code ignores any source address selection
> done by ip_route_output_key() or the new tunnel route lookup helpers.
> 
> Either source address selection should be respected, or if saddr will
> never be modified by a route lookup for some specific reason here,
> that should be documented.

The IP source address is fixed by signaling on the GTP-C control plane
and nothing that the kernel can unilaterally decide to change.  Such a
change of address would have to be decided by and first be signaled on
GTP-C to the peer by the userspace daemon, which would then update the
PDP context in the kernel.

So I guess you're asking us to document that rationale as form of a
source code comment ?

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 07/14] gtp: Support encapsulation of IPv6 packets

2017-09-19 Thread Harald Welte

On Mon, Sep 18, 2017 at 05:38:57PM -0700, Tom Herbert wrote:
> Allow IPv6 mobile subscriber packets. This entails adding an IPv6 mobile
> subscriber address to pdp context and IPv6 specific variants to find pdp
> contexts by address.

Please note that there are three different PDP contexts for IP:
* IPv4 only (what gtp.c implements so far)
* IPv6 only
* dual IPv4+IPv6 (called IPv46)

This information will have to be provisioned by the control plane
via netlink for each PDP context.  The kernel module then needs to
make sure that on a v4-only context no IPv6 packets are accepted
and vice-versa.

Your proposed patch is missing this kind of screening function and
I would imagine it could introduce all kinds of security problems :/

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next 05/14] gtp: Remove special mtu handling

2017-09-19 Thread Harald Welte

Hi Tom,

On Mon, Sep 18, 2017 at 05:38:55PM -0700, Tom Herbert wrote:
> Removes MTU handling in gtp_build_skb_ip4. This is non standard relative
> to how other tunneling protocols handle MTU. The model espoused is that
> the inner interface should set it's MTU to be less than the expected
> path MTU on the overlay network. Path MTU discovery is not typically
> used for modifying tunnel MTUs.

The point of the kernel GTP module is to interoperate with existing
other GTP implementations and the practises established by cellular
operators when operating GTP in their networks.

While what you describe (chose interface MTU to be less than the
expected path MTU) is generally best practise in the Linux IP/networking
world, this is not generally reflected in the cellular
universe. You see quite a bit of GTP fragmentation due to the fact
that the transport network simply has to deal with the MTU that has
been established via the control plane between SGSN and MS/UE, without
the GGSN even being part of that negotiation.

Also, you may very well have one "gtp0" tunnel device at the GGSN,
but you are establishing individual GTP tunnels to dozesn to hundreds of
different SGSNs at operators all over the world.  You cannot reliably
set the "gtp0" interface MTU to "the path MTU of the overlay network",
as the overlay network is in fact different for each of the SGSNs you're
talking to - and each may have a different path MTU.

So unless I'm missing something, I would currently vote for staying with
the current code, which uses the path MTU to the specific destination IP
address (the SGSN).

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v1 1/3] gtp: refactor to support flow-based gtp encap and decap

2017-07-14 Thread Harald Welte

Hi Jiannan,

 
> > >   gtp = netdev_priv(dev);
> > > + gtp->net = src_net;
> >·
> > Isn't this a generic change that's independent of your work on OVS GTP?
> 
> It is meant to be OVS independent. What makes it not? Should I leave 
> this field un-initialized?

In general, in all FOSS projects I have worked (and particularly the
Linux kernel), it is a strict rule that any given patch adresses only
one logical change.  So if your change is for flow-based "OVS" support
in the GTP code, and the "gtp->net = src_net" is a generic change (and
not something specifically required by flow/OVS) then it should be a
separate patch.  Similarly to the cosmetic changes which should be a
separate patch.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v1 2/3] gtp: Support creating flow-based gtp net_device

2017-07-14 Thread Harald Welte

Hi Jiannan,

On Fri, Jul 14, 2017 at 01:01:34AM +, Jiannan Ouyang wrote:
> > you're unconditionally binding to both GTP0 and GTP1 UDP ports.  This is
> > done selectively based on netlink attributes in the existing "normal"
> > non-OVS kernel code, i.e. the control is left to the user.
> > 
> > Is this function is only called/used in the context of OVS?  If so,
> > since you explicitly implement only GTPv1, why bind to GTPv0 port?
> > 
> 
> I had doubts on how this flow-based GTPv1 code path should fit in, which is 
> why
> the GTPv0 and the GTPv1 code pieces are mixed in my changes. 

Well, I know nothing about flow-based paths and OVS, so if you are the
one proposing related changes to me as the maintainer, you need to make
sure that your changes are consistend and useful within your use
case/scenario while making sure that the existing features don't break.

If you refactor generic code (used by "classic" GTP tunneling + your new
flow based tunneling), and that old code worked with GTPv0 and GTPv1,
then your modifications must make sure that they continue to support
GTPv0 and v1 in the "classic tunnel" use case.

If your new code for flow-based tunneling simply only implements v0, it
is fine to me - but then those restrictions must be in the flow-based
part only, and things must be consistent.  I.e. in this case, for the
flow-based tunnel approach you must not bind the v0 port, if you don't
handle related packets.

> Should I explicitly claim that the flow-based change is for GTPv1
> only?

That's definitely important, too - but is not the point I raised (see
above).

> > > + setup_udp_tunnel_sock(net, sock1u, _cfg);
> > 
> > even here, you're only setting up the v1 and not v0.
> 
> same reason as above.

yes, but if I read your code correctly, this is generic/shared code that
will break the existing GTPv0 support in the "classic tunnel" case!

> > > + /* Assume largest header, ie. GTPv0. */
> > > + dev->needed_headroom= LL_MAX_HEADER +
> > > + sizeof(struct iphdr) +
> > > + sizeof(struct udphdr) +
> > > + sizeof(struct gtp0_header);
> > 
> > ... and here you're using headroom for a GTPv0 header, despite (I think)
> > only supporting GTPv1 from this configuration?
> 
> Yes, only GTPv1 is supported.

well, then I suggest you don't generate headroom for a v0 header (which
is larger) in a v1-only code path :)

> > > + err = gtp_hashtable_new(gtp, GTP_PDP_HASHSIZE); // JO: when to free??
> > 
> > I think that question about when to free needs to be resolved before any
> > merge.  Did you check that it persists even after the device is
> > closed/removed?
> 
> I didn't investigate it. What do you mean by persist?

"persist" means "remains allocated after the release of the network
device".  Whatever you allocate during device creation you must
de-allocate on device release.  I cannot tell you when exactly (as I'm
not familiar with OVS or flow-based tunneling, as indicateD).  However,
I know for sure we cannot introduce code that looks like it introduces
memory leaks to the kernel :)

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v1 2/3] gtp: Support creating flow-based gtp net_device

2017-07-13 Thread Harald Welte

Hi Jiannan,

please keep in mind I have zero clue about OVS, so I cannot really
comment on what is customary in that context and what not.  However,
some general review from the GTP point of view below:

On Wed, Jul 12, 2017 at 05:44:54PM -0700, Jiannan Ouyang wrote:
> Add the gtp_create_flow_based_dev() interface to create flow-based gtp
> net_device, which sets gtp->collect_md. Under flow-based mode, UDP sockets are
> created and maintained in kernel.
> 
> Signed-off-by: Jiannan Ouyang <ouya...@fb.com>
> ---
>  drivers/net/gtp.c | 213 
> +-
>  include/net/gtp.h |   8 ++
>  2 files changed, 217 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
> index 5a7b504..09712c9 100644
> --- a/drivers/net/gtp.c
> +++ b/drivers/net/gtp.c
> @@ -642,9 +642,94 @@ static netdev_tx_t gtp_dev_xmit(struct sk_buff *skb, 
> struct net_device *dev)
>   return NETDEV_TX_OK;
>  }
>  
> +static int gtp_hashtable_new(struct gtp_dev *gtp, int hsize);
> +static void gtp_hashtable_free(struct gtp_dev *gtp);
> +static int gtp_encap_enable(struct gtp_dev *gtp, struct nlattr *data[]);
> +
> +static int gtp_change_mtu(struct net_device *dev, int new_mtu, bool strict)
> +{
> + int max_mtu = IP_MAX_MTU - dev->hard_header_len - sizeof(struct iphdr)
> + - sizeof(struct udphdr) - sizeof(struct gtp1_header);
> +
> + if (new_mtu < ETH_MIN_MTU)
> + return -EINVAL;
> +
> + if (new_mtu > max_mtu) {
> + if (strict)
> + return -EINVAL;
> +
> + new_mtu = max_mtu;
> + }
> +
> + dev->mtu = new_mtu;
> + return 0;
> +}
> +
> +static int gtp_dev_open(struct net_device *dev)
> +{
> + struct gtp_dev *gtp = netdev_priv(dev);
> + struct net *net = gtp->net;
> + struct socket *sock1u;
> + struct socket *sock0;
> + struct udp_tunnel_sock_cfg tunnel_cfg;
> + struct udp_port_cfg udp_conf;
> + int err;
> +
> + memset(_conf, 0, sizeof(udp_conf));
> +
> + udp_conf.family = AF_INET;
> + udp_conf.local_ip.s_addr = htonl(INADDR_ANY);
> + udp_conf.local_udp_port = htons(GTP1U_PORT);
> +
> + err = udp_sock_create(gtp->net, _conf, );
> + if (err < 0)
> + return err;
> +
> + udp_conf.local_udp_port = htons(GTP0_PORT);
> + err = udp_sock_create(gtp->net, _conf, );
> + if (err < 0)
> + return err;

you're unconditionally binding to both GTP0 and GTP1 UDP ports.  This is
done selectively based on netlink attributes in the existing "normal"
non-OVS kernel code, i.e. the control is left to the user.

Is this function is only called/used in the context of OVS?  If so,
since you explicitly implement only GTPv1, why bind to GTPv0 port?

> + setup_udp_tunnel_sock(net, sock1u, _cfg);

even here, you're only setting up the v1 and not v0.

> + /* Assume largest header, ie. GTPv0. */
> + dev->needed_headroom= LL_MAX_HEADER +
> + sizeof(struct iphdr) +
> + sizeof(struct udphdr) +
> + sizeof(struct gtp0_header);

... and here you're using headroom for a GTPv0 header, despite (I think)
only supporting GTPv1 from this configuration?

> + err = gtp_hashtable_new(gtp, GTP_PDP_HASHSIZE); // JO: when to free??

I think that question about when to free needs to be resolved before any
merge.  Did you check that it persists even after the device is
closed/removed?

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v1 1/3] gtp: refactor to support flow-based gtp encap and decap

2017-07-13 Thread Harald Welte

 +static inline void gtp1_push_header(struct sk_buff *skb, __be32 tid)
>  {
>   int payload_len = skb->len;
>   struct gtp1_header *gtp1;
> @@ -426,7 +452,7 @@ static inline void gtp1_push_header(struct sk_buff *skb, 
> struct pdp_ctx *pctx)
>   gtp1->flags = 0x30; /* v1, GTP-non-prime. */
>   gtp1->type  = GTP_TPDU;
>   gtp1->length= htons(payload_len);
> - gtp1->tid   = htonl(pctx->u.v1.o_tei);
> + gtp1->tid   = tid;
>  
>   /* TODO: Suppport for extension header, sequence number and N-PDU.
>*   Update the length field if any of them is available.
> @@ -438,34 +464,17 @@ struct gtp_pktinfo {
>   struct iphdr*iph;
>   struct flowi4   fl4;
>   struct rtable   *rt;
> - struct pdp_ctx  *pctx;
>   struct net_device   *dev;
>   __be16  gtph_port;
>  };
>  
> -static void gtp_push_header(struct sk_buff *skb, struct gtp_pktinfo *pktinfo)
> -{
> - switch (pktinfo->pctx->gtp_version) {
> - case GTP_V0:
> - pktinfo->gtph_port = htons(GTP0_PORT);
> - gtp0_push_header(skb, pktinfo->pctx);
> - break;
> - case GTP_V1:
> - pktinfo->gtph_port = htons(GTP1U_PORT);
> - gtp1_push_header(skb, pktinfo->pctx);
> - break;
> - }
> -}
> -
>  static inline void gtp_set_pktinfo_ipv4(struct gtp_pktinfo *pktinfo,
>   struct sock *sk, struct iphdr *iph,
> - struct pdp_ctx *pctx, struct rtable *rt,
> - struct flowi4 *fl4,
> + struct rtable *rt, struct flowi4 *fl4,
>   struct net_device *dev)
>  {
>  [...]
> + __be32 tun_id;

you are breaking GTPv0 functionality here.  GTPv0 has 64 bit tunnel
identifiers, and this function is called both from GTPv1 and GTPv0
context.

This makes me wonder how you did verify that your changes do not break
the existing operation with both GTPv0 and GTPv1?

> + // flow-based GTP1U encap
> + info = skb_tunnel_info(skb);
> + if (gtp->collect_md && info && ntohs(info->key.tp_dst) == GTP1U_PORT) {

I think it's typically safe to assume that GTP is only operated on
standard ports, but it is something you chould/should think about, i.e.
whether you want that kind of restriction.  In the existing use case, we
have the v0/v1 information stored in the per-pdp context structure.

> + tun_id  = htonl(pctx->u.v1.o_tei);

here is where you're assuming GTPv1 in two ways from code that is called
from both v0 and v1.
* you're dereferencing a v1 specific element in the pctx union
* you're storing the result in a 32bit variable

>   gtp = netdev_priv(dev);
> + gtp->net = src_net;

Isn't this a generic change that's independent of your work on OVS GTP?

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v1 0/3] Flow Based GTP Tunneling

2017-07-13 Thread Harald Welte

hi Jiannan,

net-next si closed, as it has been pointed out already by Joe.

On Wed, Jul 12, 2017 at 05:44:52PM -0700, Jiannan Ouyang wrote:
> ovs-ofctl add-flow br0
> "in_port=2,tun_src=192.168.60.141,tun_id=123, \
> actions=set_field:02:00:00:00:00:00->eth_src, \
> set_field:ff:ff:ff:ff:ff:ff->eth_dst,LOCAL"

I'm not familiar with the details here, but does this imply that you
are matching on the outer (transport layer) source IP address? If so,
please note this is in violation of the 3GPP specifications for GTP,
which require explicitly that the TEID is the *only* criteria for
matching an encapsulated packet to the tunnel.  Basically anyone can
send you an encapsulated packet from any random source IP, just as long
as the TEID matches a tunnel, it will be decapsulated.

This is [presumably] in order to take care of mobility, as the
subscribers' phone moves around different MME/S-GW/SGSN, each having
different source IP addresses.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: loosing netdevices with namespaces and unshare?

2017-06-01 Thread Harald Welte

Hi Eric,

On Thu, Jun 01, 2017 at 01:32:49AM -0500, Eric W. Biederman wrote:

> If a network device does not implement rntl_link_ops it is returned to
> the initial network namespace.   Anything else will loose physical
> devices.

Thanks a lot for your statement.  This is a big relief, my line of
thinking thus is confirmed:  We shall not loose physical devices.

> Only for pure software based devices do we delete them.  Perhaps your
> sub interface implements rtnl_link_ops?  Either that or something is
> still holding a reference to your network namespace, which would prevent
> the network device from being returned.

My question is how to debug this further?  Monitoring
/proc/*/ns/net* showed that the ID of the namespace is gone after
terminating my processes in the namespace.  Short of adding printk() or
playing with kprobes: to the related kernel code, how can I track the
reference count or get an idea who might hold references?

Regards,
    Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: loosing netdevices with namespaces and unshare?

2017-05-31 Thread Harald Welte

Hi Cong,

On Wed, May 31, 2017 at 03:40:33PM -0700, Cong Wang wrote:
> But you have other choices than using the physical interface
> directly in non-root ns, for example, creating a virtual pair and
> connect it with the physical one with a bridge. There are various
> ways to achieve this.

Yes, but then those "workarounds" expose the given physical device to
the root namespace, which is exactly what I try to avoid here.  The
interface has no purpose outside of the specific target namespace, and
under no circumstances should the various applications on a normal Linux
system (whether it's network manager or whatever else) start to use the
device.  The same also applies to the kernel itself.  It is not
desirable to have the "root netns" start to do things like ipv6
stateless autoconfiguration, etc.

I of course know that all of those things can be individually disabled.

I just think having a physical netdev inside "single application"
namespaces is more complicated than it could be.

However, I have sufficiently made my argument clear, and I understand
that you don't share my concern.  This is perfectly fine. We agree to
disagree :)

I simply have to find the least intrusive work-around to my liking for
the intentional but so far undocumented behavior of netdevices vanishing
into thin air.  I'll manage.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: loosing netdevices with namespaces and unshare?

2017-05-31 Thread Harald Welte

Hi Cong,

On Wed, May 31, 2017 at 10:44:53AM -0700, Cong Wang wrote:
> >> Net namespace simply unregisters all netdevices inside when it is
> >> gone, no matter where they are from.
> >
> > ah, ok. I missed that part.  Is there a good piece of documentation on
> > netwokr namespaces that I should read?
> 
> I don't know any doc mentioning this.

That's of course a pity.  I'll see what can be done about amending the
netns related manpage or the like.

> >> > Of course I know I could simply do something like "ip link set eth0
> >> > netns 1" from within the namespace before leaving.  But what if the
> >> > process is not bash and the process exits abnormally?   I'd consider
> >> > that explicit reassignment more like a hack than a proper solution...
> >>
> >> It doesn't make sense to move it back to where it is from, for example,
> >> what if you move a veth0 from netns1 to netns2 and netns1 is gone
> >> before netns2?
> >
> > for virtual devices, I would agree.  For physical devices, I think the
> > default behavior to unregister them is - from my of course very
> > subjective point of view - quite questionable.
> 
> Network namespace does not special-case the physical devices,
> it treats them all equally as abstract net devices.

I hear you, and I understand that of course from a developer point of
view it makes sense to treat all devices the same.  I just wonder if
from an usability point of view this is the best choice.  Virtual
devices can be (re)created at any time, physical not.

I mean, what is the *use case* for loosing any refrence to a physical
network device and unregistering it from the stack?  Is there any API by
which a new netdevice structure can be instantiated on the actual
hardware?  Registering the netdev is what the driver does during
discovering the system hardware.  If there's a method to "automagically"
loose devices, at the very least I wold expect some reasonable method to
resurrect them.  Unloading the kernel module and reloading it is for
sure not elegant, particularly not if you have multiple Ethernet
devices/ports sharing the same driver.

One could e.g. also think of something like a special namespace that
collects all the "orphan" netdevices.  Something analogous to the old
Unix tradition of "pid 1" collecting all the orphan tasks whose parents
died.  Transferring them into that "netdev orphanage" could
automatically set the link down so that no accidential
routing/forwarding of traffic between the devices is possible.

This is just my two cents.  Given my past involvement in Linux
networking I allow myself having an opinion on such matters.  But if the
kernel networking community thinks it is ok to loose all references to a
physical network device due to processes terminating irregularly (which
will happen, as indicated in OOM or software bug cases), then I will of
course have to accept that.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: loosing netdevices with namespaces and unshare?

2017-05-31 Thread Harald Welte

Hi Cong,

On Tue, May 30, 2017 at 04:18:17PM -0700, Cong Wang wrote:
> On Tue, May 30, 2017 at 3:07 PM, Harald Welte <lafo...@gnumonks.org> wrote:
> > But, to the contrary, this doesn't happen.  The unshare-created netns is
> > gone, but the netdevice did not get moved back to the root namespace
> > either.  The only hack to get back to the "eth0" device is to unload the
> > driver and re-load it.
> 
> 
> Net namespace simply unregisters all netdevices inside when it is
> gone, no matter where they are from. 

ah, ok. I missed that part.  Is there a good piece of documentation on
netwokr namespaces that I should read?

> I am pretty sure you can move it back to root-ns if you want, 

Yes, I can explicitly do that, but this of course doesn't work if e.g.
my [single] process in that namespace crashes due to some bug, OOM or
the like.

> it is a little tricky because you have to give the root-ns a name
> first.

It's actually not, as you can just identify the root-ns by pid 1, so
"ip link set $DEV netns 1" will move it back.  As indicated, I'm worried
about the error paths.

> > What am I missing here?  Is this the intended behavior?
> 
> Yes it is.

thanks for your confirmation.  Guess I have to get used to it.

> > Of course I know I could simply do something like "ip link set eth0
> > netns 1" from within the namespace before leaving.  But what if the
> > process is not bash and the process exits abnormally?   I'd consider
> > that explicit reassignment more like a hack than a proper solution...
> 
> It doesn't make sense to move it back to where it is from, for example,
> what if you move a veth0 from netns1 to netns2 and netns1 is gone
> before netns2?

for virtual devices, I would agree.  For physical devices, I think the
default behavior to unregister them is - from my of course very
subjective point of view - quite questionable.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

loosing netdevices with namespaces and unshare?

2017-05-30 Thread Harald Welte

Hi all,

I know I'm kind of late to the party in terms of deeper exploration of
Linux network namespaces.  Also, I'm not sure if the netdev list is the
riight place to ask, but a moderate amount of web searching didn't bring
up a solution in multiple hours, and it seems like I could trigger the
kernel (4.11.0) to loose netdevices, which I think is a serious issue.

What I'm doing:
* start a process using the 'unshare' command line tool provided with
  util-linux, e.g. "unshare -nUr bash". I do this as a non-privileged
  user but now that is mapped to uid '0' inside the new
  process/namespace, so I can adjust interface configuration.
* I use "echo $$" to get the PID of that bash process.
* On another terminal in a root shell, I use "ip link set eth0 netns $PID"
  in order to move a given physical device into that namespace.
* I then "exit" that bash, which should - to my knowledge - return the
  "eth0" netdev back to the root namespace, as the bash process was the
  only one using that network namespace

But, to the contrary, this doesn't happen.  The unshare-created netns is
gone, but the netdevice did not get moved back to the root namespace
either.  The only hack to get back to the "eth0" device is to unload the
driver and re-load it.

I can reproduce the above without starting any other process inside that
namespace.  I have verified that there are no /proc/*/ns/net symlinks
left pointing to the ID of that namespace.  What am I missing here?  Is
this the intended behavior?

Of course I know I could simply do something like "ip link set eth0
netns 1" from within the namespace before leaving.  But what if the
process is not bash and the process exits abnormally?   I'd consider
that explicit reassignment more like a hack than a proper solution...

Regards,
Harald

p.s.: In case you're wondering what I'm actually trying to achieve: Find
an easy way to run a single program in an isolated namespace that only
has one physical (usb) ethernet device.  I would like to execute that
program as unprivileged user but still be able to bind to privileged
ports.  And I want to do this using simple command-line tools without
all the bloat and overhead of "container" solutions that have 99% of
features I don't need.  But let that not distract you, I think the
mysteriously disappearing netdevices are a more general and important
issue.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v3 0/2] GTP SGSN-side tunnels

2017-03-24 Thread Harald Welte

Hi Jonas,

looks fine to me, but I haven't tested it.  Did you manually test it
using the extended libgtpnl + tools?

Also, in code like this:

+   if (gtp->role == GTP_ROLE_SGSN) {
+   pctx = ipv4_pdp_find(gtp, iph->saddr);
+   } else {

I think general Linux kernel coding style is to not have curly-brackets
around single-line blocks. See "Do not unnecessarily use braces where a
single statement will do." in line 169 of
Documentation/process/coding-style.rst

I won't mind your current style, and it is not a blocker issue to me,
but still it would be nice for general consistency.

Acked-by: Harald Welte <lafo...@gnumonks.org>

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-03-15 Thread Harald Welte

On Wed, Mar 15, 2017 at 08:10:38PM +0100, Harald Welte wrote:
> I've modified the patch slightly, see below (compile-tested, but not
> otherwise tested yet).  Basically rename the flags attribute to 'role',
> expand the commit log and removed unrelated cosmetic changes.

I also have a version against current net-next/master, in case anyone is 
interested..

>From 3274a3303d1ec997392a07a92666d57b13997658 Mon Sep 17 00:00:00 2001
From: Jonas Bonn <jo...@southpole.se>
Date: Wed, 15 Mar 2017 20:24:28 +0100
Subject: [PATCH] gtp: support SGSN-side tunnels

The GTP-tunnel driver is explicitly GGSN-side as it searches for PDP
contexts based on the incoming packets _destination_ address.  For
real-world use cases, this is sufficient, as the other side of a GTP
tunnel is not in fact implemented by GTP, but by the protocol stacking
of a mobile station / user equipment on the radio interface (like PDCP,
SNDCP).

However, if we want to simulate the mobile station, radio access network
and SGSN (for example to test the GGSN side implementation), then we
want to be identifying PDP contexts based on _source_ address.

This patch adds a "role" attribute at GTP-link creation time to specify
whether we behave like the GGSN or SGSN role of the tunnel; this
attribute is then used to determine which part of the IP packet to use
in determining the PDP context.

Signed-off-by: Jonas Bonn <jo...@southpole.se>
Signed-off-by: Harald Welte <lafo...@gnumonks.org>
---
 drivers/net/gtp.c| 46 +---
 include/uapi/linux/gtp.h |  2 +-
 include/uapi/linux/if_link.h |  7 +++
 3 files changed, 43 insertions(+), 12 deletions(-)

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 3e1854f34420..3ab593b9be85 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -74,6 +74,7 @@ struct gtp_dev {
 
struct net_device   *dev;
 
+   enum ifla_gtp_role  role;
unsigned inthash_size;
struct hlist_head   *tid_hash;
struct hlist_head   *addr_hash;
@@ -154,8 +155,8 @@ static struct pdp_ctx *ipv4_pdp_find(struct gtp_dev *gtp, 
__be32 ms_addr)
return NULL;
 }
 
-static bool gtp_check_src_ms_ipv4(struct sk_buff *skb, struct pdp_ctx *pctx,
- unsigned int hdrlen)
+static bool gtp_check_ms_ipv4(struct sk_buff *skb, struct pdp_ctx *pctx,
+ unsigned int hdrlen, enum ifla_gtp_role role)
 {
struct iphdr *iph;
 
@@ -164,27 +165,31 @@ static bool gtp_check_src_ms_ipv4(struct sk_buff *skb, 
struct pdp_ctx *pctx,
 
iph = (struct iphdr *)(skb->data + hdrlen);
 
-   return iph->saddr == pctx->ms_addr_ip4.s_addr;
+   if (role == GTP_ROLE_SGSN)
+   return iph->daddr == pctx->ms_addr_ip4.s_addr;
+   else
+   return iph->saddr == pctx->ms_addr_ip4.s_addr;
 }
 
 /* Check if the inner IP source address in this packet is assigned to any
  * existing mobile subscriber.
  */
-static bool gtp_check_src_ms(struct sk_buff *skb, struct pdp_ctx *pctx,
-unsigned int hdrlen)
+static bool gtp_check_ms(struct sk_buff *skb, struct pdp_ctx *pctx,
+unsigned int hdrlen, enum ifla_gtp_role role)
 {
switch (ntohs(skb->protocol)) {
case ETH_P_IP:
-   return gtp_check_src_ms_ipv4(skb, pctx, hdrlen);
+   return gtp_check_ms_ipv4(skb, pctx, hdrlen, role);
}
return false;
 }
 
-static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff *skb, unsigned int 
hdrlen)
+static int gtp_rx(struct pdp_ctx *pctx, struct sk_buff *skb, unsigned int 
hdrlen,
+ enum ifla_gtp_role role)
 {
struct pcpu_sw_netstats *stats;
 
-   if (!gtp_check_src_ms(skb, pctx, hdrlen)) {
+   if (!gtp_check_ms(skb, pctx, hdrlen, role)) {
netdev_dbg(pctx->dev, "No PDP ctx for this MS\n");
return 1;
}
@@ -239,7 +244,7 @@ static int gtp0_udp_encap_recv(struct gtp_dev *gtp, struct 
sk_buff *skb)
return 1;
}
 
-   return gtp_rx(pctx, skb, hdrlen);
+   return gtp_rx(pctx, skb, hdrlen, gtp->role);
 }
 
 static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, struct sk_buff *skb)
@@ -281,7 +286,7 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp, struct 
sk_buff *skb)
return 1;
}
 
-   return gtp_rx(pctx, skb, hdrlen);
+   return gtp_rx(pctx, skb, hdrlen, gtp->role);
 }
 
 static void gtp_encap_destroy(struct sock *sk)
@@ -481,7 +486,11 @@ static int gtp_build_skb_ip4(struct sk_buff *skb, struct 
net_device *dev,
 * Prepend PDP header with TEI/TID from PDP ctx.
 */
iph = ip_hdr(skb);
-   pctx = ipv4_pdp_find(gtp, iph->daddr);
+   if (gtp->role == GTP_ROLE_SGSN)
+   pctx = ipv4_pdp_find(gtp, iph->sad

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-03-15 Thread Harald Welte

Hi Pablo,

On Wed, Mar 15, 2017 at 06:23:48PM +0100, Pablo Neira Ayuso wrote:
> On Wed, Mar 15, 2017 at 05:39:16PM +0100, Harald Welte wrote:
> > 
> > I would definitely like to see this move forward, particularly in order
> > to test the GGSN-side code.
> 
> Agreed.

I've modified the patch slightly, see below (compile-tested, but not
otherwise tested yet).  Basically rename the flags attribute to 'role',
expand the commit log and removed unrelated cosmetic changes.

I've also prepared a corresponding change to libgtpnl into the
laforge/sgsn-rol branch, see
http://git.osmocom.org/libgtpnl/commit/?h=laforge/sgsn-role

This is not yet tested in any way, but I'm planning to add some
associated support to the command line tools and then give it some
testing (both against the kernel GTP in GGSN mode, as well as an
independent userspace GTP implementation).

> It would be good if we provide a way to configure GTP via iproute2 for
> testing purposes.

I don't really care about which tool is used, as long as it is easily
available [and FOSS, of course].

> We would need to create some dummy socket from
> kernel too though so we don't need any userspace daemon for this
> testing mode.

I don't really like that latter idea. It sounds too much like a hack to
me.  But then, I don't have enough phantasy right now ti imagine how an
actual implementation would look like.

To me, it is perfectly fine to run a simple, small utility in userspace
even for testing.

Regards,
Harald

>From 63920950f9498069993def78e178bde85c174e0c Mon Sep 17 00:00:00 2001
From: Jonas Bonn <jo...@southpole.se>
Date: Wed, 15 Mar 2017 17:52:28 +0100
Subject: [PATCH] gtp: support SGSN-side tunnels

The GTP-tunnel driver is explicitly GGSN-side as it searches for PDP
contexts based on the incoming packets _destination_ address.  For
real-world use cases, this is sufficient, as the other side of a GTP
tunnel is not in fact implemented by GTP, but by the protocol stacking
of a mobile station / user equipment on the radio interface (like PDCP,
SNDCP).

However, if we want to simulate the mobile station, radio access network
and SGSN (for example to test the GGSN side implementation), then we
want to be identifying PDP contexts based on _source_ address.

This patch adds a "role" attribute at GTP-link creation time to specify
whether we behave like the GGSN or SGSN role of the tunnel; this
attribute is then used to determine which part of the IP packet to use
in determining the PDP context.

Signed-off-by: Jonas Bonn <jo...@southpole.se>
Signed-off-by: Harald Welte <lafo...@gnumonks.org>

diff --git a/drivers/net/gtp.c b/drivers/net/gtp.c
index 99d3df788ce8..9aef4217f6e1 100644
--- a/drivers/net/gtp.c
+++ b/drivers/net/gtp.c
@@ -71,6 +71,7 @@ struct gtp_dev {

struct net_device   *dev;

+   enum ifla_gtp_role  role;
unsigned inthash_size;
struct hlist_head   *tid_hash;
struct hlist_head   *addr_hash;
@@ -149,8 +150,8 @@ static struct pdp_ctx *ipv4_pdp_find(struct gtp_dev *gtp, 
__be32 ms_addr)
return NULL;
 }

-static bool gtp_check_src_ms_ipv4(struct sk_buff *skb, struct pdp_ctx *pctx,
- unsigned int hdrlen)
+static bool gtp_check_ms_ipv4(struct sk_buff *skb, struct pdp_ctx *pctx,
+ unsigned int hdrlen, enum ifla_gtp_role role)
 {
struct iphdr *iph;

@@ -159,18 +160,21 @@ static bool gtp_check_src_ms_ipv4(struct sk_buff *skb, 
struct pdp_ctx *pctx,

iph = (struct iphdr *)(skb->data + hdrlen);

-   return iph->saddr == pctx->ms_addr_ip4.s_addr;
+   if (role == GTP_ROLE_SGSN)
+   return iph->daddr == pctx->ms_addr_ip4.s_addr;
+   else
+   return iph->saddr == pctx->ms_addr_ip4.s_addr;
 }

 /* Check if the inner IP source address in this packet is assigned to any
  * existing mobile subscriber.
  */
-static bool gtp_check_src_ms(struct sk_buff *skb, struct pdp_ctx *pctx,
-unsigned int hdrlen)
+static bool gtp_check_ms(struct sk_buff *skb, struct pdp_ctx *pctx,
+unsigned int hdrlen, enum ifla_gtp_role role)
 {
switch (ntohs(skb->protocol)) {
case ETH_P_IP:
-   return gtp_check_src_ms_ipv4(skb, pctx, hdrlen);
+   return gtp_check_ms_ipv4(skb, pctx, hdrlen, role);
}
return false;
 }
@@ -204,7 +208,7 @@ static int gtp0_udp_encap_recv(struct gtp_dev *gtp, struct 
sk_buff *skb,
goto out_rcu;
}

-   if (!gtp_check_src_ms(skb, pctx, hdrlen)) {
+   if (!gtp_check_ms(skb, pctx, hdrlen, gtp->role)) {
netdev_dbg(gtp->dev, "No PDP ctx for this MS\n");
ret = -1;
goto out_rcu;
@@ -261,7 +265,7 @@ static int gtp1u_udp_encap_recv(struct gtp_dev *gtp

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-03-15 Thread Harald Welte

Hi Jonas,

are you working on the review feedback that was provided back in early
February?  I think there were some comments like
* remove unrelated cosmetic change in comment
* change from FLAGS to a dedicated MODE netlink attribute
* add libgtpnl code and some usage information or even sample scripts

I would definitely like to see this move forward, particularly in order
to test the GGSN-side code.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH] netfilter: logging copyrights is useless

2017-03-15 Thread Harald Welte

Hi Corentin,

On Wed, Mar 15, 2017 at 02:17:39PM +0100, Corentin Labbe wrote:
> Logging copyrights does not add any useful information in logs.
> This patch remove such logging

Historically, there were plenty of more copyright notices for certain
drivers or sections of the code being printed while booting.  I still
remember fondly the many ethernet driver notices of a Donald Becker, for
example.

I understand that it is questionable whether or not such statements are
"useful".  You might argue, their use is in
* stating a legal formality to the user
* making it simpler to determine if a given part of code is used in a
  given device (e.g. as part of GPL enforcement) while just logging the
  serial console and no requirement to (find a way to) dump the internal
  flash

Besides such practical arguments (it is of what use to whom), there are
legal concerns regarding the removal of copyright statements.  This
holds true on whether or not it is Free Software, or whether or not it
is GPL licensed.  If an author puts a copyright statement somehwere, he
exercises his right to be regarded as the author of the work.  It is
typically not permitted to remove such notices, as that would be a
copyright infringement in itself.

Also, beyond general legal concerns, the GPLv2 states explicitly:

> c) If the modified program normally reads commands interactively
> when run, you must cause it, when started running for such
> interactive use in the most ordinary way, to print or display an
> announcement including an appropriate copyright notice and a notice
> that there is no warranty (or else, saying that you provide a
> warranty) and that users may redistribute the program under these
> conditions, and telling the user how to view a copy of this License.
> (Exception: if the Program itself is interactive but does not normally
> print such an announcement, your work based on the Program is not
> required to print an announcement.)

Now you can argue whether the kernel is a an interactive program, but at
least you can see some intent to not remove any notices/messages that
were originally present in the program.

So I think your patch could only applied if the respective copyright
holders agree to remove their respective notices.

I personally would argue to keep them.  Nobody has complained about them
so far, and they have probably saved many weeks of my work time in GPL
compliance / enforcement work.  I understand this is a "niche use case",
though ;)

-- 
- Harald Welte <lafo...@netfilter.org> http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie

Re: [PATCH net-next 0/4] gtp: support multiple APN's per GTP endpoint

2017-03-14 Thread Harald Welte

Hi Andreas,

On Tue, Mar 14, 2017 at 02:42:16PM +0100, Pablo Neira Ayuso wrote:
> On Tue, Mar 14, 2017 at 01:42:44PM +0100, Andreas Schultz wrote:
> > The only supported and documented API for the GTP module is libgtpnl.
> 
> No, the netlink interface itself if the API.
> 
> Stopping trying to find a reason to break API, that is a no-go.

As much as one might dislike it as a developer in this particular case,
the Linux kernel has the very well communicated rule: All userspace
visible interfaces must not change in an incompatible way.  This
includes of course all the syscalls, the ioctl() parameters but also the
netlink interfaces of the networking stack.

The statement "nobody ever used it" is a statement you can never make in
FOSS software, as you don't know of 99.999% of all the users of your
software.  The fact that none of the FOSS projects that any of us was
involved in may not have used a certain feature doesn't mean nobody else
has been using it privately, quietly.  Keep in mind that several Linux
distributions have already been shipping the gtp module as part of their
stable releases meanwhile.

Also, no matter what Pablo or I may think about, there are general rules
about how Linux kernel development is done (from coding style to merge
windows, and also userspace compatibility), and we all have to obey
them.  There's little point in discussing about them, we all just have
to live with them.

Regards,
    Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v5 0/7] gtp: misc improvements

2017-03-13 Thread Harald Welte

Hi Pablo,

On Mon, Mar 13, 2017 at 05:48:12PM +0100, Pablo Neira Ayuso wrote:
> I would like you take the time to develop the VRF idea that you want
> to introduce, I just would like that we avoid bloating the GTP tunnel
> with features that we can just achieve via composite of different
> subsystems.

I presume you're talking about the "several APN behind one GGSN IP"
which requires the de-coupling of the UDP socket from the gtp
net-device?

I actually think it is a very straight forward idea what Andreas is
proposing, and I see no bloat associated.  It's rather like splitting
existing combined functionality in two parts, which can still be used
together, but also be used separately.

Or are you referring to something else?

In any case, I'm looking forward to the related technical discussion on
this mailing list[s] :)

Regards,
    Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v5 0/7] gtp: misc improvements

2017-03-11 Thread Harald Welte

Hi Andreas,

> Andreas Schultz (7):
>   gtp: switch from struct socket to struct sock for the GTP sockets
>   gtp: make GTP sockets in gtp_newlink optional
>   gtp: merge gtp_get_net and gtp_genl_find_dev
>   gtp: consolidate gtp socket rx path
>   gtp: unify genl_find_pdp and prepare for per socket lookup
>   gtp: consolidate pdp context destruction into helper
>   gtp: add socket to pdp context


I agree with the conceptual and architectural direction that you're
taking the code, and I also think your current patchset is good to go
ahead, so feel free to add my "Acked-By: Harald Welte <lafo...@gnumonks.org>".

However, the usual disclaimer: I've been out of kernel networking land
for a long time, so my review skills are clearly not the best anymore.
I'm happy for any input others can give from that point of view.

There are some minor typos in the commit logs, but at the rate my typos
increase over the years, I am the last one to complain about wasting
time on fixing these ;)

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Basic test setup for testing of Kernel GTP-U

2017-02-24 Thread Harald Welte

Hi all,

[intentionally breaking the thread here]

On Thu, Feb 23, 2017 at 05:46:57PM +0100, Harald Welte wrote:
> I'll try to cook up some instructions extending
> https://osmocom.org/projects/openggsn/wiki/OpenGGSN to cover also
> sgsnemu for a basic use case of establishing one single tunnel.  That's
> of course like a manual "HOWTO" and not yet anything that can be tested
> automatically.

I've documented the instructions at
https://osmocom.org/projects/linux-kernel-gtp-u/wiki/Basic_Testing

Please let me know any updates/corrections/questions if you try to
reproduce this.  The above instructions were working for me yesterday.

Please find an ASCII export of this below (much less readable than the
wiki).

Regards,
Harald

h1. Basic Testing

This page documents some basic testing setup for the Kenrel GTP-U code. It 
follows the below rationale:
* focus on testing the kernel GTP-U module without too much external 
dependencies
* test GTP-U interoperability of the kernel with at least one other 
implementation, not just kernel-to-kernel (which currently is not supported in 
the kernel, as it only implements the GGSN/P-GW role)
* limit testing to SGSN/S-GW and GGSN/P-GW, without a real cellular network 
(which is possible e.g. using [[OsmoSGSN:]] and [[OsmoPCU:]])

h2. Building / Installing dependencies

In order to follow below test instructions, you will need
* A Linux kernel including the GTP-U driver (@drivers/net/gtp.c@) either 
compiled-in or as kernel module
* [[libgtpnl]] - the userspace library providing an API around the kernel GTP-U 
netlink interface
* [[OpenGGSN:]] - a minimal C-language implementation of a 3GPP GGSN, also 
contains a SGSN-side emulator called [[OpenGGSN:sgsnemu]]
** make sure you use Version 0.93 or later
** make sure you compile it with @--enable-gtp-linux@ enable during the 
@./configure@ step

You can find some instructions on how to build [[OpenGGSN:]] with support for 
[[libgtpnl]] and kernel GTP-U at this wiki page: [[OpenGGSN:Kernel_GTP]]

h2. Test setup description

We will run the GGSN natively on the host, and put the emulated SGSN inside a 
separate network namespace.

The two namespaces are interconnected by a virtual ethernet device using the 
transfer network 172.31.1.0/24

The GGSN is configured to provide a pool of IP addresses from the 
192.168.71.0/24 range.  Each PDP context will be allocated one dynamic address 
from that pool

h2. Test instructions

h3. create the network namespace for the SGSN

 ip netns add sgsn

h3. add veth to be used between SGSN and GGSN

 ip link add veth0 type veth peer name veth1

h3. remote (SGSN) side of veth device

ip link set veth1 netns sgsn
ip netns exec sgsn ip addr add 172.31.1.2/24 dev veth1
ip netns exec sgsn ip link set veth1 up

h3. local (GGSN) side of veth device

ip addr add  172.31.1.1/24 dev veth0
ip link set veth0 up

h3. execute the GGSN on the host

 ggsn -g -c ./ggsn.conf.test

(use the file attached to this wiki page)

The @-g@ option is responsible for activating kernel-GTP support. If you cannot 
get the example described in this document to work, try the pure GTP-U 
userspace implementation by removing @-g@ from the above command line.  If it 
works then, there's something related to Kernel GTP-U that breaks the setup.

h3. execute the emulated SGSN inside the sgsn namespace

 ip netns exec sgsn sgsnemu -d -r 172.31.1.1 -l 172.31.1.2 --defaultroute 
--createif

h3. verify the existnace of the GTP tunnel

ggsn:~# gtp-tunnel list
version 1 tei 1/1 ms_addr 192.168.71.2 sgsn_addr 172.31.1.2

h3. further testing

in the @sgsn@ namespace, there's now a default-route that points into the GTP 
tunnel. You can use this to ping any network address that's reachable to the 
GGSN host.  If that host is connected to the internet, you can e.g. run a ping 
command from within the namespace using 

 ip netns exec sgsn ping -c 10 8.8.8.8

which will send some IP packets to 8.8.8.8 via the tun0 device (created by 
[[OpenGGSN:sgsnemu]]). It will be encapsulated by the userspace GTP-U 
implementation of sgsnemu, sent via the veth device to the host, where it ends 
up inthe GTP-U kernel module, decapsulating the package and passing in on to 
the gtp0 device there.  Anything beyond that point depends on your local 
routing configuration.

== Building OpenGGSN ==

h1. OpenGGSN and Linux Kernel accelerated GTP-U

OpenGGSN has support to use the Linux kernel GTP-U code to accelerate the 
data/user plane while still implementing the control plane (GTP-C) in userspace 
in OpenGGSN.

For more information about the Linux kernel GTP-U code, please see 
[[linux-kernel-gtp-u:]]

h2. Building OpenGGSN with kernel which has GTP-U support

At the time of writing (2017-01-09) of this wiki, below listed distributions 
have support of GTP kernels :

* Debian 4.9.1-1~exp1 - Debian's latest experimental build 
* Ubuntu 16.10 kernel 4.8
* OpenSUSE Tumbleweed (stable) 4.8 (.12)
* Fedora 25 kernel 4.8 

Fol

Re: [PATCH net-next v4 4/7] gtp: consolidate gtp socket rx path

2017-02-23 Thread Harald Welte

Hi Tom,

On Thu, Feb 23, 2017 at 10:07:03AM -0800, Tom Herbert wrote:
> I'm looking at the GTP encapsulation, it's not particularly complex.
> Is there any real reason why we can't just implement a netdev
> interface with static tunnels and configuration to do performance
> testing and development? For instance, if we want to add GSO/GRO
> support to GTP that's all we really need, the control plane should be
> inconsequential in that case.

As outlined several times in this thread, GTP tunneling is not
symmetric.  The current mainline code can only act as one of the two
roles (GGSN/G-GW), not as the other one.  The rationale is that in 3GPP
networks, that is the only point in the network that takes native IP
data (e.g.from the internet) and puts it into GTP.  At all other places
in the network, you don't have this combination.  By the time the inner
IP data arrives at the mobile phone, it is no longer encapsulated in
GTP, as the lower layers have been adapted one or multiple times to
other protocols by other network elements.

There's a patch that has recently been submitted which adds the
capability for the SGSN/S-GW side of a GTP-U tunnel, but that patch has
so far not been merged (due to concersn about its netlink interface),
and the patch *only* exists for testing, it has no real-world
application in 3GPP networks.

So as of now, it is not possible to run both endpoints of a tunnel in
Linux.

> > For performance testing, you would need a SGSN-side implementation of
> > I'll try to cook up some instructions extending
> > https://osmocom.org/projects/openggsn/wiki/OpenGGSN to cover also
> > sgsnemu for a basic use case of establishing one single tunnel.  That's
> > of course like a manual "HOWTO" and not yet anything that can be tested
> > automatically.
> >
> That would be good. Thanks!

I've spent some hours earlier today on this, I expect the document to be
ready at some point over the weekend.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next] net/gtp: Add udp source port generation according to flow hash

2017-02-23 Thread Harald Welte

Hi Tom,

On Thu, Feb 23, 2017 at 08:35:13AM -0800, Tom Herbert wrote:
> On Thu, Feb 23, 2017 at 6:00 AM, Pablo Neira Ayuso <pa...@netfilter.org> 
> wrote:
> > On Thu, Feb 23, 2017 at 10:35:56AM +0100, Andreas Schultz wrote:
> >> - On Feb 22, 2017, at 10:47 PM, Tom Herbert t...@herbertland.com wrote:
> >> So in a complete 3GPP node (GGSN, P-GW) that uses the GTP tunnel
> >> implementation, malicious traffic should be blocked before it can reach
> >> the tunnel.
> >>
> >> And as I stated before, the GTP tunnel module is not supposed to be
> >> use without any of those components. So the DDOS concern should not
> >> be handled at the tunnel level.
> >
> > I think that Tom's point is that this tunnel driver will have to deal
> > with DDOS scenarios anyway, because reality is that you can't always
> > block it before it can reach the tunnel.
> >
> Right, if only we had a dime for every time someone thought their
> perimeter security was sufficient only to be proven wrong!

Yes and no.  Welcome to the cellular world. Everything is designed in
a way that it assumes everyone can be trusted and that none of those
interfaces (except the radio interface) are ever exposed to the public.

There is really very little one can do to change that world.  It's like
the Internet in the early 1990ies, and they are reluctant to learn.  And
whenever a new system is designed (like the step from MAP to DIAMETER),
they make damn sure that all the security issues are inherited from the
previous standards to ensure interoperability ;)

I understand and support the motivation to design robust systsems even
in the presence of broken/ignorant specs, but I think this is one of the
situations where it is useless (and IMHO impossible) to do anything
about it.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v4 4/7] gtp: consolidate gtp socket rx path

2017-02-23 Thread Harald Welte

Hi Tom,

On Thu, Feb 23, 2017 at 08:28:47AM -0800, Tom Herbert wrote:

> If there is a way for us to test this without setting up a full mobile
> network please provide details on how to do that in the commit log.

There are different ways how to do this.  With the existing in-kernel
code (that lacks the "SGSN role" patch which would be for testing only),
you can e.g. use two relatively small C-language programs from the
openggnsn package [http://git.osmocom.org/openggsn/]:

* OpenGGSN with support for libgtpnl and the kernel GTP-U module
* sgsnemu (included in openggsn source tree) which implements a minimal
  SGSN-side of the tunnel.  It will
  * perform the GTP-C signaling required with OpenGGSN (which in turn
will then instruct the kernel to open a  tunnel via the netlink
interface).
  * start a tun/tap device for the "mobile station end" of the tunnel
perform en/decapsulation of data between that tun/tap device and GTP
in userspace

The wiki at https://osmocom.org/projects/openggsn/wiki/OpenGGSN contains
step-by-step instructions how to build and configure OpenGGSN with
support for kernel GTP-U. It's nothing more than autotools based
compile+install of libgtpnl followed by "./configure --enable-gtp-linux"
and "make" for OpenGGSN 

What is not described on the above page is how to use sgsnemu to
simulate the SGSN side, as within Osmocom we typically run the open
source OsmoSGSN (a more "real" SGSN).

So using two small binaries that can be compiled without much external
dependencies (and very few lines of configuration), it is possible to do
some functional testing of the kernel GTP module.  For performance
testing this of course won't work, as sgsnemu is running entirely in
userspace.

For performance testing, you would need a SGSN-side implementation of
GTP-U that performs equally well or better than the GGSN-side
implementation in the kernel.  One option is the patch that has recently
been submitted to netdev and which is under discussion.  However, then
you simply test one implementation against itself, which provides no
interoperability guarantees with other implementations, and thus also
limited in different regards.

> > Adding static tunnel support into the kernel module in any form
> > makes IMHO no sense. GTP as defined by 3GPP always need a control
> > instance and there are much better options for static tunnel
> > encapsulations.
> >
> That misses the point. From the kernel point of view GTP is just
> another encapsulation protocol and we now have a lot of experience
> with those. The problem is when you post patches to improve it or fix
> issues, if there is no practical way to perform independent  analysis
> or investigation. This makes GTP no different than a proprietary
> protocol to us and that severely limits the value of trying to work
> with the community.

I wholeheartedly agree.  That's one of the reasons why I recently posted
a RFC about what to do for (regression and other) testing of the kernel
GTP-U module.

I'll try to cook up some instructions extending
https://osmocom.org/projects/openggsn/wiki/OpenGGSN to cover also
sgsnemu for a basic use case of establishing one single tunnel.  That's
of course like a manual "HOWTO" and not yet anything that can be tested
automatically.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v3 1/8] gtp: add documentation

2017-02-22 Thread Harald Welte

ever I believe there
> is nothing that would stop a SGSN/S-GW from switching GPT-U tunnel source
> transparently across IP's (for example a system with multiple shelves might
> use different shelve in DL/UL direction and have therefore multiple IP's.

yes, it might.  But then, on the other hand - at least for what I know
about 2G/3G - it is typically very cumbersome to announce IP addresses
to roaming partners.  It involves manual updating of related GSMA
[paper?] forms, exchanging them with all roaming partners, and
coordinating a change with them.  AFAIK, it is not possible to announce
'this is my range of S-GW IP addreses', but you have to individually
manually announce each of them in the related documents.   There are
projects out there where people are building additional layers of
address translation just to work around those long and error-prone
manual procedures, making sure that the externally visible IP addresses
of a given SGSN/GGSN are always the same ones.

So yes, all of the above are practical/procedural concerns, but they
would be explanations why this scenario - even if it might be supported
by the specs - is unlikely in practise, at least within current/past
procedures and practises.

Finally, from the point of view of packet filtering (which operators
typically implement, otherwise the manual announcement of SGSN IP
addresses by the above procedures wouldn't be required), it would be
impossible to do state based filtering if there were use cases where an
random IP address colud send valid traffic for a given tunnel.

> > I find it somewhat surprising, given how much this opens the door for
> > arbitrary spoofing from anyone (with access to the respective private
> > network such as GRX).
> 
> Yes it is, but it is mandated without a doubt by the specification for
> GTP-C. For the reasons outline before, I think that this applies to
> GTP-U as well.

For GTP-U it may be the case.  I'm not entirely convinced, though.
Also, even if the specifications wanted to support such scenarios, I
think it is doubtful that this is actually implemented in practise.

If it was my choice, and I had to support the "loose matching", I would
make it a configuration option (sysctl? netlink attribute?) and default
to the more strict matching, including the source address.  It just
seems to make much more sense and be more safe/sane.  For those people
who really need the loose matching (and who are willing to pay the price
for it in terms of opening up to spoofing and making packet filtering
harder), they can set the option.

But then, I'm not the one doing that implementation, so it is up to you.
My opinion is just an opinion.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v3 1/8] gtp: add documentation

2017-02-20 Thread Harald Welte

Hi Andreas,

this is not really about the documentation, but it still makes sense to
discuss here as the issue is best described:

On Mon, Feb 13, 2017 at 04:36:17PM +0100, Andreas Schultz wrote:
> +Local GTP-U entity and tunnel identification
> +
> +
> +GTP-U uses UDP for transporting PDU's. The receiving UDP port is 2152 for
> +GTPv1-U and 3386 for GTPv0-U.
> +
> +There is only one GTP-U entity (and therefor SGSN/GGSN/S-GW/PDN-GW instance)
> +per IP address. Tunnel Endpoint Identifier (TEID) are unique per GTP-U 
> entity.
> +
> +A specific tunnel is only defined by the destination entity. Since the
> +destination port is constant, only the destination IP and TEID define
> +a tunnel. The source IP and Port have no meaning for the tunnel.

Are you absolutely sure about this?

> +[3GPP TS 29.281] Section 4.3.0 defines this so:
> +
> +> The TEID in the GTP-U header is used to de-multiplex traffic incoming from
> +> remote tunnel endpoints so that it is delivered to the User plane entities
> +> in a way that allows multiplexing of different users, different packet
> +> protocols and different QoS levels. Therefore no two remote GTP-U endpoints
> +> shall send traffic to a GTP-U protocol entity using the same TEID value 
> except
> +> for data forwarding as part of mobility procedures.
> +
> +The definition above only defines that two remote GTP-U endpoints *should 
> not*
> +send to the same TEID, it *does not* forbid or exclude such a scenario. In
> +fact, the mentioned mobility procedures make it necessary that the GTP-U 
> entity
> +accepts traffic for TEID's from multiple or unknown peers.

I so far always assumed that you use GTP-C "MODIFY PDP CONTEXT" in case
of such mobility situations, i.e. the control plane explicitly notifies
the GTP-U entity of a change in the SGSN/S-GW address.

At least for 3G this seems to be the case, and my assumption appears to
be confirmed with sources such as e.g. page 15 of
http://www.nwadmin.de/presentation_mobility_manag ement_in_UMTS.pdf

Also, for LTE, it seems that there's a GTPv2-C "Modify Bearer
Request/Response" involved in relocation from one S-GW to another S-GW:
http://www.lteandbeyond.com/2012/03/x2-based-handover-with-sgw-relocation.html

> Step 3. The target Serving GW assigns addresses and TEIDs (one per
> bearer) for downlink traffic from the PDN GW. The Serving GW allocates
> DL TEIDs on S5/S8 even for non-accepted bearers. It sends a Modify
> Bearer Request (Serving GW addresses for user plane and TEID(s))
> message per PDN connection to the PDN GW(s). The SGW also includes
> User Location Information IE and/or UE Time Zone IE if it is present
> in step 2. The PDN GW updates its context field and returns a Modify
> Bearer Response (Charging Id, MSISDN, etc.) message to the Serving GW.
> The MSISDN is included if the PDN GW has it stored in its UE context.
> The PDN GW starts sending downlink packets to the target GW using the
> newly received address and TEIDs. These downlink packets will use the
> new downlink path via the target Serving GW to the target eNodeB. The
> Serving GW shall allocate TEIDs for the failed bearers and inform to
> the MME.

Section 5.5.1.1.3 of TS 23.401 agrees with the GTPv2C signalling during
relocation, AFAICT.

> +Therefor the receiving side only identifies tunnels based on TEID's, not 
> based
> +on the source IP.

I'm wondering if this really is neccesary. Do you have actual protocol
traces, specs or any other literature confirming this?  

I find it somewhat surprising, given how much this opens the door for
arbitrary spoofing from anyone (with access to the respective private
network such as GRX).

So in which situations specifically will thre be a S-GW side Address
change without associated GTP-C signaling informing the P-GW about the
new S-GW side Address + TEID?

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

[PATCH net-next] GTP: Add some basic documentation about drivers/net/gtp.c

2017-02-18 Thread Harald Welte

In order to clarify what the module actually does, and how to use it,
let's add some basic documentation to the kernel tree, together with
pointers to related specs and projects.

Signed-off-by: Harald Welte <lafo...@gnumonks.org>
---
 Documentation/networking/gtp.txt | 135 +++
 1 file changed, 135 insertions(+)
 create mode 100644 Documentation/networking/gtp.txt

diff --git a/Documentation/networking/gtp.txt b/Documentation/networking/gtp.txt
new file mode 100644
index ..93e96750f103
--- /dev/null
+++ b/Documentation/networking/gtp.txt
@@ -0,0 +1,135 @@
+The Linux kernel GTP tunneling module
+==
+Documentation by Harald Welte <lafo...@gnumonks.org>
+
+In 'drivers/net/gtp.c' you are finding a kernel-level implementation
+of a GTP tunnel endpoint.
+
+== What is GTP ==
+
+GTP is the Generic Tunnel Protocol, which is a 3GPP protocol used for
+tunneling User-IP payload between a mobile station (phone, modem)
+and the interconnection between an external packet data network (such
+as the internet).
+
+So when you start a 'data connection' from your mobile phone, the
+phone will use the control plane to signal for the establishment of
+such a tunnel between that external data network and the phone.  The
+tunnel endpoints thus reside on the phone and in the gateway.  All
+intermediate nodes just transport the encapsulated packet.
+
+The phone itself does not implement GTP but uses some other
+technology-dependent protocol stack for transmitting the user IP
+payload, such as LLC/SNDCP/RLC/MAC.
+
+At some network element inside the cellular operator infrastructure
+(SGSN in case of GPRS/EGPRS or classic UMTS, hNodeB in case of a 3G
+femtocell, eNodeB in case of 4G/LTE), the cellular protocol stacking
+is translated into GTP *without breaking the end-to-end tunnel*.  So
+intermediate nodes just perform some specific relay function.
+
+At some point the GTP packet ends up on the so-called GGSN (GSM/UMTS)
+or P-GW (LTE), which terminates the tunnel, decapsulates the packet
+and forwards it onto an external packet data network.  This can be
+public internet, but can also be any private IP network (or even
+theoretically some non-IP network like X.25).
+
+You can find the protocol specification in 3GPP TS 29.060, available
+publicly via the 3GPP website at http://www.3gpp.org/DynaReport/29060.htm
+
+A direct PDF link to v13.6.0 is provided for convenience below:
+http://www.etsi.org/deliver/etsi_ts/129000_129099/129060/13.06.00_60/ts_129060v130600p.pdf
+
+== The Linux GTP tunnelling module ==
+
+The module implements the function of a tunnel endpoint, i.e. it is
+able to decapsulate tunneled IP packets in the uplink originated by
+the phone, and encapsulate raw IP packets received from the external
+packet network in downlink towards the phone.
+
+It *only* implements the so-called 'user plane', carrying the User-IP
+payload, called GTP-U.  It does not implement the 'control plane',
+which is a signaling protocol used for establishment and teardown of
+GTP tunnels (GTP-C).
+
+So in order to have a working GGSN/P-GW setup, you will need a
+userspace program that implements the GTP-C protocol and which then
+uses the netlink interface provided by the GTP-U module in the kernel
+to configure the kernel module.
+
+This split architecture follows the tunneling modules of other
+protocols, e.g. PPPoE or L2TP, where you also run a userspace daemon
+to handle the tunnel establishment, authentication etc. and only the
+data plane is accelerated inside the kernel.
+
+Don't be confused by terminology:  The GTP User Plane goes through
+kernel accelerated path, while the GTP Control Plane goes to
+Userspace :)
+
+The official homepge of the module is at
+https://osmocom.org/projects/linux-kernel-gtp-u/wiki
+
+== Userspace Programs with Linux Kernel GTP-U support ==
+
+At the time of this writing, there are at least two Free Software
+implementations that implement GTP-C and can use the netlink interface
+to make use of the Linux kernel GTP-U support:
+
+* OpenGGSN (classic 2G/3G GGSN in C):
+  https://osmocom.org/projects/openggsn/wiki/OpenGGSN
+
+* ergw (GGSN + P-GW in Erlang):
+  https://github.com/travelping/ergw
+
+== Userspace Library / Command Line Utilities ==
+
+There is a userspace library called 'libgtpnl' which is based on
+libmnl and which implements a C-language API towards the netlink
+interface provided by the Kernel GTP module:
+
+http://git.osmocom.org/libgtpnl/
+
+== Protocol Versions ==
+
+There are two different versions of GTP-U: v0 and v1.  Both are
+implemented in the Kernel GTP module.  Version 0 is a legacy version,
+and deprecated from recent 3GPP specifications.
+
+There are three versions of GTP-C: v0, v1, and v2.  As the kernel
+doesn't implement GTP-C, we don't have to worry about this.  It's the
+responsibility of the control plane implementation in userspace t

Re: RFC: unit tests for kernel GTP module

2017-02-17 Thread Harald Welte

Hi Andreas,

On Fri, Feb 17, 2017 at 04:02:59PM +0100, Andreas Schultz wrote:
> The test suite is proprietary, so we can only share the results but
> not the test setup itself.

it would be great to have some CI setup where both current stable as
well as a development branch of the code is tested, and reports
published regularly.  Not sure how realistic that is.

> We more or less removed static GTP tunnels. The tools in libgtpnl
> only work when an application is keeping the GTP socket alive.

Well, then those tools need to be adapted accordingly.  That's what I
also indicated in other mails:  Changes in the kernel GTP code should
always be followed-up with correspondign changes in libgtpnl and
associated tools.  I mean, the tool could just open the socket and then
continue to run until terminated explicitly...

I think there's a lot of value in some very low-level tools for testing
and experimentation, without the complexity of configuring + running an
Erlang GGSN/P-GW with all its dependencies.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

RFC: unit tests for kernel GTP module

2017-02-16 Thread Harald Welte

Dear GTP-interested folks,

I would love to somehow get towards some degree of unit testing (or even
"continuous integration") for teh kernel GTP code.

We currently have the original code in the kernel, we had some recent
small fixes and now are getting more patches into place.  With
relatively few active users out there (and probably none of them in
production yet), it's particularly easy to introduce regressions while
working on the code.  Also, having tested new code even against a
test set with limited covrage could help to get more confidence in new
patches and thus get them merged sooner.

Using tools like sgsnemu of OpenGGSN and the command line tools included
in libgtpnl, it should be possibel to cook up some scripts for testing.
Even the most basic set of tests would be an improvement over what we
have now.  One could also think about pcap replay to test with
hand-crafted or real-world packets from other GTP implementations.

As much as I'd like to put something like this into place myself, I
don't think I will be able to work much on this in the near future. The
GTP module at this point is a pure hobby and contrary to some years ago
while I started it, I don't have any contract work in the GTP area at
this point, so other projects currently unfortunately get more
attention.

So in case somebody among the GTP-interested parties (Travelping, OAI,
...) would want to do something in terms of testing, I'd be more than
happy if somebody would step ahead.  Otherwise it's all just vapourware
going to end up on my ever-growing TODO list :/

Also, if netdev folks have some ideas/pointers about possible
frameworks/tools for this kind of testing [it must exist for at least
some other kernel networking code?]: Please let me know.  I'd be
interested to have a look if there's something that can be used as a
basis (starting network namespaces, sending/transmitting packets, test
case startup/teardown, ...)

My "old school" approach would have been to start one or multiple
user-mode-linux kernels (those that are to be tested), and then have
scripts that set up a gtp socket and gtp tunnels via the libgtp command
line tools, and throw packets at that.   But I'm sure there must be
quite powerful frameworks for that kind of testing in the 21st century?
How do other tunneling implementations handle this?

Regards,
    Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-02-13 Thread Harald Welte

Hi Andreas, Pablo, Jonas,

I think that the SGSN/GGSN role flag (or whatever it may end up being
called) logically belongs in the gtp-device at this point, and will in
the future belong to the UDP/GTP-socket (with Andreas' proposed
changes).   Having it per-pdp-context indeed seems odd and just provide
a way to create broken configurations (and increase the memory use per
pdp context, of which you have many more than netdevs or gtp-sockets).

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-02-06 Thread Harald Welte

Hi Jonas,

On Mon, Feb 06, 2017 at 02:33:07PM +0100, Jonas Bonn wrote:
> Fair enough.  The use-case I am looking at involves PGW load-testing where
> the simulated load is generated locally on the SGSN so it _is_ seeing IP
> packets and the SNDCP is left out altogether.  

Ok, it would have been useful to document that test-only feature in the
changelog and/or code.  Like "support simulated RAN-side tunnels" or
"support SGSN/S-GW simulation".

> Perhaps this is too pathological to warrant messing with the upstream
> driver... I don't know: the symmetry does not cost much even if it's
> of limited use.

There are plenty of features in the mainline kernel related to testing,
see pktgen for example.  So I think if it doesn't impose complexity,
performance issues or stretches the existing architecture, I think
there's no reason to keep it out.

Looking at the code, I think the one conditional on the flags is not
going to kill significant performance of the "normal" use case.  But
that's of course just guessing, without any benchmark to back that up.

Semantically, I'm not sure if the FLAGS and the re-use of the
SGSN_ADDRESS TLV is the best choice.  If suddenly the meaning of the TLV
is "Peer GSN Address" then it should be called that way.  We could have
a #define SGSN_ADDRESS to GSN_PEER_ADDRESS to make old code compile.
I'll let Pablo respond to this as he came up with the netlink interface,
as far as I can remember :)

Also, like with any changes to the kernel and netlink interface code, I
think we should always mandate similar changes to be made to libgtpnl so
the feature can actually be used/tested with the standard
tools/utilities available to anyone.

> Couldn't the SNDCP theoretically be a separate node and push IP packets to
> the SGSN, thus making this useful?  Perhaps it's a stretch...

No, because you would introduce an hop (or even two!) at the IP level,
breaking
* the notion of who the remote IP address is (remote poin-to-point address)
  of the PDP context
* packets get modified (TTL decrement, ...) where they are not supposed to
* you suddenly might get TTL exceeded, dest unreachable, ...) out of
  nowhere into your user IP
* you introduce serious security issues by having the kernel IP routing
  code between the outer IP (the operator RAN/core network) and the
  inner user IP payload.

Regards,
Harald
-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v2 3/6] gtp: unify genl_find_pdp and prepare for per socket lookup

2017-02-06 Thread Harald Welte

On Mon, Jan 30, 2017 at 05:37:10PM +0100, Andreas Schultz wrote:
> Signed-off-by: Andreas Schultz <aschu...@tpip.net>

Acked-by: Harald Welte <lafo...@gnumonks.org>

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v2 5/6] gtp: add socket to pdp context

2017-02-06 Thread Harald Welte

Dear All,

On Thu, Feb 02, 2017 at 04:07:23PM +0100, Andreas Schultz wrote:
> - On Feb 2, 2017, at 3:46 PM, pablo pa...@netfilter.org wrote:
> > On Thu, Feb 02, 2017 at 03:38:07PM +0100, Andreas Schultz wrote:
> >> - On Feb 2, 2017, at 3:28 PM, pablo pa...@netfilter.org wrote:
> >> > I suggest you just call kfree_rcu() from 4/6.
> >> > 
> >> > Regarding holding socket reference, see my comment for patch 1/6.
> >> 
> >> This is going to be a problem at this stage of the changes.
> >> 
> >> The final goal is to have a reference from the socket to the pdp context.
> > 
> > Is this just a cleanup? Or you need this sk caching for some follow up
> > work?
> 
> It's not caching, the plan is to completely remove the socket from the
> GTP netdevice (as far as that is possible without breaking the existing API).

I agree this is the way to go.  When I originally thought about the GTP
kernel tunneling module early on, I was not aware of the fact that
operators actually in practise run multiple "virtual GGSNs" on one IP
address/port.  From a pure technical point of view you would say "why
bother"?  They could just use separate IP addresses for each of them.
However, the reailty is that each new IP address that an operator uses
for a GGSN results in paper forms required to be exchanged between this
operator and all his roming partners, followed-up by manual
re-configuration of the policies on all of those roaming partners.  This
is time-consuming and error-prone, but hey, it's how the procedures
between GSMA members seem to work ;)

> A GGSN or PGW can serve multiple APN's on the same GTP-U socket. Those APN's
> can have overlapping IP address ranges. The only sensible way to handle
> this, is to have a netdevice per APN. This breaks the current 1:1 relation
> between sockets and netdevices.

Indeed.  So the question is how to do this best and how to keep
backwards compatibility of the netlink interface.  I don't claim to have
answers to that, sorry.

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v2 0/6] gtp: misc improvements

2017-02-06 Thread Harald Welte

Hi Dave and List,

On Tue, Jan 31, 2017 at 01:05:43PM -0500, David Miller wrote:
> Please someone with GTP knowledge properly review this series.

I am happy to provide review, but I was travelling and my time to work
on things outside my dayjob is typically quite limited.  So I'd like to
ask for a bit more patience for patch review from me.  Thanks for your
understanding.

Regards,
Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v2 1/6] gtp: make GTP sockets in gtp_newlink optional

2017-02-06 Thread Harald Welte

Hi Andreas,

my kernel coding skills are getting a bit rusty (no pun intended), and
I'll think others on this list are more capable to do so.  But let me at
least provide feedback from the "3GPP / GTP side":

On Mon, Jan 30, 2017 at 05:37:08PM +0100, Andreas Schultz wrote:
> Having both GTPv0-U and GTPv1-U is not always desirable.
> Fallback from GTPv1-U to GTPv0-U was depreciated from 3GPP
> Rel-8 onwards. Post Rel-8 implementation are discuraged
> from listening on the v0 port (see 3GPP TS 29.281, Sect. 1).

I confirm this and I think the related change should be applied.

> A future change will completely decouple the sockets from the
> network device. Till then, at least one of the sockets needs to
> be specified (either v0 or v1), the other is optional.

Makes sense.

-- 
- Harald Welte <lafo...@netfilter.org> http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie

Re: [PATCH net-next v2 2/6] gtp: merge gtp_get_net and gtp_genl_find_dev

2017-02-06 Thread Harald Welte

Hi Andreas,

On Mon, Jan 30, 2017 at 05:37:09PM +0100, Andreas Schultz wrote:
> Both function are always used together with the final goal to
> get the gtp_dev. This simplifies the code by merging them together.

Ok, some code restructuring / unification, seems useful.

However:

> - netdev_dbg(dev, "GTPv0-U: update tunnel id = %llx (pdp 
> %p)\n",
> -pctx->u.v0.tid, pctx);
> + pr_debug("GTPv0-U: update tunnel id = %llx (pdp %p)\n",
> +  pctx->u.v0.tid, pctx);

(and other related changes) appear to be purely cosmetic and should thus
be unrelated to the function merging described in the change log
message.

-- 
- Harald Welte <lafo...@netfilter.org> http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie

Re: [PATCH 1/1] gtp: support SGSN-side tunnels

2017-02-06 Thread Harald Welte

Dear Jonas,

On Fri, Feb 03, 2017 at 10:12:31AM +0100, Jonas Bonn wrote:
> The GTP-tunnel driver is explicitly GGSN-side as it searches for PDP
> contexts based on the incoming packets _destination_ address.  If we
> want to write an SGSN, then we want to be idenityfing PDP contexts
> based on _source_ address.

A SGSN should never see the "native" User-IP payload.  It either has a
Gb interface towards the BSS which has a BSSGP/NS/UDP/IP protocol
stacking (for GERAN) or an IuUP or GTP stacking (for UTRAN).

The user-ip tunnel (PDP context) exists between the mobile station and
the GGSN.  Any intermediate nodes (BTS, BSC, NodeB, RNC, SGSN, ...) do
not appear as intermediate IP Layer nodes in that User IP tunnel, but
only serve to transparently pass the user-ip inside that tunnel between
the two tunnel end-points.

As such, I am very surprised by your patch.  Exposing the User IP to the
Linux network stack in the SGSN seems to be a severe layering violation
and contradict everything I know about 3GPP network architecture.  But
maybe I'm missing something here? Please explain.

The only SGSN-level user plane acceleration that I can think of is
quite different:

For an UMTS SGSN that only supports IuPS, and only supports IuPS over
IP (which is a sub-class of a sub-class of all use cases), what would
make sense is some Kernel-level support to map from one GTP
socket/tunnel to another GTP socket/tunnel based on the TEIDs.  So
basically you have a GTP tunnel on the RAN side and another GTP tunnel
on the CN side, without any decapsultaion.

The TEIDs on both sides *could* be identical, or *cold* be separate, as
a matter of implementation policy.  

The IP addresses /port numbers on both sides will in almost all
non-laboratory use cases be separate, as an operator typically doesn't
want a transparent/routed IP network between the RAN and Core Network (CN).

So the GTP tunnels between RNC/hNodeB/heNodeB on the RAN side get mapped
1:1 to GTP tunnels between SGSN and GGSN on the CN side.  However, as no
encapsulation/decapsulation is performed, this is outside of the scope
of the current kernel GTP tunneling module.  Rather, it's more something
similar to static NAT between two pairs of addresses.

Regards,
    Harald

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH net-next v2 4/6] gtp: consolidate pdp context destruction into helper

2017-02-06 Thread Harald Welte

Hi Andreas,

On Mon, Jan 30, 2017 at 05:37:11PM +0100, Andreas Schultz wrote:
> Consolidate duplicate code into helper.

Makes complete sense.

However:

> +static void pdp_context_delete(struct pdp_ctx *pctx);
> +

why introduce the forward-declaration and then define the function
further down in the file?  I think in general, it is preferred to put
helper functions on top, before their first use, so forward declarations
can be avoided?  Particularly if the helper function doesn't call any
other functions that are not yet declared at this point.

Please note the above might just be my personal taste, not sure if
that's a general habit in the kernel networking code these days.

So with or without the re-ordering:

Acked-by: Harald Welte <lafo...@gnumonks.org>

-- 
- Harald Welte <lafo...@gnumonks.org>   http://laforge.gnumonks.org/

"Privacy in residential applications is a desirable marketing option."
  (ETSI EN 300 175-7 Ch. A6)

Re: [PATCH 0/5] simple gtp improvements

2017-01-24 Thread Harald Welte

Hi Andreas,

I agree with your changes (particularly those related to 3GPP specs)
like 2/5 and 5/5.  Also, 1/5 is of course obvious.

For kernel topics like 3/5 and 4/5 I trust Pablo and the general netdev
crew to have better judgement than me.

-- 
- Harald Welte <lafo...@netfilter.org> http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie

Re: [PATCH v2 00/18] gtp: fixes and support multiple VRF's per GTP socket

2017-01-24 Thread Harald Welte

Hi Andreas,

On Tue, Jan 24, 2017 at 04:28:30PM +0100, Andreas Schultz wrote:
> During that work some smaller problems where found and fixes for them are
> included.

As this is a rather large patch-set, and I think it might be better to
split it in clear bugfixes of the existing code (or no-brainers like the
module alias) and in introducing new featur regarding socket/netdev.
However, that's just my comment "from the side", I'm not the one who
needs to be convinced in terms of merging them :)

I would just assume that the clear fixes + no-brainers could go in right
away without any further review...

-- 
- Harald Welte <lafo...@netfilter.org> http://netfilter.org/

  "Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed."-- Paul Vixie

[PATCH] fix br_fdb_fini() section mismatch

2007-12-05 Thread Harald Welte

When compiling a kernel (current linus git or 2.6.24-rc4) with built-in
CONFIG_BRIDGE, I get the following error:

  LD  .tmp_vmlinux1
`br_fdb_fini' referenced in section `.init.text' of net/built-in.o: defined in 
discarded section `.exit.text' of net/built-in.o
make: *** [.tmp_vmlinux1] Error 1

This patch fixes it.

Signed-off-by: Harald Welte [EMAIL PROTECTED]

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index eb57502..bc40377 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -44,7 +44,7 @@ int __init br_fdb_init(void)
return 0;
 }
 
-void __exit br_fdb_fini(void)
+void br_fdb_fini(void)
 {
kmem_cache_destroy(br_fdb_cache);
 }
-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)
--
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH/RFC 00/10] Transparent proxying patches version 4

2007-01-08 Thread Harald Welte

On Sun, Jan 07, 2007 at 05:11:06PM +0100, Lennert Buytenhek wrote:
 On Sun, Jan 07, 2007 at 03:11:34PM +0100, Harald Welte wrote:
 
   So instead of using NAT to dynamically redirect traffic to local
   addresses, we now rely on native non-locally-bound sockets and do
   early socket lookups for inbound IPv4 packets. 
  
  It's good to see a solid implementation of this 'old idea'.  
  
  Just as a quick historical note to netdev:  This is the way how the
  netfilter project  advised the balabit guys to implement fully
  transparent proxy support, after having seen the complexity of the old
  nat-based TPROXY patches.
 
 Didn't rusty tell the balabit guys to use the NAT approach?

that was originally, way back.  It turned out to be a bad idea, after
all... way too complex.  At least that's how I look at it.  Too sad :(

Rusty and me then had the idea about the routing based approach at some
point, if I remember correctly.  We talked about it with Krisztian and
Balazs at least on one occasion.

All that isn't really important.  All I wanted to say was:

I (and AFAIR the netfilter core team) believe this is the way to
implement good support for transparent proxying.  It's already the
second completely independent implementation, let's merge it after all.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


signature.asc
Description: Digital signature

Re: [PATCH/RFC 00/10] Transparent proxying patches version 4

2007-01-07 Thread Harald Welte

Hi Krisztian!

On Wed, Jan 03, 2007 at 05:33:57PM +0100, KOVACS Krisztian wrote:
 So instead of using NAT to dynamically redirect traffic to local
 addresses, we now rely on native non-locally-bound sockets and do
 early socket lookups for inbound IPv4 packets. 

It's good to see a solid implementation of this 'old idea'.  

Just as a quick historical note to netdev:  This is the way how the
netfilter project  advised the balabit guys to implement fully
transparent proxy support, after having seen the complexity of the old
nat-based TPROXY patches.

So I personally support this patchset and vote for it to be included
(with whatever modifications netdev deems apropriate)

It might be that there now is the experimental netchannels system which
might provide an even better way for transparent proxy support.

However, ever since ip_tables was merged in the 2.3.x days, we have
lacked good support for transparent proxies.  Now that the first
incarnation of the NAT based TPROXY patch for 2.4.x had to be developed
and maintained out-of-tree for many years, I definitely think it's
better to merge the new, way less intrusive, patchset.  

Some interested party can work on a netchannels implementation later on,
but that's the next generation...

Cheers,
-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


signature.asc
Description: Digital signature

Re: [IPROUTE] manpage for lnstat

2006-10-26 Thread Harald Welte

On Thu, Oct 26, 2006 at 10:41:26PM +0200, Michael Prokop wrote:
 Hello,
 
 I wrote a manpage for lnstat, would be great if it could be applied
 to the next release.

Stephen: Please include it into your next release, looks fine to me.

 @Harald, I'm following your 'If somebody wants to do a manpage, feel
 free to send me a patch :)' of lnstat's README. :)

thanks a lot!
-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpudwX1x8eA0.pgp
Description: PGP signature

[BUG 2.6.18] unaligned access in ipvv6_rcv, nf_ip6_checksum, tcp_error, __ipv6_addr_type, fib6_lookup_1

2006-09-30 Thread Harald Welte

Hi!

I've just built 2.6.18 on a sparc64 box (Ultra 5) using gcc-3.3.5
(debian woody).  After booting the kernel, I get tons of unaligned
access messages related to various bits of the IPv6 code:

Kernel unaligned access at TPC[10022cf0] ipv6_rcv+0xb8/0x320 [ipv6]
Kernel unaligned access at TPC[10023800] __ipv6_addr_type+0x8/0x140 [ipv6]
Kernel unaligned access at TPC[1002fd64] fib6_lookup_1+0x2c/0x120 [ipv6]
Kernel unaligned access at TPC[10093878] tcp_error+0x40/0x2c0 [nf_conntrack]
Kernel unaligned access at TPC[1004ce54] nf_ip6_checksum+0x13c/0x1c0 [ipv6]
Kernel unaligned access at TPC[1004ce58] nf_ip6_checksum+0x140/0x1c0 [ipv6]
Kernel unaligned access at TPC[1004ce60] nf_ip6_checksum+0x148/0x1c0 [ipv6]

As I'm not really familiar at all with sparc assembly, the usual look
at objectdump, try to see what it does, look at corresponding source
code strategy doesn't work for me in this case.

There could be 7 different bugs, but I think it's more likely that some
common data structure is misaligned and thus causes unaligned accesses
all over the place.

I've put the ipv6.ko and nf_conntrack.ko modules online at
http://people.netfilter.org/laforge/tmp/2618_bug/

This could be a known issue, but I couldn't find any reference to it on
netdev, sparclinux or via google. 

Any assistance is appreciated, thanks!

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpmlwbGFRFtJ.pgp
Description: PGP signature

Re: driver for pptp

2006-06-14 Thread Harald Welte

On Sat, Jun 03, 2006 at 11:06:19AM +0400, [EMAIL PROTECTED] wrote:
 I have developed the driver for Point-to-Point Tunneling Protocol (PPTP).

great news.  something that I always thought of a nice-to-have. 

 I have published the project on http://accel-pptp.sourceforge.net/

Please don't expect Linux Kernel networking developers to actually go to
sourceforge download and extract code that you want to have
reviewed/submitted.

Please read Documentation/SubmittingPatches (and CodingStyle) and submit
your kernel patch to netdev.

 Hope this driver will go to a kernel tree and will make linux more productive.

not without you pushing it actively and getting through review cycles
(which I hope you will!).

Some initial comments:

1) why wasn't it possible to use the PPPoX infrastructure of the kernel
   which is already being used by PPPoE ?  Or at least model it somehow
   similar to the existing PPPoE/PPPoX infrastructure?

2) why are you using a timer for asynchronous processing of GRE frames?
   First of all, why does it have to happen asynchronously at all?
   Secondly, why using a timer when there's nothing time related (or do
   I miss something)?  If deferred, out-of-context execution is
   required, there are other primitives such as tasklets.

3) you conflict with the ip_gre.c genric GRE encapsulation driver.  this
   is because both want to reigster a proto handler for GRE.  Ideally,
   there needs to be another demultiplex between the GRE protocl and its
   users.  The code registered for GRE would look at the packet and
   determine whether e.g. it is a PPTP GRE packet and then pass it on to
   the pptp module.

4) your code doesn't look nonlinear skb clean

5) why did you chose to implement  /dev/pptp rather than a socket family
   like the existing pppox/pppoe code?

6) lots of codingstyle issues

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpuEiqiED7WD.pgp
Description: PGP signature

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Harald Welte

On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote:
 
 This has the makings of a nice stable internal kernel api.  Why do we want
 to provide this nice stable internal api to proprietary modules?

because there is IMHO legally nothing we can do about it anyway.  Use of
an industry-standard API that is provided in multiple operating system
is one of the clearest idnication of some program _not_ being a
derivative work.

Whether we like it or not, it doesn't really matter if we export them
GPL-only or not.  Anybody using those scoket API calls will be having an
easy time arguing in favor of non-derivative work.

The GPL doesn't extend beyon what copyright law allows you to do...

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpFWlVVgch8f.pgp
Description: PGP signature

Re: [RFC/PATCH 1/2] in-kernel sockets API

2006-06-14 Thread Harald Welte

On Wed, Jun 14, 2006 at 04:29:04PM +0200, Erik Mouw wrote:
 On Wed, Jun 14, 2006 at 03:30:22PM +0200, Harald Welte wrote:
  On Tue, Jun 13, 2006 at 02:12:41PM -0700, Daniel Phillips wrote:
   
   This has the makings of a nice stable internal kernel api.  Why do we want
   to provide this nice stable internal api to proprietary modules?
  
  because there is IMHO legally nothing we can do about it anyway.  Use of
  an industry-standard API that is provided in multiple operating system
  is one of the clearest idnication of some program _not_ being a
  derivative work.
 
 IMHO there is no industry-standard API for in-kernel use of sockets.
 There is however one for user space.

it doesn't matter in what space you are.  If the API really is similar
enough, then any piece of code (no matter where it was originally
intended to run) will be able to work with any such socket API.

The whole point of this is: Where is the derivation of an existing work?
I can write a program against some BSD socket api somewhere, and I can
easily make it use the proposed in-kernel sockets API.  No derivation of
anything that is inside the kernel and GPL licensed.

 (IANAL, etc)

Neither am I, but I'm constantly dealing with legal questions related to
the GPL while running gpl-violations.org.

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

We all know Linux is great...it does infinite loops in 5 seconds. -- Linus


pgpIzJfnLL2Vq.pgp
Description: PGP signature

Re: [RFC] Geographical/regulatory information for ieee80211

2006-04-28 Thread Harald Welte

On Wed, Apr 26, 2006 at 07:54:31PM -0500, Larry Finger wrote:
 I am leaning toward putting the geographical information into a
 userland daemon. 

I like that idea very much.  This is all control metadate that doesn't
really need to be in the kernel.

 That way we won't have to patch the kernel every time a country
 modifies its regulations. 

that's another advantage.

 In addition, the kernel will be smaller. The downside is that the
 daemon will have to be updated and supplied in some convenient form,
 perhaps as part of a wireless tools package.

Ideally the daemon would get the table of country restrictions from a
policy file (some human-readable ascii?).  That file can then be
downloaded by a cronjob to keep it updated, if desired.  A bit like the
PCI/USB device id databases...

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpYfGS2xTcML.pgp
Description: PGP signature

Re: [RFC: 2.6 patch] the overdue removal of ip{,6}_queue

2006-04-11 Thread Harald Welte

On Sun, Apr 09, 2006 at 08:40:05PM +0200, Patrick McHardy wrote:
 Adrian Bunk wrote:
  This patch contains the overdue removal of ip{,6}_queue.
 
 I think this is too early, none of the distributions seem to have
 picked up the compat library already. I believe (not sure, Harald?)
 that it requires at least recompilation of programs using libipq,
 so the 6 month period seems too short. 

Yes, apparently the distro's don't seem to be picking up the new
userspace libraries.  The question is, however, whether any further
delay really is the right motivation for them to pick them up soon ;)

Maybe we should add a printk ('app foo is using obsolete ip_queue
system').

 It also appears the compat library won't work on older system, 

could you elaborate on that?

 so I have to admit I'm not sure about the removal at all.

I'm still for the removal, but probably giving it some more time.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpjVgSowAMmi.pgp
Description: PGP signature

Re: [RFC: 2.6 patch] the overdue removal of ip{,6}_queue

2006-04-11 Thread Harald Welte

On Tue, Apr 11, 2006 at 04:42:50PM +0200, Patrick McHardy wrote:
  Maybe we should add a printk ('app foo is using obsolete ip_queue
  system').
 
 Good idea, that will probably help speed it up. But I still think
 we need to give them at least another six month.

ok. I'll prepare a patch for both the printk and the update of
feature-removal-schedule.

 I think we need to do two things:
 
 - make libipq_compat a drop-in replacement that doesn't require
   recompilation

libipq is not distributed as a shared library (at least not by us), so I
don't see any purpose for doing so.  Do you think anyone is going to
re-link statically linked code against the new lib

 - make it work with both ipq and nfnetlink_queue

that should be possible, though.  

I still don't really think that it is worth all the effort, especially
since there is only one hand full of applications using libipq.

The backwards compatibility library is expected to perform a lot worse
thna both the old libipq as well as the new native libnetfilter_queue
due to additional data copies, etc.  When I wrote it, it was more
intended as some intermediate aid, something that helps while people
migrate.  We shouldn't make it too perfect, otherwise they won't migrate
at all.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpUyDjQahTpM.pgp
Description: PGP signature

Re: [PATCH] [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

2006-02-13 Thread Harald Welte

On Mon, Feb 13, 2006 at 10:32:49AM +0900, Yasuyuki KOZAKAI wrote:

 Hi, Harald,

 From: Harald Welte [EMAIL PROTECTED]
 Date: Sun, 12 Feb 2006 18:56:22 +0100

  [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

  This patch moves all helper related data fields of 'struct nf_conn' into a
  separate structure 'struct nf_conn_help'.  This new structure is only
  present in conntrack entries for which we actually have a helper loaded.

  Also, this patch cleans up the nf_conntrack 'features' mechanism to
  resemble what the original idea was:  Just glue the feature-specific
  data structures at the end of 'struct nf_conn', and explicitly re-calculate
  the pointer to it when needed rather than keeping pointers around.

  Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack is
  276 bytes. We still need to save another 20 bytes in order to fit into to
  target of 256bytes.

  Signed-off-by: Harald Welte [EMAIL PROTECTED]

 This patch seems not to include fixes I pointed out and you said fixed.
 Please confirm if you really sent the intended one.

oops, thanks for checking.  Dave: Please hold, I sent an old version.
You'll get a new one ASAP.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie

pgpRGzy12Mxgh.pgp
Description: PGP signature

Re: [PATCH] [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

2006-02-13 Thread Harald Welte

Hi Dave!

This is the correct (latest) version of this patch.  Sorry for the
confusion.  Please apply to net-2.6.17, thanks!


[NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

This patch moves all helper related data fields of 'struct nf_conn' into a
separate structure 'struct nf_conn_help'.  This new structure is only
present in conntrack entries for which we actually have a helper loaded.

Also, this patch cleans up the nf_conntrack 'features' mechanism to
resemble what the original idea was:  Just glue the feature-specific
data structures at the end of 'struct nf_conn', and explicitly re-calculate
the pointer to it when needed rather than keeping pointers around.

Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack is
276 bytes. We still need to save another 20 bytes in order to fit into to
target of 256bytes.

Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit aba8cf3ac5b60e10eb1ce9f30b6bb4c007ab9868
tree 758aedbe80b8306c1991a05c47eab8905d33a13b
parent 94d3d40c84672b74e59ea5252f61602610e1513e
author Harald Welte [EMAIL PROTECTED] Sun, 29 Jan 2006 19:19:06 +0100
committer Harald Welte [EMAIL PROTECTED] Sun, 29 Jan 2006 19:19:06 +0100

 include/net/netfilter/nf_conntrack.h   |   56 +++
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   22 ++---
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   39 +---
 net/netfilter/nf_conntrack_core.c  |  117 ++--
 net/netfilter/nf_conntrack_ftp.c   |2 
 net/netfilter/nf_conntrack_netlink.c   |   39 +---
 net/netfilter/nf_conntrack_standalone.c|1 
 net/netfilter/xt_helper.c  |8 +-
 8 files changed, 147 insertions(+), 137 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h 
b/include/net/netfilter/nf_conntrack.h
index 6d075ca..9d1f0e6 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -67,6 +67,18 @@ do { 
\
 
 struct nf_conntrack_helper;
 
+/* nf_conn feature for connections that have a helper */
+struct nf_conn_help {
+   /* Helper. if any */
+   struct nf_conntrack_helper *helper;
+   
+   union nf_conntrack_help help;
+   
+   /* Current number of expected connections */
+   unsigned int expecting;
+};
+
+
 #include net/netfilter/ipv4/nf_conntrack_ipv4.h
 struct nf_conn
 {
@@ -81,6 +93,9 @@ struct nf_conn
/* Have we seen traffic both ways yet? (bitset) */
unsigned long status;
 
+   /* If we were expected by an expectation, this will be it */
+   struct nf_conn *master;
+
/* Timer function; drops refcnt when it goes off. */
struct timer_list timeout;
 
@@ -88,38 +103,22 @@ struct nf_conn
/* Accounting Information (same cache line as other written members) */
struct ip_conntrack_counter counters[IP_CT_DIR_MAX];
 #endif
-   /* If we were expected by an expectation, this will be it */
-   struct nf_conn *master;
-   
-   /* Current number of expected connections */
-   unsigned int expecting;
 
/* Unique ID that identifies this conntrack*/
unsigned int id;
 
-   /* Helper. if any */
-   struct nf_conntrack_helper *helper;
-
/* features - nat, helper, ... used by allocating system */
u_int32_t features;
 
-   /* Storage reserved for other modules: */
-
-   union nf_conntrack_proto proto;
-
 #if defined(CONFIG_NF_CONNTRACK_MARK)
u_int32_t mark;
 #endif
 
-   /* These members are dynamically allocated. */
-
-   union nf_conntrack_help *help;
+   /* Storage reserved for other modules: */
+   union nf_conntrack_proto proto;
 
-   /* Layer 3 dependent members. (ex: NAT) */
-   union {
-   struct nf_conntrack_ipv4 *ipv4;
-   } l3proto;
-   void *data[0];
+   /* features dynamically at the end: helper, nat (both optional) */
+   char data[0];
 };
 
 struct nf_conntrack_expect
@@ -373,10 +372,23 @@ nf_conntrack_expect_event(enum ip_conntr
 #define NF_CT_F_NUM4
 
 extern int
-nf_conntrack_register_cache(u_int32_t features, const char *name, size_t size,
-   int (*init_conntrack)(struct nf_conn *, u_int32_t));
+nf_conntrack_register_cache(u_int32_t features, const char *name, size_t size);
 extern void
 nf_conntrack_unregister_cache(u_int32_t features);
 
+/* valid combinations:
+ * basic: nf_conn, nf_conn .. nf_conn_help
+ * nat: nf_conn .. nf_conn_nat, nf_conn .. nf_conn_nat, nf_conn help
+ */
+static inline struct nf_conn_help *nfct_help(const struct nf_conn *ct)
+{
+   unsigned int offset = sizeof(struct nf_conn);
+
+   if (!(ct-features  NF_CT_F_HELP))
+   return NULL;
+
+   return (struct nf_conn_help *) ((void *)ct + offset);
+}
+
 #endif /* __KERNEL__ */
 #endif /* _NF_CONNTRACK_H */
diff --git a/net/ipv4/netfilter

[PATCH] 2.6.16: [NETFILTER] Fix Kconfig menu level for x_tables

2006-02-13 Thread Harald Welte

Hi Dave, please apply this to the 2.6.16 tree and push it to Linus.

Thanks!

[NETFILTER] Fix Kconfig menu level for x_tables

The new x_tables related Kconfig options appear at the wrong menu level
without this patch.

Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit 2d3e1788278398e3d9f33409066f3232730336a6
tree d23977bfeafb91fdbc5e5400d516ef623cde5a1d
parent 0b2d6ca5297c6343d9ae4c2259f85046e487ed01
author Harald Welte [EMAIL PROTECTED] Mon, 13 Feb 2006 12:04:25 +0100
committer Harald Welte [EMAIL PROTECTED] Mon, 13 Feb 2006 12:04:25 +0100

 net/netfilter/Kconfig |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 99c0a0f..0e55012 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -102,8 +102,6 @@ config NF_CT_NETLINK
help
  This option enables support for a netlink-based userspace interface
 
-endmenu
-
 config NETFILTER_XTABLES
tristate Netfilter Xtables support (required for ip_tables)
help
@@ -361,3 +359,5 @@ config NETFILTER_XT_MATCH_TCPMSS
 
  To compile it as a module, choose M here.  If unsure, say N.
 
+endmenu
+
-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpHFnw0n6kSi.pgp
Description: PGP signature

[PATCH 1/1] [NETFILTER] bridge: fix link error with recent netfilter-bridge change

2006-02-12 Thread Harald Welte

[NETFILTER] bridge: fix link error with recent netfilter-bridge change

commit 5dce971acf2ae20c80d5e9d1f6bbf17376870911 removed the
has_bridge_parent macro from net/bridge/br_netfilter.c, resulting in the
following warning at compile time:

 net/bridge/br_netfilter.c: In function `br_nf_post_routing':
 net/bridge/br_netfilter.c:808: warning: implicit declaration of function
 `has_bridge_parent'

and this error at link time:

 net/built-in.o(.text+0xeae28): In function `br_nf_post_routing':
 net/bridge/br_netfilter.c:808: undefined reference to `has_bridge_parent'
 make: *** [.tmp_vmlinux1] Error 1

The patch below fixes the problem.

Signed-off-by: Jesper Juhl [EMAIL PROTECTED]
Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit d63ec539a20c35537c930ebadeac0f58bccee86f
tree 65a56c18c840f137bf1c2f778af6afba215d143e
parent de7cdf0307f39c1399922431144d2dc8f327ee42
author Jesper Juhl [EMAIL PROTECTED] Sun, 12 Feb 2006 12:35:40 +0100
committer Harald Welte [EMAIL PROTECTED] Sun, 12 Feb 2006 12:35:40 +0100

 net/bridge/br_netfilter.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/bridge/br_netfilter.c b/net/bridge/br_netfilter.c
index b501816..c06cb09 100644
--- a/net/bridge/br_netfilter.c
+++ b/net/bridge/br_netfilter.c
@@ -805,7 +805,7 @@ static unsigned int br_nf_post_routing(u
 print_error:
if (skb-dev != NULL) {
printk([%s], skb-dev-name);
-   if (has_bridge_parent(skb-dev))
+   if (bridge_parent(skb-dev))
printk([%s], bridge_parent(skb-dev)-name);
}
printk( head:%p, raw:%p, data:%p\n, skb-head, skb-mac.raw,

--
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] [NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

2006-02-12 Thread Harald Welte

Hi Dave!

This is net-2.6.17 stuff, please apply.

Thanks!



[NETFILTER] nf_conntrack: clean up to reduce size of 'struct nf_conn'

This patch moves all helper related data fields of 'struct nf_conn' into a
separate structure 'struct nf_conn_help'.  This new structure is only
present in conntrack entries for which we actually have a helper loaded.

Also, this patch cleans up the nf_conntrack 'features' mechanism to
resemble what the original idea was:  Just glue the feature-specific
data structures at the end of 'struct nf_conn', and explicitly re-calculate
the pointer to it when needed rather than keeping pointers around.

Saves 20 bytes per conntrack on my x86_64 box. A non-helped conntrack is
276 bytes. We still need to save another 20 bytes in order to fit into to
target of 256bytes.

Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit aba8cf3ac5b60e10eb1ce9f30b6bb4c007ab9868
tree 758aedbe80b8306c1991a05c47eab8905d33a13b
parent 94d3d40c84672b74e59ea5252f61602610e1513e
author Harald Welte [EMAIL PROTECTED] Sun, 29 Jan 2006 19:19:06 +0100
committer Harald Welte [EMAIL PROTECTED] Sun, 29 Jan 2006 19:19:06 +0100

 include/net/netfilter/nf_conntrack.h   |   56 +++
 net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c |   22 ++---
 net/ipv6/netfilter/nf_conntrack_l3proto_ipv6.c |   39 +---
 net/netfilter/nf_conntrack_core.c  |  117 ++--
 net/netfilter/nf_conntrack_ftp.c   |2 
 net/netfilter/nf_conntrack_netlink.c   |   39 +---
 net/netfilter/nf_conntrack_standalone.c|1 
 net/netfilter/xt_helper.c  |8 +-
 8 files changed, 147 insertions(+), 137 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack.h 
b/include/net/netfilter/nf_conntrack.h
index 6d075ca..9d1f0e6 100644
--- a/include/net/netfilter/nf_conntrack.h
+++ b/include/net/netfilter/nf_conntrack.h
@@ -67,6 +67,18 @@ do { 
\
 
 struct nf_conntrack_helper;
 
+/* nf_conn feature for connections that have a helper */
+struct nf_conn_help {
+   /* Helper. if any */
+   struct nf_conntrack_helper *helper;
+   
+   union nf_conntrack_help help;
+   
+   /* Current number of expected connections */
+   unsigned int expecting;
+};
+
+
 #include net/netfilter/ipv4/nf_conntrack_ipv4.h
 struct nf_conn
 {
@@ -81,6 +93,9 @@ struct nf_conn
/* Have we seen traffic both ways yet? (bitset) */
unsigned long status;
 
+   /* If we were expected by an expectation, this will be it */
+   struct nf_conn *master;
+
/* Timer function; drops refcnt when it goes off. */
struct timer_list timeout;
 
@@ -88,38 +103,22 @@ struct nf_conn
/* Accounting Information (same cache line as other written members) */
struct ip_conntrack_counter counters[IP_CT_DIR_MAX];
 #endif
-   /* If we were expected by an expectation, this will be it */
-   struct nf_conn *master;
-   
-   /* Current number of expected connections */
-   unsigned int expecting;
 
/* Unique ID that identifies this conntrack*/
unsigned int id;
 
-   /* Helper. if any */
-   struct nf_conntrack_helper *helper;
-
/* features - nat, helper, ... used by allocating system */
u_int32_t features;
 
-   /* Storage reserved for other modules: */
-
-   union nf_conntrack_proto proto;
-
 #if defined(CONFIG_NF_CONNTRACK_MARK)
u_int32_t mark;
 #endif
 
-   /* These members are dynamically allocated. */
-
-   union nf_conntrack_help *help;
+   /* Storage reserved for other modules: */
+   union nf_conntrack_proto proto;
 
-   /* Layer 3 dependent members. (ex: NAT) */
-   union {
-   struct nf_conntrack_ipv4 *ipv4;
-   } l3proto;
-   void *data[0];
+   /* features dynamically at the end: helper, nat (both optional) */
+   char data[0];
 };
 
 struct nf_conntrack_expect
@@ -373,10 +372,23 @@ nf_conntrack_expect_event(enum ip_conntr
 #define NF_CT_F_NUM4
 
 extern int
-nf_conntrack_register_cache(u_int32_t features, const char *name, size_t size,
-   int (*init_conntrack)(struct nf_conn *, u_int32_t));
+nf_conntrack_register_cache(u_int32_t features, const char *name, size_t size);
 extern void
 nf_conntrack_unregister_cache(u_int32_t features);
 
+/* valid combinations:
+ * basic: nf_conn, nf_conn .. nf_conn_help
+ * nat: nf_conn .. nf_conn_nat, nf_conn .. nf_conn_nat, nf_conn help
+ */
+static inline struct nf_conn_help *nfct_help(const struct nf_conn *ct)
+{
+   unsigned int offset = sizeof(struct nf_conn);
+
+   if (!(ct-features  NF_CT_F_HELP))
+   return NULL;
+
+   return (struct nf_conn_help *) ((void *)ct + offset);
+}
+
 #endif /* __KERNEL__ */
 #endif /* _NF_CONNTRACK_H */
diff --git a/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c 
b/net/ipv4/netfilter/nf_conntrack_l3proto_ipv4.c

Re: [PATCH] [NETFILTER] nfnetlink_log: add sequence numbers for log events

2006-01-31 Thread Harald Welte

On Tue, Jan 31, 2006 at 12:32:21AM +0100, Patrick McHardy wrote:
 Harald Welte wrote:
  Hi Dave,
  
  please apply, thanks!
  
  [NETFILTER] nfnetlink_log: add sequence numbers for log events
  
  By using a sequence number for every logged netfilter event, we can
  determine from userspace whether logging information was lots somewhere
  downstream.
 
 BTW, I have a patch I wanted to submit on top of this, which changes the
 *LOG targets to do reliable logging, which means if we encounter any
 errors during logging (for example from netlink), the packet will be
 dropped. This makes as sure as possible that no connections will be
 silently accepted. Its a slight change of user-visible behaviour, but
 since it only affects corner-cases I think it should be OK. I could add
 some flags to retain the current behaviour, but I think its not worth
 it. 

I think it is very much required to have such a flag.  (we can actually
add it to the nfnetlink_log flags).  It really depends on your setup.
Some people really really want to have logging reliable, others fear
that they might easily be DoS'ed, if logging has higher priority than
packet forwarding.
 
 Any objections?

If it's optional (and the default is unreliable), then I think it's a
great idea.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpnfOO9W6O2I.pgp
Description: PGP signature

ip_conntrack related slab error (Re: Fw: Re: 2.6.16-rc1-mm3)

2006-01-31 Thread Harald Welte

 Begin forwarded message:

 Date: Sat, 28 Jan 2006 00:47:06 +1300
 From: Reuben Farrelly [EMAIL PROTECTED]
 To: Andrew Morton [EMAIL PROTECTED]
 Cc: linux-kernel@vger.kernel.org, netdev@vger.kernel.org
 Subject: Re: 2.6.16-rc1-mm3

 Just triggered this one, which had a fairly bad effect on connectivity to the 
 box:

 i2c /dev entries driver
 slab error in kmem_cache_destroy(): cache `ip_conntrack': Can't free all 
 objects
   [b010412b] show_trace+0xd/0xf
   [b01041cc] dump_stack+0x17/0x19
   [b0155d04] kmem_cache_destroy+0x9b/0x1a9
   [f0ebf701] ip_conntrack_cleanup+0x5d/0x10e [ip_conntrack]
   [f0ebe31e] init_or_cleanup+0x1f8/0x283 [ip_conntrack]
   [f0ec2c4e] fini+0xa/0x66 [ip_conntrack]
   [b0136d06] sys_delete_module+0x161/0x1fb
   [b0102b3f] sysenter_past_esp+0x54/0x75
 Removing netfilter NETLINK layer.
 [EMAIL PROTECTED] log]#

 I was just reading IMAP mail at the time, ie same as I'd been doing for an 
 hour 
 or two beforehand and not altering config of the box in any way.  I was able 
 to 
 log on via console but lost all network connectivity and had to reboot :(

The codepath you see in that backtrace is only hit during load or
removal of the 'ip_conntrack' module.  While this certainly still should
not oops, your description of 'not doing anything but IMAP reading' is
certainly not true.  

Could you please describe what actually happened when that bug happened?
It looks to me that you were unloading ip_conntrack_netlink.ko followed
by ip_conntrack.ko.

 Generic details such as .config is at http://www.reub.net/files/kernel/

You don't have permission to access /files/kernel/ on this server.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie

pgp97NXr97jrX.pgp
Description: PGP signature

Re: [Netlink] virtual interface

2006-01-31 Thread Harald Welte

On Tue, Jan 24, 2006 at 09:19:02PM +0100, Mathieu Geli wrote:

 Here is a patch that add a netlink virtual interface.

I really like the possibility that it offers.  In fact, I always wanted
something like a 'promiscuous netlink socket'.

Whether it is the right thing to add a full-blown net_device for it, I
don't know.  This really only is a kludge for scapy/ethereal/whatever,
since they just expect tu use network interfaces and AF_PACKET sockets.

The question is whether we should introduce a net_device to work around
scapy/ethereal limitations - or whether we'd rather have some different
interface.

I would like to hear comments from Dave, Jamal, Thomas Graf, Patrick
McHardy on this...

btw: There seem to be a number of coding style and whitespace issue in
your patch.  Please make sure you follow Documentation/CodingStyle as
closely as possible.

 Dissectors are being developed for scapy [1] at the moment.

please note that a number of netlink related data structures are host
endian, and others are network endian, where again others are mixed
host/network endian.  This can create soem headache while writing
dissectors ;)

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpzqQ2CKzt4F.pgp
Description: PGP signature

Re: wireless-2.6 status (25 January 2006)

2006-01-31 Thread Harald Welte

On Fri, Jan 27, 2006 at 07:33:05AM -0500, John W. Linville wrote:
 On Fri, Jan 27, 2006 at 09:17:50AM +, Christoph Hellwig wrote:
 
  Folks, please stop these stupid ideas.  There's a free driver, let's improve
  and merge it.  That's a thousand times better than any half-free driver
  with buggy binary blobs.
 
 I presume you mean the ath-driver.org stuff?  I'm perfectly
 open to using that code, but some had raised issues about the
 reverse-engineering process used to create it.  Has that been resolved?

It is unclear.  I'm in contact with lawyers from the institute for low
and open source software here in Germany, who conducted an analysis on
reverse engineering.  

The result of their analysis leaves more open questions than it answers.

The first problem is that there appears to be no precedent whatsoever on
the reverse engineering interoperability exceptions.  Also, the wording
of those exceptions seem to assume software/software interoperability,
not software/hardware.  Furthermore, the .de implementation on the EU
directive on 'reverse engineering' explicitly states 'decompilation' is
forbidden.  Technically speaking, that's different from disassembly.
But what would a court rule?  Also, the interoperability exceptions seem
to be written with closed-source software in mind, since the result of
such conditionally allowed re-engineering is not allowed to be
published.

So for now, I personally would ignore the concerns.  Please tell me how
many Linux device drivers we would have, if nobody had ever looked into
some disassembled proprietary driver? 

Don't get me wrong, I'm very much against any kind of 'carbon copying'
code from a proprietary driver into a free one.  

However, merely looking at existing proprietary code in order to figure
out which bit of a register resembles what function, and then
implementing a driver from scratch using that information seems pretty
safe to me.

Actually, running the proprietary code in a simulator and just looking
at the inputs/outputs of that simulator seems to be the safest way.

Also, who will be able to prove that you actually looked at some
assembly instructions rather than observing the proprietary program
running in a simulator?

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpY8sDn6yEHs.pgp
Description: PGP signature

[PATCH] [NETFILTER] nfnetlink_log: add sequence numbers for log events

2006-01-30 Thread Harald Welte

Hi Dave,

please apply, thanks!

[NETFILTER] nfnetlink_log: add sequence numbers for log events

By using a sequence number for every logged netfilter event, we can
determine from userspace whether logging information was lots somewhere
downstream.

The user has a choice of either having per-instance local sequence
counters, or using a global sequence counter, or both.

Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit e3c7a1f99300fbd6de35a40fcd9c4dc1b0fbfee2
tree fa0df65edd0ec729af8033c0605b19dc49c1a688
parent 1259fffe55307fa41e932a594442aa78e1e37cfd
author Harald Welte [EMAIL PROTECTED] Thu, 26 Jan 2006 15:05:29 +0100
committer Harald Welte [EMAIL PROTECTED] Thu, 26 Jan 2006 15:05:29 +0100

 include/linux/netfilter/nfnetlink_log.h |6 
 net/netfilter/nfnetlink_log.c   |   46 +++
 2 files changed, 52 insertions(+), 0 deletions(-)

diff --git a/include/linux/netfilter/nfnetlink_log.h 
b/include/linux/netfilter/nfnetlink_log.h
index b04b038..a7497c7 100644
--- a/include/linux/netfilter/nfnetlink_log.h
+++ b/include/linux/netfilter/nfnetlink_log.h
@@ -47,6 +47,8 @@ enum nfulnl_attr_type {
NFULA_PAYLOAD,  /* opaque data payload */
NFULA_PREFIX,   /* string prefix */
NFULA_UID,  /* user id of socket */
+   NFULA_SEQ,  /* instance-local sequence number */
+   NFULA_SEQ_GLOBAL,   /* global sequence number */
 
__NFULA_MAX
 };
@@ -77,6 +79,7 @@ enum nfulnl_attr_config {
NFULA_CFG_NLBUFSIZ, /* u_int32_t buffer size */
NFULA_CFG_TIMEOUT,  /* u_int32_t in 1/100 s */
NFULA_CFG_QTHRESH,  /* u_int32_t */
+   NFULA_CFG_FLAGS,/* u_int16_t */
__NFULA_CFG_MAX
 };
 #define NFULA_CFG_MAX (__NFULA_CFG_MAX -1)
@@ -85,4 +88,7 @@ enum nfulnl_attr_config {
 #define NFULNL_COPY_META   0x01
 #define NFULNL_COPY_PACKET 0x02
 
+#define NFULNL_CFG_F_SEQ   0x0001
+#define NFULNL_CFG_F_SEQ_GLOBAL0x0002
+
 #endif /* _NFNETLINK_LOG_H */
diff --git a/net/netfilter/nfnetlink_log.c b/net/netfilter/nfnetlink_log.c
index e10512e..a98d0b7 100644
--- a/net/netfilter/nfnetlink_log.c
+++ b/net/netfilter/nfnetlink_log.c
@@ -11,6 +11,10 @@
  * it under the terms of the GNU General Public License version 2 as
  * published by the Free Software Foundation.
  *
+ * 2006-01-26 Harald Welte [EMAIL PROTECTED]
+ * - Add optional local and global sequence number to detect lost 
+ *   events from userspace
+ *
  */
 #include linux/module.h
 #include linux/skbuff.h
@@ -68,11 +72,14 @@ struct nfulnl_instance {
unsigned int nlbufsiz;  /* netlink buffer allocation size */
unsigned int qthreshold;/* threshold of the queue */
u_int32_t copy_range;
+   u_int32_t seq;  /* instance-local sequential counter */
u_int16_t group_num;/* number of this queue */
+   u_int16_t flags;
u_int8_t copy_mode; 
 };
 
 static DEFINE_RWLOCK(instances_lock);
+static atomic_t global_seq;
 
 #define INSTANCE_BUCKETS   16
 static struct hlist_head instance_table[INSTANCE_BUCKETS];
@@ -310,6 +317,16 @@ nfulnl_set_qthresh(struct nfulnl_instanc
return 0;
 }
 
+static int
+nfulnl_set_flags(struct nfulnl_instance *inst, u_int16_t flags)
+{
+   spin_lock_bh(inst-lock);
+   inst-flags = ntohs(flags);
+   spin_unlock_bh(inst-lock);
+
+   return 0;
+}
+
 static struct sk_buff *nfulnl_alloc_skb(unsigned int inst_size, 
unsigned int pkt_size)
 {
@@ -373,6 +390,8 @@ static void nfulnl_timer(unsigned long d
spin_unlock_bh(inst-lock);
 }
 
+/* This is an inline function, we don't really care about a long
+ * list of arguments */
 static inline int 
 __build_packet_message(struct nfulnl_instance *inst,
const struct sk_buff *skb, 
@@ -511,6 +530,17 @@ __build_packet_message(struct nfulnl_ins
read_unlock_bh(skb-sk-sk_callback_lock);
}
 
+   /* local sequence number */
+   if (inst-flags  NFULNL_CFG_F_SEQ) {
+   tmp_uint = htonl(inst-seq++);
+   NFA_PUT(inst-skb, NFULA_SEQ, sizeof(tmp_uint), tmp_uint);
+   }
+   /* global sequence number */
+   if (inst-flags  NFULNL_CFG_F_SEQ_GLOBAL) {
+   tmp_uint = atomic_inc_return(global_seq);
+   NFA_PUT(inst-skb, NFULA_SEQ_GLOBAL, sizeof(tmp_uint), 
tmp_uint);
+   }
+
if (data_len) {
struct nfattr *nfa;
int size = NFA_LENGTH(data_len);
@@ -603,6 +633,11 @@ nfulnl_log_packet(unsigned int pf,
 
spin_lock_bh(inst-lock);
 
+   if (inst-flags  NFULNL_CFG_F_SEQ)
+   size += NFA_SPACE(sizeof(u_int32_t));
+   if (inst-flags  NFULNL_CFG_F_SEQ_GLOBAL)
+   size += NFA_SPACE(sizeof(u_int32_t));
+
qthreshold

Re: WCONF, netlink based WE replacement.

2006-01-20 Thread Harald Welte

On Fri, Jan 13, 2006 at 02:15:03PM +0100, Thomas Graf wrote:
 * Michael Buesch [EMAIL PROTECTED] 2006-01-12 18:24
  This is an attempt to rewrite the Wireless Extensions
  userspace API, using netlink sockets.
  There should also be a notification API, to inform
  userspace for changes (config changes, state changes, etc).
  It is not implemented, yet.
 
 I'll only comment on the netlink bits and leave the rest to
 others. I'd highly recommend the use of attributes instead
 of fixed message structures to allow the interface to be
 flexible to extensions while staying binary compatible.

I can only second Thomas' suggestion.  The power and flexibility of the
use of netlink is only really unveiled if you use attributes.   This way
you can easily extend your data structures later on without breaking
compatibility, etc.

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpSNFb09iVlx.pgp
Description: PGP signature

[PATCH 2/3] [NETFILTER] ip6tables: remove unused definitions

2006-01-16 Thread Harald Welte

[NETFILTER] ip6tables: remove unused definitions

These definitions ware used for only internal use in kernel = 2.6.13,
which had not introduced the unified parser of IPv6 extension header yet.

Signed-off-by: Yasuyuki Kozakai [EMAIL PROTECTED]
Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit de63a83ad960be726fd487ce8f8a7115ecd015d5
tree 6499aea0dd46054b92a4393b4365d8b87b602cd1
parent 396728c2753f13781973532b4074bfdb398ad675
author Yasuyuki Kozakai [EMAIL PROTECTED] Mon, 16 Jan 2006 17:46:49 +0100
committer Harald Welte [EMAIL PROTECTED] Mon, 16 Jan 2006 17:46:49 +0100

 include/linux/netfilter_ipv6/ip6t_ah.h   |9 -
 include/linux/netfilter_ipv6/ip6t_esp.h  |9 -
 include/linux/netfilter_ipv6/ip6t_frag.h |9 -
 include/linux/netfilter_ipv6/ip6t_opts.h |9 -
 include/linux/netfilter_ipv6/ip6t_rt.h   |9 -
 5 files changed, 0 insertions(+), 45 deletions(-)

diff --git a/include/linux/netfilter_ipv6/ip6t_ah.h 
b/include/linux/netfilter_ipv6/ip6t_ah.h
index c4f0793..8531879 100644
--- a/include/linux/netfilter_ipv6/ip6t_ah.h
+++ b/include/linux/netfilter_ipv6/ip6t_ah.h
@@ -18,13 +18,4 @@ struct ip6t_ah
 #define IP6T_AH_INV_LEN0x02/* Invert the sense of length. 
*/
 #define IP6T_AH_INV_MASK   0x03/* All possible flags. */
 
-#define MASK_HOPOPTS128
-#define MASK_DSTOPTS64
-#define MASK_ROUTING32
-#define MASK_FRAGMENT   16
-#define MASK_AH 8
-#define MASK_ESP4
-#define MASK_NONE   2
-#define MASK_PROTO  1
-
 #endif /*_IP6T_AH_H*/
diff --git a/include/linux/netfilter_ipv6/ip6t_esp.h 
b/include/linux/netfilter_ipv6/ip6t_esp.h
index 01142b9..a91b6ab 100644
--- a/include/linux/netfilter_ipv6/ip6t_esp.h
+++ b/include/linux/netfilter_ipv6/ip6t_esp.h
@@ -7,15 +7,6 @@ struct ip6t_esp
u_int8_t  invflags; /* Inverse flags */
 };
 
-#define MASK_HOPOPTS128
-#define MASK_DSTOPTS64
-#define MASK_ROUTING32
-#define MASK_FRAGMENT   16
-#define MASK_AH 8
-#define MASK_ESP4
-#define MASK_NONE   2
-#define MASK_PROTO  1
-
 /* Values for invflags field in struct ip6t_esp. */
 #define IP6T_ESP_INV_SPI   0x01/* Invert the sense of spi. */
 #define IP6T_ESP_INV_MASK  0x01/* All possible flags. */
diff --git a/include/linux/netfilter_ipv6/ip6t_frag.h 
b/include/linux/netfilter_ipv6/ip6t_frag.h
index 449a57e..66070a0 100644
--- a/include/linux/netfilter_ipv6/ip6t_frag.h
+++ b/include/linux/netfilter_ipv6/ip6t_frag.h
@@ -21,13 +21,4 @@ struct ip6t_frag
 #define IP6T_FRAG_INV_LEN  0x02/* Invert the sense of length. */
 #define IP6T_FRAG_INV_MASK 0x03/* All possible flags. */
 
-#define MASK_HOPOPTS128
-#define MASK_DSTOPTS64
-#define MASK_ROUTING32
-#define MASK_FRAGMENT   16
-#define MASK_AH 8
-#define MASK_ESP4
-#define MASK_NONE   2
-#define MASK_PROTO  1
-
 #endif /*_IP6T_FRAG_H*/
diff --git a/include/linux/netfilter_ipv6/ip6t_opts.h 
b/include/linux/netfilter_ipv6/ip6t_opts.h
index e259b62..a07e363 100644
--- a/include/linux/netfilter_ipv6/ip6t_opts.h
+++ b/include/linux/netfilter_ipv6/ip6t_opts.h
@@ -20,13 +20,4 @@ struct ip6t_opts
 #define IP6T_OPTS_INV_LEN  0x01/* Invert the sense of length. */
 #define IP6T_OPTS_INV_MASK 0x01/* All possible flags. */
 
-#define MASK_HOPOPTS128
-#define MASK_DSTOPTS64
-#define MASK_ROUTING32
-#define MASK_FRAGMENT   16
-#define MASK_AH 8
-#define MASK_ESP4
-#define MASK_NONE   2
-#define MASK_PROTO  1
-
 #endif /*_IP6T_OPTS_H*/
diff --git a/include/linux/netfilter_ipv6/ip6t_rt.h 
b/include/linux/netfilter_ipv6/ip6t_rt.h
index f1070fb..5215602 100644
--- a/include/linux/netfilter_ipv6/ip6t_rt.h
+++ b/include/linux/netfilter_ipv6/ip6t_rt.h
@@ -30,13 +30,4 @@ struct ip6t_rt
 #define IP6T_RT_INV_LEN0x04/* Invert the sense of length. 
*/
 #define IP6T_RT_INV_MASK   0x07/* All possible flags. */
 
-#define MASK_HOPOPTS128
-#define MASK_DSTOPTS64
-#define MASK_ROUTING32
-#define MASK_FRAGMENT   16
-#define MASK_AH 8
-#define MASK_ESP4
-#define MASK_NONE   2
-#define MASK_PROTO  1
-
 #endif /*_IP6T_RT_H*/

--
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH 3/3] [NETFILTER] ip6tables: whitespace and indent cosmetic cleanup

2006-01-16 Thread Harald Welte

[NETFILTER] ip6tables: whitespace and indent cosmetic cleanup

Signed-off-by: Yasuyuki Kozakai [EMAIL PROTECTED]
Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit 663dafd271c08d6cf69ecaa1daf34380276b9945
tree 6194991675051520a186f40f587c1a55827a5081
parent de63a83ad960be726fd487ce8f8a7115ecd015d5
author Yasuyuki Kozakai [EMAIL PROTECTED] Mon, 16 Jan 2006 17:47:39 +0100
committer Harald Welte [EMAIL PROTECTED] Mon, 16 Jan 2006 17:47:39 +0100

 net/ipv6/netfilter/ip6t_dst.c|  147 
 net/ipv6/netfilter/ip6t_eui64.c  |   64 +-
 net/ipv6/netfilter/ip6t_frag.c   |  153 -
 net/ipv6/netfilter/ip6t_hbh.c|  147 
 net/ipv6/netfilter/ip6t_ipv6header.c |   79 ++---
 net/ipv6/netfilter/ip6t_owner.c  |   28 ++---
 net/ipv6/netfilter/ip6t_rt.c |  211 ++
 7 files changed, 417 insertions(+), 412 deletions(-)

diff --git a/net/ipv6/netfilter/ip6t_dst.c b/net/ipv6/netfilter/ip6t_dst.c
index 80fe826..b4c153a 100644
--- a/net/ipv6/netfilter/ip6t_dst.c
+++ b/net/ipv6/netfilter/ip6t_dst.c
@@ -36,19 +36,19 @@ MODULE_AUTHOR(Andras Kis-Szabo [EMAIL PROTECTED]
 #endif
 
 /*
- * (Type  0xC0)  6
- * 0   - ignorable
- * 1   - must drop the packet
- * 2   - send ICMP PARM PROB regardless and drop packet
- * 3   - Send ICMP if not a multicast address and drop packet
+ *  (Type  0xC0)  6
+ * 0   - ignorable
+ * 1   - must drop the packet
+ * 2   - send ICMP PARM PROB regardless and drop packet
+ * 3   - Send ICMP if not a multicast address and drop packet
  *  (Type  0x20)  5
- * 0   - invariant
- * 1   - can change the routing
+ * 0   - invariant
+ * 1   - can change the routing
  *  (Type  0x1F) Type
- *  0  - Pad1 (only 1 byte!)
- *  1  - PadN LENGTH info (total length = length + 2)
- *  C0 | 2 - JUMBO 4 x x x x (   64k )
- *  5  - RTALERT 2 x x
+ * 0   - Pad1 (only 1 byte!)
+ * 1   - PadN LENGTH info (total length = length + 2)
+ * C0 | 2  - JUMBO 4 x x x x (   64k )
+ * 5   - RTALERT 2 x x
  */
 
 static int
@@ -60,16 +60,16 @@ match(const struct sk_buff *skb,
   unsigned int protoff,
   int *hotdrop)
 {
-   struct ipv6_opt_hdr _optsh, *oh;
-   const struct ip6t_opts *optinfo = matchinfo;
-   unsigned int temp;
-   unsigned int ptr;
-   unsigned int hdrlen = 0;
-   unsigned int ret = 0;
-   u8 _opttype, *tp = NULL;
-   u8 _optlen, *lp = NULL;
-   unsigned int optlen;
-   
+   struct ipv6_opt_hdr _optsh, *oh;
+   const struct ip6t_opts *optinfo = matchinfo;
+   unsigned int temp;
+   unsigned int ptr;
+   unsigned int hdrlen = 0;
+   unsigned int ret = 0;
+   u8 _opttype, *tp = NULL;
+   u8 _optlen, *lp = NULL;
+   unsigned int optlen;
+
 #if HOPBYHOP
if (ipv6_find_hdr(skb, ptr, NEXTHDR_HOP, NULL)  0)
 #else
@@ -77,42 +77,41 @@ match(const struct sk_buff *skb,
 #endif
return 0;
 
-   oh = skb_header_pointer(skb, ptr, sizeof(_optsh), _optsh);
-   if (oh == NULL){
-  *hotdrop = 1;
-   return 0;
-   }
-
-   hdrlen = ipv6_optlen(oh);
-   if (skb-len - ptr  hdrlen){
-  /* Packet smaller than it's length field */
-   return 0;
-   }
-
-   DEBUGP(IPv6 OPTS LEN %u %u , hdrlen, oh-hdrlen);
-
-   DEBUGP(len %02X %04X %02X ,
-   optinfo-hdrlen, hdrlen,
-   (!(optinfo-flags  IP6T_OPTS_LEN) ||
-   ((optinfo-hdrlen == hdrlen) ^
-   !!(optinfo-invflags  IP6T_OPTS_INV_LEN;
-
-   ret = (oh != NULL)
-   
-   (!(optinfo-flags  IP6T_OPTS_LEN) ||
-   ((optinfo-hdrlen == hdrlen) ^
-   !!(optinfo-invflags  IP6T_OPTS_INV_LEN)));
-
-   ptr += 2;
-   hdrlen -= 2;
-   if ( !(optinfo-flags  IP6T_OPTS_OPTS) ){
-  return ret;
+   oh = skb_header_pointer(skb, ptr, sizeof(_optsh), _optsh);
+   if (oh == NULL) {
+   *hotdrop = 1;
+   return 0;
+   }
+
+   hdrlen = ipv6_optlen(oh);
+   if (skb-len - ptr  hdrlen) {
+   /* Packet smaller than it's length field */
+   return 0;
+   }
+
+   DEBUGP(IPv6 OPTS LEN %u %u , hdrlen, oh-hdrlen);
+
+   DEBUGP(len %02X %04X %02X ,
+  optinfo-hdrlen, hdrlen,
+  (!(optinfo-flags  IP6T_OPTS_LEN) ||
+   ((optinfo-hdrlen == hdrlen) ^
+!!(optinfo-invflags  IP6T_OPTS_INV_LEN;
+
+   ret = (oh != NULL) 
+ (!(optinfo-flags  IP6T_OPTS_LEN) ||
+  ((optinfo-hdrlen == hdrlen) ^
+   !!(optinfo-invflags  IP6T_OPTS_INV_LEN)));
+
+   ptr += 2;
+   hdrlen

[PATCH 1/3] [NETFILTER] Makefile cleanup

2006-01-16 Thread Harald Welte

[NETFILTER] Makefile cleanup

These are replaced with x_tables matches and no longer exist.

Signed-off-by: Yasuyuki Kozakai [EMAIL PROTECTED]
Signed-off-by: Harald Welte [EMAIL PROTECTED]

---
commit 396728c2753f13781973532b4074bfdb398ad675
tree cefbd947576c9ec4511153242109a6566079ebdf
parent 78dbb647a1b797fcda30a65557ca21b3b73ee097
author Yasuyuki Kozakai [EMAIL PROTECTED] Mon, 16 Jan 2006 17:45:12 +0100
committer Harald Welte [EMAIL PROTECTED] Mon, 16 Jan 2006 17:45:12 +0100

 net/ipv4/netfilter/Makefile |1 -
 net/ipv6/netfilter/Makefile |1 -
 2 files changed, 0 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/netfilter/Makefile b/net/ipv4/netfilter/Makefile
index bcefe64..e5c5b32 100644
--- a/net/ipv4/netfilter/Makefile
+++ b/net/ipv4/netfilter/Makefile
@@ -46,7 +46,6 @@ obj-$(CONFIG_IP_NF_NAT) += iptable_nat.o
 obj-$(CONFIG_IP_NF_RAW) += iptable_raw.o
 
 # matches
-obj-$(CONFIG_IP_NF_MATCH_HELPER) += ipt_helper.o
 obj-$(CONFIG_IP_NF_MATCH_HASHLIMIT) += ipt_hashlimit.o
 obj-$(CONFIG_IP_NF_MATCH_IPRANGE) += ipt_iprange.o
 obj-$(CONFIG_IP_NF_MATCH_MULTIPORT) += ipt_multiport.o
diff --git a/net/ipv6/netfilter/Makefile b/net/ipv6/netfilter/Makefile
index 663b474..db6073c 100644
--- a/net/ipv6/netfilter/Makefile
+++ b/net/ipv6/netfilter/Makefile
@@ -4,7 +4,6 @@
 
 # Link order matters here.
 obj-$(CONFIG_IP6_NF_IPTABLES) += ip6_tables.o
-obj-$(CONFIG_IP6_NF_MATCH_LENGTH) += ip6t_length.o
 obj-$(CONFIG_IP6_NF_MATCH_RT) += ip6t_rt.o
 obj-$(CONFIG_IP6_NF_MATCH_OPTS) += ip6t_hbh.o ip6t_dst.o
 obj-$(CONFIG_IP6_NF_MATCH_IPV6HEADER) += ip6t_ipv6header.o

--
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie
-
To unsubscribe from this list: send the line unsubscribe netdev in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[PATCH] fix ip[6]t_policy compiler warnings after x_tables merge

2006-01-14 Thread Harald Welte

Hi Dave!

Please apply, thanks.

[NETFILTER] ip[6]t_policy: Fix compilation warnings

ip[6]t_policy argument conversion slipped when merging with x_tables

Signed-off-by: Benoit Boissinot [EMAIL PROTECTED]
Signed-off-by: Harald Welte [EMAIL PROTECTED]

--- a/net/ipv4/netfilter/ipt_policy.c   2006-01-14 16:11:52.0 +0100
+++ b/net/ipv4/netfilter/ipt_policy.c   2006-01-14 18:55:26.0 +0100
@@ -95,7 +95,10 @@
 static int match(const struct sk_buff *skb,
  const struct net_device *in,
  const struct net_device *out,
- const void *matchinfo, int offset, int *hotdrop)
+ const void *matchinfo,
+ int offset,
+ unsigned int protoff,
+ int *hotdrop)
 {
const struct ipt_policy_info *info = matchinfo;
int ret;
@@ -113,7 +116,7 @@
return ret;
 }
 
-static int checkentry(const char *tablename, const struct ipt_ip *ip,
+static int checkentry(const char *tablename, const void *ip_void,
   void *matchinfo, unsigned int matchsize,
   unsigned int hook_mask)
 {
--- a/net/ipv6/netfilter/ip6t_policy.c  2006-01-14 16:11:52.0 +0100
+++ b/net/ipv6/netfilter/ip6t_policy.c  2006-01-14 18:52:58.0 +0100
@@ -118,7 +118,7 @@
return ret;
 }
 
-static int checkentry(const char *tablename, const struct ip6t_ip6 *ip,
+static int checkentry(const char *tablename, const void *ip_void,
   void *matchinfo, unsigned int matchsize,
   unsigned int hook_mask)
 {

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpdPv8g0AvFU.pgp
Description: PGP signature

Re: Resubmit: [PATCH] natsemi: NAPI support

2006-01-04 Thread Harald Welte

On Wed, Jan 04, 2006 at 07:32:49AM +, Mark Brown wrote:
 This patch against 2.6.14 converts the natsemi driver to use NAPI.  It
 was originally based on one written by Harald Welte, though it has since
 been modified quite a bit, most extensively in order to remove the
 ability to disable NAPI since none of the other drivers seem to provide
 that functionality any more.

Mark, sorry for not responding earlier to your emails with regard to
your natsemi patch.

Thanks for pushing this persistently.  From reviewing your patch, I
personally thin it's fine to be merged.

I will test it on my natsemi based boxes later today.

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpXjBH2ZtIut.pgp
Description: PGP signature

Re: Integrating IXP4xx NPE support in kernel (with IXP4xx accesslibrary)

2006-01-04 Thread Harald Welte

On Tue, Dec 27, 2005 at 09:22:31AM -0500, jamal wrote:
 On Tue, 2005-27-12 at 15:05 +0100, Lennert Buytenhek wrote:
  On Tue, Dec 27, 2005 at 08:49:50AM -0500, jamal wrote:
  
   A question for you: Does the 4XX allow you to rewrite the microcode?
  
  As far as I know, there are no docs for this, at least not any publicly
  available ones.
  
  Chapter 4 in the IXP42X Developer's Manual is titled Network Processor
  Engines (NPE), but that's just a 2-page marketing blurb on how great
  their stuff is.  If you ask Intel for docs, they just tell you to go
  look at the available (evil-licensed) source code.
 
 They probably will never open it up.

which is quite surprising, given the fact that you get full
documentation (even as printed books) for the ixp2xxx microengines.

Yes, all the actual example microcode is evil-licensed, but as lennert
has shown it is possible to develop gpl licensed microengine code using
their docs.

why can't they just do the same for ixp4xx?

-- 
- Harald Welte [EMAIL PROTECTED]  http://gnumonks.org/

Privacy in residential applications is a desirable marketing option.
  (ETSI EN 300 175-7 Ch. A6)


pgpmx85Sbq20y.pgp
Description: PGP signature

Re: 2.6.15-rc2-mm1

2005-11-24 Thread Harald Welte

On Wed, Nov 23, 2005 at 11:22:18AM -0800, Andrew Morton wrote:
 Michal Piotrowski [EMAIL PROTECTED] wrote:
 
   Debug: sleeping function called from invalid context at
   include/asm/semaphore.h:123
   in_atomic():1, irqs_disabled():0
[c0103be6] dump_stack+0x17/0x19
[c011a0c3] __might_sleep+0x9c/0xae
[fd9a090d] translate_table+0x147/0xc14 [ip_tables]
[fd9a2b2a] ipt_register_table+0x93/0x20d [ip_tables]
[f98fe027] init+0x27/0x9e [iptable_filter]
[c01376d0] sys_init_module+0xd7/0x26c
[c0102cc7] sysenter_past_esp+0x54/0x75
   ---
   | preempt count: 0001 ]
   | 1 level deep critical section nesting:
   
   .. [fd9a2aca]  ipt_register_table+0x33/0x20d [ip_tables]
   .[f98fe027] ..   ( = init+0x27/0x9e [iptable_filter])
  
 
 ipt_register_table() does get_cpu() then calls translate_table(), and
 somewhere under translate_table() we do something which sleeps, only I'm not
 sure what it is - netfilter likes to hide things in unexpected places.

I'll investigate this.  the get_cpu() scheme was introduced as a fix for
a different (less serious) problem.  

You'll get a reply until later today.

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgprVtGQHXCIr.pgp
Description: PGP signature

Re: Fw: [Fwd: [Bug 5644] New: NFS v3 TCP 3-way handshake incorrect, iptables blocks access]

2005-11-24 Thread Harald Welte

On Wed, Nov 23, 2005 at 02:44:19PM -0800, David S. Miller wrote:
 Please make sure this gets looked at, and at least reviewed.

Jozsef Kadlecsik doesn't recall those patches/changes (even though he's
our Mr. TCP state tracking and is indicated as the author of one of
the two patches.

I also don't recall having seen any of those patches before.  But that
doesn't mean all too much, my brain is like a sieve some times.

Jozsef is most familiar with that code, he said he'll cook up a new
patch and do nfsim testing over the next days.

Also, it was indicated that those fixes did go mainline at some point?
Can soembody please indicate when that was supposed to happen?  I don't
really recall any of this, and I would be surprised if we deliberately
backed out such changes at a later point.

Thanks!

-- 
- Harald Welte [EMAIL PROTECTED] http://netfilter.org/

  Fragmentation is like classful addressing -- an interesting early
   architectural error that shows how much experimentation was going
   on while IP was being designed.-- Paul Vixie


pgpQbFIBHhzlb.pgp
Description: PGP signature

1 2 3 >

1 - 100 of 200 matches

Mail list logo