Re: [RFC/RFT, net-next, 00/17] net: Convert neighbor tables to per-namespace

2018-08-11 Thread Vasily Averin
On 07/17/2018 03:06 PM, dsah...@kernel.org wrote:
> From: David Ahern 
> 
> Nikita Leshenko reported that neighbor entries in one namespace can
> evict neighbor entries in another. The problem is that the neighbor
> tables have entries across all namespaces without separate accounting
> and with global limits on when to scan for entries to evict.
> 
> Resolve by making the neighbor tables for ipv4, ipv6 and decnet per
> namespace and making the accounting and threshold limits per namespace.

Dear David,
I prepared own patch set to fix this problem and found your one.
It looks perfect for me, and I hope David Miller will merge it soon,
however I have found a few drawbacks:

1) I know that if net_device exist it always have correct net reference,
so dev_net(dev) will be always correct.
However I afraid that device reference itself is correct in some places.
For example, 
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
@@ -376,9 +377,10 @@ mlxsw_sp_span_entry_gretap4_parms(const struct net_device 
*to_dev,
return mlxsw_sp_span_entry_unoffloadable(sparmsp);
 
l3edev = mlxsw_sp_span_gretap4_route(to_dev, &saddr.addr4, &gw.addr4);
+   tbl = ipv4_neigh_table(dev_net(l3edev));
return mlxsw_sp_span_entry_tunnel_parms_common(l3edev, saddr, daddr, gw,
   tparm.iph.ttl,
-  &arp_tbl, sparmsp);
+  tbl, sparmsp);
 }
 
mlxsw_sp_span_entry_tunnel_parms_common() have "if (!edev)" check inside,
so it seems l3edev can be set to NULL here and lead to crash inside 
dev_net(l3edev).
There are few other suspicious places and I think they should be carefully 
re-checked.

2) modified arp_net_init() does not check return value neigh_sysctl_register() 
and lacks correct rollback.
It was acceptable in arp_init, because it was called only once on boot, but now 
it will be called 
for each new net namespace, it can have real chances to fail lead to memory 
crash/memory corruption.

3) modified neigh_table_init() is called many times per netns but it can panic 
in case failed memory allocation.
I think it should be reworked to return errors in such cases, its callers 
should check it and add correct rollbacks.

4) currently neigh_table_clear() always return 0, I think it makes sense to 
change it to return void.

Thank you,
Vasily Averin 


Re: url filtering with netfiler

2018-08-11 Thread Saber Rezvani




On 08/12/2018 12:24 AM, Oleg wrote:

On Sat, Aug 11, 2018 at 12:15:26PM +0200, Pablo Neira Ayuso wrote:

We used to have mmap for nfq but that was removed because there was no
performance gain from it.

   Interesting. I didn't know about it. Was that a work without
kernelspace to userspace copying?


I think it's unlikely we'll see this infra
again in place. Moreover, there's already a number of mechanism in
place for nfq that were providing similar numbers.

   What mechanisms for example?

So in the final decision, is it worthy to implement it in Netfilter 
subsystem? like other features which Netfilter provides to users. A new 
features which we can call it an HTTP URL filtering.




Re: url filtering with netfiler

2018-08-11 Thread Oleg
On Sat, Aug 11, 2018 at 12:15:26PM +0200, Pablo Neira Ayuso wrote:
> We used to have mmap for nfq but that was removed because there was no
> performance gain from it.

  Interesting. I didn't know about it. Was that a work without
kernelspace to userspace copying?

> I think it's unlikely we'll see this infra
> again in place. Moreover, there's already a number of mechanism in
> place for nfq that were providing similar numbers.

  What mechanisms for example?

-- 
Олег Неманов (Oleg Nemanov)


Re: [PATCH 3/3 nft] src: osf: import nfnl_osf.c to load osf fingerprints

2018-08-11 Thread Fernando Fernandez Mancera




On 08/11/2018 12:03 PM, Pablo Neira Ayuso wrote:

+#endif /* _NF_OSF_H */
diff --git a/include/nfnl_osf.h b/include/nfnl_osf.h
new file mode 100644
index 000..d9287e9
--- /dev/null
+++ b/include/nfnl_osf.h
@@ -0,0 +1,6 @@
+#ifndef _NFNL_OSF_H
+#define _NFNL_OSF_H
+
+int nfnl_osf_load_fingerprints(struct netlink_ctx *ctx, int del);
+
+#endif /* _NFNL_OSF_H */
diff --git a/include/osf.h b/include/osf.h
index 715b04e..0a35b07 100644
--- a/include/osf.h
+++ b/include/osf.h
@@ -1,6 +1,8 @@
  #ifndef NFTABLES_OSF_H
  #define NFTABLES_OSF_H
  
+bool osf_init;


I think you can probably place osf_init in struct netlink_ctx?



If we place osf_init in struct netlink_ctx we will need to modify 
osf_expr_alloc() and I am not sure if we can get access to netlink_ctx 
from netlink_parse_osf() in netlink_delinearize.c. Also we will need 
access to netlink_ctx from parser_bison.y.


So I propose to add osf_init in nfnl_osf.h in order to have only one 
extra include in rule.c. Thanks.



  struct expr *osf_expr_alloc(const struct location *loc);
  
  #endif /* NFTABLES_OSF_H */

diff --git a/src/Makefile.am b/src/Makefile.am
index ed3640e..e569029 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -57,6 +57,7 @@ libnftables_la_SOURCES =  \
services.c  \
mergesort.c \
osf.c   \
+   nfnl_osf.c  \
tcpopt.c\
socket.c\
libnftables.c
diff --git a/src/nfnl_osf.c b/src/nfnl_osf.c
new file mode 100644
index 000..07bf682
--- /dev/null
+++ b/src/nfnl_osf.c
@@ -0,0 +1,449 @@
+/*
+ * Copyright (c) 2005 Evgeniy Polyakov 
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, 
USA.
+ */
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#include 
+
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#define OPTDEL ','
+#define OSFPDEL':'
+#define MAXOPTSTRLEN   128
+
+static struct nf_osf_opt IANA_opts[] = {
+   { .kind = 0, .length = 1,},
+   { .kind=1, .length=1,},
+   { .kind=2, .length=4,},
+   { .kind=3, .length=3,},
+   { .kind=4, .length=2,},
+   { .kind=5, .length=1,}, /* SACK length is not defined */
+   { .kind=6, .length=6,},
+   { .kind=7, .length=6,},
+   { .kind=8, .length=10,},
+   { .kind=9, .length=2,},
+   { .kind=10, .length=3,},
+   { .kind=11, .length=1,},/* CC: Suppose 1 */
+   { .kind=12, .length=1,},/* the same */
+   { .kind=13, .length=1,},/* and here too */
+   { .kind=14, .length=3,},
+   { .kind=15, .length=1,},/* TCP Alternate Checksum Data. 
Length is not defined */
+   { .kind=16, .length=1,},
+   { .kind=17, .length=1,},
+   { .kind=18, .length=3,},
+   { .kind=19, .length=18,},
+   { .kind=20, .length=1,},
+   { .kind=21, .length=1,},
+   { .kind=22, .length=1,},
+   { .kind=23, .length=1,},
+   { .kind=24, .length=1,},
+   { .kind=25, .length=1,},
+   { .kind=26, .length=1,},
+};
+
+static void uloga(const char *f, struct netlink_ctx *ctx, ...)
+{
+   if (!(ctx->debug_mask & NFT_DEBUG_NETLINK))
+   return;
+
+   nft_print(ctx->octx, "%s", f);
+}


I think you can use uloga() all the time, so you can remove ulog()
function.



I agree. Changes done.


+static void ulog(const char *f, struct netlink_ctx *ctx, ...)
+{
+   char str[64];
+   struct tm tm;
+   struct timeval tv;
+
+   gettimeofday(&tv, NULL);
+   localtime_r((time_t *)&tv.tv_sec, &tm);
+   strftime(str, sizeof(str), "%F %R:%S", &tm);
+
+   if (!(ctx->debug_mask & NFT_DEBUG_NETLINK))
+   return;
+
+   nft_print(ctx->octx, "%s.%lu %ld %s", str, tv.tv_usec,
+ syscall(__NR_gettid), f);
+}
+
+#define ulog_err(f, ctx, a...) uloga(f ": %s [%d].\n", ctx, ##a, 
strerror(errno), errno)


And this macro too.

Other than that, this looks good to me, thanks.



[PATCH v2] nft: doc: Fixed typos in asciidoc

2018-08-11 Thread Arushi Singhal
Correct all the typo mistakes done while converting man page source to
asciidoc.

Signed-off-by: Arushi Singhal 
---
changes in v2
-submit the patch on top of current git HEAD

 doc/data-types.txt | 2 +-
 doc/nft.txt| 6 +++---
 doc/primary-expression.txt | 2 +-
 doc/statements.txt | 6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/doc/data-types.txt b/doc/data-types.txt
index 1d46bdc..8796259 100644
--- a/doc/data-types.txt
+++ b/doc/data-types.txt
@@ -268,7 +268,7 @@ ICMPV6 TYPE TYPE
 |==
 |Name | Keyword | Size | Base type
 |ICMPv6 Type |
-icmpv6_type |
+icmpx_code |
 8 bit |
 integer
 |===
diff --git a/doc/nft.txt b/doc/nft.txt
index 5b477d8..6c73bc5 100644
--- a/doc/nft.txt
+++ b/doc/nft.txt
@@ -61,8 +61,8 @@ For a full summary of options, run *nft --help*.
 
 *-e*::
 *--echo*::
-   When  inserting  items  into the ruleset using add, insert or replace 
commands, print notifications
-   just like nft monitor.
+   When  inserting  items  into the ruleset using *add*, *insert* or 
*replace* commands, print notifications
+   just like *nft monitor*.
 
 *-I*::
 *--includepath directory*::
@@ -72,7 +72,7 @@ For a full summary of options, run *nft --help*.
 *-f*::
 *--file 'filename'*::
Read input from 'filename'. If 'filename' is -, read from stdin. +
-   nft scripts must start #!/usr/sbin/nft -f
+   nft scripts must start *#!/usr/sbin/nft -f*
 
 *-i*::
 *--interactive*::
diff --git a/doc/primary-expression.txt b/doc/primary-expression.txt
index 4ca096d..12b88bd 100644
--- a/doc/primary-expression.txt
+++ b/doc/primary-expression.txt
@@ -130,7 +130,7 @@ raw prerouting meta secpath exists accept
 SOCKET EXPRESSION
 ~
 [verse]
-socket {transparent}
+*socket* \{transparent\}
 
 Socket expression can be used to search for an existing open TCP/UDP socket and
 its attributes that can be associated with a packet. It looks for an 
established
diff --git a/doc/statements.txt b/doc/statements.txt
index b8b7a60..b93cf2c 100644
--- a/doc/statements.txt
+++ b/doc/statements.txt
@@ -62,8 +62,8 @@ tcp flags syn tcp option maxseg size set rt mtu
 LOG STATEMENT
 ~
 [verse]
-log [prefix quoted_string] [level syslog-level] [flags log-flags]
-log group nflog_group [prefix quoted_string] [queue-threshold value] [snaplen 
size]
+*log* [prefix 'quoted_string'] [level 'syslog-level'] [flags 'log-flags']
+*log* group 'nflog_group' [prefix 'quoted_string'] [queue-threshold 'value'] 
[snaplen 'size']
 
 The log statement enables logging of matching packets. When this statement is
 used from a rule, the Linux kernel will print some information on all matching
@@ -165,7 +165,7 @@ COUNTER STATEMENT
 A counter statement sets the hit count of packets along with the number of 
bytes.
 
 [verse]
-counter [ packets 'number' bytes 'number' ]
+*counter* [ packets 'number' bytes 'number' ]
 
 CONNTRACK STATEMENT
 ~~~
-- 
2.7.4



Re: url filtering with netfiler

2018-08-11 Thread Pablo Neira Ayuso
On Fri, Aug 10, 2018 at 04:50:09PM +0300, Oleg wrote:
> On Fri, Aug 10, 2018 at 02:01:25PM +0200, Pablo Neira Ayuso wrote:
> > On Thu, Aug 02, 2018 at 10:44:14PM +0300, Oleg wrote:
> > > On Thu, Aug 02, 2018 at 06:44:26PM +0430, Saber Rezvani wrote:
> > > IMHO, this can be easier implemented with help of userspace.
> > > This can be nfq-based program(something like
> > > https://github.com/lego12239/trfl), that assembles tcp session packets
> > > and mark matched connections for blocking.
> > 
> > We can do this from the kernel, by implementing a template based
> > approach with aho-corasick (to find all string keys you want to use
> > for matching in one single go) then match the values.
> > 
> > Userspace needs to provide a description of the layout of the
> > application protocol that you want to match through template. The
> > template describes keys, datatype and field length. It should be
> > flexiable enough to model a number of target application protocol that
> > are of interest.
>
>   Such a template language will be complex. The best template is a
> C code to parse a packet and split it to needed fields :-).

I'm not debating fully userspace vs. fully kernelspace: I think
extending the kernel to have capabilities to match on application
layer as a pre-classifier can be useful, ie. if kernel can parse the
bits that are needed, then there will be no need to pass up this
packet to userspace.

As said, I would start by looking at the application protocols that
are of interest, and then making an abstraction that would fit for
interpreting application layer layouts in a way that allows us to
extend this infrastructure incrementally.

> > To deal with segmentation, in case kernel cannot parse the packet,
> > we can pass it to userspace for further inspection.
> 
>   IMHO, we don't need to complicate a kernel side if we can
> do nfq without full packet copying(in this case nfq speed will be
> enough). Something like a mmap to packet kernel buffer or transmiting
> with nfq just an address of a packet start and a length in the kernel
> space; or something else. I don't know kernel internals so good to say
> the exact way to achieve this.

We used to have mmap for nfq but that was removed because there was no
performance gain from it. I think it's unlikely we'll see this infra
again in place. Moreover, there's already a number of mechanism in
place for nfq that were providing similar numbers.

Thanks.


Re: [PATCH 3/3 nft] src: osf: import nfnl_osf.c to load osf fingerprints

2018-08-11 Thread Pablo Neira Ayuso
On Fri, Aug 10, 2018 at 03:02:00PM +0200, Fernando Fernandez Mancera wrote:
> Import iptables/utils/nfnl_osf.c into nftables tree with some changes in order
> to load OS fingerprints automatically from pf.os file.
> 
> Signed-off-by: Fernando Fernandez Mancera 
> ---
>  include/linux/netfilter/nfnetlink_osf.h | 119 +++
>  include/nfnl_osf.h  |   6 +
>  include/osf.h   |   2 +
>  src/Makefile.am |   1 +
>  src/nfnl_osf.c  | 449 
>  src/osf.c   |   2 +
>  src/rule.c  |   5 +
>  7 files changed, 584 insertions(+)
>  create mode 100644 include/linux/netfilter/nfnetlink_osf.h
>  create mode 100644 include/nfnl_osf.h
>  create mode 100644 src/nfnl_osf.c
> 
> diff --git a/include/linux/netfilter/nfnetlink_osf.h 
> b/include/linux/netfilter/nfnetlink_osf.h
> new file mode 100644
> index 000..15a39d2
> --- /dev/null
> +++ b/include/linux/netfilter/nfnetlink_osf.h
> @@ -0,0 +1,119 @@
> +#ifndef _NF_OSF_H
> +#define _NF_OSF_H
> +
> +#include 
> +
> +#define MAXGENRELEN  32
> +
> +#define NF_OSF_GENRE (1 << 0)
> +#define NF_OSF_TTL   (1 << 1)
> +#define NF_OSF_LOG   (1 << 2)
> +#define NF_OSF_INVERT(1 << 3)
> +
> +#define NF_OSF_LOGLEVEL_ALL  0   /* log all matched fingerprints 
> */
> +#define NF_OSF_LOGLEVEL_FIRST1   /* log only the first 
> matced fingerprint */
> +#define NF_OSF_LOGLEVEL_ALL_KNOWN2   /* do not log unknown packets */
> +
> +#define NF_OSF_TTL_TRUE  0   /* True ip and 
> fingerprint TTL comparison */
> +
> +/* Check if ip TTL is less than fingerprint one */
> +#define NF_OSF_TTL_LESS  1
> +
> +/* Do not compare ip and fingerprint TTL at all */
> +#define NF_OSF_TTL_NOCHECK   2
> +
> +#define NF_OSF_FLAGMASK  (NF_OSF_GENRE | NF_OSF_TTL | \
> +  NF_OSF_LOG | NF_OSF_INVERT)
> +/* Wildcard MSS (kind of).
> + * It is used to implement a state machine for the different wildcard values
> + * of the MSS and window sizes.
> + */
> +struct nf_osf_wc {
> + __u32   wc;
> + __u32   val;
> +};
> +
> +/* This struct represents IANA options
> + * http://www.iana.org/assignments/tcp-parameters
> + */
> +struct nf_osf_opt {
> + __u16   kind, length;
> + struct nf_osf_wcwc;
> +};
> +
> +struct nf_osf_info {
> + chargenre[MAXGENRELEN];
> + __u32   len;
> + __u32   flags;
> + __u32   loglevel;
> + __u32   ttl;
> +};
> +
> +struct nf_osf_user_finger {
> + struct nf_osf_wcwss;
> +
> + __u8ttl, df;
> + __u16   ss, mss;
> + __u16   opt_num;
> +
> + chargenre[MAXGENRELEN];
> + charversion[MAXGENRELEN];
> + charsubtype[MAXGENRELEN];
> +
> + /* MAX_IPOPTLEN is maximum if all options are NOPs or EOLs */
> + struct nf_osf_opt   opt[MAX_IPOPTLEN];
> +};
> +
> +struct nf_osf_nlmsg {
> + struct nf_osf_user_finger   f;
> + struct iphdrip;
> + struct tcphdr   tcp;
> +};
> +
> +/* Defines for IANA option kinds */
> +enum iana_options {
> + OSFOPT_EOL = 0, /* End of options */
> + OSFOPT_NOP, /* NOP */
> + OSFOPT_MSS, /* Maximum segment size */
> + OSFOPT_WSO, /* Window scale option */
> + OSFOPT_SACKP,   /* SACK permitted */
> + OSFOPT_SACK,/* SACK */
> + OSFOPT_ECHO,
> + OSFOPT_ECHOREPLY,
> + OSFOPT_TS,  /* Timestamp option */
> + OSFOPT_POCP,/* Partial Order Connection Permitted */
> + OSFOPT_POSP,/* Partial Order Service Profile */
> +
> + /* Others are not used in the current OSF */
> + OSFOPT_EMPTY = 255,
> +};
> +
> +/*
> + * Initial window size option state machine: multiple of mss, mtu or
> + * plain numeric value. Can also be made as plain numeric value which
> + * is not a multiple of specified value.
> + */
> +enum nf_osf_window_size_options {
> + OSF_WSS_PLAIN   = 0,
> + OSF_WSS_MSS,
> + OSF_WSS_MTU,
> + OSF_WSS_MODULO,
> + OSF_WSS_MAX,
> +};
> +
> +enum nf_osf_attr_type {
> + OSF_ATTR_UNSPEC,
> + OSF_ATTR_FINGER,
> + OSF_ATTR_MAX,
> +};
> +
> +/*
> + * Add/remove fingerprint from the kernel.
> + */
> +enum nf_osf_msg_types {
> + OSF_MSG_ADD,
> + OSF_MSG_REMOVE,
> + OSF_MSG_MAX,
> +};
> +
> +#endif /* _NF_OSF_H */
> diff --git a/include/nfnl_osf.h b/include/nfnl_osf.h
> new file mode 100644
> index 000..d9287e9
> --- /dev/null
> +++ b/include/nfnl_osf.h
> @@ -0,0 +1,6 @@
> +#ifndef _NFNL_OSF_H
> +#define _NFNL_OSF_H
> +
> +int nfnl_osf_load_fingerprints(struct netlink_ctx *ctx, int del);
> +
> +#endif   /* _NFNL_OSF_H */
> diff --git a/include/osf.h b/include/osf.h
> index 715b04e..0a35b07 100644
> --- a/include/osf.h
> +++ b/includ

Re: [PATCH 1/3 nft] files: osf: copy iptables/utils/pf.os into nftables tree

2018-08-11 Thread Pablo Neira Ayuso
On Fri, Aug 10, 2018 at 03:01:58PM +0200, Fernando Fernandez Mancera wrote:
> As we are going to need pf.os file to load OS fingerprints from the incoming
> nfnl_osf.c, we copy it into the nftables tree directory "files/osf/".

Could you also add a Makefile.am for files/osf/ and update
files/Makefile.am to install via dist_pkgsysconf_DATA?

Then, the C code should use DEFAULT_INCLUDE_PATH to build the path to
the pf.os that nft will autoload, I mean, to find the pf.os file.

Thanks.