Re: [PATCH][PING] Hide private symbols in libnfnetlink

2018-05-03 Thread Jan Engelhardt
On Thursday 2018-05-03 17:03, Yuri Gribov wrote:

>Hi all,
>
>Here's the updated version of the patch.
>
>diff --git a/src/Makefile.am b/src/Makefile.am
>index d0098cc..d91c9f7 100644
>--- a/src/Makefile.am
>+++ b/src/Makefile.am
>@@ -3,7 +3,8 @@ include $(top_srcdir)/Make_global.am
> lib_LTLIBRARIES = libnfnetlink.la
> 
> libnfnetlink_la_LDFLAGS = -Wc,-nostartfiles   \
>--version-info $(LIBVERSION)
>+-version-info $(LIBVERSION) \
>+-Wl,--version-script=$(srcdir)/nfnl.version
> libnfnetlink_la_SOURCES = libnfnetlink.c iftable.c rtnl.c
> 
> noinst_HEADERS = iftable.h rtnl.h

Another additional line will be needed,

EXTRA_libnfnetlink_la_DEPENDENCIES = nfnl.version

otherwise the linker won't rerun if the .version file gets modified.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH iptables] extensions: libipt_DNAT: use size of nf_nat_range2 for rev2

2018-05-03 Thread Thierry Du Tre
On 03-05-18 21:40, Florian Westphal wrote:
> DNAT tests fail on nf-next.git, kernel complains about target size
> mismatch (40 vs 48), this fixes this for me.
>
> Fixes: 36976c4b5406 ("extensions: libipt_DNAT: support shifted portmap 
> ranges")
> Signed-off-by: Florian Westphal 
> ---
>  extensions/libip6t_DNAT.c | 4 ++--
>  extensions/libipt_DNAT.c  | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/extensions/libip6t_DNAT.c b/extensions/libip6t_DNAT.c
> index 2a7574b02444..89c5ceb15325 100644
> --- a/extensions/libip6t_DNAT.c
> +++ b/extensions/libip6t_DNAT.c
> @@ -393,8 +393,8 @@ static struct xtables_target dnat_tg_reg[] = {
>   .version= XTABLES_VERSION,
>   .family = NFPROTO_IPV6,
>   .revision   = 2,
> - .size   = XT_ALIGN(sizeof(struct nf_nat_range)),
> - .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range)),
> + .size   = XT_ALIGN(sizeof(struct nf_nat_range2)),
> + .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range2)),
>   .help   = DNAT_help_v2,
>   .print  = DNAT_print_v2,
>   .save   = DNAT_save_v2,
> diff --git a/extensions/libipt_DNAT.c b/extensions/libipt_DNAT.c
> index b89d3ca5f0d4..4907a2e83d06 100644
> --- a/extensions/libipt_DNAT.c
> +++ b/extensions/libipt_DNAT.c
> @@ -537,8 +537,8 @@ static struct xtables_target dnat_tg_reg[] = {
>   .version= XTABLES_VERSION,
>   .family = NFPROTO_IPV4,
>   .revision   = 2,
> - .size   = XT_ALIGN(sizeof(struct nf_nat_range)),
> - .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range)),
> + .size   = XT_ALIGN(sizeof(struct nf_nat_range2)),
> + .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range2)),
>   .help   = DNAT_help_v2,
>   .print  = DNAT_print_v2,
>   .save   = DNAT_save_v2,
Hi Florian,

I'm going to verify, but that looks like a logical fix indeed.

Thierry
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH iptables] extensions: libipt_DNAT: use size of nf_nat_range2 for rev2

2018-05-03 Thread Florian Westphal
DNAT tests fail on nf-next.git, kernel complains about target size
mismatch (40 vs 48), this fixes this for me.

Fixes: 36976c4b5406 ("extensions: libipt_DNAT: support shifted portmap ranges")
Signed-off-by: Florian Westphal 
---
 extensions/libip6t_DNAT.c | 4 ++--
 extensions/libipt_DNAT.c  | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/extensions/libip6t_DNAT.c b/extensions/libip6t_DNAT.c
index 2a7574b02444..89c5ceb15325 100644
--- a/extensions/libip6t_DNAT.c
+++ b/extensions/libip6t_DNAT.c
@@ -393,8 +393,8 @@ static struct xtables_target dnat_tg_reg[] = {
.version= XTABLES_VERSION,
.family = NFPROTO_IPV6,
.revision   = 2,
-   .size   = XT_ALIGN(sizeof(struct nf_nat_range)),
-   .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range)),
+   .size   = XT_ALIGN(sizeof(struct nf_nat_range2)),
+   .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range2)),
.help   = DNAT_help_v2,
.print  = DNAT_print_v2,
.save   = DNAT_save_v2,
diff --git a/extensions/libipt_DNAT.c b/extensions/libipt_DNAT.c
index b89d3ca5f0d4..4907a2e83d06 100644
--- a/extensions/libipt_DNAT.c
+++ b/extensions/libipt_DNAT.c
@@ -537,8 +537,8 @@ static struct xtables_target dnat_tg_reg[] = {
.version= XTABLES_VERSION,
.family = NFPROTO_IPV4,
.revision   = 2,
-   .size   = XT_ALIGN(sizeof(struct nf_nat_range)),
-   .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range)),
+   .size   = XT_ALIGN(sizeof(struct nf_nat_range2)),
+   .userspacesize  = XT_ALIGN(sizeof(struct nf_nat_range2)),
.help   = DNAT_help_v2,
.print  = DNAT_print_v2,
.save   = DNAT_save_v2,
-- 
2.16.1

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] ipvs: fix stats update from local clients

2018-05-03 Thread Julian Anastasov
Local clients are not properly synchronized on 32-bit CPUs when
updating stats (3.10+). Now it is possible estimation_timer (timer),
a stats reader, to interrupt the local client in the middle of
write_seqcount_{begin,end} sequence leading to loop (DEADLOCK).
The same interrupt can happen from received packet (SoftIRQ)
which updates the same per-CPU stats.

Fix it by disabling BH while updating stats.

Found with debug:

WARNING: inconsistent lock state
4.17.0-rc2-00105-g35cb6d7-dirty #2 Not tainted

inconsistent {IN-SOFTIRQ-R} -> {SOFTIRQ-ON-W} usage.
ftp/2545 [HC0[0]:SC0[0]:HE1:SE1] takes:
86845479 (>seq#6){+.+-}, at: ip_vs_schedule+0x1c5/0x59e [ip_vs]
{IN-SOFTIRQ-R} state was registered at:
 lock_acquire+0x44/0x5b
 estimation_timer+0x1b3/0x341 [ip_vs]
 call_timer_fn+0x54/0xcd
 run_timer_softirq+0x10c/0x12b
 __do_softirq+0xc1/0x1a9
 do_softirq_own_stack+0x1d/0x23
 irq_exit+0x4a/0x64
 smp_apic_timer_interrupt+0x63/0x71
 apic_timer_interrupt+0x3a/0x40
 default_idle+0xa/0xc
 arch_cpu_idle+0x9/0xb
 default_idle_call+0x21/0x23
 do_idle+0xa0/0x167
 cpu_startup_entry+0x19/0x1b
 start_secondary+0x133/0x182
 startup_32_smp+0x164/0x168
irq event stamp: 42213

other info that might help us debug this:
Possible unsafe locking scenario:

  CPU0
  
 lock(>seq#6);
 
   lock(>seq#6);

*** DEADLOCK ***

Fixes: ac69269a45e8 ("ipvs: do not disable bh for long time")
Signed-off-by: Julian Anastasov 
---
 net/netfilter/ipvs/ip_vs_core.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
index 5f6f73c..0679dd1 100644
--- a/net/netfilter/ipvs/ip_vs_core.c
+++ b/net/netfilter/ipvs/ip_vs_core.c
@@ -119,6 +119,8 @@ ip_vs_in_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
struct ip_vs_cpu_stats *s;
struct ip_vs_service *svc;
 
+   local_bh_disable();
+
s = this_cpu_ptr(dest->stats.cpustats);
u64_stats_update_begin(>syncp);
s->cnt.inpkts++;
@@ -137,6 +139,8 @@ ip_vs_in_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
s->cnt.inpkts++;
s->cnt.inbytes += skb->len;
u64_stats_update_end(>syncp);
+
+   local_bh_enable();
}
 }
 
@@ -151,6 +155,8 @@ ip_vs_out_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
struct ip_vs_cpu_stats *s;
struct ip_vs_service *svc;
 
+   local_bh_disable();
+
s = this_cpu_ptr(dest->stats.cpustats);
u64_stats_update_begin(>syncp);
s->cnt.outpkts++;
@@ -169,6 +175,8 @@ ip_vs_out_stats(struct ip_vs_conn *cp, struct sk_buff *skb)
s->cnt.outpkts++;
s->cnt.outbytes += skb->len;
u64_stats_update_end(>syncp);
+
+   local_bh_enable();
}
 }
 
@@ -179,6 +187,8 @@ ip_vs_conn_stats(struct ip_vs_conn *cp, struct 
ip_vs_service *svc)
struct netns_ipvs *ipvs = svc->ipvs;
struct ip_vs_cpu_stats *s;
 
+   local_bh_disable();
+
s = this_cpu_ptr(cp->dest->stats.cpustats);
u64_stats_update_begin(>syncp);
s->cnt.conns++;
@@ -193,6 +203,8 @@ ip_vs_conn_stats(struct ip_vs_conn *cp, struct 
ip_vs_service *svc)
u64_stats_update_begin(>syncp);
s->cnt.conns++;
u64_stats_update_end(>syncp);
+
+   local_bh_enable();
 }
 
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH net] ipvs: fix refcount usage for conns in ops mode

2018-05-03 Thread Julian Anastasov
Connections in One-packet scheduling mode (-o, --ops) are
removed with refcnt=0 because they are not hashed in conn table.
To avoid refcount_dec reporting this as error, change them to be
removed with refcount_dec_if_one as all other connections.

refcount_t hit zero at ip_vs_conn_put+0x31/0x40 [ip_vs]
in sh[15519], uid/euid: 497/497
WARNING: CPU: 0 PID: 15519 at ../kernel/panic.c:657
refcount_error_report+0x94/0x9e
Modules linked in: ip_vs_rr cirrus ttm sb_edac
edac_core drm_kms_helper crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel pcbc mousedev drm aesni_intel aes_x86_64
crypto_simd glue_helper cryptd psmouse evdev input_leds led_class
intel_agp fb_sys_fops syscopyarea sysfillrect intel_rapl_perf mac_hid
intel_gtt serio_raw sysimgblt agpgart i2c_piix4 i2c_core ata_generic
pata_acpi floppy cfg80211 rfkill button loop macvlan ip_vs
nf_conntrack libcrc32c crc32c_generic ip_tables x_tables ipv6
crc_ccitt autofs4 ext4 crc16 mbcache jbd2 fscrypto ata_piix libata
atkbd libps2 scsi_mod crc32c_intel i8042 rtc_cmos serio af_packet
dm_mod dax fuse xen_netfront xen_blkfront
CPU: 0 PID: 15519 Comm: sh Tainted: GW
4.15.17 #1-NixOS
Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
RIP: 0010:refcount_error_report+0x94/0x9e
RSP: :a344dde039c8 EFLAGS: 00010296
RAX: 0057 RBX: 92f20e06 RCX: 0006
RDX: 0007 RSI: 0086 RDI: a344dde165c0
RBP: a344dde03b08 R08: 0218 R09: 0004
R10: 93006a80 R11: 0001 R12: a344d68cd100
R13: 01f1 R14: 92f12fb0 R15: 0004
FS:  7fc9d2040fc0() GS:a344dde0()
knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 0262a000 CR3: 16a0c004 CR4: 001606f0
Call Trace:
 
 ex_handler_refcount+0x4e/0x80
 fixup_exception+0x33/0x40
 do_trap+0x83/0x140
 do_error_trap+0x83/0xf0
 ? ip_vs_conn_drop_conntrack+0x120/0x1a5 [ip_vs]
 ? ip_finish_output2+0x29c/0x390
 ? ip_finish_output2+0x1a2/0x390
 invalid_op+0x1b/0x40
RIP: 0010:ip_vs_conn_put+0x31/0x40 [ip_vs]
RSP: :a344dde03bb8 EFLAGS: 00010246
RAX: 0001 RBX: a344df31cf00 RCX: a344d7450198
RDX: 0003 RSI: fe01 RDI: a344d7450140
RBP: 0002 R08: 0476 R09: 
R10: a344dde03b28 R11: a344df20 R12: a344d7d09000
R13: a344def3a980 R14: c04f6e20 R15: 0008
 ip_vs_in.part.29.constprop.36+0x34f/0x640 [ip_vs]
 ? ip_vs_conn_out_get+0xe0/0xe0 [ip_vs]
 ip_vs_remote_request4+0x47/0xa0 [ip_vs]
 ? ip_vs_in.part.29.constprop.36+0x640/0x640 [ip_vs]
 nf_hook_slow+0x43/0xc0
 ip_local_deliver+0xac/0xc0
 ? ip_rcv_finish+0x400/0x400
 ip_rcv+0x26c/0x380
 __netif_receive_skb_core+0x3a0/0xb10
 ? inet_gro_receive+0x23c/0x2b0
 ? netif_receive_skb_internal+0x24/0xb0
 netif_receive_skb_internal+0x24/0xb0
 napi_gro_receive+0xb8/0xe0
 xennet_poll+0x676/0xb40 [xen_netfront]
 net_rx_action+0x139/0x3a0
 __do_softirq+0xde/0x2b4
 irq_exit+0xae/0xb0
 xen_evtchn_do_upcall+0x2c/0x40
 xen_hvm_callback_vector+0x7d/0x90
 
RIP: 0033:0x7fc9d11c91f9
RSP: 002b:7ffebe8a2ea0 EFLAGS: 0202 ORIG_RAX:
ff0c
RAX:  RBX: 02609808 RCX: 0054
RDX: 0001 RSI: 02605440 RDI: 025f940e
RBP: 025f940e R08: 0260213d R09: 1999
R10: 0262a808 R11: 025f942d R12: 025f940e
R13: 7fc9d1301e20 R14: 025f9408 R15: 7fc9d1302720
Code: 48 8b 95 80 00 00 00 41 55 49 8d 8c 24 e0 05 00
00 45 8b 84 24 38 04 00 00 41 89 c1 48 89 de 48 c7 c7 a8 2f f2 92 e8
7c fa ff ff <0f> 0b 58 5b 5d 41 5c 41 5d c3 0f 1f 44 00 00 55 48 89 e5
41 56

Reported-by: Net Filter 
Fixes: b54ab92b84b6 ("netfilter: refcounter conversions")
Signed-off-by: Julian Anastasov 
---
 net/netfilter/ipvs/ip_vs_conn.c | 17 ++---
 1 file changed, 6 insertions(+), 11 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_conn.c b/net/netfilter/ipvs/ip_vs_conn.c
index 370abbf..75de465 100644
--- a/net/netfilter/ipvs/ip_vs_conn.c
+++ b/net/netfilter/ipvs/ip_vs_conn.c
@@ -232,7 +232,10 @@ static inline int ip_vs_conn_unhash(struct ip_vs_conn *cp)
 static inline bool ip_vs_conn_unlink(struct ip_vs_conn *cp)
 {
unsigned int hash;
-   bool ret;
+   bool ret = false;
+
+   if (cp->flags & IP_VS_CONN_F_ONE_PACKET)
+   return refcount_dec_if_one(>refcnt);
 
hash = ip_vs_conn_hashkey_conn(cp);
 
@@ -240,15 +243,13 @@ static inline bool ip_vs_conn_unlink(struct ip_vs_conn 
*cp)
spin_lock(>lock);
 
if (cp->flags & IP_VS_CONN_F_HASHED) {
-   ret = false;
/* Decrease refcnt and unlink conn only if we are last user */
if (refcount_dec_if_one(>refcnt)) {
hlist_del_rcu(>c_list);
cp->flags &= 

[PATCH][PING] Hide private symbols in libnfnetlink

2018-05-03 Thread Yuri Gribov
Hi all,

Here's the updated version of the patch.

-Y


0001-Hide-private-symbols-v4.patch
Description: Binary data


[PATCH] netfilter: nf_queue: Replace conntrack entry

2018-05-03 Thread Kristian Evensen
SKBs are assigned a conntrack entry before being passed to any NFQUEUEs,
and if no entry is found then a new one is created. This behavior causes
problems for some traffic patterns. For example, if two UDP packets
to/from the same host (using the same ports) arrive at the "same" time,
both are assigned a new conntrack entry. After the first packet have
traversed all chains, the conntrack entry will be inserted into the
global table. The second packet will then be dropped during the
insertion step, as an entry for the same flow already exists. One type
of application that frequently generates this traffic pattern, is DNS
resolvers.

This commit introduces a new function that checks, and potentially
replaces, the conntrack entry for any additional "new" SKBs mapping to
an existing flow. While not a perfect solution, there are still
situations where to-be-dropped SKBs can slip through, the situations is
improved considerably. On the routers I have used for testing, packets
belonging to the same UDP flow are let through (when generating the
traffic pattern described above). Without the change in this commit, all
packets except the first one was dropped.

With the change in this commit, a user can implement "perfect" solutions
in user-space. An application can for example keep track of seen UDP
flows, and then only release packets belonging to one flow when the
entry has been created. Without the change, and SKB is stuck with the
original conntrack entry.

Signed-off-by: Kristian Evensen 
---
 net/netfilter/nfnetlink_queue.c | 68 +
 1 file changed, 68 insertions(+)

diff --git a/net/netfilter/nfnetlink_queue.c b/net/netfilter/nfnetlink_queue.c
index c97966298..150c11ff4 100644
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -43,6 +43,9 @@
 
 #if IS_ENABLED(CONFIG_NF_CONNTRACK)
 #include 
+#include 
+#include 
+#include 
 #endif
 
 #define NFQNL_QMAX_DEFAULT 1024
@@ -1046,6 +1049,53 @@ static int nfq_id_after(unsigned int id, unsigned int 
max)
return (int)(id - max) > 0;
 }
 
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+static void nfqnl_update_ct(struct net *net, struct sk_buff *skb)
+{
+   const struct nf_conntrack_l3proto *l3proto;
+   const struct nf_conntrack_l4proto *l4proto;
+   struct nf_conntrack_tuple_hash *h;
+   struct nf_conntrack_tuple tuple;
+   enum ip_conntrack_info ctinfo;
+   struct nf_conn *ct = NULL;
+   unsigned int dataoff;
+   u16 l3num;
+   u8 l4num;
+
+   ct = nf_ct_get(skb, );
+   l3num = nf_ct_l3num(ct);
+   l3proto = nf_ct_l3proto_find_get(l3num);
+
+   if (l3proto->get_l4proto(skb, skb_network_offset(skb), ,
+) <= 0) {
+   return;
+   }
+
+   l4proto = nf_ct_l4proto_find_get(l3num, l4num);
+
+   if (!nf_ct_get_tuple(skb, skb_network_offset(skb), dataoff, l3num,
+l4num, net, , l3proto, l4proto)) {
+   return;
+   }
+
+#if IS_ENABLED(CONFIG_NF_CONNTRACK_ZONES)
+   h = nf_conntrack_find_get(net, >zone, );
+#else
+   h = nf_conntrack_find_get(net, NULL, );
+#endif
+
+   if (h) {
+   pr_debug("%s: tuple %u %pI4:%hu -> %pI4:%hu\n", __func__,
+tuple.dst.protonum, ,
+ntohs(tuple.src.u.all), ,
+ntohs(tuple.dst.u.all));
+   nf_ct_put(ct);
+   ct = nf_ct_tuplehash_to_ctrack(h);
+   nf_ct_set(skb, ct, IP_CT_NEW);
+   }
+}
+#endif
+
 static int nfqnl_recv_verdict_batch(struct net *net, struct sock *ctnl,
struct sk_buff *skb,
const struct nlmsghdr *nlh,
@@ -1060,6 +1110,7 @@ static int nfqnl_recv_verdict_batch(struct net *net, 
struct sock *ctnl,
LIST_HEAD(batch_list);
u16 queue_num = ntohs(nfmsg->res_id);
struct nfnl_queue_net *q = nfnl_queue_pernet(net);
+   enum ip_conntrack_info ctinfo;
 
queue = verdict_instance_lookup(q, queue_num,
NETLINK_CB(skb).portid);
@@ -1090,6 +1141,16 @@ static int nfqnl_recv_verdict_batch(struct net *net, 
struct sock *ctnl,
list_for_each_entry_safe(entry, tmp, _list, list) {
if (nfqa[NFQA_MARK])
entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
+
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+   nf_ct_get(entry->skb, );
+
+   if (ctinfo == IP_CT_NEW && verdict != NF_STOLEN &&
+   verdict != NF_DROP) {
+   nfqnl_update_ct(net, entry->skb);
+   }
+#endif
+
nf_reinject(entry, verdict);
}
return 0;
@@ -1213,6 +1274,13 @@ static int nfqnl_recv_verdict(struct net *net, struct 
sock *ctnl,
if (nfqa[NFQA_MARK])
entry->skb->mark = 

[PATCH nf-next v5] netfilter: nf_osf: nf_osf_ttl() and nf_osf_match()

2018-05-03 Thread Fernando Fernandez Mancera
Added nf_osf_ttl() and nf_osf_match() into nf_osf.c in order to start the
nftables OSF implementation.

Signed-off-by: Fernando Fernandez Mancera 
---
 include/linux/netfilter/nf_osf.h  |  29 
 include/uapi/linux/netfilter/nf_osf.h |  93 +++
 include/uapi/linux/netfilter/xt_osf.h | 108 +++--
 net/netfilter/Kconfig |   4 +
 net/netfilter/Makefile|   1 +
 net/netfilter/nf_osf.c| 218 ++
 net/netfilter/xt_osf.c| 202 +---
 7 files changed, 365 insertions(+), 290 deletions(-)
 create mode 100644 include/linux/netfilter/nf_osf.h
 create mode 100644 include/uapi/linux/netfilter/nf_osf.h
 create mode 100644 net/netfilter/nf_osf.c

diff --git a/include/linux/netfilter/nf_osf.h b/include/linux/netfilter/nf_osf.h
new file mode 100644
index ..6b0167a9274c
--- /dev/null
+++ b/include/linux/netfilter/nf_osf.h
@@ -0,0 +1,29 @@
+#include 
+
+/*
+ * Initial window size option state machine: multiple of mss, mtu or
+ * plain numeric value. Can also be made as plain numeric value which
+ * is not a multiple of specified value.
+ */
+enum nf_osf_window_size_options {
+   OSF_WSS_PLAIN   = 0,
+   OSF_WSS_MSS,
+   OSF_WSS_MTU,
+   OSF_WSS_MODULO,
+   OSF_WSS_MAX,
+};
+
+enum osf_fmatch_states {
+   /* Packet does not match the fingerprint */
+   FMATCH_WRONG = 0,
+   /* Packet matches the fingerprint */
+   FMATCH_OK,
+   /* Options do not match the fingerprint, but header does */
+   FMATCH_OPT_WRONG,
+};
+
+bool nf_osf_match(const struct sk_buff *skb, u_int8_t family,
+   int hooknum, struct net_device *in, struct net_device *out,
+   const struct nf_osf_info *info, struct net *net,
+   const struct list_head *nf_osf_fingers);
+
diff --git a/include/uapi/linux/netfilter/nf_osf.h 
b/include/uapi/linux/netfilter/nf_osf.h
new file mode 100644
index ..076ec2ee5906
--- /dev/null
+++ b/include/uapi/linux/netfilter/nf_osf.h
@@ -0,0 +1,93 @@
+#ifndef _NF_OSF_H
+#define _NF_OSF_H
+
+#define MAXGENRELEN32
+
+#define NF_OSF_GENRE   (1<<0)
+#define NF_OSF_TTL (1<<1)
+#define NF_OSF_LOG (1<<2)
+#define NF_OSF_INVERT  (1<<3)
+
+#define NF_OSF_LOGLEVEL_ALL0   /* log all matched fingerprints */
+#define NF_OSF_LOGLEVEL_FIRST  1   /* log only the first matced 
fingerprint */
+#define NF_OSF_LOGLEVEL_ALL_KNOWN  2   /* do not log unknown packets */
+
+#define NF_OSF_TTL_TRUE0   /* True ip and fingerprint TTL 
comparison */
+
+/* Do not compare ip and fingerprint TTL at all */
+#define NF_OSF_TTL_NOCHECK 2
+
+/*
+ * Wildcard MSS (kind of).
+ * It is used to implement a state machine for the different wildcard values
+ * of the MSS and window sizes.
+ */
+struct nf_osf_wc {
+   __u32   wc;
+   __u32   val;
+};
+
+/*
+ * This struct represents IANA options
+ * http://www.iana.org/assignments/tcp-parameters
+ */
+struct nf_osf_opt {
+   __u16   kind, length;
+   struct  nf_osf_wc   wc;
+};
+
+struct nf_osf_info {
+   chargenre[MAXGENRELEN];
+   __u32   len;
+   __u32   flags;
+   __u32   loglevel;
+   __u32   ttl;
+};
+
+struct nf_osf_user_finger {
+   struct nf_osf_wcwss;
+
+   __u8ttl, df;
+   __u16   ss, mss;
+   __u16   opt_num;
+
+   chargenre[MAXGENRELEN];
+   charversion[MAXGENRELEN];
+   charsubtype[MAXGENRELEN];
+
+   /* MAX_IPOPTLEN is maximum if all options are NOPs or EOLs */
+   struct nf_osf_opt   opt[MAX_IPOPTLEN];
+};
+
+struct nf_osf_finger {
+   struct rcu_head rcu_head;
+   struct list_headfinger_entry;
+   struct nf_osf_user_finger   finger;
+};
+
+struct nf_osf_nlmsg {
+   struct nf_osf_user_finger   f;
+   struct iphdrip;
+   struct tcphdr   tcp;
+};
+
+/* Defines for IANA option kinds */
+
+enum iana_options {
+   OSFOPT_EOL = 0, /* End of options */
+   OSFOPT_NOP, /* NOP */
+   OSFOPT_MSS, /* Maximum segment size */
+   OSFOPT_WSO, /* Window scale option */
+   OSFOPT_SACKP,   /* SACK permitted */
+   OSFOPT_SACK,/* SACK */
+   OSFOPT_ECHO,
+   OSFOPT_ECHOREPLY,
+   OSFOPT_TS,  /* Timestamp option */
+   OSFOPT_POCP,/* Partial Order Connection Permitted */
+   OSFOPT_POSP,/* Partial Order Service Profile */
+
+   /* Others are not used in the current OSF */
+   OSFOPT_EMPTY = 255,
+};
+
+#endif /* _NF_OSF_H */
diff --git a/include/uapi/linux/netfilter/xt_osf.h 
b/include/uapi/linux/netfilter/xt_osf.h
index dad197e2ab99..5d5874b5d747 100644
--- a/include/uapi/linux/netfilter/xt_osf.h
+++ b/include/uapi/linux/netfilter/xt_osf.h
@@ -23,101 

Re: Silently dropped UDP packets on kernel 4.14

2018-05-03 Thread Kristian Evensen
Hi Michal,

Thanks for providing a nice summary of your experience when dealing
with this problem. Always nice to know that I am not alone :)

On Thu, May 3, 2018 at 11:42 AM, Michal Kubecek  wrote:
> One of the ideas I had was this:
>
>   - keep also unconfirmed conntracks in some data structure
>   - check new packets also against unconfirmed conntracks
>   - if it matches an unconfirmed conntrack, defer its processing
> until that conntrack is either inserted or discarded

I was thinking about something along the same lines and came to the
same conclusion, it is a lot of hassle and work for a very special
case. I think that replacing the conntrack entry is a good compromise,
it improves on the current situation, and allows for the creation of
"perfect" solutions in user-space. For example, a user can keep track
of seen UDP flows, and then only release new packets belonging to the
same flow when the conntrack entry is created.

BR,
Kristian
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH nft 5/5] src: use location to display error messages

2018-05-03 Thread Pablo Neira Ayuso
 # nft add chain foo bar
 Error: Could not process rule: No such file or directory
 add chain foo bar
   ^^^

Signed-off-by: Pablo Neira Ayuso 
---
 src/evaluate.c | 156 ++---
 1 file changed, 94 insertions(+), 62 deletions(-)

diff --git a/src/evaluate.c b/src/evaluate.c
index fdc536479785..c5f86395c84b 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -21,6 +21,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -42,8 +43,8 @@ static const char * const byteorder_names[] = {
__stmt_binary_error(ctx, &(s1)->location, NULL, fmt, ## args)
 #define monitor_error(ctx, s1, fmt, args...) \
__stmt_binary_error(ctx, &(s1)->location, NULL, fmt, ## args)
-#define cmd_error(ctx, fmt, args...) \
-   __stmt_binary_error(ctx, &(ctx->cmd)->location, NULL, fmt, ## args)
+#define cmd_error(ctx, loc, fmt, args...) \
+   __stmt_binary_error(ctx, loc, NULL, fmt, ## args)
 
 static int __fmtstring(3, 4) set_error(struct eval_ctx *ctx,
   const struct set *set,
@@ -190,8 +191,9 @@ static int expr_evaluate_symbol(struct eval_ctx *ctx, 
struct expr **expr)
 
table = table_lookup_global(ctx);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
-ctx->cmd->handle.table.name);
+   return cmd_error(ctx, >cmd->handle.table.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
set = set_lookup(table, (*expr)->identifier);
if (set == NULL)
@@ -2746,13 +2748,15 @@ static int setelem_evaluate(struct eval_ctx *ctx, 
struct expr **expr)
 
table = table_lookup_global(ctx);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table.name);
+   return cmd_error(ctx, >cmd->handle.table.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
set = set_lookup(table, ctx->cmd->handle.set.name);
if (set == NULL)
-   return cmd_error(ctx, "Could not process rule: Set '%s' does 
not exist",
-ctx->cmd->handle.set.name);
+   return cmd_error(ctx, >cmd->handle.set.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
ctx->set = set;
expr_set_context(>ectx, set->key->dtype, set->key->len);
@@ -2769,8 +2773,9 @@ static int set_evaluate(struct eval_ctx *ctx, struct set 
*set)
 
table = table_lookup_global(ctx);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table.name);
+   return cmd_error(ctx, >cmd->handle.table.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
if (!(set->flags & NFT_SET_INTERVAL) && set->automerge)
return set_error(ctx, set, "auto-merge only works with interval 
sets");
@@ -2831,8 +2836,9 @@ static int flowtable_evaluate(struct eval_ctx *ctx, 
struct flowtable *ft)
 
table = table_lookup_global(ctx);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table.name);
+   return cmd_error(ctx, >cmd->handle.table.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
ft->hooknum = str2hooknum(NFPROTO_NETDEV, ft->hookstr);
if (ft->hooknum == NF_INET_NUMHOOKS)
@@ -2923,8 +2929,9 @@ static int chain_evaluate(struct eval_ctx *ctx, struct 
chain *chain)
 
table = table_lookup_global(ctx);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table.name);
+   return cmd_error(ctx, >cmd->handle.table.location,
+"Could not process rule: %s",
+strerror(ENOENT));
 
if (chain == NULL) {
if (chain_lookup(table, >cmd->handle) == NULL) {
@@ -3087,12 +3094,14 @@ static int cmd_evaluate_get(struct eval_ctx *ctx, 
struct cmd *cmd)
case CMD_OBJ_SETELEM:
table = table_lookup(>handle, ctx->cache);
if (table == NULL)
-   return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
-

[PATCH nft 1/5] src: add table_spec

2018-05-03 Thread Pablo Neira Ayuso
Store location object in handle to improve error reporting.

Signed-off-by: Pablo Neira Ayuso 
---
 include/rule.h|  7 ++-
 src/evaluate.c| 42 +-
 src/monitor.c |  4 ++--
 src/netlink.c | 40 
 src/netlink_delinearize.c |  2 +-
 src/parser_bison.y|  3 ++-
 src/rule.c| 42 +-
 7 files changed, 73 insertions(+), 67 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index ee22cf217ac6..88750f0a4b54 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -27,6 +27,11 @@ struct position_spec {
uint64_tid;
 };
 
+struct table_spec {
+   struct location location;
+   const char  *name;
+};
+
 /**
  * struct handle - handle for tables, chains, rules and sets
  *
@@ -42,7 +47,7 @@ struct position_spec {
  */
 struct handle {
uint32_tfamily;
-   const char  *table;
+   struct table_spec   table;
const char  *chain;
const char  *set;
const char  *obj;
diff --git a/src/evaluate.c b/src/evaluate.c
index 4384e2710176..76125fcd884d 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -191,7 +191,7 @@ static int expr_evaluate_symbol(struct eval_ctx *ctx, 
struct expr **expr)
table = table_lookup_global(ctx);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
-ctx->cmd->handle.table);
+ctx->cmd->handle.table.name);
 
set = set_lookup(table, (*expr)->identifier);
if (set == NULL)
@@ -2747,7 +2747,7 @@ static int setelem_evaluate(struct eval_ctx *ctx, struct 
expr **expr)
table = table_lookup_global(ctx);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table);
+ctx->cmd->handle.table.name);
 
set = set_lookup(table, ctx->cmd->handle.set);
if (set == NULL)
@@ -2770,7 +2770,7 @@ static int set_evaluate(struct eval_ctx *ctx, struct set 
*set)
table = table_lookup_global(ctx);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table);
+ctx->cmd->handle.table.name);
 
if (!(set->flags & NFT_SET_INTERVAL) && set->automerge)
return set_error(ctx, set, "auto-merge only works with interval 
sets");
@@ -2832,7 +2832,7 @@ static int flowtable_evaluate(struct eval_ctx *ctx, 
struct flowtable *ft)
table = table_lookup_global(ctx);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table);
+ctx->cmd->handle.table.name);
 
ft->hooknum = str2hooknum(NFPROTO_NETDEV, ft->hookstr);
if (ft->hooknum == NF_INET_NUMHOOKS)
@@ -2924,7 +2924,7 @@ static int chain_evaluate(struct eval_ctx *ctx, struct 
chain *chain)
table = table_lookup_global(ctx);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-ctx->cmd->handle.table);
+ctx->cmd->handle.table.name);
 
if (chain == NULL) {
if (chain_lookup(table, >cmd->handle) == NULL) {
@@ -3088,7 +3088,7 @@ static int cmd_evaluate_get(struct eval_ctx *ctx, struct 
cmd *cmd)
table = table_lookup(>handle, ctx->cache);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
-cmd->handle.table);
+cmd->handle.table.name);
set = set_lookup(table, cmd->handle.set);
if (set == NULL || set->flags & (NFT_SET_MAP | NFT_SET_EVAL))
return cmd_error(ctx, "Could not process rule: Set '%s' 
does not exist",
@@ -3111,7 +3111,7 @@ static int cmd_evaluate_list_obj(struct eval_ctx *ctx, 
const struct cmd *cmd,
table = table_lookup(>handle, ctx->cache);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
-cmd->handle.table);
+cmd->handle.table.name);
if (obj_lookup(table, cmd->handle.obj, obj_type) == NULL)
return cmd_error(ctx, "Could not process rule: Object '%s' does 
not exist",

[PATCH nft 3/5] src: add set_spec

2018-05-03 Thread Pablo Neira Ayuso
Store location object in handle to improve error reporting.

Signed-off-by: Pablo Neira Ayuso 
---
 include/rule.h  |  7 ++-
 src/evaluate.c  | 36 ++--
 src/expression.c|  4 ++--
 src/netlink.c   |  6 +++---
 src/netlink_linearize.c | 10 +-
 src/parser_bison.y  |  6 --
 src/rule.c  | 18 +-
 src/segtree.c   |  4 ++--
 8 files changed, 49 insertions(+), 42 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index 4ea09c52b12e..68d32f10c353 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -37,6 +37,11 @@ struct chain_spec {
const char  *name;
 };
 
+struct set_spec {
+   struct location location;
+   const char  *name;
+};
+
 /**
  * struct handle - handle for tables, chains, rules and sets
  *
@@ -54,7 +59,7 @@ struct handle {
uint32_tfamily;
struct table_spec   table;
struct chain_spec   chain;
-   const char  *set;
+   struct set_spec set;
const char  *obj;
const char  *flowtable;
struct handle_spec  handle;
diff --git a/src/evaluate.c b/src/evaluate.c
index 78ff6071230a..79fa3221e20d 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -84,7 +84,7 @@ static struct expr *implicit_set_declaration(struct eval_ctx 
*ctx,
 
set = set_alloc(>location);
set->flags  = NFT_SET_ANONYMOUS | expr->set_flags;
-   set->handle.set = xstrdup(name);
+   set->handle.set.name = xstrdup(name);
set->key= key;
set->init   = expr;
set->automerge  = set->flags & NFT_SET_INTERVAL;
@@ -2749,10 +2749,10 @@ static int setelem_evaluate(struct eval_ctx *ctx, 
struct expr **expr)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
 ctx->cmd->handle.table.name);
 
-   set = set_lookup(table, ctx->cmd->handle.set);
+   set = set_lookup(table, ctx->cmd->handle.set.name);
if (set == NULL)
return cmd_error(ctx, "Could not process rule: Set '%s' does 
not exist",
-ctx->cmd->handle.set);
+ctx->cmd->handle.set.name);
 
ctx->set = set;
expr_set_context(>ectx, set->key->dtype, set->key->len);
@@ -2813,7 +2813,7 @@ static int set_evaluate(struct eval_ctx *ctx, struct set 
*set)
}
ctx->set = NULL;
 
-   if (set_lookup(table, set->handle.set) == NULL)
+   if (set_lookup(table, set->handle.set.name) == NULL)
set_add_hash(set_get(set), table);
 
/* Default timeout value implies timeout support */
@@ -3089,10 +3089,10 @@ static int cmd_evaluate_get(struct eval_ctx *ctx, 
struct cmd *cmd)
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
 cmd->handle.table.name);
-   set = set_lookup(table, cmd->handle.set);
+   set = set_lookup(table, cmd->handle.set.name);
if (set == NULL || set->flags & (NFT_SET_MAP | NFT_SET_EVAL))
return cmd_error(ctx, "Could not process rule: Set '%s' 
does not exist",
-cmd->handle.set);
+cmd->handle.set.name);
 
return setelem_evaluate(ctx, >expr);
default:
@@ -3144,30 +3144,30 @@ static int cmd_evaluate_list(struct eval_ctx *ctx, 
struct cmd *cmd)
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
 cmd->handle.table.name);
-   set = set_lookup(table, cmd->handle.set);
+   set = set_lookup(table, cmd->handle.set.name);
if (set == NULL || set->flags & (NFT_SET_MAP | NFT_SET_EVAL))
return cmd_error(ctx, "Could not process rule: Set '%s' 
does not exist",
-cmd->handle.set);
+cmd->handle.set.name);
return 0;
case CMD_OBJ_METER:
table = table_lookup(>handle, ctx->cache);
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table 
'%s' does not exist",
 cmd->handle.table.name);
-   set = set_lookup(table, cmd->handle.set);
+   set = set_lookup(table, cmd->handle.set.name);
if (set == NULL || !(set->flags & NFT_SET_EVAL))
return cmd_error(ctx, "Could not process rule: Meter 
'%s' does not exist",
-cmd->handle.set);
+   

[PATCH nft 2/5] src: add chain_spec

2018-05-03 Thread Pablo Neira Ayuso
Store location object in handle to improve error reporting.

Signed-off-by: Pablo Neira Ayuso 
---
 include/rule.h|  7 ++-
 src/evaluate.c|  4 ++--
 src/netlink.c | 14 +++---
 src/netlink_delinearize.c |  4 ++--
 src/parser_bison.y|  6 --
 src/rule.c| 16 
 6 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index 88750f0a4b54..4ea09c52b12e 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -32,6 +32,11 @@ struct table_spec {
const char  *name;
 };
 
+struct chain_spec {
+   struct location location;
+   const char  *name;
+};
+
 /**
  * struct handle - handle for tables, chains, rules and sets
  *
@@ -48,7 +53,7 @@ struct table_spec {
 struct handle {
uint32_tfamily;
struct table_spec   table;
-   const char  *chain;
+   struct chain_spec   chain;
const char  *set;
const char  *obj;
const char  *flowtable;
diff --git a/src/evaluate.c b/src/evaluate.c
index 76125fcd884d..78ff6071230a 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -3176,7 +3176,7 @@ static int cmd_evaluate_list(struct eval_ctx *ctx, struct 
cmd *cmd)
 cmd->handle.table.name);
if (chain_lookup(table, >handle) == NULL)
return cmd_error(ctx, "Could not process rule: Chain 
'%s' does not exist",
-cmd->handle.chain);
+cmd->handle.chain.name);
return 0;
case CMD_OBJ_QUOTA:
return cmd_evaluate_list_obj(ctx, cmd, NFT_OBJECT_QUOTA);
@@ -3319,7 +3319,7 @@ static int cmd_evaluate_rename(struct eval_ctx *ctx, 
struct cmd *cmd)
 ctx->cmd->handle.table.name);
if (chain_lookup(table, >cmd->handle) == NULL)
return cmd_error(ctx, "Could not process rule: Chain 
'%s' does not exist",
-ctx->cmd->handle.chain);
+ctx->cmd->handle.chain.name);
break;
default:
BUG("invalid command object type %u\n", cmd->obj);
diff --git a/src/netlink.c b/src/netlink.c
index 0c078d643344..e33e094e1992 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -145,8 +145,8 @@ struct nftnl_chain *alloc_nftnl_chain(const struct handle 
*h)
nftnl_chain_set_str(nlc, NFTNL_CHAIN_TABLE, h->table.name);
if (h->handle.id)
nftnl_chain_set_u64(nlc, NFTNL_CHAIN_HANDLE, h->handle.id);
-   if (h->chain != NULL)
-   nftnl_chain_set_str(nlc, NFTNL_CHAIN_NAME, h->chain);
+   if (h->chain.name != NULL)
+   nftnl_chain_set_str(nlc, NFTNL_CHAIN_NAME, h->chain.name);
 
return nlc;
 }
@@ -161,8 +161,8 @@ struct nftnl_rule *alloc_nftnl_rule(const struct handle *h)
 
nftnl_rule_set_u32(nlr, NFTNL_RULE_FAMILY, h->family);
nftnl_rule_set_str(nlr, NFTNL_RULE_TABLE, h->table.name);
-   if (h->chain != NULL)
-   nftnl_rule_set_str(nlr, NFTNL_RULE_CHAIN, h->chain);
+   if (h->chain.name != NULL)
+   nftnl_rule_set_str(nlr, NFTNL_RULE_CHAIN, h->chain.name);
if (h->handle.id)
nftnl_rule_set_u64(nlr, NFTNL_RULE_HANDLE, h->handle.id);
if (h->position.id)
@@ -540,7 +540,7 @@ static int list_rule_cb(struct nftnl_rule *nlr, void *arg)
 
if (h->family != family ||
strcmp(table, h->table.name) != 0 ||
-   (h->chain && strcmp(chain, h->chain) != 0))
+   (h->chain.name && strcmp(chain, h->chain.name) != 0))
return 0;
 
netlink_dump_rule(nlr, ctx);
@@ -697,7 +697,7 @@ static int list_chain_cb(struct nftnl_chain *nlc, void *arg)
 
if (h->family != family || strcmp(table, h->table.name) != 0)
return 0;
-   if (h->chain && strcmp(name, h->chain) != 0)
+   if (h->chain.name && strcmp(name, h->chain.name) != 0)
return 0;
 
chain = netlink_delinearize_chain(ctx, nlc);
@@ -1720,7 +1720,7 @@ static void trace_print_rule(const struct nftnl_trace 
*nlt,
 
h.family = nftnl_trace_get_u32(nlt, NFTNL_TRACE_FAMILY);
h.table.name  = nftnl_trace_get_str(nlt, NFTNL_TRACE_TABLE);
-   h.chain  = nftnl_trace_get_str(nlt, NFTNL_TRACE_CHAIN);
+   h.chain.name  = nftnl_trace_get_str(nlt, NFTNL_TRACE_CHAIN);
 
if (!h.table.name)
return;
diff --git a/src/netlink_delinearize.c b/src/netlink_delinearize.c
index 8b42850ecd43..eb509917e01d 100644
--- a/src/netlink_delinearize.c
+++ b/src/netlink_delinearize.c
@@ -2444,8 +2444,8 @@ struct rule *netlink_delinearize_rule(struct netlink_ctx 
*ctx,
 
memset(, 0, sizeof(h));

[PATCH nft 4/5] src: add obj_spec

2018-05-03 Thread Pablo Neira Ayuso
Store location object in handle to improve error reporting.

Signed-off-by: Pablo Neira Ayuso 
---
 include/rule.h |  7 ++-
 src/evaluate.c |  4 ++--
 src/netlink.c  |  8 
 src/parser_bison.y |  6 --
 src/rule.c | 18 +-
 5 files changed, 25 insertions(+), 18 deletions(-)

diff --git a/include/rule.h b/include/rule.h
index 68d32f10c353..b265690d3c96 100644
--- a/include/rule.h
+++ b/include/rule.h
@@ -42,6 +42,11 @@ struct set_spec {
const char  *name;
 };
 
+struct obj_spec {
+   struct location location;
+   const char  *name;
+};
+
 /**
  * struct handle - handle for tables, chains, rules and sets
  *
@@ -60,7 +65,7 @@ struct handle {
struct table_spec   table;
struct chain_spec   chain;
struct set_spec set;
-   const char  *obj;
+   struct obj_spec obj;
const char  *flowtable;
struct handle_spec  handle;
struct position_specposition;
diff --git a/src/evaluate.c b/src/evaluate.c
index 79fa3221e20d..fdc536479785 100644
--- a/src/evaluate.c
+++ b/src/evaluate.c
@@ -3112,9 +3112,9 @@ static int cmd_evaluate_list_obj(struct eval_ctx *ctx, 
const struct cmd *cmd,
if (table == NULL)
return cmd_error(ctx, "Could not process rule: Table '%s' does 
not exist",
 cmd->handle.table.name);
-   if (obj_lookup(table, cmd->handle.obj, obj_type) == NULL)
+   if (obj_lookup(table, cmd->handle.obj.name, obj_type) == NULL)
return cmd_error(ctx, "Could not process rule: Object '%s' does 
not exist",
-cmd->handle.obj);
+cmd->handle.obj.name);
return 0;
 }
 
diff --git a/src/netlink.c b/src/netlink.c
index e465daa79c84..864947b4d2f0 100644
--- a/src/netlink.c
+++ b/src/netlink.c
@@ -293,8 +293,8 @@ __alloc_nftnl_obj(const struct handle *h, uint32_t type)
 
nftnl_obj_set_u32(nlo, NFTNL_OBJ_FAMILY, h->family);
nftnl_obj_set_str(nlo, NFTNL_OBJ_TABLE, h->table.name);
-   if (h->obj != NULL)
-   nftnl_obj_set_str(nlo, NFTNL_OBJ_NAME, h->obj);
+   if (h->obj.name != NULL)
+   nftnl_obj_set_str(nlo, NFTNL_OBJ_NAME, h->obj.name);
 
nftnl_obj_set_u32(nlo, NFTNL_OBJ_TYPE, type);
if (h->handle.id)
@@ -1410,7 +1410,7 @@ struct obj *netlink_delinearize_obj(struct netlink_ctx 
*ctx,
obj->handle.family = nftnl_obj_get_u32(nlo, NFTNL_OBJ_FAMILY);
obj->handle.table.name =
xstrdup(nftnl_obj_get_str(nlo, NFTNL_OBJ_TABLE));
-   obj->handle.obj =
+   obj->handle.obj.name =
xstrdup(nftnl_obj_get_str(nlo, NFTNL_OBJ_NAME));
obj->handle.handle.id =
nftnl_obj_get_u64(nlo, NFTNL_OBJ_HANDLE);
@@ -1564,7 +1564,7 @@ int netlink_reset_objs(struct netlink_ctx *ctx, const 
struct cmd *cmd,
int err;
 
obj_cache = mnl_nft_obj_dump(ctx, h->family,
-h->table.name, h->obj, type, dump, true);
+h->table.name, h->obj.name, type, dump, 
true);
if (obj_cache == NULL)
return -1;
 
diff --git a/src/parser_bison.y b/src/parser_bison.y
index e4b83523b411..5b3860368bc5 100644
--- a/src/parser_bison.y
+++ b/src/parser_bison.y
@@ -1925,7 +1925,8 @@ flowtable_identifier  :   identifier
 obj_spec   :   table_spec  identifier
{
$$  = $1;
-   $$.obj  = $2;
+   $$.obj.name = $2;
+   $$.obj.location = @2;
}
;
 
@@ -1940,7 +1941,8 @@ objid_spec:   table_spec  HANDLE 
NUM
 obj_identifier :   identifier
{
memset(&$$, 0, sizeof($$));
-   $$.obj  = $1;
+   $$.obj.name = $1;
+   $$.obj.location = @1;
}
;
 
diff --git a/src/rule.c b/src/rule.c
index 7d18bd08c1fb..2f0123b7a4a5 100644
--- a/src/rule.c
+++ b/src/rule.c
@@ -48,8 +48,8 @@ void handle_merge(struct handle *dst, const struct handle 
*src)
dst->set.name = xstrdup(src->set.name);
if (dst->flowtable == NULL && src->flowtable != NULL)
dst->flowtable = xstrdup(src->flowtable);
-   if (dst->obj == NULL && src->obj != NULL)
-   dst->obj = xstrdup(src->obj);
+   if (dst->obj.name == NULL && src->obj.name != NULL)
+   dst->obj.name = xstrdup(src->obj.name);
if (dst->handle.id == 0)
dst->handle = src->handle;

Re: Silently dropped UDP packets on kernel 4.14

2018-05-03 Thread Michal Kubecek
On Thu, May 03, 2018 at 07:03:45AM +0200, Florian Westphal wrote:
> Kristian Evensen  wrote:
> > I went for the early-insert approached and have patched
> 
> I'm sorry for suggesting that.
> 
> It doesn't work, because of NAT.
> NAT rewrites packet content and changes the reply tuple, but the tuples
> determine the hash insertion location.
> 
> I don't know how to solve this problem.

It's an old problem which surfaces from time to time when some special
conditions make it more visible. When I was facing it in 2015, I found
this thread from as early as 2009:

  https://www.spinics.net/lists/linux-net/msg16712.html

In our case, the customer was using IPVS in "one packet scheduling" mode
(it drops the conntrack entry after each packet) which increased the
probability of insert collisions significantly. Using NFQUEUE 

We were lucky, though, as it turned out the only reason why customer
needed connection tracking was to make sure fragments of long UDP
datagrams are not sent to different real servers. For newer kernels
after commit 6aafeef03b9d ("netfilter: push reasm skb through instead of
original frag skbs"), this was no longer necessary so that they could
disable connection tracking for these packets.

For older kernels without this change, I tried several ideas, each of
which didn't work for some reason. We ended up with rather hacky
workaround, not dropping the packet on collision (so that its conntrack
wasn't inserted into the table and was dropped once the packet was
sent). It worked fine for our customer but like the early insert
approach, it wouldn't work with NAT.

One of the ideas I had was this:

  - keep also unconfirmed conntracks in some data structure
  - check new packets also against unconfirmed conntracks
  - if it matches an unconfirmed conntrack, defer its processing
until that conntrack is either inserted or discarded

But as it would be rather complicated to implement without races and
harming performance, I didn't want to actually try it until I would
run out of other ideas. With NAT coming to the play, there doesn't seem
to be many other options.

Michal Kubecek
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[arptables PATCH] arptables: cleanup sysvinit script

2018-05-03 Thread Arturo Borrero Gonzalez
This file belong to downstream distributions. Also, it's unmaintained.

Signed-off-by: Arturo Borrero Gonzalez 
---
 Makefile   |8 +---
 arptables.sysv |  103 
 2 files changed, 2 insertions(+), 109 deletions(-)
 delete mode 100644 arptables.sysv

diff --git a/Makefile b/Makefile
index 7bead0d..139c9ca 100644
--- a/Makefile
+++ b/Makefile
@@ -7,7 +7,6 @@ LIBDIR:=$(PREFIX)/lib
 BINDIR:=$(PREFIX)/sbin
 MANDIR:=$(PREFIX)/man
 man8dir=$(MANDIR)/man8
-INITDIR:=/etc/rc.d/init.d
 SYSCONFIGDIR:=/etc/sysconfig
 DESTDIR:=
 
@@ -46,15 +45,12 @@ $(DESTDIR)$(BINDIR)/arptables: arptables
 tmp1:=$(shell printf $(BINDIR) | sed 's/\//\\\//g')
 tmp2:=$(shell printf $(SYSCONFIGDIR) | sed 's/\//\\\//g')
 .PHONY: scripts
-scripts: arptables-save arptables-restore arptables.sysv
+scripts: arptables-save arptables-restore
cat arptables-save | sed 's/__EXEC_PATH__/$(tmp1)/g' > arptables-save_
install -m 0755 arptables-save_ $(DESTDIR)$(BINDIR)/arptables-save
cat arptables-restore | sed 's/__EXEC_PATH__/$(tmp1)/g' > 
arptables-restore_
install -m 0755 arptables-restore_ $(DESTDIR)$(BINDIR)/arptables-restore
-   cat arptables.sysv | sed 's/__EXEC_PATH__/$(tmp1)/g' | sed 
's/__SYSCONFIG__/$(tmp2)/g' > arptables.sysv_
-   if [ "$(DESTDIR)" != "" ]; then mkdir -p $(DESTDIR)$(INITDIR); fi
-   if test -d $(DESTDIR)$(INITDIR); then install -m 0755 arptables.sysv_ 
$(DESTDIR)$(INITDIR)/arptables; fi
-   rm -f arptables-save_ arptables-restore_ arptables.sysv_
+   rm -f arptables-save_ arptables-restore_
 
 .PHONY: install-man
 install-man: $(MANS)
diff --git a/arptables.sysv b/arptables.sysv
deleted file mode 100644
index ea5cf09..000
--- a/arptables.sysv
+++ /dev/null
@@ -1,103 +0,0 @@
-#!/bin/bash
-#
-# init script for arptables
-#
-# Original by Dag Wieers .
-# Modified/changed to arptables by
-#  Rok Papez .
-#
-# chkconfig: - 16 84
-# description: Arp filtering tables
-#
-# config: __SYSCONFIG__/arptables
-
-source /etc/init.d/functions
-source /etc/sysconfig/network
-
-# Check that networking is up.
-[ ${NETWORKING} = "no" ] && exit 0
-
-[ -x __EXEC_PATH__/arptables ] || exit 1
-[ -x __EXEC_PATH__/arptables-save ] || exit 1
-[ -x __EXEC_PATH__/arptables-restore ] || exit 1
-
-[ "$1" != "save" -o -r __SYSCONFIG__/arptables ] || exit 1
-
-RETVAL=0
-prog="arptables"
-desc="Arp filtering"
-
-start() {
-   echo -n $"Starting $desc ($prog): "
-   __EXEC_PATH__/arptables-restore < __SYSCONFIG__/arptables || RETVAL=1
-
-   if [ $RETVAL -eq 0 ]; then
-   success "$prog startup"
-   rm -f /var/lock/subsys/$prog
-   else
-   failure "$prog startup"
-   fi
-
-   echo
-   return $RETVAL
-}
-
-stop() {
-   echo -n $"Stopping $desc ($prog): "
-   __EXEC_PATH__/arptables-restore < /dev/null || RETVAL=1
-
-   if [ $RETVAL -eq 0 ]; then
-   success "$prog shutdown"
-   rm -f %{_localstatedir}/lock/subsys/$prog
-   else
-   failure "$prog shutdown"
-   fi
-
-   echo
-   return $RETVAL
-}
-
-restart() {
-   stop
-   start
-}
-
-save() {
-   echo -n $"Saving $desc ($prog): "
-   __EXEC_PATH__/arptables-save > __SYSCONFIG__/arptables || RETVAL=1
-
-   if [ $RETVAL -eq 0 ]; then
-   success "$prog saved"
-   else
-   failure "$prog saved"
-   fi
-   echo
-}
-
-case "$1" in
-  start)
-   start
-   ;;
-  stop)
-   stop
-   ;;
-  restart|reload)
-   restart
-   ;;
-  condrestart)
-   [ -e /var/lock/subsys/$prog ] && restart
-   RETVAL=$?
-   ;;
-  save)
-   save
-   ;;
-  status)
-   __EXEC_PATH__/arptables-save
-   RETVAL=$?
-   ;;
-  *)
-   echo $"Usage $0 {start|stop|restart|condrestart|save|status}"
-   RETVAL=1
-esac
-
-exit $RETVAL

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Silently dropped UDP packets on kernel 4.14

2018-05-03 Thread Kristian Evensen
Hi Florian,

On Thu, May 3, 2018 at 7:03 AM, Florian Westphal  wrote:
> I'm sorry for suggesting that.
>
> It doesn't work, because of NAT.
> NAT rewrites packet content and changes the reply tuple, but the tuples
> determine the hash insertion location.
>
> I don't know how to solve this problem.

No problem. This has anyway served as a good exercise for getting more
familiar with the conntrack/nat code in the kernel. I did some more
tests and I see that on my router (or routers actually), just
replacing the ct solves the issue. While not a perfect solution, the
situation is improved considerably. Do you think a patch where the ct
is replace would be acceptable, or would upstream rather wait for a
"proper" fix to this problem? When replacing the ct, it is at least
possible to work around the problem in userspace, while without
replacing ct we are stuck with the original entry.

BR,
Kristian
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [ANNOUNCE] libnftnl 1.1.0 release

2018-05-03 Thread Pablo Neira Ayuso
On Thu, May 03, 2018 at 01:08:36AM +1000, Duncan Roe wrote:
> On Wed, May 02, 2018 at 10:09:04AM +0200, Pablo Neira Ayuso wrote:
> > On Wed, May 02, 2018 at 11:32:13AM +1000, Duncan Roe wrote:
> > > On Tue, May 01, 2018 at 11:33:33PM +0200, Florian Westphal wrote:
> [...]
> > > Hey Florian,
> > >
> > > I just downloaded
> > > https://netfilter.org/projects/libnftnl/downloads.html#libnftnl-1.1.0 but 
> > > the
> > > sha256sum doesn't match:
> > > ec0eaca11b165110c2b61e6a7b50a7a0a9b17fa04a0c333f795bec2d19f78f6c instead 
> > > of the
> > > expected 36c6d99c7684851d4d72e75bd07ff3f0ff1baaf4b6f069eb7244990cd1a9a462.
> > > Installing it anyway,
> >
> > Just fixed website, sorry about this.
> >
> > The right checksum is
> > ec0eaca11b165110c2b61e6a7b50a7a0a9b17fa04a0c333f795bec2d19f78f6c as:
> >
> > http://ftp.netfilter.org/pub/libnftnl/libnftnl-1.1.0.tar.bz2.sha256sum
> >
> > says.
> >
> > Thanks for reporting.
> 
> I refreshed
> https://netfilter.org/projects/libnftnl/downloads.html#libnftnl-1.1.0 several
> times, but still see
> 36c6d99c7684851d4d72e75bd07ff3f0ff1baaf4b6f069eb7244990cd1a9a462 displayed on
> the screen. The downloaded one is right, but I always use the screen.

I have compiled the website but I forgot to push out changes to
website, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html