Hello,
On Fri, 4 Dec 2020, dust.li wrote:
>
> On 12/3/20 4:48 PM, Julian Anastasov wrote:
> >
> > - work will use spin_lock_bh(&s->lock) to protect the
> > entries, we do not want delays between /proc readers and
> > the work if using
ible with the RCU locking, we need a safe way
to move entries to neighbour list, say if work walks
row 0 we can rebalance between rows 32 and 33 which are
1 second away of row 0. And not all list primitives allow
it for _rcu.
- next options is to insert entries in some current list,
if their count reaches, say 128, then move to the next list
for inserting. This option tries to provide exact 2000ms
delay for the first estimation for the newly added entry.
We can start with some implementation and see if
your tests are happy.
Regards
--
Julian Anastasov
ate_net* return NULL
> when PROC is not used.
>
> Fixes: b17fc9963f83 ("IPVS: netns, ip_vs_stats and its procfs")
> Fixes: 61b1ab4583e2 ("IPVS: netns, add basic init per netns.")
> Reported-by: Hulk Robot
> Signed-off-by: Wang Hai
uot;ip_vs_stats_percpu", ipvs->net->proc_net);
err_percpu:
> + remove_proc_entry("ip_vs_stats", ipvs->net->proc_net);
err_stats:
> + remove_proc_entry("ip_vs", ipvs->net->proc_net);
err_vs:
#endif
> free_percpu(ipvs->tot_stats.cpustats);
> return -ENOMEM;
> }
> --
Regards
--
Julian Anastasov
Hello,
On Mon, 16 Nov 2020, Yejune Deng wrote:
> atomic_inc_return() looks better
>
> Signed-off-by: Yejune Deng
Looks good to me for -next, thanks!
Acked-by: Julian Anastasov
> ---
> net/netfilter/ipvs/ip_vs_core.c | 2 +-
> net/netfilter/ipvs/ip_vs_sy
ests(): #ifdef can be before declarations,
try to use long-to-short lines (reverse xmas tree order
for variables in declarations)
- print_service_entry(): no need to check d before free(d),
free() checks it itself, just like kfree() in kernel.
- ipvs_services_dests_parse_cb: we should stop if realloc() fails,
sadly, existing code does not check realloc() result but
for new code we should do it
- ipvs_get_services_dests(): kernel avoids using assignments in
'if' condition, we do the same for new code. You have to
split such code to assignment+condition.
- there are extra parentheses in code such as sizeof(*(get->index)),
that should be fine instead: sizeof(*get->index), same for
sizeof(get->index[0]). Extra parens also for &(get->dests),
etc.
- as new code runs only for LIBIPVS_USE_NL, check if it is wrapped
in proper #ifdef in libipvs/libipvs.c. Make sure
ipvsadm compiles without LIBIPVS_USE_NL.
- the extern word should not be used in .h files anymore
Some of the above styling issues are also reported by
linux# scripts/checkpatch.pl --strict /tmp/ipvsadm.patch
As we try to apply to ipvsadm the same styling rules
that are used for networking in kernel, you should be able
to fix all such places with help from checkpatch.pl. Probably,
you know about this file:
Documentation/process/coding-style.rst
Regards
--
Julian Anastasov
) {
> + if (ip_vs_genl_dump_service_dests(skb, cb, ipvs,
> + svc, &ctx))
> + goto nla_put_failure;
> + }
> + ctx.idx_svc = 0;
> + ctx.start_svc = 0;
ctx->idx_dest = 0;
ctx->start_dest = 0;
> + }
row = 0;# Not needed
tab++; $ tab = 2 to indicate EOF
> +
> +nla_put_failure:
> + cb->args[0] = ctx.idx_svc;
> + cb->args[1] = ctx.idx_dest;
> + cb->args[2] = tab;
> + cb->args[3] = row;
> +
> +out_err:
> + mutex_unlock(&__ip_vs_mutex);
> +
> + return skb->len;
> +}
> +
> static int ip_vs_genl_parse_dest(struct ip_vs_dest_user_kern *udest,
>struct nlattr *nla, bool full_entry)
> {
> @@ -3991,6 +4143,12 @@ static const struct genl_small_ops ip_vs_genl_ops[] = {
> .flags = GENL_ADMIN_PERM,
> .doit = ip_vs_genl_set_cmd,
> },
> + {
> + .cmd= IPVS_CMD_GET_SERVICE_DEST,
> + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> + .flags = GENL_ADMIN_PERM,
> + .dumpit = ip_vs_genl_dump_services_destinations,
> + },
> };
>
> static struct genl_family ip_vs_genl_family __ro_after_init = {
> --
> 2.25.1
Regards
--
Julian Anastasov
is behavior while writing this patch and even
> created a few crude validation scripts running parallel agents and
> checking the diff in [1].
Ok, make sure your tests cover cases with multiple
dests, so that single service occupies multiple packets,
I'm not sure if 100 dests fit in one packet or not.
Regards
--
Julian Anastasov
+
> static int ip_vs_genl_parse_dest(struct ip_vs_dest_user_kern *udest,
>struct nlattr *nla, bool full_entry)
> {
> @@ -3991,6 +4094,12 @@ static const struct genl_small_ops ip_vs_genl_ops[] = {
> .flags = GENL_ADMIN_PERM,
> .doit = ip_vs_genl_set_cmd,
> },
> + {
> + .cmd= IPVS_CMD_GET_SERVICE_DEST,
> + .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> + .flags = GENL_ADMIN_PERM,
> + .dumpit = ip_vs_genl_dump_services_destinations,
> + },
> };
>
> static struct genl_family ip_vs_genl_family __ro_after_init = {
> --
Regards
--
Julian Anastasov
*skb, int
> skb_af,
>
> ip_vs_drop_early_demux_sk(skb);
>
> + skb->tstamp = 0;
> +
Should be after all skb_forward_csum() calls in ip_vs_xmit.c
> if (skb_headroom(skb) < max_headroom || skb_cloned(skb)) {
> new_skb = skb_realloc_headroom(skb, max_headroom);
> if (!new_skb)
Regards
--
Julian Anastasov
Hello,
On Mon, 28 Sep 2020, longguang.yue wrote:
> Outputting client,virtual,dst addresses info when tcp state changes,
> which makes the connection debug more clear
>
> Signed-off-by: longguang.yue
OK, v5 can be used instead of fixing v4.
Acked-by: Juli
Hello,
On Sun, 27 Sep 2020, longguang.yue wrote:
> outputting client,virtual,dst addresses info when tcp state changes,
> which makes the connection debug more clear
>
> Signed-off-by: longguang.yue
Looks good to me, thanks!
Acked-by: Juli
7 ("ipvs: Fix faulty IPv6 extension header handling in
> IPVS").
> Signed-off-by: Yaroslav Bolyukin
Looks good to me, thanks! May be maintainers will
remove the extra dot after the Fixes line.
Acked-by: Julian Anastasov
> ---
> Missed canonical patch format sectio
IP_VS
> config IP_VS_IPV6
> bool "IPv6 support for IPVS"
> depends on IPV6 = y || IP_VS = IPV6
> - select IP6_NF_IPTABLES
> select NF_DEFRAG_IPV6
> help
> Add IPv6 support to IPVS.
> --
Regards
--
Julian Anastasov
PV6
> - select IP6_NF_IPTABLES
> select NF_DEFRAG_IPV6
> help
> Add IPv6 support to IPVS.
> --
> 2.28.0
Regards
--
Julian Anastasov
appspot.com/bug?id=46ebfb92a8a812621a001ef04d90dfa459520fe2
> Suggested-by: Julian Anastasov
> Signed-off-by: Peilin Ye
Looks good to me, thanks!
Acked-by: Julian Anastasov
> ---
> Changes in v2:
> - Target net-next tree. (Suggested by Julian Anastasov )
> - Reject all `len == 0` requests
ID(cmd)]);
> @@ -2547,9 +2549,6 @@ do_ip_vs_set_ctl(struct sock *sk, int cmd, void __user
> *user, unsigned int len)
> break;
> case IP_VS_SO_SET_DELDEST:
> ret = ip_vs_del_dest(svc, &udest);
> - break;
> - default:
> - ret = -EINVAL;
> }
>
>out_unlock:
Regards
--
Julian Anastasov
Hello,
On Wed, 22 Jul 2020, Pablo Neira Ayuso wrote:
> On Fri, Jul 17, 2020 at 08:36:36PM +0300, Julian Anastasov wrote:
> >
> > On Fri, 17 Jul 2020, Andrew Sy Kim wrote:
> >
> > > Adds missing "*ipvs" to ip_vs_enqueue_expire_nodest_co
"udp: use a separate rx queue for packet reception")
> Reported-by: zhouxudong
> Signed-off-by: guodeqing
Looks good to me, thanks!
Acked-by: Julian Anastasov
Simon, Pablo, this patch should be applied to the nf tree.
As the reader_queue appears in 4.13, this patch ca
Hello,
On Fri, 17 Jul 2020, Andrew Sy Kim wrote:
> Adds missing "*ipvs" to ip_vs_enqueue_expire_nodest_conns when
> CONFIG_SYSCTL is disabled
>
> Signed-off-by: Andrew Sy Kim
Acked-by: Julian Anastasov
Pablo, please apply this too.
> ---
> inc
!skb_queue_empty(&up->reader_queue)) {
Here too
> len = ip_vs_receive(tinfo->sock, tinfo->buf,
> ipvs->bcfg.sync_maxlen);
> if (len <= 0) {
> --
> 2.7.4
Regards
--
Julian Anastasov
nt ip_vs_in_icmp_v6(struct netns_ipvs *ipvs,
> struct sk_buff *skb,
> }
>
> if (resched) {
> + if (uses_ct)
> + cp->flags &= ~IP_VS_CONN_F_NFCT;
> if (!atomic_read
Hello,
On Thu, 4 Jun 2020, Christoph Paasch wrote:
> On Wed, Jun 11, 2014 at 11:05 PM Julian Anastasov wrote:
> >
> >
> > > The behavior that we want is for the receipt of the duplicate bare
> > > ACK to not result in waking up user space. The socke
ead of
> "then the client program".
> Or a more detailed explanation.
Yes, if the packet is SYN we can create new connection.
If it is ACK, the retransmission will get RST.
Regards
--
Julian Anastasov
nnection with unavailable dest,
as before
- create new connection to available destination that will be found
first in lists. But it can work only when sysctl var "conntrack" is 0,
we do not want to create two netfilter conntracks to different
real servers.
Note that we intentionally removed the timer_pending() check
because we can not see existing ONE_PACKET connections in table.
Regards
--
Julian Anastasov
/* try to expire the connection immediately */
> ip_vs_conn_expire_now(cp);
> }
You can also look at the discussion which resulted in
the last patch for this place:
http://archive.linuxvirtualserver.org/html/lvs-devel/2018-07/msg00014.html
Regards
--
Julian Anastasov
Hello,
On Mon, 30 Sep 2019, zhang kai wrote:
> In the end of function __ip_vs_get_out_rt/__ip_vs_get_out_rt_v6,the
> 'local' variable is always zero.
>
> Signed-off-by: zhang kai
Looks good to me, thanks!
Acked-by: Julian Anastasov
Simon, thi
Hello,
On Tue, 17 Sep 2019, David Ahern wrote:
> On 9/17/19 12:50 PM, Julian Anastasov wrote:
> >
> > Looks good to me, thanks!
> >
> > Reviewed-by: Julian Anastasov
> >
>
> BTW, do you have any tests for the rt_uses_gateway paths -
are always used
> together, and then re-use that u8 for rt_uses_gateway. End result is that
> rtable size is unchanged.
>
> Fixes: 1550c171935d ("ipv4: Prepare rtable for IPv6 gateway")
> Reported-by: Julian Anastasov
> Signed-off-by: David Ahern
Looks good to
te get LOCAL_IP oif eth0' where extra 'via GW' line is
shown.
Regards
--
Julian Anastasov
ULL by default.
> We could not fix it in __metadata_dst_init() as there is no dev supplied.
> On the other hand, the reason we need rt->dst.dev is to get the net.
> So we can just try get it from skb->dev when rt->dst.dev is NULL.
>
> v4: Julian Anastasov remind skb-&g
unreach6() and other IPv6 places have
workarounds to avoid skb->dev being NULL but IPv4 and IPv6 are
different: IPv4 never required skb->dev to be non-NULL, so better
do not change that. Just check dst.dev to avoid crash.
> + net = dev_net(skb_in->dev);
>
> /*
>* Find the original header. It is expected to be valid, of course.
> --
> 2.19.2
Regards
--
Julian Anastasov
__s32 previous_delta; /* Delta in sequence numbers
>* before last resized pkt */
> };
>
> --
> 2.17.1
Regards
--
Julian Anastasov
o read
> the value from the user buffer, and save only when it is valid.
> I delete proc_do_sync_mode and use extra1/2 in table for the
> proc_dointvec_minmax call.
>
> Fixes: f73181c8288f ("ipvs: add support for sync threads")
> Signed-off-by: Junwei Hu
> Acked-by:
o read
> the value from the user buffer, and save only when it is valid.
> I delete proc_do_sync_mode and use extra1/2 in table for the
> proc_dointvec_minmax call.
>
> Fixes: f73181c8288f ("ipvs: add support for sync threads")
> Signed-off-by: Junwei Hu
Looks g
max(NEIGH_VAR(neigh->parms, RETRANS_TIME),
> @@ -1140,6 +1141,7 @@ int __neigh_event_send(struct neighbour *neigh, struct
> sk_buff *skb)
> }
> } else if (neigh->nud_state & NUD_STALE) {
> neigh_dbg(2, "neigh %p is delayed\n", neigh);
> + neigh_del_timer(neigh);
> neigh->nud_state = NUD_DELAY;
> neigh->updated = jiffies;
> neigh_add_timer(neigh, jiffies +
> --
> 2.21.0
Regards
--
Julian Anastasov
t; memset(&(IPCB(skb)->opt), 0, sizeof(IPCB(skb)->opt));
As part of your patch, the new tunnel type should be registered
also in ip_vs_rs_hash(), GRE will use port 0 just like IPIP, eg:
case IP_VS_CONN_F_TUNNEL_TYPE_IPIP:
+ case IP_VS_CONN_F_TUNNEL_TYPE_GRE:
port = 0;
break;
Then I'll post a patch for ip_vs_in_icmp() that strips
the GRE header from ICMP errors by adding ipvs_gre_decap().
I also created ipvsadm patch for GRE.
Regards
--
Julian Anastasov
max_headroom,
>&next_protocol, &payload_len,
> @@ -1208,8 +1297,17 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct
> ip_vs_conn *cp,
> goto tx_error;
>
> gso_type = __tun_gso_type_mask(AF_INET6, cp->af);
> - if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE)
> - gso_type |= SKB_GSO_UDP_TUNNEL;
> + if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE) {
> + if ((tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM) ||
> + (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_REMCSUM))
> + gso_type |= SKB_GSO_UDP_TUNNEL_CSUM;
> + else
> + gso_type |= SKB_GSO_UDP_TUNNEL;
> + if ((tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_REMCSUM) &&
> + skb->ip_summed == CHECKSUM_PARTIAL) {
> + gso_type |= SKB_GSO_TUNNEL_REMCSUM;
> + }
> + }
>
> if (iptunnel_handle_offloads(skb, gso_type))
> goto tx_error;
> @@ -1218,8 +1316,18 @@ ip_vs_tunnel_xmit_v6(struct sk_buff *skb, struct
> ip_vs_conn *cp,
>
> skb_set_inner_ipproto(skb, next_protocol);
>
> - if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE)
> - ipvs_gue_encap(net, skb, cp, &next_protocol);
> + if (tun_type == IP_VS_CONN_F_TUNNEL_TYPE_GUE) {
> + bool check = false;
> +
> + if (ipvs_gue_encap(net, skb, cp, &next_protocol))
> + goto tx_error;
> +
> + if ((tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_CSUM) ||
> + (tun_flags & IP_VS_TUNNEL_ENCAP_FLAG_REMCSUM))
> + check = true;
> +
> + udp6_set_csum(!check, skb, &saddr, &cp->daddr.in6, skb->len);
> + }
>
> skb_push(skb, sizeof(struct ipv6hdr));
> skb_reset_network_header(skb);
> --
> 2.21.0
Regards
--
Julian Anastasov
t; Fixes: efe41606184e ("ipvs: convert to use pernet nf_hook api")
> Signed-off-by: YueHaibing
> ---
> net/netfilter/ipvs/ip_vs_core.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/net/netfilter/ipvs/ip_vs_core.c b/net/netfilter/ipvs/ip_vs_core.c
> index 1445755..33205db 100644
> --- a/net/netfilter/ipvs/ip_vs_core.c
> +++ b/net/netfilter/ipvs/ip_vs_core.c
> @@ -2320,6 +2320,7 @@ static void __net_exit __ip_vs_cleanup(struct net *net)
> ip_vs_control_net_cleanup(ipvs);
> ip_vs_estimator_net_cleanup(ipvs);
> IP_VS_DBG(2, "ipvs netns %d released\n", ipvs->gen);
> + synchronize_net();
> net->ipvs = NULL;
> }
Regards
--
Julian Anastasov
%d released\n", ipvs->gen);
> + synchronize_net();
Grace period in net_exit handler should be avoided.
It can be added to ip_vs_cleanup() but may be we have to
reorder the operations, so that we can have single grace
period. Note that ip_vs_conn_cleanup() already includes
rcu_barrier() and we can use it to split the cleanups to
two steps: 1: unregister hooks (__ip_vs_dev_cleanup) to
stop traffic and 2: cleanups when traffic is stopped.
Note that the problem should be only when module
is removed, the case with netns exit in cleanup_net()
should not cause problem.
I'll have more time this weekend to reorganize the
code...
> net->ipvs = NULL;
> }
>
> --
> 2.7.4
Regards
--
Julian Anastasov
ck.inet6
> directly, it should be accessed via tcp_inet6_sk() or inet6_sk().
>
> This happened when we added the first u64 field in struct tcp_sock.
>
> Fixes: 93a77c11ae79 ("tcp: add tcp_inet6_sk() helper")
> Signed-off-by: Eric Dumazet
> Bisected-by: Julian Anas
.
>
> Signed-off-by: Xin Long
Looks good to me, thanks!
Acked-by: Julian Anastasov
I guess, it is for the nf-next/net-next tree because it just
eliminates a duplicate check.
> ---
> net/netfilter/ipvs/ip_vs_proto_sctp.c | 7 ++-
> 1 file changed, 2 inse
ned-off-by: Matteo Croce
Looks good to me, thanks!
Acked-by: Julian Anastasov
> ---
> net/netfilter/ipvs/ip_vs_core.c | 49 +++-
> net/netfilter/ipvs/ip_vs_proto_tcp.c | 3 +-
> net/netfilter/ipvs/ip_vs_proto_udp.c | 3 +-
> 3 files change
> This reduces the performance impact of the Spectre mitigation, and
> should give a small improvement even with RETPOLINES disabled.
>
> Signed-off-by: Matteo Croce
Looks good to me, thanks!
Acked-by: Julian Anastasov
> ---
> include/net/ip_vs.h
gt;
> Fix this by checking whether the timer already started.
>
> Signed-off-by: Tan Hu
> Reviewed-by: Jiang Biao
v3 looks good to me,
Acked-by: Julian Anastasov
Simon and Pablo, this can be applied to ipvs/nf tree...
> ---
> v2: fix use-after-free in CONN_ONE_PAC
x this by checking whether the timer already started.
>
> Signed-off-by: Tan Hu
> Reviewed-by: Jiang Biao
> ---
> v2: fix use-after-free in CONN_ONE_PACKET case suggested by Julian Anastasov
>
> net/netfilter/ipvs/ip_vs_core.c | 15 +++
> 1 file changed
to some path (via alive ISP), use route lookup just
to select alive path for the first packet in connection. So, what
we balance are connections, not packets (which does not work with
different ISPs). Probe GWs to keep only alive routes in the table.
Regards
--
Julian Anastasov
Hello,
On Thu, 21 Jun 2018, Grant Taylor wrote:
> On 06/21/2018 01:57 PM, Julian Anastasov wrote:
> > Hello,
>
> > http://ja.ssi.bg/dgd-usage.txt
>
> "DGD" or "Dead Gateway Detection" sounds very familiar. I referenced it in an
> ea
Hello,
On Wed, 20 Jun 2018, Grant Taylor wrote:
> On 06/20/2018 01:00 PM, Julian Anastasov wrote:
> > You can also try alternative routes.
>
> "Alternative routes"? I can't say as I've heard that description as a
> specific technique / feature
tool puts only alive routes in service after doing health
checks of all near gateways.
Regards
--
Julian Anastasov
ountering
pmtu exception")
Fixes: 7343ff31ebf0 ("ipv6: Don't create clones of host routes.")
Signed-off-by: Julian Anastasov
---
net/ipv6/route.c | 3 ---
1 file changed, 3 deletions(-)
Note: I failed to build 2.6.38 kernel for the test but I think
commit 7343ff31ebf0 looks as
Hello,
On Tue, 5 Jun 2018, Martin KaFai Lau wrote:
> On Sat, May 05, 2018 at 03:58:25PM +0300, Julian Anastasov wrote:
> > So, except the RTF_LOCAL check in __ip6_rt_update_pmtu
> > we should have no other issues.
> Hi Julian,
>
> Do you have a chance
* - it can be when cp was dropped on load
*/
cp->state == IP_VS_TCP_S_SYN_RECV) {
IP_VS_DBG(4, "del conn template\n");
ip_vs_conn_expire_now(cp_c);
}
}
It is not perfect, i.e. it does not know if there was
some conn that was established in the past:
- CONN1: SYN, SYN+ACK, ESTABLISH, FIN, FIN+ACK, expire
- CONN2: expire in SYN state, drop tpl before persistent timeout
But it should work in the general case.
Anyways, give me some days to think more on this issue.
Regards
--
Julian Anastasov
: 0001
> Code: 08 5b 41 5c 41 5d 41 5e 41 5f 5d c3 0f 0b 48 89 df e8 d2 8f 48 fa eb de
> 55 48 89 fe 48 c7 c7 60 65 64 88 48 89 e5 e8 91 dd f3 f9 <0f> 0b 90 90 90 90
> 90 90 90 90 90 90 90 55 48 89 e5 41 57 41 56
> RIP: fortify_panic+0x13/0x20 lib/string.c:1051 RSP: 8801c976f800
> ---[ end trace 624046f2d9af7702 ]---
Just to let you know that I tested a patch with
the syzbot, will do more tests before submitting...
Regards
--
Julian Anastasov
Hello,
On Mon, 7 May 2018, Martin KaFai Lau wrote:
> On Sat, May 05, 2018 at 03:58:25PM +0300, Julian Anastasov wrote:
> >
> > So, except the RTF_LOCAL check in __ip6_rt_update_pmtu
> > we should have no other issues. Only one minor bit is s
tu(rt6, mtu);
+ /* update rt6_ex->stamp for cache */
rt6_update_exception_stamp_rt(rt6);
+ }
} else if (daddr) {
Regards
--
Julian Anastasov
Hello,
On Wed, 2 May 2018, David Ahern wrote:
> On 5/2/18 12:41 AM, Julian Anastasov wrote:
> > Allow some non-cached routes to use non-expired fnhe:
> >
> > 1. ip_del_fnhe: moved above and now called by find_exception.
> > The 4.5+ commit deed49df7390 exp
Hello,
On Wed, 2 May 2018, Martin KaFai Lau wrote:
> On Wed, May 02, 2018 at 09:38:43AM +0300, Julian Anastasov wrote:
> >
> > - initial traffic for port 21 does not use GSO. But after
> > every packet IPVS calls maybe_update_pmtu (rt->dst.ops->update_pmtu
on local routes with
oif")
Fixes: deed49df7390 ("route: check and remove route cache when we get route")
Cc: David Ahern
Cc: Xin Long
Signed-off-by: Julian Anastasov
---
net/ipv4/route.c | 118 +--
1 file changed, 53 insertions(+)
skb->len > mtu && !skb_is_gso(skb) &&
- !ip_vs_iph_icmp(ipvsh))) {
+ if (unlikely(__mtu_check_toobig(skb, mtu))) {
icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
htonl(mtu));
IP_VS_DBG(1, "frag needed for %pI4\n",
Regards
--
Julian Anastasov
Hello,
On Mon, 23 Apr 2018, Cong Wang wrote:
> Similarly, tbl->entries is not initialized after kmalloc(),
> therefore causes an uninit-value warning in ip_vs_lblc_check_expire(),
> as reported by syzbot.
>
> Reported-by:
> Cc: Simon Horman
> Cc: Julian Anas
Hello,
On Mon, 23 Apr 2018, Cong Wang wrote:
> tbl->entries is not initialized after kmalloc(), therefore
> causes an uninit-value warning in ip_vs_lblc_check_expire()
> as reported by syzbot.
>
> Reported-by:
> Cc: Simon Horman
> Cc: Julian Anastasov
<< 2) +
> + skb_network_header_len(skb);
> +
> + if (mtu > hdr_len && mtu - hdr_len < skb_shinfo(skb)->gso_size)
> + skb_decrease_gso_size(skb_shinfo(skb),
> + skb_shinfo(skb)->gso_size -
> + (mtu - hdr_len));
So, software segmentation happens and we want the
tunnel header to be accounted immediately and not after PMTU
probing period? Is this a problem only for IPVS tunnels?
Do we see such delays with other tunnels? May be this should
be solved for all protocols (not just TCP) and for all tunnels?
Looking at ip6_xmit, on GSO we do not return -EMSGSIZE to
local sender, so we should really alter the gso_size for proper
segmentation?
Regards
--
Julian Anastasov
Hello,
On Thu, 12 Apr 2018, Stephen Suryaputra wrote:
> Thanks for the feedbacks. Please see the detail below:
>
> On Wed, Apr 11, 2018 at 3:37 PM, Julian Anastasov wrote:
> [snip]
> >> - __IP_INC_STATS(net, IPSTATS_MIB_INHDRERRORS);
> >> + _
t_sync_thread should be resolved soon...
> > IPVS: sync thread started: state = BACKUP, mcast_ifn = lo, syncid = 0, id =
> > 0
> > IPVS: stopping backup sync thread 4546 ...
> >
> >
> > IPVS: stopping backup sync thread 4559 ...
> > WARNING: possible recursive locking detected
Regards
--
Julian Anastasov
st
'skb_dst(skb)->dev'.
> icmp_send(skb, ICMP_TIME_EXCEEDED, ICMP_EXC_TTL, 0);
> return false;
> }
The patch probably has other errors, for example,
using rt->dst.dev (lo) when rt->dst.error != 0 in ip_error,
may be 'dev' should be used instead...
Regards
--
Julian Anastasov
ipvs/ip_vs_csh.c | 339
> +
> net/netfilter/ipvs/ip_vs_sh.c | 32 +---
> 5 files changed, 381 insertions(+), 31 deletions(-)
> create mode 100644 net/netfilter/ipvs/ip_vs_csh.c
Regards
--
Julian Anastasov
ut it shouldn't matter.
>
> Signed-off-by: Vincent Bernat
Looks good to me, thanks! Simon, please apply, if possible
with the extra space removed, see below...
Acked-by: Julian Anastasov
> ---
> net/netfilter/ipvs/ip_vs_dh.c| 3 ++-
> net/netfilter/ipvs/i
sk' part is not needed, still,
it does not generate extra code. I see that other code uses
hash_32(val, bits) from include/linux/hash.h but note that it
used different ratio before Linux 4.7, in case someone backports
this patch on old kernels. So, I don't have preference what should
be used, may be return hash_32(ntohl(addr_fold), IP_VS_DH_TAB_BITS)
is better.
Regards
--
Julian Anastasov
d.c:238
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:406
> Sending NMI from CPU 1 to CPUs 0:
> NMI backtrace for cpu 0 skipped: idling at native_safe_halt+0x6/0x10
> arch/x86/include/asm/irqflags.h:54
>
>
> ---
> This bug is generated by a dumb bot. It may contain errors.
> See https://goo.gl/tpsmEJ for details.
> Direct all questions to syzkal...@googlegroups.com.
>
> syzbot will keep track of this bug report.
> If you forgot to add the Reported-by tag, once the fix for this bug is merged
> into any tree, please reply to this email with:
> #syz fix: exact-commit-title
> To mark this as a duplicate of another syzbot report, please reply with:
> #syz dup: exact-subject-of-another-report
> If it's a one-off invalid bug report, please reply with:
> #syz invalid
> Note: if the crash happens again, it will cause creation of a new bug report.
> Note: all commands must start from beginning of the line in the email body.
Regards
--
Julian Anastasov
to {make,setup}_{send,receive}_sock ...
> > stack backtrace:
> > rtnl_lock+0x17/0x20 net/core/rtnetlink.c:74
> > ip_mc_drop_socket+0x88/0x230 net/ipv4/igmp.c:2643
> > inet_release+0x4e/0x1c0 net/ipv4/af_inet.c:413
> > sock_release+0x8d/0x1e0 net/socket.c:595
> > start_sync_thread+0x2213/0x2b70 net/netfilter/ipvs/ip_vs_sync.c:1924
> > do_ip_vs_set_ctl+0x1139/0x1cc0 net/netfilter/ipvs/ip_vs_ctl.c:2389
Regards
--
Julian Anastasov
Convert ip_vs_ftp_ops
The IPVS patches 4-6 look good to me,
Acked-by: Julian Anastasov
> net/l2tp/l2tp_core.c|1 +
> net/mpls/af_mpls.c |1 +
> net/netfilter/ipvs/ip_vs_core.c |2 ++
> net/netfilter/ipvs/ip_vs_ftp.c |1 +
> n
ool copy)
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 2465607..e140ba4 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4864,6 +4864,7 @@ void skb_scrub_packet(struct sk_buff *skb, bool xnet)
> if (!xnet)
> return;
>
> + ipvs_reset(skb);
> skb_orphan(skb);
> skb->mark = 0;
> }
> --
> 1.7.12.4
Regards
--
Julian Anastasov
; Cc: Simon Horman
> Cc: Julian Anastasov
> Cc: Pablo Neira Ayuso
> Cc: Jozsef Kadlecsik
> Cc: Florian Westphal
> Cc: "David S. Miller"
> Cc: netdev@vger.kernel.org
> Cc: lvs-de...@vger.kernel.org
> Cc: netfilter-de...@vger.kernel.org
> Cc: coret...@netfilter
size=4096)
> Prot LocalAddress:Port Scheduler Flags
> -> RemoteAddress:Port Forward Weight ActiveConn InActConn
> TCP 0A010102:0050 wlc
>
> Signed-off-by: KUWAZAWA Takuya
Looks good to me
Acked-by: Julian Anastasov
Simon, please apply to ipvs tree.
>
gt; The original issue was reported only once to us from the regression rack only
> so the exact steps to reproduce is unknown.
OK, lets see, may be others can explain what happens.
Regards
--
Julian Anastasov
S_REFCOUNTED flag
and later to see this flag in __dst_destroy_metrics_generic
So, I'm not sure where exactly is the bug with the
metrics.
May be I'm missing some posting but I don't see if
the patch was tested successfully.
Regards
--
Julian Anastasov
IT
> packet
> netfilter: ipvs: do not create conn for ABORT packet in
> sctp_conn_schedule
>
> net/netfilter/ipvs/ip_vs_proto_sctp.c | 8 ++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
Patchset looks ok to me,
Acked-by: Julian Anastasov
)) {
if (NEIGH_VAR(neigh->parms, MCAST_PROBES) +
NEIGH_VAR(neigh->parms, APP_PROBES)) {
- unsigned long next, now = jiffies;
+ unsigned long next;
atomic_set(&neigh->probes,
NEIGH_VAR(neigh->parms, UCAST_PROBES));
Regards
--
Julian Anastasov
Hello,
On Wed, 16 Aug 2017, Julian Anastasov wrote:
> I thought about this, it is possible in
> neigh_event_send:
>
> if (neigh->used != now)
> neigh->used = now;
> else if (neigh->nud_state == NUD_INCOMPLET
Hello,
On Tue, 15 Aug 2017, Eric Dumazet wrote:
> On Tue, 2017-08-15 at 22:45 +0300, Julian Anastasov wrote:
> > Hello,
> >
> > On Tue, 15 Aug 2017, Eric Dumazet wrote:
> >
> > > Please try this :
> > > diff --git a/net/core/n
ly caller,
neigh_event_send. Now we risk to enter the
'if (!(neigh->nud_state & (NUD_STALE | NUD_INCOMPLETE))) {' block...
> write_lock_bh(&neigh->lock);
>
> rc = 0;
> - if (neigh->nud_state & (NUD_CONNECTED | NUD_DELAY | NUD_PROBE))
> - goto out_unlock_bh;
> if (neigh->dead)
> goto out_dead;
Regards
--
Julian Anastasov
le and free it. This patch refactors and shares common code
> between neigh_forced_gc and the newly added neigh_remove_one.
>
> A similar issue exists for IPv6 Neighbor Cache entries, and is fixed
> in a similar manner by this patch.
>
> Signed-off-by: Sowmini Varadhan
>
Change looks ok to me but with some non-fatal
warnings, see below.
Reviewed-by: Julian Anastasov
> ---
> v2: kbuild-test-robot compile error
> v3: do not export_symbol neigh_remove_one (David Miller comment)
> v4: rearrange locking for tbl->lo
Hello,
On Wed, 31 May 2017, Sowmini Varadhan wrote:
> On (06/01/17 00:41), Julian Anastasov wrote:
> >
> > So, we do not hold reference to neigh while accessing
> > its fields. I suspect we need to move the table lock from
> > neigh_remove_one here, f
en gc_thresh1 is not reached,
so such solution is not good enough.
Regards
--
Julian Anastasov
other objects)
>
> I tried to make this patch as small as possible to ease its backport,
> instead of being super clean. Note that we believe that only ipv4 dst
> need to take care of the metric refcount. But if this is wrong,
> this patch adds the basic infrastructure to extend this to
GARP replies can not work for 1394, is_garp will be
cleared. May be 'tha' check should be moved in if expression,
for example:
if (is_garp && ar_op == htons(ARPOP_REPLY) && tha)
is_garp = !memcmp(tha, sha, dev->addr_len);
> + !memcmp(tha, sha, dev->addr_len);
> +
> + return is_garp;
> +}
Regards
--
Julian Anastasov
p;
(addr_type == RTN_UNICAST ||
(addr_type < 0 &&
inet_addr_type_dev_table(net, dev, sip) == RTN_UNICAST)
n = __neigh_lookup(&arp_tbl, &sip, dev, 1);
As result, we will avoid the unneeded
inet_addr_type_dev_table() call for ARP requests (non-GARP)
which are too common to see. May be there is another way
to make this code more nice...
Regards
--
Julian Anastasov
Hello,
On Mon, 15 May 2017, Cong Wang wrote:
> On Mon, May 15, 2017 at 1:37 PM, Julian Anastasov wrote:
> > Any user that does not set FIB_LOOKUP_NOREF
> > will need nh_dev refcounts. The assumption is that the
> > NHs are accessed, who knows, may be
Hello,
On Mon, 15 May 2017, Ihar Hrachyshka wrote:
> On Mon, May 15, 2017 at 1:05 PM, Julian Anastasov wrote:
> >
> > It seems arp_accept value currently has influence on
> > the locktime for GARP requests. My understanding is that
> > locktime is
Hello,
On Mon, 15 May 2017, Cong Wang wrote:
> On Fri, May 12, 2017 at 2:27 PM, Julian Anastasov wrote:
> > Now the main question: is FIB_LOOKUP_NOREF used
> > everywhere in IPv4? I guess so. If not, it means
> > someone can walk its res->fi NHs which i
was honoured by the kernel
> layer. This would require tracking timestamps for state transitions
> separately from timestamps when actual updates are received. This would
> probably involve changes in neighbour struct. Therefore, the patch
> doesn't tackle the issue of the first
Hello,
On Fri, 12 May 2017, Cong Wang wrote:
> On Thu, May 11, 2017 at 11:39 PM, Julian Anastasov wrote:
> >
> > fib_flush will unlink the FIB infos at NETDEV_UNREGISTER
> > time, so we can not see them in any hash tables later on
> > NETDEV_UNREGI
e don't set nh_dev to NULL this is not
needed.
What more? What about nh_pcpu_rth_output and
nh_rth_input holding routes? We should think about
them too. I should think more if nh_oif trick can work
for them, may be not because nh_oif is optional...
May be we should purge them somehow?
Regards
--
Julian Anastasov
Hello,
On Wed, 10 May 2017, Cong Wang wrote:
> On Wed, May 10, 2017 at 12:38 AM, Julian Anastasov wrote:
> >
> > During NETDEV_UNREGISTER packets for dev should not
> > be flying but packets for other devs can walk the nexthops
> > for multipath ro
ce that holds
routes without listening for NETDEV_UNREGISTER? On fib_flush
the infos are unlinked from trees, so after a grace period packets
should not see/hold such infos. If we hold routes somewhere for
long time, problem can happen also for routes with single nexthop.
Regards
--
Julian Anastasov
d you please keep me posted as this is merged?
Sure. Thanks for the confirmation! I'll do some
tests and will post official patch in few days.
Regards
--
Julian Anastasov
Hello,
On Mon, 24 Apr 2017, Paolo Abeni wrote:
> Hi,
>
> The problem with the patched code is that it tries to resolve ipv6
> addresses that are not created/validated by the kernel.
OK. Simon, please apply to ipvs tree.
Acked-by: Julian Anastasov
Regard
t in rare
cases it can happen also for TCP or remote clients when the
real server sends the reply traffic via the director.
So, better to be more precise for the reply traffic.
As replies are not expected for DR/TUN connections, better
to not touch them.
Reported-by: Nick Moriarty
Signed-off-by: J
1 - 100 of 302 matches
Mail list logo