Re: [vpp-dev] det44 and map plugins interfere with linux-cp

2021-11-19 Thread Ben McKeegan

On 19/11/2021 13:32, Ben McKeegan via lists.fd.io wrote:


Firstly, with the map plugin it appears to break IPv6 connectivity: the 
control plane can no longer successfully do NDP to the external gateway 
(a layer 3 switch).  NDP replies from the gateway to the control plane 
do not arrive.   There is a very simple workaround: if I put in a static 
neighbour entry in Linux (with 'ip neigh replace ...') everything else 
works.  I have not yet understood why this happens although as I have a 
workaround I did not spent too long on investigating it.




It turns out this was fairly straightforward, see patch below which 
fixed it for me.  Previously, the code checked for ICMP6 echo request 
and reply codes and handled these locally, attempting to relay 
everything else (and discarding any that are not suitable for relaying). 
 For now I have added similar exceptions for NDP and RAs, but this 
seems a little backward to me.   Should we make IP6_MAP_NEXT_IP6_LOCAL 
the default and only set IP6_MAP_NEXT_IP6_ICMP_RELAY for one of the four 
ICMP6 error codes that ip6_map_icmp_relay() actually checks for? The 
comment in the code says:


   * ICMP IPv6 packet
   *   - Error -> Pass to ICMPv6/ICMPv4 relay
   *   - Info -> Pass to IPv6 local

... which makes sense, but doesn't match what the code was doing.



diff --git a/src/plugins/map/ip6_map.c b/src/plugins/map/ip6_map.c
index 1193dda0a..d400c634c 100644
--- a/src/plugins/map/ip6_map.c
+++ b/src/plugins/map/ip6_map.c
@@ -246,8 +246,11 @@ ip6_map (vlib_main_t * vm, vlib_node_runtime_t * 
node, vlib_frame_t * frame)

{
  icmp46_header_t *icmp = (void *) (ip60 + 1);
  next0 = (icmp->type == ICMP6_echo_request
-  || icmp->type ==
-  ICMP6_echo_reply) ? IP6_MAP_NEXT_IP6_LOCAL :
+  || icmp->type == ICMP6_echo_reply
+  || icmp->type == ICMP6_neighbor_solicitation
+  || icmp->type == ICMP6_neighbor_advertisement
+  || icmp->type == ICMP6_router_solicitation
+  || icmp->type == ICMP6_router_advertisement) ? 
IP6_MAP_NEXT_IP6_LOCAL :

IP6_MAP_NEXT_IP6_ICMP_RELAY;
}
  else if (ip60->protocol == IP_PROTOCOL_IPV6_FRAGMENTATION)
@@ -273,8 +276,11 @@ ip6_map (vlib_main_t * vm, vlib_node_runtime_t * 
node, vlib_frame_t * frame)

{
  icmp46_header_t *icmp = (void *) (ip61 + 1);
  next1 = (icmp->type == ICMP6_echo_request
-  || icmp->type ==
-  ICMP6_echo_reply) ? IP6_MAP_NEXT_IP6_LOCAL :
+  || icmp->type == ICMP6_echo_reply
+  || icmp->type == ICMP6_neighbor_solicitation
+  || icmp->type == ICMP6_neighbor_advertisement
+  || icmp->type == ICMP6_router_solicitation
+  || icmp->type == ICMP6_router_advertisement) ? 
IP6_MAP_NEXT_IP6_LOCAL :

IP6_MAP_NEXT_IP6_ICMP_RELAY;
}
  else if (ip61->protocol == IP_PROTOCOL_IPV6_FRAGMENTATION)
@@ -451,8 +457,11 @@ ip6_map (vlib_main_t * vm, vlib_node_runtime_t * 
node, vlib_frame_t * frame)

{
  icmp46_header_t *icmp = (void *) (ip60 + 1);
  next0 = (icmp->type == ICMP6_echo_request
-  || icmp->type ==
-  ICMP6_echo_reply) ? IP6_MAP_NEXT_IP6_LOCAL :
+  || icmp->type == ICMP6_echo_reply
+  || icmp->type == ICMP6_neighbor_solicitation
+  || icmp->type == ICMP6_neighbor_advertisement
+  || icmp->type == ICMP6_router_solicitation
+  || icmp->type == ICMP6_router_advertisement) ? 
IP6_MAP_NEXT_IP6_LOCAL :

IP6_MAP_NEXT_IP6_ICMP_RELAY;
}
  else if (ip60->protocol == IP_PROTOCOL_IPV6_FRAGMENTATION &&


Regards,
Ben.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20523): https://lists.fd.io/g/vpp-dev/message/20523
Mute This Topic: https://lists.fd.io/mt/87167458/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] det44 and map plugins interfere with linux-cp

2021-11-19 Thread Ben McKeegan

Hi,

I have been experimenting recently with use of the linux-cp plugin to 
run BGP, my intention being to implement failover and load-sharing of 
various networking functions such as map or det44 among pools of VPP 
instances. (The map plugin's protocols are ideally suited to BGP 
anycasting, whereas for det44 I am running VRRP on the inside interface 
and will be adjusting my BGP advertisements of the outside addresses 
based on the inside VRRP status)


linux-cp works a treat on its own (v21.10) and I have no trouble 
establishing BGP sessions using FRR, but I've been hitting some issues 
when enabling extra plugins on the same interfaces that my BGP control 
plane is using.


Firstly, with the map plugin it appears to break IPv6 connectivity: the 
control plane can no longer successfully do NDP to the external gateway 
(a layer 3 switch).  NDP replies from the gateway to the control plane 
do not arrive.   There is a very simple workaround: if I put in a static 
neighbour entry in Linux (with 'ip neigh replace ...') everything else 
works.  I have not yet understood why this happens although as I have a 
workaround I did not spent too long on investigating it.


Secondly, with the det44 plugin enabled (on a different instance) then 
all IPv4 connectivity to the control plane was broken.   In my case 
(running BGP only on the 'outside' interface) this is because the 
det44_out2in_node function is rather overzealous: it grabs all IPv4 
packets received and attempts to translate them, and if it has no 
matching mapping configuration for the destination address it will drop 
the packet.


I have come up with a very simple fix which works for me: if the 
destination address belongs to the receiving interface, then skip the 
TTL check and NAT:



diff --git a/src/plugins/nat/det44/det44_out2in.c 
b/src/plugins/nat/det44/det44_out2in.c

index 111bc61c4..e128b794e 100644
--- a/src/plugins/nat/det44/det44_out2in.c
+++ b/src/plugins/nat/det44/det44_out2in.c
@@ -425,6 +425,10 @@ VLIB_NODE_FN (det44_out2in_node) (vlib_main_t * vm,
   tcp0 = (tcp_header_t *) udp0;

   sw_if_index0 = vnet_buffer (b0)->sw_if_index[VLIB_RX];
+  /* do not interfere with packets to the interface address 
(control plane) */

+  if (PREDICT_FALSE (!det44_is_interface_addr (node, sw_if_index0,
+ 
ip0->dst_address.as_u32)))

+goto trace0;

   if (PREDICT_FALSE (ip0->ttl == 1))
{
@@ -543,6 +547,10 @@ VLIB_NODE_FN (det44_out2in_node) (vlib_main_t * vm,
   tcp1 = (tcp_header_t *) udp1;

   sw_if_index1 = vnet_buffer (b1)->sw_if_index[VLIB_RX];
+  /* do not interfere with packets to the interface address 
(control plane) */

+  if (PREDICT_FALSE (!det44_is_interface_addr (node, sw_if_index1,
+ 
ip1->dst_address.as_u32)))

+goto trace1;

   if (PREDICT_FALSE (ip1->ttl == 1))
{
@@ -688,6 +696,10 @@ VLIB_NODE_FN (det44_out2in_node) (vlib_main_t * vm,
   tcp0 = (tcp_header_t *) udp0;

   sw_if_index0 = vnet_buffer (b0)->sw_if_index[VLIB_RX];
+  /* do not interfere with packets to the interface address 
(control plane) */

+  if (PREDICT_FALSE (!det44_is_interface_addr (node, sw_if_index0,
+ 
ip0->dst_address.as_u32)))

+goto trace00;

   if (PREDICT_FALSE (ip0->ttl == 1))
{

This works well for me as I have BGP advertise routes to the outside 
prefixes of my det44 mappings, with the non-overlapping interface 
address as the next hop.


The downside of this patch is if anyone is relying on being able to NAT 
using the first interface address, rather than a routed prefix or 
secondary address, then it would break it for them.  Is that a use case 
we should try to retain support for? (Based on my recent discovery of 
the long standing session scavenging bug in this plugin, I suspect there 
are few if any other serious users of it.)


My first attempt of the patch put the interface check after the mapping 
table lookup, inside the  "if (PREDICT_FALSE (!mp0))" clauses, so it 
would still be possible to NAT the interface's address if you did not 
need to use if for control plane traffic.  However, I found my BGP 
packets then fell foul of the TTL check in this node.  Thus to continue 
to support NAT of the first interface address would require additional 
refactoring of the code to move the TTL check after the map lookup, and 
special handling for the ICMP code path too.  I feel this would add too 
much complexity to the code for little gain.


What do people think?

Regards,
Ben.


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20521): https://lists.fd.io/g/vpp-dev/message/20521
Mute This Topic: https://lists.fd.io/mt/87167458/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] Memory leak in map plugin

2021-11-19 Thread Ben McKeegan

Hi Ben,

Thank you for the patch, that fixes it!

Regards,
Ben.


On 17/11/2021 08:24, Benoit Ganne (bganne) via lists.fd.io wrote:

Hi Ben,

Thanks for the detailed report, you are correct we are missing a vec_free() 
before returning from the function.
Can you give a try to https://gerrit.fd.io/r/c/vpp/+/34536 ?

Best
ben


-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Ben McKeegan
Sent: mardi 16 novembre 2021 15:04
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] Memory leak in map plugin

Hello,

I have identified a memory leak the ip4_map function of
src/plugins/ip4_map.c.  I am using the 21.10 release.

Enabling memory trace of the main-heap via the debug CLI and backtracing
with gdb both point to all the leaked memory being allocated from the
vec_add1(buffer0,pi0) macro at line 293 of ip4_map.c.   In tests it is
leaking approximately 50 bytes for every packet passing through this
function (invariant on packet size).

Here is an extract of the relevant code:

exit:
  /* Send fragments that were added in the frame */
  if (free_original_buffer0)
{
  vlib_buffer_free_one (vm, pi0);   /* Free original packet */
}
  else
{
  vec_add1 (buffer0, pi0);   <<<< leak is here on line 293
}

  frag_from0 = buffer0;
  frag_left0 = vec_len (buffer0);

  while (frag_left0 > 0)
{
  while (frag_left0 > 0 && n_left_to_next > 0)
{
  u32 i0;
  i0 = to_next[0] = frag_from0[0];
  frag_from0 += 1;
  frag_left0 -= 1;
  to_next += 1;
  n_left_to_next -= 1;

  vlib_get_buffer (vm, i0)->error =
error_node->errors[error0];
  vlib_validate_buffer_enqueue_x1 (vm, node, next_index,
   to_next, n_left_to_next,
   i0, next0);
}
  vlib_put_next_frame (vm, node, next_index, n_left_to_next);
  vlib_get_next_frame (vm, node, next_index, to_next,
   n_left_to_next);
}
  vec_reset_length (buffer0);
}
vlib_put_next_frame (vm, node, next_index, n_left_to_next);


I must admit I do not fully understand exactly what this code is doing,
but I am suspicious of the use of 'vec_reset_length' macro.   I have
looked at the definition of this and it appears that although this sets
the length of the vector back to zero (if the pointer is non-zero), it
does not release any memory that may have been allocated.   Do we not
need a call to 'vec_free' somewhere?

Regards,
Ben.







--
Ben McKeegan
Netservers Limited
01223 446000 ext 8103


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20520): https://lists.fd.io/g/vpp-dev/message/20520
Mute This Topic: https://lists.fd.io/mt/87095064/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Re: [vpp-dev] det44 plugin

2021-11-19 Thread Ben McKeegan

Hi Filip,

Thank you for providing the patch.  To confirm, I have now tested your 
patch and it does indeed fix the problem.


Regards,
Ben.

On 02/11/2021 17:46, Filip Varga via lists.fd.io wrote:

Hi Ben,

Thank you for pointing out the issue. Indeed it looks like the node runs just 
once. I will provide a patch shortly.

Best regards,
Filip Varga


-Original Message-
From: vpp-dev@lists.fd.io  On Behalf Of Ben McKeegan
Sent: Monday, November 1, 2021 7:24 PM
To: vpp-dev@lists.fd.io
Subject: [vpp-dev] det44 plugin

Hello,

I am fairly new to VPP so please bear with me.  I am trying to use the
det44 NAT plugin on 21.06 but I am experiencing some difficulties with
running out of ports.   It would appear that my det44 sessions are never
removed despite passing the expire time.   For example, I have the
following setting:

show det44 timeout
udp timeout: 300sec
tcp established timeout: 7440sec
tcp transitory timeout: 240sec
icmp timeout: 60sec

However, if I generate a series of ICMP pings from my test host and then run 'show 
det44 sessions' I have a session listed for every individual ping packet, as 
expected, but these remain long after the 60 second timeout configured.  For example 
on my last test I sent a flood of 100 pings which generated 100 sessions in the 
lists, all "state: icmp-active
expire" with expiry times ranging from 171 to 173.   I have just sent
another 100 pings and now have another 100 sessions with expiry times ranging 
from 2647 to 2650, and the original 100 sessions are still there still with 
expiry times from 171 to 173 so these have not been refreshed or expired.

I have taken a look at the source code of the plugin and I can that
det44_create_expire_walk_process() is called from det44_plugin_enable().
   This function appears to start a new vlib process with the 'main loop'
   function det44_expire_walk_fn().

According to the documentation here
https://docs.fd.io/vpp/21.06/dd/d64/vlib__process__doc_8h.html I understand 
these despatch functions should be implemented as a while (1) {} loop that 
never ends.  However, my reading of the
det44_expire_walk_fn() function code is that it will only perform a single walk 
of the det44 data structures before returning to its caller.

Is this a bug in det44_expire_walk_fn(), is the documentation wrong or
am I misreading it?   My hypothesis is that det44_expire_walk_fn() runs
just once, when the plugin is first enabled (and the session table is already 
empty), and does not get run again thereafter.  Therefore, the sessions never 
get expired.


Regards,
Ben.











--
Ben McKeegan
Netservers Limited
01223 446000 ext 8103


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20519): https://lists.fd.io/g/vpp-dev/message/20519
Mute This Topic: https://lists.fd.io/mt/86748366/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] Memory leak in map plugin

2021-11-16 Thread Ben McKeegan

Hello,

I have identified a memory leak the ip4_map function of 
src/plugins/ip4_map.c.  I am using the 21.10 release.


Enabling memory trace of the main-heap via the debug CLI and backtracing 
with gdb both point to all the leaked memory being allocated from the 
vec_add1(buffer0,pi0) macro at line 293 of ip4_map.c.   In tests it is 
leaking approximately 50 bytes for every packet passing through this 
function (invariant on packet size).


Here is an extract of the relevant code:

exit:
  /* Send fragments that were added in the frame */
  if (free_original_buffer0)
{
  vlib_buffer_free_one (vm, pi0);   /* Free original packet */
}
  else
{
  vec_add1 (buffer0, pi0);    leak is here on line 293
}

  frag_from0 = buffer0;
  frag_left0 = vec_len (buffer0);

  while (frag_left0 > 0)
{
  while (frag_left0 > 0 && n_left_to_next > 0)
{
  u32 i0;
  i0 = to_next[0] = frag_from0[0];
  frag_from0 += 1;
  frag_left0 -= 1;
  to_next += 1;
  n_left_to_next -= 1;

  vlib_get_buffer (vm, i0)->error =
error_node->errors[error0];
  vlib_validate_buffer_enqueue_x1 (vm, node, next_index,
   to_next, n_left_to_next,
   i0, next0);
}
  vlib_put_next_frame (vm, node, next_index, n_left_to_next);
  vlib_get_next_frame (vm, node, next_index, to_next,
   n_left_to_next);
}
  vec_reset_length (buffer0);
}
  vlib_put_next_frame (vm, node, next_index, n_left_to_next);


I must admit I do not fully understand exactly what this code is doing, 
but I am suspicious of the use of 'vec_reset_length' macro.   I have 
looked at the definition of this and it appears that although this sets 
the length of the vector back to zero (if the pointer is non-zero), it 
does not release any memory that may have been allocated.   Do we not 
need a call to 'vec_free' somewhere?


Regards,
Ben.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20496): https://lists.fd.io/g/vpp-dev/message/20496
Mute This Topic: https://lists.fd.io/mt/87095064/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



Recommended NICs and drivers (was Re: [vpp-dev] About Risc-V Porting)

2021-11-09 Thread Ben McKeegan

Hi,

On 08/11/2021 19:15, Damjan Marion via lists.fd.io wrote:


and that VPP works better when DPDK is not involved.




Apologies for hijacking the thread but as a new VPP user who has spent 
many many hours reading through every scrap of documentation I can find, 
this statement comes as a surprise to me.   Unfortunately a lot of the 
VPP documentation seems a bit out of date.  Indeed, the documentation 
for most the recent release still says it 'Leverages best-of-breed open 
source driver technology: DPDK':


https://s3-docs.fd.io/vpp/21.10/whatisvpp/performance.html

So if DPDK isn't the best option any more, what do the people of this 
list recommend as a NIC and driver combination, and why? 
(Unfortunately, I don't have a huge budget at my disposal, so I'm 
looking for some best value-for-money options.)


Thanks,
Ben.

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20460): https://lists.fd.io/g/vpp-dev/message/20460
Mute This Topic: https://lists.fd.io/mt/86928356/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-



[vpp-dev] det44 plugin

2021-11-01 Thread Ben McKeegan

Hello,

I am fairly new to VPP so please bear with me.  I am trying to use the 
det44 NAT plugin on 21.06 but I am experiencing some difficulties with 
running out of ports.   It would appear that my det44 sessions are never 
removed despite passing the expire time.   For example, I have the 
following setting:


show det44 timeout
udp timeout: 300sec
tcp established timeout: 7440sec
tcp transitory timeout: 240sec
icmp timeout: 60sec

However, if I generate a series of ICMP pings from my test host and then 
run 'show det44 sessions' I have a session listed for every individual 
ping packet, as expected, but these remain long after the 60 second 
timeout configured.  For example on my last test I sent a flood of 100 
pings which generated 100 sessions in the lists, all "state: icmp-active 
expire" with expiry times ranging from 171 to 173.   I have just sent 
another 100 pings and now have another 100 sessions with expiry times 
ranging from 2647 to 2650, and the original 100 sessions are still there 
still with expiry times from 171 to 173 so these have not been refreshed 
or expired.


I have taken a look at the source code of the plugin and I can that 
det44_create_expire_walk_process() is called from det44_plugin_enable(). 
 This function appears to start a new vlib process with the 'main loop' 
 function det44_expire_walk_fn().


According to the documentation here 
https://docs.fd.io/vpp/21.06/dd/d64/vlib__process__doc_8h.html I 
understand these despatch functions should be implemented as a while (1) 
{} loop that never ends.  However, my reading of the 
det44_expire_walk_fn() function code is that it will only perform a 
single walk of the det44 data structures before returning to its caller.


Is this a bug in det44_expire_walk_fn(), is the documentation wrong or 
am I misreading it?   My hypothesis is that det44_expire_walk_fn() runs 
just once, when the plugin is first enabled (and the session table is 
already empty), and does not get run again thereafter.  Therefore, the 
sessions never get expired.



Regards,
Ben.





-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#20402): https://lists.fd.io/g/vpp-dev/message/20402
Mute This Topic: https://lists.fd.io/mt/86748366/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-