from:"Florin Coras"

Re: [vpp-dev] can't establish tcp connection with new introduced transport_endpoint_freelist

2023-03-14 Thread Florin Coras

Hi, 

Are you looking for behavior similar to the one when random local ports are 
allocated when, if port is used, we check if the 5-tuple is available? 

Don’t think we explicitly supported this before but here’s a patch [1]. 

Regards,
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/38486


> On Mar 14, 2023, at 12:56 AM, Zhang Dongya  wrote:
> 
> Just use this patch and the connection can be reconnected after closed.
> 
> However, I find another possible bug when using local ip + local port for 
> different target server due to transport_endpoint_mark_used return error
> if it find local ip + port being created.
> 
> I think it should increase the refcnt instead if it find 6 tuple is unique.
> 
>> static int
>> transport_endpoint_mark_used (u8 proto, ip46_address_t *ip, u16 port)
>> {
>>   transport_main_t *tm = _main;
>>   local_endpoint_t *lep;
>>   u32 tei;
>> 
>>   ASSERT (vlib_get_thread_index () <= transport_cl_thread ());
>>   // BUG??? maybe should allow reuse ??? 
>>   tei =
>> transport_endpoint_lookup (>local_endpoints_table, proto, ip, port);
>>   if (tei != ENDPOINT_INVALID_INDEX)
>> return SESSION_E_PORTINUSE;
>> 
>>   /* Pool reallocs with worker barrier */
>>   lep = transport_endpoint_alloc ();
>>   clib_memcpy_fast (>ep.ip, ip, sizeof (*ip));
>>   lep->ep.port = port;
>>   lep->proto = proto;
>>   lep->refcnt = 1;
>> 
>>   transport_endpoint_table_add (>local_endpoints_table, proto, >ep,
>> lep - tm->local_endpoints);
>> 
>>   return 0;
>> }
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月14日周二 11:38写道：
>> Hi, 
>> 
>> Could you try this out [1]? I’ve hit this issue myself today but with udp 
>> sessions. Unfortunately, as you’ve correctly pointed out, we were forcing a 
>> cleanup only on the non-fixed local port branch. 
>> 
>> Regards, 
>> Florin
>> 
>> [1] https://gerrit.fd.io/r/c/vpp/+/38473
>> 
>>> On Mar 13, 2023, at 7:35 PM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi list,
>>> 
>>> We have update coded from the upstream session changes to our code base 
>>> and find a possible bug which cause tcp connection can't be established 
>>> anymore.
>>> 
>>> Our scenario is that we will connect to a remote tcp server with specified 
>>> local port and local ip, however, new vpp code have introduced a 
>>> lcl_endpts_freelist which will be either flushed when pending local 
>>> endpoint exceeded the limit (32) or when transport_alloc_local_port is 
>>> called.
>>> 
>>> However, since we specify the local port and local ip and the total session 
>>> count is limited (< 32), in this case, the transport_cleanup_freelist will 
>>> never be called which cause the previous session which use the specified 
>>> local port and local ip will not be released after the session aborted.
>>> 
>>> I think we should also try to free the list in such case as I did in the 
>>> following code:
>>> 
>>>> int
>>>> transport_alloc_local_endpoint (u8 proto, transport_endpoint_cfg_t * 
>>>> rmt_cfg,
>>>> ip46_address_t * lcl_addr, u16 * lcl_port)
>>>> {
>>>>   // ZDY:
>>>>   transport_main_t *tm = _main;
>>>>   transport_endpoint_t *rmt = (transport_endpoint_t *) rmt_cfg;
>>>>   session_error_t error;
>>>>   int port;
>>>> 
>>>>   /*
>>>>* Find the local address
>>>>*/
>>>>   if (ip_is_zero (_cfg->peer.ip, rmt_cfg->peer.is_ip4))
>>>> {
>>>>   error = transport_find_local_ip_for_remote 
>>>> (_cfg->peer.sw_if_index,
>>>>  rmt, lcl_addr);
>>>>   if (error)
>>>> return error;
>>>> }
>>>>   else
>>>> {
>>>>   /* Assume session layer vetted this address */
>>>>   clib_memcpy_fast (lcl_addr, _cfg->peer.ip,
>>>> sizeof (rmt_cfg->peer.ip));
>>>> }
>>>> 
>>>>   /*
>>>>* Allocate source port
>>>>*/
>>>>   if (rmt_cfg->peer.port == 0)
>>>> {
>>>>   port = transport_alloc_local_port (proto, lcl_addr, rmt_cfg);
>>>>   if (port < 1)
>>>> return SESSION_E_NOPORT;
>>>>   *lcl_port = port;
>>>> }
>>>>   else
>>>> {
>>>>   port = clib_net_to_host_u16 (rmt_cfg->peer.port);
>>>>   *lcl_port = port;
>>>> 
>>>>   // ZDY: need add this to to cleanup because in specified src port
>>>>   // case, we will not run to transport_alloc_local_port, then
>>>>   // freelist will only be freeed when list is full (>32).
>>>>   /* Cleanup freelist if need be */
>>>>   if (vec_len (tm->lcl_endpts_freelist))
>>>> transport_cleanup_freelist ();
>>>> 
>>>>   return transport_endpoint_mark_used (proto, lcl_addr, port);
>>>> }
>>>> 
>>>>   return 0;
>>>> }
>>> 
>>> 
>>> 
>>> 
>> 
>> 
>> 
>> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22704): https://lists.fd.io/g/vpp-dev/message/22704
Mute This Topic: https://lists.fd.io/mt/97596886/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] #vnet A bug which may cause assertion error in vnet/session

2023-03-20 Thread Florin Coras

Hi, 

First of all, could you try this [1] with latest vpp? It’s really interesting 
that iperf does not exhibit this issue. 

Regarding your config, some observations:
- I see you have configured 4 worker. I would then recommend to use 4 rx-queues 
and 5 tx-queues (main can send packets), as opposed to 2. 
- tcp defaults to cubic, so config can be omitted.
- evt_qs_memfd_seg is not deprecated, so it can be omitted as well
- any particular reason for "set interface rx-mode eth1 polling”? dpdk 
interfaces are in polling mode by default
- you’re using binary api socket "api-socket-name /run/vpp/api.sock”. That 
works, but going forward we’ll slowly deprecate that api. So it’d recommend 
using the app socket api. See for instance [2] for changes needed to session 
stanza and vcl. 

Regards,
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/38529
[2] https://wiki.fd.io/view/VPP/HostStack/LDP/iperf


> On Mar 20, 2023, at 5:50 AM, Chen Weihao  wrote:
> 
> Thanks for your reply.
> I give a more detailed backtrace and config in 
> https://lists.fd.io/g/vpp-dev/message/22731.  
> My installation method is to 
> clone vpp from github and make build on Ubuntu 22.04(Kernel version is 
> 5.19)，and I use make run for test and make debug for debugging. Yes, I yried 
> to make the server and client are attached to the same vpp instance.I tried 
> the latest version of vpp on github on yesterday, the problem is still exist.
> I am looking forward to your reply.
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22734): https://lists.fd.io/g/vpp-dev/message/22734
Mute This Topic: https://lists.fd.io/mt/97707720/21656
Mute #vnet:https://lists.fd.io/g/vpp-dev/mutehashtag/vnet
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

2023-03-20 Thread Florin Coras

Hi, 

That last thing is pretty interesting. It’s either the issue fixed by this 
patch [1] or sessions are somehow cleaned up multiple times. If it’s the 
latter, I’d really like to understand how that happens. 

Regards,
Florin

[1] https://gerrit.fd.io/r/c/vpp/+/38507 

> On Mar 20, 2023, at 6:52 PM, Zhang Dongya  wrote:
> 
> Hi,
> 
> After merge this patch and update the test environment, the issue still 
> persists.
> 
> Let me clear my client app config:
> 1. register a reset callback, which will call vnet_disconnect there and also 
> trigger reconnect by send event to the ctrl process.)
> 2. register a connected callback, which will handle connect err by trigger 
> reconnect, on success, it will record session handle and extract tcp sequence 
> for our app usage.
> 3. register a disconnect callback, which basically do same as reset callback.
> 4. register a cleanup callback and accept callback, which basically make the 
> session layer happy without actually relevant work to do.
> 
> There is a ctrl process in mater, which will handle periodically reconnect or 
> triggered by event.
> 
> BTW, I also see frequently warning 'session %u hash delete rv -3' in 
> session_delete in my environment, hope this helps to investigate.
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月20日周一 23:29写道：
>> Hi, 
>> 
>> Understood and yes, connect will synchronously fail if port is not 
>> available, so you should be able to retry it later. 
>> 
>> Regards, 
>> Florin
>> 
>>> On Mar 20, 2023, at 1:58 AM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> It seems the issue occurs when there are disconnect called because our 
>>> network can't guarantee a tcp can't be reset even when 3 ways handshake is 
>>> completed (firewall issue :( ).
>>> 
>>> When we find the app layer timeout, we will first disconnect (because we 
>>> record the session handle, this session might be a half open session), does 
>>> vnet session layer guarantee that if we reconnect from master thread when 
>>> the half open session still not be released yet (due to asynchronous logic) 
>>> that the reconnect fail? if then we can retry connect later.
>>> 
>>> I prefer to not registered half open callback because I think it make app 
>>> complicated from a TCP programming prospective.
>>> 
>>> For your patch, I think it should be work because I can't delete the half 
>>> open session immediately because there is worker configured, so the half 
>>> open will be removed from bihash when syn retrans timeout. I have merged 
>>> the patch and will provide feedback later.
>>> 
>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>> 于2023年3月20日周一 13:09写道：
>>>> Hi, 
>>>> 
>>>> Inline.
>>>> 
>>>>> On Mar 19, 2023, at 6:47 PM, Zhang Dongya >>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> It can be aborted both in established state or half open state because I 
>>>>> will do timeout in our app layer. 
>>>> 
>>>> [fc] Okay! Is the issue present irrespective of the state of the session 
>>>> or does it happen only after a disconnect in hanf-open state? More lower. 
>>>> 
>>>>> 
>>>>> Regarding your question,
>>>>> 
>>>>> - Yes we add a builtin in app relys on C apis that  mainly use 
>>>>> vnet_connect/disconnect to connect or disconnect session.
>>>> 
>>>> [fc] Understood
>>>> 
>>>>> - We call these api in a vpp ctrl process which should be running on the 
>>>>> master thread, we never do session setup/teardown on worker thread. (the 
>>>>> environment that found this issue is configured with 1 master + 1 worker 
>>>>> setup.)
>>>> 
>>>> [fc] With vpp latest it’s possible to connect from first workers. It’s an 
>>>> optimization meant to avoid 1) worker barrier on syns and 2) entering poll 
>>>> mode on main (consume less cpu)
>>>> 
>>>>> - We started to develop the app using 22.06 and I keep to merge upstream 
>>>>> changes to latest vpp by cherry-picking. The reason for line mismatch is 
>>>>> that I added some comment to the session layer code, it should be equal 
>>>>> to the master branch now.
>>>> 
>>>> [fc] Ack
&

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

2023-03-23 Thread Florin Coras

Hi Zhang, 

Thanks for confirming! Give me a few more days to check if there’s any other 
improvements to be made in that area. 

Regards,
Florin 

> On Mar 23, 2023, at 12:00 AM, Zhang Dongya  wrote:
> 
> Hi,
> 
> The new patch works as expected, no assert triggered abort anymore.
> 
> Really appreciate your help and thanks a lot.
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月22日周三 11:54写道：
>> Hi Zhang, 
>> 
>> Awesome! Thanks!
>> 
>> Regards,
>> Florin
>> 
>>> On Mar 21, 2023, at 7:41 PM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi Florin,
>>> 
>>> Thanks a lot, the previous patch and with reset disabled have been running 
>>> 1 day without issue.
>>> 
>>> I will enable reset and with your new patch, will provide feedback later.
>>> 
>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>> 于2023年3月22日周三 02:12写道：
>>>> Hi, 
>>>> 
>>>> Okay, resetting of half-opens definitely not supported. I updated the 
>>>> patch to just clean them up on forced reset, without sending a reset to 
>>>> make sure session lookup table cleanup still happens. 
>>>> 
>>>> Regards,
>>>> Florin
>>>> 
>>>>> On Mar 20, 2023, at 9:13 PM, Zhang Dongya >>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> After review my code, I found that I have add a flag to the 
>>>>> vnet_disconnect API which will call session_reset instead of 
>>>>> session_close, the reason I do this is to make intermediate firewall just 
>>>>> flush the state and reconstruct if I later reconnect.
>>>>> 
>>>>> It seems in session_reset logic, for half open session, it also missing 
>>>>> to remove the session from the lookup hash which may cause the issue too.
>>>>> 
>>>>> I change my code and will test with your patch along, will provide 
>>>>> feedback later.
>>>>> 
>>>>> I also noticed the bihash issue discussed in the list recently, I will 
>>>>> merge later.
>>>>> 
>>>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>>>> 于2023年3月21日周二 11:56写道：
>>>>>> Hi, 
>>>>>> 
>>>>>> That last thing is pretty interesting. It’s either the issue fixed by 
>>>>>> this patch [1] or sessions are somehow cleaned up multiple times. If 
>>>>>> it’s the latter, I’d really like to understand how that happens. 
>>>>>> 
>>>>>> Regards,
>>>>>> Florin
>>>>>> 
>>>>>> [1] https://gerrit.fd.io/r/c/vpp/+/38507 
>>>>>> 
>>>>>>> On Mar 20, 2023, at 6:52 PM, Zhang Dongya >>>>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> After merge this patch and update the test environment, the issue still 
>>>>>>> persists.
>>>>>>> 
>>>>>>> Let me clear my client app config:
>>>>>>> 1. register a reset callback, which will call vnet_disconnect there and 
>>>>>>> also trigger reconnect by send event to the ctrl process.)
>>>>>>> 2. register a connected callback, which will handle connect err by 
>>>>>>> trigger reconnect, on success, it will record session handle and 
>>>>>>> extract tcp sequence for our app usage.
>>>>>>> 3. register a disconnect callback, which basically do same as reset 
>>>>>>> callback.
>>>>>>> 4. register a cleanup callback and accept callback, which basically 
>>>>>>> make the session layer happy without actually relevant work to do.
>>>>>>> 
>>>>>>> There is a ctrl process in mater, which will handle periodically 
>>>>>>> reconnect or triggered by event.
>>>>>>> 
>>>>>>> BTW, I also see frequently warning 'session %u hash delete rv -3' in 
>>>>>>> session_delete in my environment, hope this helps to investigate.
>>>>>>> 
>>>>>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>>>>>> 于2023年3月20

Re: COMMERCIAL BULK: Re: [E] COMMERCIAL BULK: [vpp-dev] TLS app stuck after burst of traffic #vpp-hoststack

2023-03-07 Thread Florin Coras

Hi Kevin, 

Understood. If you manage to confirm things are stable on latest vpp, maybe try 
to backport patches that changed tls tx functions. Worst case, you’ll have to 
backport some session layer patches as well. 

Regards,
Florin

> On Mar 7, 2023, at 5:24 AM, Kevin Yan  wrote:
> 
> Hi Florin,
>Due to some reasons we need to stay on vpp20.09 for some time, 
>  that’s why I hope I can fix the issue on this version. Actually, I  did some 
> code changes in TLS layer to let session node force reschedule the TLS 
> session when tx svm fifo is full or at least not empty, after the changes, 
> the tx  svm fifo can be recovered after stopping the traffic, but I think 
> this is not the correct way to solve the issue.
>  
>Anyway let me see if I can test the same case with latest vpp 
> codes
>  
> BRs,
> Kevin
>  
> From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>  <mailto:vpp-dev@lists.fd.io>> On Behalf Of Florin Coras
> Sent: Tuesday, March 7, 2023 2:49 PM
> To: vpp-dev mailto:vpp-dev@lists.fd.io>>
> Cc: Olivia Dunham  <mailto:theoliviadun...@gmail.com>>
> Subject: COMMERCIAL BULK: Re: [E] COMMERCIAL BULK: [vpp-dev] TLS app stuck 
> after burst of traffic #vpp-hoststack
>  
> Email is from a Free Mail Service (Gmail/Yahoo/Hotmail….) : Beware of 
> Phishing Scams, Report questionable emails tos...@mavenir.com 
> <mailto:s...@mavenir.com>
> Hi Kevin,  
>  
> That’s a really old version of vpp. TLS has seen several improvements since 
> then in areas including scheduling after incomplete writes. If you get a 
> chance to test vpp latest or a more recent release, do let us know if the 
> issue still persists. 
>  
> Regards,
> Florin
> 
> 
> On Mar 6, 2023, at 5:42 PM, Kevin Yan via lists.fd.io 
>  <mailto:kevin.yan=mavenir@lists.fd.io>> wrote:
>  
> Hi @Olivia Dunham <mailto:theoliviadun...@gmail.com>,
>Recently I met the exact same issue,  TLS TX svm fifo gets 
> full after burst of traffic and it will never resume, meanwhile, TCP TX svm 
> fifo is empty at that time I’m using VPP20.09,  I believe there is some issue 
> in TLS layer,
> so did you fix the issue later? If yes, can you share the solution.
>  
> BRs,
> Kevin
>  
> From: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>  <mailto:vpp-dev@lists.fd.io>> On Behalf Of Olivia Dunham
> Sent: Tuesday, September 14, 2021 8:58 PM
> To: vpp-dev@lists.fd.io <mailto:vpp-dev@lists.fd.io>
> Subject: [E] COMMERCIAL BULK: [vpp-dev] TLS app stuck after burst of traffic 
> #vpp-hoststack
>  
> Email is from a Free Mail Service (Gmail/Yahoo/Hotmail….) : Beware of 
> Phishing Scams, Report questionable emails tos...@mavenir.com 
> <mailto:s...@mavenir.com>
> During sudden burst of traffic, the TCP fifo gets full. When this happens the 
> openssl TLS app de-schedules the transport. But once the TCP data is sent 
> out, the TLS is not resuming. VPP ends up in a state where TCP fifo is empty, 
> but the TLS fifo is full and no more Tx happens on TLS fifo.
> 
> VPP version: 21.01
> 
> We came across this commit - session tls: deq notifications for custom tx 
> <https://github.com/FDio/vpp/commit/1e6a0f64653c8142fa7032aba127ab4894bafc3c>
> Not sure what is the issue fixed by this commit, but It doesn't seem to fix 
> the above mentioned issue.
> This e-mail message may contain confidential or proprietary information of 
> Mavenir Systems, Inc. or its affiliates and is intended solely for the use of 
> the intended recipient(s). If you are not the intended recipient of this 
> message, you are hereby notified that any review, use or distribution of this 
> information is absolutely prohibited and we request that you delete all 
> copies in your control and contact us by e-mailing to secur...@mavenir.com 
> <mailto:secur...@mavenir.com>. This message contains the views of its author 
> and may not necessarily reflect the views of Mavenir Systems, Inc. or its 
> affiliates, who employ systems to monitor email messages, but make no 
> representation that such messages are authorized, secure, uncompromised, or 
> free from computer viruses, malware, or other defects. Thank You
> 
> 
>  
> This e-mail message may contain confidential or proprietary information of 
> Mavenir Systems, Inc. or its affiliates and is intended solely for the use of 
> the intended recipient(s). If you are not the intended recipient of this 
> message, you are hereby notified that any review, use or distribution of this 
> information is absolutely prohibited and we request that you delete all 
> copies in your control and contact us by e-mailing to secur...@mavenir.com 
> <mailto

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

2023-03-21 Thread Florin Coras

Hi Zhang, 

Awesome! Thanks!

Regards,
Florin

> On Mar 21, 2023, at 7:41 PM, Zhang Dongya  wrote:
> 
> Hi Florin,
> 
> Thanks a lot, the previous patch and with reset disabled have been running 1 
> day without issue.
> 
> I will enable reset and with your new patch, will provide feedback later.
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月22日周三 02:12写道：
>> Hi, 
>> 
>> Okay, resetting of half-opens definitely not supported. I updated the patch 
>> to just clean them up on forced reset, without sending a reset to make sure 
>> session lookup table cleanup still happens. 
>> 
>> Regards,
>> Florin
>> 
>>> On Mar 20, 2023, at 9:13 PM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> After review my code, I found that I have add a flag to the vnet_disconnect 
>>> API which will call session_reset instead of session_close, the reason I do 
>>> this is to make intermediate firewall just flush the state and reconstruct 
>>> if I later reconnect.
>>> 
>>> It seems in session_reset logic, for half open session, it also missing to 
>>> remove the session from the lookup hash which may cause the issue too.
>>> 
>>> I change my code and will test with your patch along, will provide feedback 
>>> later.
>>> 
>>> I also noticed the bihash issue discussed in the list recently, I will 
>>> merge later.
>>> 
>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>> 于2023年3月21日周二 11:56写道：
>>>> Hi, 
>>>> 
>>>> That last thing is pretty interesting. It’s either the issue fixed by this 
>>>> patch [1] or sessions are somehow cleaned up multiple times. If it’s the 
>>>> latter, I’d really like to understand how that happens. 
>>>> 
>>>> Regards,
>>>> Florin
>>>> 
>>>> [1] https://gerrit.fd.io/r/c/vpp/+/38507 
>>>> 
>>>>> On Mar 20, 2023, at 6:52 PM, Zhang Dongya >>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> After merge this patch and update the test environment, the issue still 
>>>>> persists.
>>>>> 
>>>>> Let me clear my client app config:
>>>>> 1. register a reset callback, which will call vnet_disconnect there and 
>>>>> also trigger reconnect by send event to the ctrl process.)
>>>>> 2. register a connected callback, which will handle connect err by 
>>>>> trigger reconnect, on success, it will record session handle and extract 
>>>>> tcp sequence for our app usage.
>>>>> 3. register a disconnect callback, which basically do same as reset 
>>>>> callback.
>>>>> 4. register a cleanup callback and accept callback, which basically make 
>>>>> the session layer happy without actually relevant work to do.
>>>>> 
>>>>> There is a ctrl process in mater, which will handle periodically 
>>>>> reconnect or triggered by event.
>>>>> 
>>>>> BTW, I also see frequently warning 'session %u hash delete rv -3' in 
>>>>> session_delete in my environment, hope this helps to investigate.
>>>>> 
>>>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>>>> 于2023年3月20日周一 23:29写道：
>>>>>> Hi, 
>>>>>> 
>>>>>> Understood and yes, connect will synchronously fail if port is not 
>>>>>> available, so you should be able to retry it later. 
>>>>>> 
>>>>>> Regards, 
>>>>>> Florin
>>>>>> 
>>>>>>> On Mar 20, 2023, at 1:58 AM, Zhang Dongya >>>>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi,
>>>>>>> 
>>>>>>> It seems the issue occurs when there are disconnect called because our 
>>>>>>> network can't guarantee a tcp can't be reset even when 3 ways handshake 
>>>>>>> is completed (firewall issue :( ).
>>>>>>> 
>>>>>>> When we find the app layer timeout, we will first disconnect (because 
>>>>>>> we record the session handle, this session might be a half open 
>>>>>>> session), does vnet session layer guarantee that if we reconn

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

2023-03-21 Thread Florin Coras

Hi, 

Okay, resetting of half-opens definitely not supported. I updated the patch to 
just clean them up on forced reset, without sending a reset to make sure 
session lookup table cleanup still happens. 

Regards,
Florin

> On Mar 20, 2023, at 9:13 PM, Zhang Dongya  wrote:
> 
> Hi,
> 
> After review my code, I found that I have add a flag to the vnet_disconnect 
> API which will call session_reset instead of session_close, the reason I do 
> this is to make intermediate firewall just flush the state and reconstruct if 
> I later reconnect.
> 
> It seems in session_reset logic, for half open session, it also missing to 
> remove the session from the lookup hash which may cause the issue too.
> 
> I change my code and will test with your patch along, will provide feedback 
> later.
> 
> I also noticed the bihash issue discussed in the list recently, I will merge 
> later.
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月21日周二 11:56写道：
>> Hi, 
>> 
>> That last thing is pretty interesting. It’s either the issue fixed by this 
>> patch [1] or sessions are somehow cleaned up multiple times. If it’s the 
>> latter, I’d really like to understand how that happens. 
>> 
>> Regards,
>> Florin
>> 
>> [1] https://gerrit.fd.io/r/c/vpp/+/38507 
>> 
>>> On Mar 20, 2023, at 6:52 PM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> After merge this patch and update the test environment, the issue still 
>>> persists.
>>> 
>>> Let me clear my client app config:
>>> 1. register a reset callback, which will call vnet_disconnect there and 
>>> also trigger reconnect by send event to the ctrl process.)
>>> 2. register a connected callback, which will handle connect err by trigger 
>>> reconnect, on success, it will record session handle and extract tcp 
>>> sequence for our app usage.
>>> 3. register a disconnect callback, which basically do same as reset 
>>> callback.
>>> 4. register a cleanup callback and accept callback, which basically make 
>>> the session layer happy without actually relevant work to do.
>>> 
>>> There is a ctrl process in mater, which will handle periodically reconnect 
>>> or triggered by event.
>>> 
>>> BTW, I also see frequently warning 'session %u hash delete rv -3' in 
>>> session_delete in my environment, hope this helps to investigate.
>>> 
>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>> 于2023年3月20日周一 23:29写道：
>>>> Hi, 
>>>> 
>>>> Understood and yes, connect will synchronously fail if port is not 
>>>> available, so you should be able to retry it later. 
>>>> 
>>>> Regards, 
>>>> Florin
>>>> 
>>>>> On Mar 20, 2023, at 1:58 AM, Zhang Dongya >>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> It seems the issue occurs when there are disconnect called because our 
>>>>> network can't guarantee a tcp can't be reset even when 3 ways handshake 
>>>>> is completed (firewall issue :( ).
>>>>> 
>>>>> When we find the app layer timeout, we will first disconnect (because we 
>>>>> record the session handle, this session might be a half open session), 
>>>>> does vnet session layer guarantee that if we reconnect from master thread 
>>>>> when the half open session still not be released yet (due to asynchronous 
>>>>> logic) that the reconnect fail? if then we can retry connect later.
>>>>> 
>>>>> I prefer to not registered half open callback because I think it make app 
>>>>> complicated from a TCP programming prospective.
>>>>> 
>>>>> For your patch, I think it should be work because I can't delete the half 
>>>>> open session immediately because there is worker configured, so the half 
>>>>> open will be removed from bihash when syn retrans timeout. I have merged 
>>>>> the patch and will provide feedback later.
>>>>> 
>>>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>>>> 于2023年3月20日周一 13:09写道：
>>>>>> Hi, 
>>>>>> 
>>>>>> Inline.
>>>>>> 
>>>>>>> On Mar 19, 2023, at 6:47 PM, Zhang Dongya >>>>>> <mailto:fortitude.zh...@gmail.com>> wrote:
>>>>>&

Re: [vpp-dev] #vnet A bug which may cause assertion error in vnet/session

2023-03-21 Thread Florin Coras

Hi, 

The problem seems to be that you’re using a vmxnet3 interface, so I suspect 
this might be a vm configuration issue. Your current config should work but 
could end up being inefficient. 

With respect to your problem, I just built redis and ran redis-server and cli 
over LDP. Everything seems to be working fine so I’m assuming you’re doing some 
stress tests of redis? Could you provide more info about your client? 

Regards,
Florin

> On Mar 21, 2023, at 6:18 AM, Chen Weihao  wrote:
> 
> Thank you for your reply.
> I tried to change num-tx-queues from 2 to 5, but it got a SIGSEGV, the 
> backtrace is:
> #0  0x7fffb453ff89 in rte_write32_relaxed (addr=0x80007ef0, value=0)
> at ../src-dpdk/lib/eal/include/generic/rte_io.h:310
> #1  rte_write32 (addr=0x80007ef0, value=0)
> at ../src-dpdk/lib/eal/include/generic/rte_io.h:373
> #2  vmxnet3_enable_intr (hw=0xac03b0600, intr_idx=4294967262)
> at ../src-dpdk/drivers/net/vmxnet3/vmxnet3_ethdev.c:210
> #3  0x7fffb4544d35 in vmxnet3_dev_rx_queue_intr_enable (
> dev=0x7fffb5186980 , queue_id=0)
> at ../src-dpdk/drivers/net/vmxnet3/vmxnet3_ethdev.c:1815
> #4  0x7fffaff4bbf2 in rte_eth_dev_rx_intr_enable (port_id=0, queue_id=0)
> at ../src-dpdk/lib/ethdev/rte_ethdev.c:4740
> #5  0x7fffb49f4564 in dpdk_setup_interrupts (xd=0x7fffbdbb2940)
> at /home/chenweihao/vpp_dev/src/plugins/dpdk/device/common.c:336
> #6  0x7fffb49f4430 in dpdk_device_start (xd=0x7fffbdbb2940)
> at /home/chenweihao/vpp_dev/src/plugins/dpdk/device/common.c:411
> #7  0x7fffb49ff713 in dpdk_interface_admin_up_down (
> vnm=0x77e2b828 , hw_if_index=1, flags=1)
> at /home/chenweihao/vpp_dev/src/plugins/dpdk/device/device.c:476
> #8  0x770d60e8 in vnet_sw_interface_set_flags_helper (
> vnm=0x77e2b828 , sw_if_index=1, 
> flags=VNET_SW_INTERFACE_FLAG_ADMIN_UP, helper_flags=0)
> at /home/chenweihao/vpp_dev/src/vnet/interface.c:470
> #9  0x770d645a in vnet_sw_interface_set_flags (
> vnm=0x77e2b828 , sw_if_index=1, 
> flags=VNET_SW_INTERFACE_FLAG_ADMIN_UP)
> at /home/chenweihao/vpp_dev/src/vnet/interface.c:524
> #10 0x7710515f in set_state (vm=0x7fffb6a00740, input=0x7fffa9f84bb8, 
> cmd=0x7fffb7180850)
> at /home/chenweihao/vpp_dev/src/vnet/interface_cli.c:946
> #11 0x77e72257 in vlib_cli_dispatch_sub_commands (vm=0x7fffb6a00740, 
> cm=0x77f6a770 , input=0x7fffa9f84bb8, 
> parent_command_index=20) at /home/chenweihao/vpp_dev/src/vlib/cli.c:650
> #12 0x77e71fea in vlib_cli_dispatch_sub_commands (vm=0x7fffb6a00740, 
> cm=0x77f6a770 , input=0x7fffa9f84bb8, 
> parent_command_index=7) at /home/chenweihao/vpp_dev/src/vlib/cli.c:607
> #13 0x77e71fea in vlib_cli_dispatch_sub_commands (vm=0x7fffb6a00740, 
> cm=0x77f6a770 , input=0x7fffa9f84bb8, 
> parent_command_index=0) at /home/chenweihao/vpp_dev/src/vlib/cli.c:607
> #14 0x77e7122a in vlib_cli_input (vm=0x7fffb6a00740, 
> input=0x7fffa9f84bb8, function=0x0, function_arg=0)
> at /home/chenweihao/vpp_dev/src/vlib/cli.c:753
> #15 0x77ef7e23 in unix_cli_exec (vm=0x7fffb6a00740, 
> input=0x7fffa9f84f30, cmd=0x7fffb71815b8)
> at /home/chenweihao/vpp_dev/src/vlib/unix/cli.c:3431
> #16 0x77e72257 in vlib_cli_dispatch_sub_commands (vm=0x7fffb6a00740, 
> cm=0x77f6a770 , input=0x7fffa9f84f30, 
> --Type  for more, q to quit, c to continue without paging--
> parent_command_index=0) at /home/chenweihao/vpp_dev/src/vlib/cli.c:650
> #17 0x77e7122a in vlib_cli_input (vm=0x7fffb6a00740, 
> input=0x7fffa9f84f30, function=0x0, function_arg=0)
> at /home/chenweihao/vpp_dev/src/vlib/cli.c:753
> #18 0x77efdfc5 in startup_config_process (vm=0x7fffb6a00740, 
> rt=0x7fffb9194080, f=0x0)
> at /home/chenweihao/vpp_dev/src/vlib/unix/main.c:291
> #19 0x77ea2c5d in vlib_process_bootstrap (_a=140736084405176)
> at /home/chenweihao/vpp_dev/src/vlib/main.c:1221
> #20 0x76f1ffd8 in clib_calljmp ()
> at /home/chenweihao/vpp_dev/src/vppinfra/longjmp.S:123
> #21 0x7fffac516bb0 in ?? ()
> #22 0x77ea26f9 in vlib_process_startup (vm=0x8, 
> p=0x77ea53bb , f=0x7fffac516cc0)
> at /home/chenweihao/vpp_dev/src/vlib/main.c:1246
> #23 0x76f7aa1c in vec_mem_size (v=0x7fffb6a00740)
> at /home/chenweihao/vpp_dev/src/vppinfra/vec.c:15
> #24 0x0581655dfd1c in ?? ()
> #25 0x00330004 in ?? ()
> #26 0x0030 in ?? ()
> #27 0x7fffbdbc7240 in ?? ()
> #28 0x7fffbdbc7240 in ?? ()
> #29 0x7fffb80e5498 in ?? ()
> #30 0x0001 in ?? ()
> #31 0x in ?? ()
> 
> I tried to change num-rx-queues and num-tx-queues to 4, then SIGSEGV not 
> happened.
> I applied the patch https://gerrit.fd.io/r/c/vpp/+/38529 , and the problem of 
> redis 6.0 seemed still exist, the stack backtrace is same with 
>

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

2023-03-20 Thread Florin Coras

Hi, 

Understood and yes, connect will synchronously fail if port is not available, 
so you should be able to retry it later. 

Regards, 
Florin

> On Mar 20, 2023, at 1:58 AM, Zhang Dongya  wrote:
> 
> Hi,
> 
> It seems the issue occurs when there are disconnect called because our 
> network can't guarantee a tcp can't be reset even when 3 ways handshake is 
> completed (firewall issue :( ).
> 
> When we find the app layer timeout, we will first disconnect (because we 
> record the session handle, this session might be a half open session), does 
> vnet session layer guarantee that if we reconnect from master thread when the 
> half open session still not be released yet (due to asynchronous logic) that 
> the reconnect fail? if then we can retry connect later.
> 
> I prefer to not registered half open callback because I think it make app 
> complicated from a TCP programming prospective.
> 
> For your patch, I think it should be work because I can't delete the half 
> open session immediately because there is worker configured, so the half open 
> will be removed from bihash when syn retrans timeout. I have merged the patch 
> and will provide feedback later.
> 
> Florin Coras mailto:fcoras.li...@gmail.com>> 
> 于2023年3月20日周一 13:09写道：
>> Hi, 
>> 
>> Inline.
>> 
>>> On Mar 19, 2023, at 6:47 PM, Zhang Dongya >> <mailto:fortitude.zh...@gmail.com>> wrote:
>>> 
>>> Hi,
>>> 
>>> It can be aborted both in established state or half open state because I 
>>> will do timeout in our app layer. 
>> 
>> [fc] Okay! Is the issue present irrespective of the state of the session or 
>> does it happen only after a disconnect in hanf-open state? More lower. 
>> 
>>> 
>>> Regarding your question,
>>> 
>>> - Yes we add a builtin in app relys on C apis that  mainly use 
>>> vnet_connect/disconnect to connect or disconnect session.
>> 
>> [fc] Understood
>> 
>>> - We call these api in a vpp ctrl process which should be running on the 
>>> master thread, we never do session setup/teardown on worker thread. (the 
>>> environment that found this issue is configured with 1 master + 1 worker 
>>> setup.)
>> 
>> [fc] With vpp latest it’s possible to connect from first workers. It’s an 
>> optimization meant to avoid 1) worker barrier on syns and 2) entering poll 
>> mode on main (consume less cpu)
>> 
>>> - We started to develop the app using 22.06 and I keep to merge upstream 
>>> changes to latest vpp by cherry-picking. The reason for line mismatch is 
>>> that I added some comment to the session layer code, it should be equal to 
>>> the master branch now.
>> 
>> [fc] Ack
>> 
>>> 
>>> When reading the code I understand that we mainly want to cleanup half open 
>>> from bihash in session_stream_connect_notify, however, in syn-sent state if 
>>> I choose to close the session, the session might be closed by my app due to 
>>> session setup timeout (in second scale), in that case, session will be 
>>> marked as half_open_done and half open session will be freed shortly in the 
>>> ctrl thread (the 1st worker?).
>> 
>> [fc] Actually, this might be the issue. We did start to provide a half-open 
>> session handle to apps which if closed does clean up the session but 
>> apparently it is missing the cleanup of the session lookup table. Could you 
>> try this patch [1]? It might need additional work.
>> 
>> Having said that, forcing a close/cleanup will not free the port 
>> synchronously. So, if you’re using fixed ports, you’ll have to wait for the 
>> half-open cleanup notification.
>> 
>>> 
>>> Should I also registered half open callback or there are some other reason 
>>> that lead to this failure?
>>> 
>> 
>> [fc] Yes, see above.
>> 
>> Regards, 
>> Florin
>> 
>> [1] https://gerrit.fd.io/r/c/vpp/+/38526
>> 
>>> 
>>> Florin Coras mailto:fcoras.li...@gmail.com>> 
>>> 于2023年3月20日周一 06:22写道：
>>>> Hi, 
>>>> 
>>>> When you abort the connection, is it fully established or half-open? 
>>>> Half-opens are cleaned up by the owner thread after a timeout, but the 
>>>> 5-tuple should be assigned to the fully established session by that point. 
>>>> tcp_half_open_connection_cleanup does not cleanup the bihash instead 
>>>> session_stream_connect_notify does once tcp connect returns either success 
>>>> or failure. 
>>>

Re: [vpp-dev] Delete node process

2023-02-06 Thread Florin Coras

Hi, 

You can disable a node using vlib_node_set_state. There’s no api to unregister 
a node. 

Regards,
Florin

> On Feb 6, 2023, at 12:00 PM, amine belroul  wrote:
> 
> Hello, 
> How can I delete node process from vpp runtime?
> For right I can make it just done but not deleted.
> 
> 
> Thank you. 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22555): https://lists.fd.io/g/vpp-dev/message/22555
Mute This Topic: https://lists.fd.io/mt/96791991/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] Issue related to VCL Session Migration

2023-02-07 Thread Florin Coras

Hi Vivek, Aslam, 

That’s an interesting use case. We typically recommend using VCL natively and 
only if that’s not possible use LDP, which implies VLS locking. We haven’t had 
many VLS native integration efforts.

Coming back to your problem, any particular reason why you’re not registering 
all your app’s pthreads as vcl workers and have them listen on the same 
ip:port? Then vpp would just distribute incoming connections to all workers 
which can in parallel handle tls establishment and io.

If that’s not possible, you’re doing the right thing by using VLS. I’m assuming 
you’re creating those epoll sessions within the pthreads/workers? If that’s not 
the case, and you have sessions added to those epfds, you should get a "can't 
migrate nonempty epoll session” error. 

If on the other hand you accept sessions on only one worker, and then hand them 
off to other workers, you’ll need to change ownership/share those sessions in 
vpp as well, i.e., modify vls_clone_and_share_rpc_handler and have it call 
vls_init_share_session (which today it is not if mt workers is on). This is the 
least efficient option as it involves a lot of work per each session.  

Regards,
Florin

> On Feb 7, 2023, at 12:54 PM, Vivek Gupta  wrote:
> 
> Hi,
>  
> We have SSL based application using VCL. 
>  
> - Currently, we have one thread which does the epoll, TLS Session 
> establishment, Read/Write for multiple tunnels.
>  
> - Now we want to split the functionality in different threads, such that TLS 
> Session establishment happens in a separate thread,
>   read/write happens in another thread and Epoll is happening in separate 
> thread.
>   
> - To do this, we want migrate the VCL sessions from one VCL worker to another 
> VCL worker using the VLS wrapper.
>  
> - We notice that post the migration read/write is working fine from the 
> migrated thread, but EPOLL indications are not coming to either the
>   old thread or the new thread.
>  
> Since we are using VLS, we have set multi-thread-workers option to TRUE.
>  
> If we use the single VCL worker based VLS option, epoll is working fine for 
> us. But it will require lot of locks and hence trying to avoid that option.
>  
> Please let us know if epoll is supported for migrating the VCL sessions, with 
> multi-thread-workers option set to true. Also, any pointers
> any specific changes to be done for that will help a lot.
>  
>  
> Regards,
> Vivek

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#22564): https://lists.fd.io/g/vpp-dev/message/22564
Mute This Topic: https://lists.fd.io/mt/96816502/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/leave/1480452/21656/631435203/xyzzy 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

[vpp-dev] [FD.io Helpdesk #46540] [linuxfoundation.org #46540] Re: Growing build queue

2017-10-02 Thread Florin Coras via RT

Queue is back down to 1. 

Thanks a lot Vanessa!

Florin

> On Oct 2, 2017, at 11:59 AM, Vanessa Valderrama 
> <vvalderr...@linuxfoundation.org> wrote:
> 
> Florin,
> 
> I'm looking into the issue now.
> 
> Thank you,
> Vanessa
> 
> On 10/02/2017 11:49 AM, Florin Coras wrote:
>> Hi Vanessa, 
>> 
>> It would seem we’re running out of executors and the build queue keeps on 
>> growing. Could you take a look at it?
>> 
>> Thanks,
>> Florin
> 


___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

[vpp-dev] [FD.io Helpdesk #45343] [linuxfoundation.org #45343] Re: More build timeouts for vpp-verify-master-ubuntu1604

2017-09-06 Thread Florin Coras via RT

Hi, 

Any news regarding this? We are 1 week away from API freeze and the infra makes 
it almost impossible to merge patches! 

Thanks, 
Florin

> On Sep 4, 2017, at 9:44 PM, Dave Wallace  wrote:
> 
> Dear helpd...@fd.io ,
> 
> There has been another string of build timeouts for 
> vpp-verify-master-ubuntu1604:
> https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/buildTimeTrend 
> 
> Please change the timeout for build failures from 360 minutes to 120 minutes 
> in addition to addressing the slow minion issue.
> 
> Thanks,
> -daw-
> ___
> vpp-dev mailing list
> vpp-dev@lists.fd.io
> https://lists.fd.io/mailman/listinfo/vpp-dev


___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

[vpp-dev] [FD.io Helpdesk #46540] [linuxfoundation.org #46540] Fwd: Growing build queue

2017-10-02 Thread Florin Coras via RT

Unfortunately this completely blocks vpp development. Could someone please take 
a look at it. 

Thanks, 
Florin

> Begin forwarded message:
> 
> From: Florin Coras <fcoras.li...@gmail.com>
> Subject: Growing build queue
> Date: October 2, 2017 at 9:49:16 AM PDT
> To: Vanessa Valderrama <vvalderr...@linuxfoundation.org>, helpd...@fd.io
> Cc: vpp-dev <vpp-dev@lists.fd.io>
> 
> Hi Vanessa, 
> 
> It would seem we’re running out of executors and the build queue keeps on 
> growing. Could you take a look at it?
> 
> Thanks,
> Florin

-- 
You received this message because you are subscribed to the Google Groups 
"Emergency Admin Alert" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to emergency+unsubscr...@linuxfoundation.org.

___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] [csit-dev] make test python segfault in ubuntu 16.04

2017-08-28 Thread Florin Coras (fcoras)

Hi Dave,

Thanks a lot for thorough analysis! I’d also really like to see 1 be fixed as 
soon as possible.

Cheers,
Florin

From: Dave Wallace <dwallac...@gmail.com>
Date: Friday, August 25, 2017 at 10:56 AM
To: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
Cc: "Florin Coras (fcoras)" <fco...@cisco.com>, "csit-...@lists.fd.io" 
<csit-...@lists.fd.io>
Subject: Re: [csit-dev] make test python segfault in ubuntu 16.04

vpp-dev, Florin,

Below is an analysis of the all of the failures that this patch encountered 
before finally passing. None of the failures were related in any way to the 
code changes in the patch.

In summary, there appear to be a number of different factors involved with 
these failures.
· Two failures appear to be caused by the run-time environment.
· An intermittent bug appears to exist in `L2BD Multi-instance test 5 - 
delete 5 BDs'
· The segfault shows lots of threads being run.  Are tests being 
executed in parallel?  If so, it would be interesting to serialize the tests to 
see if that fixes any of these issues.
I'm also seeing a variation in the order that the "make tests" are run (or at 
least in the order of the status reports).  My understanding of the 'make test' 
python infrastructure is insufficient to make an intelligent guess as to 
whether this has any bearing on any of these failures.

I get more predictable result output when running make test locally on my own 
server, but the order of test output is different than in the CI test runs.  
Locally, the order of tests appears to be the same between different runs of 
'make test'.  I have also not seen any of these errors on my server which is 
running Ubuntu 17.04, although I have not done an endurance test either.

My recommendation based on this analysis is as follows:
  1. The L2BD unit test issue be investigated by the appropriate 'make test' 
experts
  2. vpp-verify-master-centos7, vpp-verify-master-ubuntu1604, and 
vpp-test-debug-master-ubuntu1604 jobs should be run operationally in the 
Container PoC environment with the rest of the jjb jobs run in the cloud infra.

Thanks,
-daw-

 %< 
[ From https://gerrit.fd.io/r/#/c/8133 ]

=> Container PoC Aug 24 8:36 PM  Patch Set 9:  Build Successful
http://jenkins.ejkern.net:8080/job/vpp-docs-verify-master/1515/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-make-test-docs-verify-master/1512/ : 
SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-verify-master-centos7/1983/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-test-debug-master-ubuntu1604/1301/ : 
SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-verify-master-ubuntu1604/2022/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-fake-csit-verify-master/1695/ : SUCCESS

=> fd.io JJB  Aug 24 9:19 PM  Patch Set 9:  Verified-1  Build Failed
https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/6775/ : FAILURE
Logs: 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-verify-master-ubuntu1604/6775
Failure Signature:
  01:08:59  verify templates on IP6 datapath  Fatal Python error: 
Segmentation fault

Comment:
  Python bug or resource starvation?  Lots of threads running...
  Possibly due to bad environment/sick minion.
https://jenkins.fd.io/job/vpp-make-test-docs-verify-master/3098/ : SUCCESS
https://jenkins.fd.io/job/vpp-verify-master-centos7/6770/ : SUCCESS
https://jenkins.fd.io/job/vpp-csit-verify-virl-master/6781/ : SUCCESS
https://jenkins.fd.io/job/vpp-docs-verify-master/5370/ : SUCCESS

=> Container PoC  Aug 24 10:54 PM  Patch Set 9:  Build Successful
http://jenkins.ejkern.net:8080/job/vpp-docs-verify-master/1519/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-make-test-docs-verify-master/1516/ : 
SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-verify-master-centos7/1987/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-test-debug-master-ubuntu1604/1305/ : 
SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-verify-master-ubuntu1604/2027/ : SUCCESS
http://jenkins.ejkern.net:8080/job/vpp-fake-csit-verify-master/1699/ : SUCCESS

=> fd.io JJB  Aug 24 11:13 PM  Patch Set 9:  Verified-1  Build Failed
https://jenkins.fd.io/job/vpp-verify-master-centos7/6774/ : FAILURE
Logs: 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-verify-master-centos7/6774
Failure Signature:
  00:23:17.198 CCLD vcl_test_client
  00:24:32.936 FATAL: command execution failed
  00:24:32.937 java.io.IOException

Comment:
  Bad environment/sick minion?
  There's no reason for compilation to kill the build.
https://jenkins.fd.io/job/vpp-verify-master-ubuntu1604/6779/ : FAILURE
Logs: 
https://logs.fd.io/production/vex-yul-rot-jenkins-1/vpp-verify-master-ubuntu1604/6779
Failure Signature:
  03:02:47  
==
  03:02:47  collect information on Ethernet, IP4 and IP6 datapath (no timers)
  03:02:47  
===

Re: [vpp-dev] [FD.io Helpdesk #56282] git.fd.io not updating

2018-05-22 Thread Florin Coras via RT

It is! Thank you, Vanessa!

Florin

> On May 22, 2018, at 11:41 AM, Vanessa Valderrama via RT 
> <fdio-helpd...@rt.linuxfoundation.org> wrote:
> 
> This issue should be resolved.
> 
> Thank you,
> Vanessa
> 
> On Tue May 22 02:59:02 2018, mvarl...@suse.de wrote:
>> Roughly a week ago, I noticed there was a DNS/IP change when cloning a
>> new VPP
>> repo... I wonder if what I saw is somehow connected to this issue.
>> On Mon, 2018-05-21 at 16:34 -0700, Florin Coras wrote:
>>> Hi,
>>> It would seem that git.fd.io [1] thinks that we last committed a
>>> patch to vpp
>>> almost 1 week ago. Any idea what might’ve triggered this?
>>> Thanks, Florin
>>> [1] https://git.fd.io/vpp/log/
>>> 
>>> 
>>> 
> 
> 
> 



-=-=-=-=-=-=-=-=-=-=-=-
Links:

You receive all messages sent to this group.

View/Reply Online (#9354): https://lists.fd.io/g/vpp-dev/message/9354
View All Messages In Topic (2): https://lists.fd.io/g/vpp-dev/topic/19743812
Mute This Topic: https://lists.fd.io/mt/19743812/21656
New Topic: https://lists.fd.io/g/vpp-dev/post

Change Your Subscription: https://lists.fd.io/g/vpp-dev/editsub/21656
Group Home: https://lists.fd.io/g/vpp-dev
Contact Group Owner: vpp-dev+ow...@lists.fd.io
Terms of Service: https://lists.fd.io/static/tos
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub
-=-=-=-=-=-=-=-=-=-=-=-

Re: [vpp-dev] builtin UDP server broken in 18.01 ?

2018-01-26 Thread Florin Coras (fcoras)

Also, it should be noted that the patch changes the cli to run any of the 
builtin server/clients. To run a server/client one should do:

test echo server|client uri transport_proto://ip/port 

We now have support for tcp, udp and, thanks to Marco, sctp.

Florin

From: Andreas Schultz <andreas.schu...@travelping.com>
Date: Friday, January 26, 2018 at 8:09 AM
To: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
Cc: "Florin Coras (fcoras)" <fco...@cisco.com>
Subject: Re: builtin UDP server broken in 18.01 ?

Andreas Schultz 
<andreas.schu...@travelping.com<mailto:andreas.schu...@travelping.com>> schrieb 
am Fr., 26. Jan. 2018 um 16:10 Uhr:
Hi,

I used to be able to do a

   builtin uri bind uri udp://0.0.0.0/1234<http://0.0.0.0/1234>

After upgrading to 18.01 this now failes with:

Correction, it still works in 18.01, commit  
"b384b543313b6b47a277c903e9d4fcd4343054fa: session: add support for memfd 
segments" seems to have broken it.

Andreas


  builtin uri bind: bind_uri_server returned -1

Any hints on how to fix that?

Regards
Andreas
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] builtin UDP server broken in 18.01 ?

2018-01-26 Thread Florin Coras (fcoras)

Sorry about that, it is fixed in this patch: https://gerrit.fd.io/r/#/c/10253/

Florin

From: Andreas Schultz <andreas.schu...@travelping.com>
Date: Friday, January 26, 2018 at 8:09 AM
To: "vpp-dev@lists.fd.io" <vpp-dev@lists.fd.io>
Cc: "Florin Coras (fcoras)" <fco...@cisco.com>
Subject: Re: builtin UDP server broken in 18.01 ?

Andreas Schultz 
<andreas.schu...@travelping.com<mailto:andreas.schu...@travelping.com>> schrieb 
am Fr., 26. Jan. 2018 um 16:10 Uhr:
Hi,

I used to be able to do a

   builtin uri bind uri udp://0.0.0.0/1234<http://0.0.0.0/1234>

After upgrading to 18.01 this now failes with:

Correction, it still works in 18.01, commit  
"b384b543313b6b47a277c903e9d4fcd4343054fa: session: add support for memfd 
segments" seems to have broken it.

Andreas

  builtin uri bind: bind_uri_server returned -1

Any hints on how to fix that?

Regards
Andreas
___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] [FD.io Helpdesk #50221] [linuxfoundation.org #50221] RE: Wiki hacked?

2017-12-27 Thread Florin Coras via RT

Hi George, 

Thanks for the heads up!

Happy Holidays!
Florin

> On Dec 21, 2017, at 10:53 AM, George.Y.Zhao via RT 
> <fdio-helpd...@rt.linuxfoundation.org> wrote:
> 
> Hi Florin,
> It seems that you already fix the issue, the best way I can think is to 
> revert to previous version from wiki history.
> 
> We had similar hack on OPNFV/ODL IRC channel as well.
> 
> George
> 
> From: vpp-dev-boun...@lists.fd.io [mailto:vpp-dev-boun...@lists.fd.io] On 
> Behalf Of Florin Coras
> Sent: Wednesday, December 20, 2017 3:38 PM
> To: fdio-helpd...@rt.linuxfoundation.org
> Cc: vpp-dev
> Subject: [vpp-dev] Wiki hacked?
> 
> Apparently our wiki has been hacked (see attached)? What’s the best way 
> forward to solve this?
> 
> Florin[cid:image002.png@01D37A49.DC736E30]
> 
> 


___
vpp-dev mailing list
vpp-dev@lists.fd.io
https://lists.fd.io/mailman/listinfo/vpp-dev

Re: [vpp-dev] can't establish tcp connection with new introduced transport_endpoint_freelist

Re: [vpp-dev] #vnet A bug which may cause assertion error in vnet/session

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

Re: COMMERCIAL BULK: Re: [E] COMMERCIAL BULK: [vpp-dev] TLS app stuck after burst of traffic #vpp-hoststack

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

Re: [vpp-dev] #vnet A bug which may cause assertion error in vnet/session

Re: [vpp-dev] Sigabrt in tcp46_input_inline for tcp_lookup_is_valid

Re: [vpp-dev] Delete node process

Re: [vpp-dev] Issue related to VCL Session Migration

[vpp-dev] [FD.io Helpdesk #46540] [linuxfoundation.org #46540] Re: Growing build queue

[vpp-dev] [FD.io Helpdesk #45343] [linuxfoundation.org #45343] Re: More build timeouts for vpp-verify-master-ubuntu1604

[vpp-dev] [FD.io Helpdesk #46540] [linuxfoundation.org #46540] Fwd: Growing build queue

Re: [vpp-dev] [csit-dev] make test python segfault in ubuntu 16.04

Re: [vpp-dev] [FD.io Helpdesk #56282] git.fd.io not updating

Re: [vpp-dev] builtin UDP server broken in 18.01 ?

Re: [vpp-dev] builtin UDP server broken in 18.01 ?

Re: [vpp-dev] [FD.io Helpdesk #50221] [linuxfoundation.org #50221] RE: Wiki hacked?

< 3 4 5 6 7 8

701 - 719 of 719 matches

Site Navigation

Mail list logo

Footer information