Re: [vpp-dev] VPP 2005 is crashing on stopping the VCL applications #vpp-hoststack

2020-07-29 Thread Florin Coras
Hi Raj, 

In that case it should work. Just from the trace lower it’s hard to figure out 
what exactly happened. Also, keep in mind that vcl is not thread safe, so make 
sure you’re not trying to share sessions or allow two workers to  interact with 
the message queue(s) at the same time. 

Regards,
Florin

> On Jul 29, 2020, at 8:17 PM, Raj Kumar  wrote:
> 
> Hi Florin,
> I am using kill  to stop the application. But , the application has a 
> kill signal handler and after receiving the signal it is exiting gracefully.
> About vppcom_app_exit, I think this function is registered with atexit() 
> inside vppcom_app_create() so it should call when the application exits. 
> Even, I also tried this vppcom_app_exit() explicitly before exiting the 
> application but still I am seeing the same issue. 
> 
> My application is a multithreaded application. Can you please suggest some 
> cleanup functions ( vppcom functions) that  I should call before exiting a 
> thread and the main application for a proper cleanup. 
> I also tried vppcom_app_destroy() before exiting the main application but 
> still I am seeing the same issue.
> 
> thanks,
> -Raj
> 
> On Wed, Jul 29, 2020 at 5:34 PM Florin Coras  > wrote:
> Hi Raj, 
> 
> Does stopping include a call to vppcom_app_exit or killing the applications? 
> If the latter, the apps might be killed with some mutexes/spinlocks held. For 
> now, we only support the former. 
> 
> Regards,
> Florin
> 
> > On Jul 29, 2020, at 1:49 PM, Raj Kumar  > > wrote:
> > 
> > Hi,
> > In my UDP application , I am using VPP host stack to receive packets and 
> > memIf to transmit packets. There are a total 6 application connected to 
> > VPP. 
> > if I stop the application(s) then VPP is crashing.  In vpp configuration , 
> > 4 worker threads are configured.  If there is no worker thread configured 
> > then I do not see this crash.
> > Here is the VPP task trace - 
> >  (gdb) bt
> > #0  0x751818df in raise () from /lib64/libc.so.6
> > #1  0x7516bcf5 in abort () from /lib64/libc.so.6
> > #2  0xc123 in os_panic () at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vpp/vnet/main.c:366
> > #3  0x76b466bb in vlib_worker_thread_barrier_sync_int 
> > (vm=0x76d78200 , func_name=)
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/threads.c:1529
> > #4  0x77bc5ef0 in vl_msg_api_handler_with_vm_node 
> > (am=am@entry=0x77dd2ea0 ,
> > vlib_rp=vlib_rp@entry=0x7fee7c001000, the_msg=0x7fee7c02bbd8, 
> > vm=vm@entry=0x76d78200 ,
> > node=node@entry=0x7fffb6295000, is_private=is_private@entry=1 '\001')
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibapi/api_shared.c:596
> > #5  0x77bb000f in void_mem_api_handle_msg_i (is_private=1 '\001', 
> > node=0x7fffb6295000, vm=0x76d78200 ,
> > vlib_rp=0x7fee7c001000, am=0x77dd2ea0 )
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:698
> > #6  vl_mem_api_handle_msg_private (vm=vm@entry=0x76d78200 
> > , node=node@entry=0x7fffb6295000, reg_index= > out>)
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:762
> > #7  0x77bbe346 in vl_api_clnt_process (vm=, 
> > node=0x7fffb6295000, f=)
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/vlib_api.c:370
> > #8  0x76b161d6 in vlib_process_bootstrap (_a=)
> > at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/main.c:1502
> > #9  0x7602ac0c in clib_calljmp () from /lib64/libvppinfra.so.20.05
> > #10 0x7fffb5e93dd0 in ?? ()
> > #11 0x76b19821 in dispatch_process (vm=0x76d78200 
> > , p=0x7fffb6295000, last_time_stamp=15931923011231136,
> > f=0x0) at 
> > /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vppinfra/types.h:133
> > #12 0x7f0f66009024 in ?? ()
> > 
> > 
> > Thanks,
> > -Raj
> >   
> 
> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17114): https://lists.fd.io/g/vpp-dev/message/17114
Mute This Topic: https://lists.fd.io/mt/75873900/21656
Mute #vpp-hoststack: 
https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp-hoststack
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP 2005 is crashing on stopping the VCL applications #vpp-hoststack

2020-07-29 Thread Raj Kumar
Hi Florin,
I am using kill  to stop the application. But , the application has a
kill signal handler and after receiving the signal it is exiting gracefully.
About vppcom_app_exit, I think this function is registered with atexit()
inside vppcom_app_create() so it should call when the application exits.
Even, I also tried this vppcom_app_exit() explicitly before exiting the
application but still I am seeing the same issue.

My application is a multithreaded application. Can you please suggest some
cleanup functions ( vppcom functions) that  I should call before exiting a
thread and the main application for a proper cleanup.
I also tried vppcom_app_destroy() before exiting the main application but
still I am seeing the same issue.

thanks,
-Raj

On Wed, Jul 29, 2020 at 5:34 PM Florin Coras  wrote:

> Hi Raj,
>
> Does stopping include a call to vppcom_app_exit or killing the
> applications? If the latter, the apps might be killed with some
> mutexes/spinlocks held. For now, we only support the former.
>
> Regards,
> Florin
>
> > On Jul 29, 2020, at 1:49 PM, Raj Kumar  wrote:
> >
> > Hi,
> > In my UDP application , I am using VPP host stack to receive packets and
> memIf to transmit packets. There are a total 6 application connected to
> VPP.
> > if I stop the application(s) then VPP is crashing.  In vpp configuration
> , 4 worker threads are configured.  If there is no worker thread configured
> then I do not see this crash.
> > Here is the VPP task trace -
> >  (gdb) bt
> > #0  0x751818df in raise () from /lib64/libc.so.6
> > #1  0x7516bcf5 in abort () from /lib64/libc.so.6
> > #2  0xc123 in os_panic () at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vpp/vnet/main.c:366
> > #3  0x76b466bb in vlib_worker_thread_barrier_sync_int
> (vm=0x76d78200 , func_name=)
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/threads.c:1529
> > #4  0x77bc5ef0 in vl_msg_api_handler_with_vm_node 
> > (am=am@entry=0x77dd2ea0
> ,
> > vlib_rp=vlib_rp@entry=0x7fee7c001000, the_msg=0x7fee7c02bbd8,
> vm=vm@entry=0x76d78200 ,
> > node=node@entry=0x7fffb6295000, is_private=is_private@entry=1
> '\001')
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibapi/api_shared.c:596
> > #5  0x77bb000f in void_mem_api_handle_msg_i (is_private=1
> '\001', node=0x7fffb6295000, vm=0x76d78200 ,
> > vlib_rp=0x7fee7c001000, am=0x77dd2ea0 )
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:698
> > #6  vl_mem_api_handle_msg_private (vm=vm@entry=0x76d78200
> , node=node@entry=0x7fffb6295000, reg_index= out>)
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:762
> > #7  0x77bbe346 in vl_api_clnt_process (vm=,
> node=0x7fffb6295000, f=)
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/vlib_api.c:370
> > #8  0x76b161d6 in vlib_process_bootstrap (_a=)
> > at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/main.c:1502
> > #9  0x7602ac0c in clib_calljmp () from
> /lib64/libvppinfra.so.20.05
> > #10 0x7fffb5e93dd0 in ?? ()
> > #11 0x76b19821 in dispatch_process (vm=0x76d78200
> , p=0x7fffb6295000, last_time_stamp=15931923011231136,
> > f=0x0) at
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vppinfra/types.h:133
> > #12 0x7f0f66009024 in ?? ()
> >
> >
> > Thanks,
> > -Raj
> >
>
> 
>
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17113): https://lists.fd.io/g/vpp-dev/message/17113
Mute This Topic: https://lists.fd.io/mt/75873900/21656
Mute #vpp-hoststack: 
https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp-hoststack
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] VPP 2005 is crashing on stopping the VCL applications #vpp-hoststack

2020-07-29 Thread Florin Coras
Hi Raj, 

Does stopping include a call to vppcom_app_exit or killing the applications? If 
the latter, the apps might be killed with some mutexes/spinlocks held. For now, 
we only support the former. 

Regards,
Florin

> On Jul 29, 2020, at 1:49 PM, Raj Kumar  wrote:
> 
> Hi,
> In my UDP application , I am using VPP host stack to receive packets and 
> memIf to transmit packets. There are a total 6 application connected to VPP. 
> if I stop the application(s) then VPP is crashing.  In vpp configuration , 4 
> worker threads are configured.  If there is no worker thread configured then 
> I do not see this crash.
> Here is the VPP task trace - 
>  (gdb) bt
> #0  0x751818df in raise () from /lib64/libc.so.6
> #1  0x7516bcf5 in abort () from /lib64/libc.so.6
> #2  0xc123 in os_panic () at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vpp/vnet/main.c:366
> #3  0x76b466bb in vlib_worker_thread_barrier_sync_int 
> (vm=0x76d78200 , func_name=)
> at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/threads.c:1529
> #4  0x77bc5ef0 in vl_msg_api_handler_with_vm_node 
> (am=am@entry=0x77dd2ea0 ,
> vlib_rp=vlib_rp@entry=0x7fee7c001000, the_msg=0x7fee7c02bbd8, 
> vm=vm@entry=0x76d78200 ,
> node=node@entry=0x7fffb6295000, is_private=is_private@entry=1 '\001')
> at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibapi/api_shared.c:596
> #5  0x77bb000f in void_mem_api_handle_msg_i (is_private=1 '\001', 
> node=0x7fffb6295000, vm=0x76d78200 ,
> vlib_rp=0x7fee7c001000, am=0x77dd2ea0 )
> at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:698
> #6  vl_mem_api_handle_msg_private (vm=vm@entry=0x76d78200 
> , node=node@entry=0x7fffb6295000, reg_index=)
> at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:762
> #7  0x77bbe346 in vl_api_clnt_process (vm=, 
> node=0x7fffb6295000, f=)
> at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/vlib_api.c:370
> #8  0x76b161d6 in vlib_process_bootstrap (_a=)
> at /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/main.c:1502
> #9  0x7602ac0c in clib_calljmp () from /lib64/libvppinfra.so.20.05
> #10 0x7fffb5e93dd0 in ?? ()
> #11 0x76b19821 in dispatch_process (vm=0x76d78200 
> , p=0x7fffb6295000, last_time_stamp=15931923011231136,
> f=0x0) at 
> /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vppinfra/types.h:133
> #12 0x7f0f66009024 in ?? ()
> 
> 
> Thanks,
> -Raj
>   

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17111): https://lists.fd.io/g/vpp-dev/message/17111
Mute This Topic: https://lists.fd.io/mt/75873900/21656
Mute #vpp-hoststack: 
https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp-hoststack
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] VPP 2005 is crashing on stopping the VCL applications #vpp-hoststack

2020-07-29 Thread Raj Kumar
Hi,
In my UDP application , I am using VPP host stack to receive packets and memIf 
to transmit packets. There are a total 6 application connected to VPP.
if I stop the application(s) then VPP is crashing.  In vpp configuration , 4 
worker threads are configured.  If there is no worker thread configured then I 
do not see this crash.
Here is the VPP task trace -
(gdb) bt
#0  0x751818df in raise () from /lib64/libc.so.6
#1  0x7516bcf5 in abort () from /lib64/libc.so.6
#2  0xc123 in os_panic () at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vpp/vnet/main.c:366
#3  0x76b466bb in vlib_worker_thread_barrier_sync_int 
(vm=0x76d78200 , func_name=)
at /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/threads.c:1529
#4  0x77bc5ef0 in vl_msg_api_handler_with_vm_node 
(am=am@entry=0x77dd2ea0 ,
vlib_rp=vlib_rp@entry=0x7fee7c001000, the_msg=0x7fee7c02bbd8, 
vm=vm@entry=0x76d78200 ,
node=node@entry=0x7fffb6295000, is_private=is_private@entry=1 '\001')
at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibapi/api_shared.c:596
#5  0x77bb000f in void_mem_api_handle_msg_i (is_private=1 '\001', 
node=0x7fffb6295000, vm=0x76d78200 ,
vlib_rp=0x7fee7c001000, am=0x77dd2ea0 )
at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:698
#6  vl_mem_api_handle_msg_private (vm=vm@entry=0x76d78200 
, node=node@entry=0x7fffb6295000, reg_index=)
at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/memory_api.c:762
#7  0x77bbe346 in vl_api_clnt_process (vm=, 
node=0x7fffb6295000, f=)
at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlibmemory/vlib_api.c:370
#8  0x76b161d6 in vlib_process_bootstrap (_a=)
at /usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vlib/main.c:1502
#9  0x7602ac0c in clib_calljmp () from /lib64/libvppinfra.so.20.05
#10 0x7fffb5e93dd0 in ?? ()
#11 0x76b19821 in dispatch_process (vm=0x76d78200 
, p=0x7fffb6295000, last_time_stamp=15931923011231136,
f=0x0) at 
/usr/src/debug/vpp-20.05-9~g0bf9c294c_dirty.x86_64/src/vppinfra/types.h:133
#12 0x7f0f66009024 in ?? ()

Thanks,
-Raj
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17110): https://lists.fd.io/g/vpp-dev/message/17110
Mute This Topic: https://lists.fd.io/mt/75873900/21656
Mute #vpp-hoststack: 
https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp-hoststack
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-29 Thread Florin Coras
Hi Ivan, 

Inline.

> On Jul 29, 2020, at 9:40 AM, Ivan Shvedunov  wrote:
> 
> Hi Florin,
> 
> while trying to fix the proxy cleanup issue, I've spotted another problem in 
> the TCP stack, namely RSTs being ignored in SYN_SENT (half-open) connection 
> state:
> https://gerrit.fd.io/r/c/vpp/+/28103 

That’s actually a nice catch! Thanks! It slipped in relatively recently, 
nonetheless it shows we should be doing more “negative" testing … 

> 
> The following fix for handling failed active connections in the proxy has 
> worked for me, but please confirm that it's correct:
> https://gerrit.fd.io/r/c/vpp/+/28104 

LGTM. Merged.

> 
> After https://gerrit.fd.io/r/c/vpp/+/28096 
>  , I no longer see any "disp error" 
> diagnostics for CLOSE_WAIT but I still do see "disp errors" for SYNs in 
> TIME_WAIT and FIN_WAIT_2 states. I didn't enable reorder in netem, the config 
> for it that I now use is this:
> ip netns exec client tc qdisc add dev client root netem \
>loss 15% delay 100ms 300ms duplicate 10%

Are you sure about the SYNs in TIME_WAIT, because those should be processed in 
tcp_listen? 

Hm, probably because of the loss some fins are not properly delivered and the 
peer times out the connections faster than us. Either way, here’s another patch 
to squash the warnings in FIN_WAIT_2. Again, the only thing that will change is 
the fact that we’ll probably generate dupacks and the peer will be forced to 
reset, instead of ignoring the packet. 
 
> 
> I tried to abandon my older proxy patch that just removes ASSERT (0), but 
> somehow I've lost the ability to log into fd.io  Gerrit with 
> my Linux Foundation login/password, although I can still submit patches and 
> also log into identity.linuxfoundation.org 
>  just fine. Cc'ing Vanessa again here, 
> could you please help with this?

Just as FYI, in case you want to update patches instead of abandoning them: 
“git review” can be used to download patches like: “git review -d ”, 
where gerrit id is the number at the end of the URLs. You can then modify the 
patches, amend them (e.g., git commit -a --amend) and “git review” them again. 
That will update the existing patch set in gerrit.

Regards,
Florin

> 
> 
> On Tue, Jul 28, 2020 at 9:27 PM Florin Coras  > wrote:
> Hi Ivan, 
> 
> Inline.
> 
>> On Jul 28, 2020, at 8:45 AM, Ivan Shvedunov > > wrote:
>> 
>> Hi Florin,
>> thanks, the fix has worked and http_static no longer crashes.
> 
> Perfect, thanks for confirming! 
> 
>> 
>> I still get a number of messages like this when using release build:
>> /usr/bin/vpp[39]: state_sent_ok:954: BUG: couldn't send response header!
>> Not sure if it's actually a bug or the queue being actually full because of 
>> the packet loss being caused by netem.
> 
> Not sure. Either that or there’s some issue with the send logic in some 
> corner case. If you have hsm->debug_level > 0 you should get a warning when 
> the enqueue fails because there’s no space left for sending. Note that we 
> probably need some logic to retry sending if enqueue fails (fill in usual 
> disclaimer that these are example test apps, unfortunately). I can try to 
> provide some guidance if anybody feels like tackling that specific problem. 
> 
>> 
>> There are also numerous messages like
>> /usr/bin/vpp[39]: tcp_input_dispatch_buffer:2817: tcp conn 13836 disp error 
>> state CLOSE_WAIT flags 0x02 SYN
>> but I guess that's caused by a large number of connections from the same 
>> client => port reuse and, again, the packet loss.
> 
> Are you also inducing reorder? 
> 
> It seems the client closes the connection cleanly (fin) and before that is 
> completed, which as a first step would require a close from the app and a 
> move to last-ack, it sends a syn. Alternatively, reordering makes it look 
> that way. Those warnings are not harmful per se, because tcp will just drop 
> the packets, but we could try processing the syns to maybe force resets. If 
> you have the time to do it, try this [1]
> 
>> 
>> I've spotted another crash in the proxy example for which I've submitted a 
>> patch:
>> https://gerrit.fd.io/r/c/vpp/+/28085 
>> 
>> One problem with this fix is that the proxy example is actually incomplete, 
>> as the client connection should be dropped in case if the proxy can't 
>> establish the corresponding outgoing connection (e.g. timeout or RST b/c of 
>> "connection refused"), whereas currently the client-side connection just 
>> hangs instead. I'll take a stab at fixing that as our proxy in UPF also has 
>> problems with handling similar situations and it would be easier for me to 
>> fix it in this simpler example first. But, in any case, crashing VPP is 
>> probably much worse than just letting the connection hang, so maybe we 

Re: [vpp-dev] TCP timer race and another possible TCP issue

2020-07-29 Thread Ivan Shvedunov
Hi Florin,

while trying to fix the proxy cleanup issue, I've spotted another problem
in the TCP stack, namely RSTs being ignored in SYN_SENT (half-open)
connection state:
https://gerrit.fd.io/r/c/vpp/+/28103

The following fix for handling failed active connections in the proxy has
worked for me, but please confirm that it's correct:
https://gerrit.fd.io/r/c/vpp/+/28104

After https://gerrit.fd.io/r/c/vpp/+/28096 , I no longer see any "disp
error" diagnostics for CLOSE_WAIT but I still do see "disp errors" for SYNs
in TIME_WAIT and FIN_WAIT_2 states. I didn't enable reorder in netem, the
config for it that I now use is this:
ip netns exec client tc qdisc add dev client root netem \
   loss 15% delay 100ms 300ms duplicate 10%

I tried to abandon my older proxy patch that just removes ASSERT (0), but
somehow I've lost the ability to log into fd.io Gerrit with my Linux
Foundation login/password, although I can still submit patches and also log
into identity.linuxfoundation.org just fine. Cc'ing Vanessa again here,
could you please help with this?


On Tue, Jul 28, 2020 at 9:27 PM Florin Coras  wrote:

> Hi Ivan,
>
> Inline.
>
> On Jul 28, 2020, at 8:45 AM, Ivan Shvedunov  wrote:
>
> Hi Florin,
> thanks, the fix has worked and http_static no longer crashes.
>
>
> Perfect, thanks for confirming!
>
>
> I still get a number of messages like this when using release build:
> /usr/bin/vpp[39]: state_sent_ok:954: BUG: couldn't send response header!
> Not sure if it's actually a bug or the queue being actually full because
> of the packet loss being caused by netem.
>
>
> Not sure. Either that or there’s some issue with the send logic in some
> corner case. If you have hsm->debug_level > 0 you should get a warning when
> the enqueue fails because there’s no space left for sending. Note that we
> probably need some logic to retry sending if enqueue fails (fill in usual
> disclaimer that these are example test apps, unfortunately). I can try to
> provide some guidance if anybody feels like tackling that specific problem.
>
>
> There are also numerous messages like
> /usr/bin/vpp[39]: tcp_input_dispatch_buffer:2817: tcp conn 13836 disp
> error state CLOSE_WAIT flags 0x02 SYN
> but I guess that's caused by a large number of connections from the same
> client => port reuse and, again, the packet loss.
>
>
> Are you also inducing reorder?
>
> It seems the client closes the connection cleanly (fin) and before that is
> completed, which as a first step would require a close from the app and a
> move to last-ack, it sends a syn. Alternatively, reordering makes it look
> that way. Those warnings are not harmful per se, because tcp will just drop
> the packets, but we could try processing the syns to maybe force resets. If
> you have the time to do it, try this [1]
>
>
> I've spotted another crash in the proxy example for which I've submitted a
> patch:
> https://gerrit.fd.io/r/c/vpp/+/28085
>
> One problem with this fix is that the proxy example is actually
> incomplete, as the client connection should be dropped in case if the proxy
> can't establish the corresponding outgoing connection (e.g. timeout or RST
> b/c of "connection refused"), whereas currently the client-side connection
> just hangs instead. I'll take a stab at fixing that as our proxy in UPF
> also has problems with handling similar situations and it would be easier
> for me to fix it in this simpler example first. But, in any case, crashing
> VPP is probably much worse than just letting the connection hang, so maybe
> we should remove that ASSERT after all.
>
>
> Exactly. The assert there is just a reminder to eventually do the proper
> cleanup. Thanks for looking into it!
>
> Regards,
> Florin
>
> [1] https://gerrit.fd.io/r/c/vpp/+/28096
>
>
>
> On Mon, Jul 27, 2020 at 10:59 PM Florin Coras 
> wrote:
>
>> Hi Ivan,
>>
>> Took a look at the static http server and, as far as I can tell, it has
>> the same type of issue the proxy had, i.e., premature session
>> cleanup/reuse. Does this solve the problem for you [1]?
>>
>> Also, merged your elog fix patch. Thanks!
>>
>> Regards,
>> Florin
>>
>> [1]  https://gerrit.fd.io/r/c/vpp/+/28076
>>
>> On Jul 27, 2020, at 10:22 AM, Ivan Shvedunov  wrote:
>>
>> Hi.
>> I've debugged http server issue a bit more and here are my observations:
>> if I add an ASSERT(0) in the place of "No http session for thread 0
>> session_index 54",
>> I get stack trace along the lines of
>> Program received signal SIGABRT, Aborted.
>> 0x7470bf47 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>> #0  0x7470bf47 in raise () from /lib/x86_64-linux-gnu/libc.so.6
>> #1  0x7470d8b1 in abort () from /lib/x86_64-linux-gnu/libc.so.6
>> #2  0x00407193 in os_panic () at /src/vpp/src/vpp/vnet/main.c:371
>> #3  0x755ec619 in debugger () at /src/vpp/src/vppinfra/error.c:84
>> #4  0x755ec397 in _clib_error (how_to_die=2, function_name=0x0,
>> line_number=0,
>> fmt=0x7fffb0e1324b "%s:%d (%s) assertion `%s' 

Re: [vpp-dev] Update to iOAM using latest IETF draft #vpp

2020-07-29 Thread Justin Iurman
Hi Mauricio,

CC'ing Shwetha, she implemented the IOAM plugin. Last time I checked, IOAM 
namespaces were not included, so it is probably based on the -03 version of 
draft-ietf-ippm-ioam-data. Actually, just to let you know, there is already 
someone that is going to rebase the implementation on the last draft version.

Justin

> Hi,
> 
> I noticed that the current iOAM plugin implementation is using the first IETF
> drafts, so I'm thinking about trying to update the iOAM implementation in VPP
> using the latest. I first just want to make sure there that this update is not
> in the immediate VPP release pipeline since I do not wish to do work that has
> already been done. I'm also uncertain about the amount of work it will require
> to update the plugin, but I have already dug into the code and it doesn't seem
> "too" bad.
> 
> Regards,
> 
> Mauricio
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17107): https://lists.fd.io/g/vpp-dev/message/17107
Mute This Topic: https://lists.fd.io/mt/75861366/21656
Mute #vpp: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


[vpp-dev] Update to iOAM using latest IETF draft #vpp

2020-07-29 Thread mauricio.solisjr via lists.fd.io
Hi,

I noticed that the current iOAM plugin implementation is using the first IETF 
drafts, so I'm thinking about trying to update the iOAM implementation in VPP 
using the latest.  I first just want to make sure there that this update is not 
in the immediate VPP release pipeline since I do not wish to do work that has 
already been done.  I'm also uncertain about the amount of work it will require 
to update the plugin, but I have already dug into the code and it doesn't seem 
"too" bad.

Regards,

Mauricio
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#17106): https://lists.fd.io/g/vpp-dev/message/17106
Mute This Topic: https://lists.fd.io/mt/75861366/21656
Mute #vpp: https://lists.fd.io/g/fdio+vpp-dev/mutehashtag/vpp
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-


Re: [vpp-dev] Create big tables on huge-page

2020-07-29 Thread Nitin Saxena
Hi Lijian,

+1 on the finding. It would be interested to know how much is the performance 
gain.
Having said that, correct me if I am wrong,  I think pmalloc module works only 
on single hugepage size (pm->def_log2_page_sz) which means either 1G or 2M and 
not both

Thanks,
Nitin

From: vpp-dev@lists.fd.io  On Behalf Of Honnappa 
Nagarahalli
Sent: Thursday, July 23, 2020 10:53 PM
To: Damjan Marion 
Cc: Lijian Zhang ; vpp-dev ; nd 
; Govindarajan Mohandoss ; 
Jieqiang Wang ; Honnappa Nagarahalli 

Subject: [EXT] Re: [vpp-dev] Create big tables on huge-page

External Email

Sure. We will create couple of patches (in the areas we are analyzing 
currently) and we can decide from there.
Thanks,
Honnappa

From: Damjan Marion mailto:dmar...@me.com>>
Sent: Thursday, July 23, 2020 12:17 PM
To: Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>
Cc: Lijian Zhang mailto:lijian.zh...@arm.com>>; vpp-dev 
mailto:vpp-dev@lists.fd.io>>; nd 
mailto:n...@arm.com>>; Govindarajan Mohandoss 
mailto:govindarajan.mohand...@arm.com>>; 
Jieqiang Wang mailto:jieqiang.w...@arm.com>>
Subject: Re: [vpp-dev] Create big tables on huge-page



Hard to say without seeing the patch. Can you summarize what the changes will 
be in each particular .c file?


On 23 Jul 2020, at 18:00, Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>> wrote:

Hi Damjan,
Thank you. Till your patch is ready, would you accept patches 
that would enable creating these tables in 1G huge pages as temporary solution?

Thanks,
Honnappa

From: Damjan Marion mailto:dmar...@me.com>>
Sent: Thursday, July 23, 2020 7:15 AM
To: Lijian Zhang mailto:lijian.zh...@arm.com>>
Cc: vpp-dev mailto:vpp-dev@lists.fd.io>>; nd 
mailto:n...@arm.com>>; Honnappa Nagarahalli 
mailto:honnappa.nagaraha...@arm.com>>; 
Govindarajan Mohandoss 
mailto:govindarajan.mohand...@arm.com>>; 
Jieqiang Wang mailto:jieqiang.w...@arm.com>>
Subject: Re: [vpp-dev] Create big tables on huge-page


I started working on patch which addresses most of this points, few weeks ago, 
but likely I will not have it completed for 20.09.
Even if it is completed, it is probably bad idea to merge it so late in the 
release process….

—
Damjan



On 23 Jul 2020, at 10:45, Lijian Zhang 
mailto:lijian.zh...@arm.com>> wrote:

Hi Maintainers,
From VPP source code, ip4-mtrie table is created on huge-page only when below 
parameters are set in configuration file.
While adjacency table is created on normal-page always.
  36 ip {
  37   heap-size 256M
  38   mtrie-hugetlb
  39 }
In the 10K flow testing, I configured 10K routing entries in ip4-mtrie and 10K 
entries in adjacency table.
By creating ip4-mtrie table on 1G huge-page with above parameters set and 
similarly create adjacency table on 1G huge-page, although I don’t observe 
obvious throughput performance improvement, but TLB misses are dramatically 
reduced.
Do you think configuration of 10K routing entries + 10K adjacency entries is a 
reasonable and possible config, or normally it would be 10K routing entries + 
only several adjacency entries?
Does it make sense to create adjacency table on huge-pages?
Another problem is although above assigned heap-size is 256M, but on 1G 
huge-page system, it seems to occupy a huge-page completely, other memory space 
within that huge-page seems will not be used by other tables.

Same as the bihash based tables, only 2M huge-page system is supported. To 
support creating bihash based tables on 1G huge-page system, each table will 
occupy a 1G huge-page completely, but that will waste a lot of memories.
Is it possible just like pmalloc module, reserving a big memory space on 1G/2M 
huge-pages in initialization stage, and then allocate memory pieces per 
requirement for Bihash, ip4-mtrie and adjacency tables, so that all tables 
could be created on huge-pages and will fully utilize the huge-pages.
I tried to create MAC table on 1G huge-page, and it does improve throughput 
performance.
vpp# show bihash
Name Actual Configured
GBP Endpoints - MAC/BD   1m 1m
b4s 64m 64m
b4s 64m 64m
in2out   10.12m 10.12m
in2out   10.12m 10.12m
ip4-dr   2m 2m
ip4-dr   2m 2m
ip6 FIB fwding table32m 32m
ip6 FIB non-fwding table32m 32m
ip6 mFIB table  32m 32m
l2fib mac table512m 512m
mapping_by_as4  64m 64m
out2in 128m 128m
out2in 128m 128m
out2in   10.12m 10.12m
out2in   10.12m 10.12m
pppoe link table 8m 8m
pppoe session table  8m 8m
static_mapping_by_external  64m 64m
static_mapping_by_local 64m 64m
stn addresses1m 1m
users