Hi Hanlin, 

Thanks for confirming!

Regards,
Florin

> On Jan 18, 2020, at 7:00 PM, wanghanlin <wanghan...@corp.netease.com> wrote:
> 
> Hi Florin,
> With latest master code,the problem regarding 3) has been fixed.
> 
> Thanks & Regards,
> Hanlin
> 
>       
> wanghanlin
> 
> wanghan...@corp.netease.com
>  
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
> On 12/12/2019 14:53,wanghanlin<wanghan...@corp.netease.com> 
> <mailto:wanghan...@corp.netease.com> wrote: 
> That's great! 
> I'll apply and check it soon.
> 
> Thanks & Regards,
> Hanlin
> 
>       
> wanghanlin
> 
> wanghan...@corp.netease.com
>  
> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
> On 12/12/2019 04:15,Florin Coras<fcoras.li...@gmail.com> 
> <mailto:fcoras.li...@gmail.com> wrote: 
> Hi Hanlin, 
> 
> Thanks to Dave, we can now have per thread binary api connections to vpp. 
> I’ve updated the socket client and vcl to leverage this so, after [1] we have 
> per vcl worker thread binary api sockets that are used to exchange fds. 
> 
> Let me know if you’re still hitting the issue. 
> 
> Regards,
> Florin
> 
> [1] https://gerrit.fd.io/r/c/vpp/+/23687 
> <https://gerrit.fd.io/r/c/vpp/+/23687>
> 
>> On Nov 22, 2019, at 10:30 AM, Florin Coras <fcoras.li...@gmail.com 
>> <mailto:fcoras.li...@gmail.com>> wrote:
>> 
>> Hi Hanlin, 
>> 
>> Okay, that’s a different issue. The expectation is that each vcl worker has 
>> a different binary api transport into vpp. This assumption holds for 
>> applications with multiple process workers (like nginx) but is not 
>> completely satisfied for applications with thread workers. 
>> 
>> Namely, for each vcl worker we connect over the socket api to vpp and 
>> initialize the shared memory transport (so binary api messages are delivered 
>> over shared memory instead of the socket). However, as you’ve noted, the 
>> socket client is currently not multi-thread capable, consequently we have an 
>> overlap of socket client fds between the workers. The first segment is 
>> assigned properly but the subsequent ones will fail in this scenario. 
>> 
>> I wasn’t aware of this so we’ll have to either fix the socket binary api 
>> client, for multi-threaded apps, or change the session layer to use 
>> different fds for exchanging memfd fds. 
>> 
>> Regards, 
>> Florin
>> 
>>> On Nov 21, 2019, at 11:47 PM, wanghanlin <wanghan...@corp.netease.com 
>>> <mailto:wanghan...@corp.netease.com>> wrote:
>>> 
>>> Hi Florin,
>>> Regarding 3), I think main problem maybe in function 
>>> vl_socket_client_recv_fd_msg called by vcl_session_app_add_segment_handler. 
>>>  Mutiple worker threads share the same scm->client_socket.fd, so B2 may 
>>> receive the segment memfd belong to A1.
>>> 
>>>
>>> Regards,
>>> Hanlin
>>> 
>>>     
>>> wanghanlin
>>> 
>>> wanghan...@corp.netease.com
>>>  
>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>> On 11/22/2019 01:44,Florin Coras<fcoras.li...@gmail.com> 
>>> <mailto:fcoras.li...@gmail.com> wrote: 
>>> Hi Hanlin, 
>>> 
>>> As Jon pointed out, you may want to register with gerrit. 
>>> 
>>> You comments with respect to points 1) and 2) are spot on. I’ve updated the 
>>> patch to fix them. 
>>> 
>>> Regarding 3), if I understood your scenario correctly, it should not 
>>> happen. The ssvm infra forces applications to map segments at fixed 
>>> addresses. That is, for the scenario you’re describing lower, if B2 is 
>>> processed first, ssvm_slave_init_memfd will map the segment at A2. Note how 
>>> we first map the segment to read the shared header (sh) and then use 
>>> sh->ssvm_va (which should be A2) to remap the segment at a fixed virtual 
>>> address (va). 
>>> 
>>> Regards,
>>> Florin
>>> 
>>>> On Nov 21, 2019, at 2:49 AM, wanghanlin <wanghan...@corp.netease.com 
>>>> <mailto:wanghan...@corp.netease.com>> wrote:
>>>> 
>>>> Hi Florin,
>>>> I have applied the patch, and found some problems in my case.  I have not 
>>>> right to post it in gerrit, so I post here.
>>>> 1)evt->event_type should be set  with SESSION_CTRL_EVT_APP_DEL_SEGMENT 
>>>> rather than SESSION_CTRL_EVT_APP_ADD_SEGMENT. File: 
>>>> src/vnet/session/session_api.c, Line: 561, Function:mq_send_del_segment_cb
>>>> 2)session_send_fds may been called in the end of function 
>>>> mq_send_add_segment_cb, otherwise lock of app_mq can't been free 
>>>> here.File: src/vnet/session/session_api.c, Line: 519, 
>>>> Function:mq_send_add_segment_cb 
>>>> 3) When vcl_segment_attach called in each worker thread, then 
>>>> ssvm_slave_init_memfd can been called in each worker thread and then 
>>>> ssvm_slave_init_memfd map address sequentially through map segment once in 
>>>> advance.  It's OK in only one thread, but maybe wrong in multiple worker 
>>>> threads. Suppose following scene: VPP allocate segment at address A1 and 
>>>> notify worker thread B1 to expect B1 also map segment at address A1,  and 
>>>> simultaneously VPP allocate segment at address A2 and notify worker thread 
>>>> B2 to expect B2 map segment at address A2. If B2 first process notify 
>>>> message, then ssvm_slave_init_memfd may map segment at address A1. Maybe 
>>>> VPP can add segment map address in notify message, and then worker thread 
>>>> just map segment at this address. 
>>>> 
>>>> Regards,
>>>> Hanlin
>>>>    
>>>> wanghanlin
>>>> 
>>>> wanghan...@corp.netease.com
>>>>  
>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>> On 11/19/2019 09:50,wanghanlin<wanghan...@corp.netease.com> 
>>>> <mailto:wanghan...@corp.netease.com> wrote: 
>>>> Hi  Florin,
>>>> VPP vsersion is v19.08.
>>>> I'll apply this patch and check it. Thanks a lot!
>>>> 
>>>> Regards,
>>>> Hanlin
>>>>    
>>>> wanghanlin
>>>> 
>>>> wanghan...@corp.netease.com
>>>>  
>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>> On 11/16/2019 00:50,Florin Coras<fcoras.li...@gmail.com> 
>>>> <mailto:fcoras.li...@gmail.com> wrote: 
>>>> Hi Hanlin,
>>>> 
>>>> Just to make sure, are you running master or some older VPP?
>>>> 
>>>> Regarding the issue you could be hitting lower, here’s [1] a patch that I 
>>>> have not yet pushed for merging because it leads to api changes for 
>>>> applications that directly use the session layer application interface 
>>>> instead of vcl. I haven’t tested it extensively, but the goal with it is 
>>>> to signal segment allocation/deallocation over the mq instead of the 
>>>> binary api.
>>>> 
>>>> Finally, I’ve never tested LDP with Envoy, so not sure if that works 
>>>> properly. There’s ongoing work to integrate Envoy with VCL, so you may 
>>>> want to get in touch with the authors. 
>>>> 
>>>> Regards,
>>>> Florin
>>>> 
>>>> [1] https://gerrit.fd.io/r/c/vpp/+/21497 
>>>> <https://gerrit.fd.io/r/c/vpp/+/21497>
>>>> 
>>>>> On Nov 15, 2019, at 2:26 AM, wanghanlin <wanghan...@corp.netease.com 
>>>>> <mailto:wanghan...@corp.netease.com>> wrote:
>>>>> 
>>>>> hi ALL,
>>>>> I accidentally got following crash stack when I used VCL with hoststack 
>>>>> and memfd. But corresponding invalid rx_fifo address (0x2f42e2480) is 
>>>>> valid in VPP process and also can be found in /proc/map. That is, shared 
>>>>> memfd segment memory is not consistent between hoststack app and VPP.
>>>>> Generally, VPP allocate/dealloc the memfd segment and then notify 
>>>>> hoststack app to attach/detach. But If just after VPP dealloc memfd 
>>>>> segment and notify hoststack app, and then VPP allocate same memfd 
>>>>> segment at once because of session connected, and then what happened now? 
>>>>> Because hoststack app process dealloc message and connected message with 
>>>>> diffrent threads, maybe rx_thread_fn just detach the memfd segment and 
>>>>> not attach the same memfd segment, then unfortunately worker thread get 
>>>>> the connected message. 
>>>>> 
>>>>> These are just my guess, maybe I misunderstand.
>>>>> 
>>>>> (gdb) bt
>>>>> #0  0x00007f7cde21ffbf in raise () from 
>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>> #1  0x0000000001190a64 in Envoy::SignalAction::sigHandler (sig=11, 
>>>>> info=<optimized out>, context=<optimized out>) at 
>>>>> source/common/signal/signal_action.cc:73 <http://signal_action.cc:73/>
>>>>> #2  <signal handler called>
>>>>> #3  0x00007f7cddc2e85e in vcl_session_connected_handler 
>>>>> (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at 
>>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471
>>>>> #4  0x00007f7cddc37fec in vcl_epoll_wait_handle_mq_event 
>>>>> (wrk=0x7f7ccd4bad00, e=0x224052f48, events=0x395000c, 
>>>>> num_ev=0x7f7cca49e5e8)
>>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2658
>>>>> #5  0x00007f7cddc3860d in vcl_epoll_wait_handle_mq (wrk=0x7f7ccd4bad00, 
>>>>> mq=0x224042480, events=0x395000c, maxevents=63, wait_for_time=0, 
>>>>> num_ev=0x7f7cca49e5e8)
>>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2762
>>>>> #6  0x00007f7cddc38c74 in vppcom_epoll_wait_eventfd (wrk=0x7f7ccd4bad00, 
>>>>> events=0x395000c, maxevents=63, n_evts=0, wait_for_time=0)
>>>>>     at /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2823
>>>>> #7  0x00007f7cddc393a0 in vppcom_epoll_wait (vep_handle=33554435, 
>>>>> events=0x395000c, maxevents=63, wait_for_time=0) at 
>>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:2880
>>>>> #8  0x00007f7cddc5d659 in vls_epoll_wait (ep_vlsh=3, events=0x395000c, 
>>>>> maxevents=63, wait_for_time=0) at 
>>>>> /home/wanghanlin/vpp-new/src/vcl/vcl_locked.c:895
>>>>> #9  0x00007f7cdeb4c252 in ldp_epoll_pwait (epfd=67, events=0x3950000, 
>>>>> maxevents=64, timeout=32, sigmask=0x0) at 
>>>>> /home/wanghanlin/vpp-new/src/vcl/ldp.c:2334
>>>>> #10 0x00007f7cdeb4c334 in epoll_wait (epfd=67, events=0x3950000, 
>>>>> maxevents=64, timeout=32) at /home/wanghanlin/vpp-new/src/vcl/ldp.c:2389
>>>>> #11 0x0000000000fc9458 in epoll_dispatch ()
>>>>> #12 0x0000000000fc363c in event_base_loop ()
>>>>> #13 0x0000000000c09b1c in Envoy::Server::WorkerImpl::threadRoutine 
>>>>> (this=0x357d8c0, guard_dog=...) at source/server/worker_impl.cc:104 
>>>>> <http://worker_impl.cc:104/>
>>>>> #14 0x0000000001193485 in std::function<void ()>::operator()() const 
>>>>> (this=0x7f7ccd4b8544)
>>>>>     at 
>>>>> /usr/lib/gcc/x86_64-linux-gnu/7.4.0/../../../../include/c++/7.4.0/bits/std_function.h:706
>>>>> #15 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void 
>>>>> ()>)::$_0::operator()(void*) const (this=<optimized out>, arg=0x2f42e2480)
>>>>>     at source/common/common/posix/thread_impl.cc:33 
>>>>> <http://thread_impl.cc:33/>
>>>>> #16 Envoy::Thread::ThreadImplPosix::ThreadImplPosix(std::function<void 
>>>>> ()>)::$_0::__invoke(void*) (arg=0x2f42e2480) at 
>>>>> source/common/common/posix/thread_impl.cc:32 <http://thread_impl.cc:32/>
>>>>> #17 0x00007f7cde2164a4 in start_thread () from 
>>>>> /lib/x86_64-linux-gnu/libpthread.so.0
>>>>> #18 0x00007f7cddf58d0f in clone () from /lib/x86_64-linux-gnu/libc.so.6
>>>>> (gdb) f 3
>>>>> #3  0x00007f7cddc2e85e in vcl_session_connected_handler 
>>>>> (wrk=0x7f7ccd4bad00, mp=0x224052f4a) at 
>>>>> /home/wanghanlin/vpp-new/src/vcl/vppcom.c:471
>>>>> 471       rx_fifo->client_session_index = session_index;
>>>>> (gdb) p rx_fifo
>>>>> $1 = (svm_fifo_t *) 0x2f42e2480
>>>>> (gdb) p *rx_fifo
>>>>> Cannot access memory at address 0x2f42e2480
>>>>> (gdb)
>>>>> 
>>>>> 
>>>>> Regards,
>>>>> Hanlin
>>>>>   
>>>>> wanghanlin
>>>>> 
>>>>> wanghan...@corp.netease.com
>>>>>  
>>>>> <https://maas.mail.163.com/dashi-web-extend/html/proSignature.html?ftlId=1&name=wanghanlin&uid=wanghanlin%40corp.netease.com&iconUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyelogo%2FdefaultAvatar.png&items=%5B%22wanghanlin%40corp.netease.com%22%5D&logoUrl=https%3A%2F%2Fmail-online.nosdn.127.net%2Fqiyeicon%2F209a2912f40f6683af56bb7caff1cb54.png>
>>>>> 签名由 网易邮箱大师 <https://mail.163.com/dashi/dlpro.html?from=mail81> 定制
>>>> 
>>> 
>> 

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#15198): https://lists.fd.io/g/vpp-dev/message/15198
Mute This Topic: https://lists.fd.io/mt/59126583/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to