On 05/20/17 21:59, Bill Fischofer wrote: > > On Sat, May 20, 2017 at 1:38 PM, Maxim Uvarov <[email protected] > <mailto:[email protected]>> wrote: > > On 05/20/17 01:07, Bill Fischofer wrote: > > Something is clearly going on as I'm seeing the pktio/pktio_ipc_run.sh > > script suddenly start failing for completely unrelated changes. > > > > > Can you reproduce it manually? I can not reproduce it locally > and here looks like it passed for the latest code (if not take in > account that check patch failed): > > https://github.com/Linaro/odp/commits/api-next > <https://github.com/Linaro/odp/commits/api-next> > > > Yes, I can. I've been rebasing my packet reference patch which I'm > trying to put into a pull request. You can see the failure in my local > GitHub repo > at https://travis-ci.org/Bill-Fischofer-Linaro/odp/jobs/234169324 > > I can reproduce the failure on my local Linux system as well, but I > don't understand the output. This is the output I get after I apply part > 9 of my patch (things work normally for the patches before that): > > ==== run pktio_ipc1 then pktio_ipc2 ==== > _ishm.c:866:_odp_ishm_reserve():No huge pages, fall back to normal > pages. check: /proc/sys/vm/nr_hugepages. > _ishm.c:866:_odp_ishm_reserve():No huge pages, fall back to normal > pages. check: /proc/sys/vm/nr_hugepages. > PKTIO: initialized loop interface. > PKTIO: initialized pcap interface. > PKTIO: initialized ipc interface. > PKTIO: initialized socket mmap, use export > ODP_PKTIO_DISABLE_SOCKET_MMAP=1 to disable. > PKTIO: initialized socket mmsg,use export > ODP_PKTIO_DISABLE_SOCKET_MMSG=1 to disable. > > ODP system info > --------------- > ODP API version: 1.14.0 > CPU model: Intel(R) Core(TM) i7-4790K CPU > > Running ODP appl: "pktio_ipc1" > ----------------- > Using IF: (null) > > > pktio_ipc1.c:188:pktio_run_loop():head.seq 0 - cnt_recv 1 = > 18446744073709551615 > pktio_ipc2.c:107:ipc_second_process():exit after 5 seconds > PKTIO: initialized loop interface. > PKTIO: initialized pcap interface. > PKTIO: initialized ipc interface. > PKTIO: initialized socket mmap, use export > ODP_PKTIO_DISABLE_SOCKET_MMAP=1 to disable. > PKTIO: initialized socket mmsg,use export > ODP_PKTIO_DISABLE_SOCKET_MMSG=1 to disable. > pid: 13402, create IPC pktio ipc:13401:ipktio > normal exit of 2 application
that means that second application was exited normally with time-out value set from command line. > srwxr-xr-x 1 bill bill 0 May 19 20:00 /tmp/odp-13401-fdserver > -rw-r--r-- 1 bill bill 4096 May 19 20:00 > /tmp/odp-13401-ishm-ipc:ipktio_info > -rw-r--r-- 1 bill bill 36864 May 19 20:00 > /tmp/odp-13401-ishm-ipc:ipktio_m_cons > -rw-r--r-- 1 bill bill 36864 May 19 20:00 > /tmp/odp-13401-ishm-ipc:ipktio_m_prod > -rw-r--r-- 1 bill bill 36864 May 19 20:00 > /tmp/odp-13401-ishm-ipc:ipktio_s_cons > -rw-r--r-- 1 bill bill 36864 May 19 20:00 > /tmp/odp-13401-ishm-ipc:ipktio_s_prod > -rw-r--r-- 1 bill bill 71827456 May 19 20:00 > /tmp/odp-13401-ishm-ipc_packet_pool > -rw-r--r-- 1 bill bill 168 May 19 20:00 > /tmp/odp-13401-shm-ipc:ipktio_info > -rw-r--r-- 1 bill bill 176 May 19 20:00 > /tmp/odp-13401-shm-ipc:ipktio_m_cons > -rw-r--r-- 1 bill bill 176 May 19 20:00 > /tmp/odp-13401-shm-ipc:ipktio_m_prod > -rw-r--r-- 1 bill bill 176 May 19 20:00 > /tmp/odp-13401-shm-ipc:ipktio_s_cons > -rw-r--r-- 1 bill bill 176 May 19 20:00 > /tmp/odp-13401-shm-ipc:ipktio_s_prod > -rw-r--r-- 1 bill bill 177 May 19 20:00 > /tmp/odp-13401-shm-ipc_packet_pool > !!!First stage FAILED 255!!! That means that first application does not did termination functions without errors. I.e. pool destroy (odp-13401-shm-ipc_packet_pool) and shm destroy (odp-13401-shm-...) did not destroy resources. More likely pool still has packet which was not freed. >From email log I do not see that counters for packet exchange between apps is increased. That is strange. > > Any suggestions for how to debug this? The patch that causes the failure > is some restructuring of the odp_packet_free() and > odp_packet_free_multi() routines to deal with packet references, so I > don't see why pktio_ipc should be affected by that (or by the changes > you posted either) but obviously there is some sensitivity that's going on. > Run manually the same thing as shell scripts does from 2 consoles. Turn on debug prints: ./platform/linux-generic/pktio/ipc.c:#define IPC_ODP_DEBUG_PRINT 1 Check why pool was not destroyed correctly. In multi process scenario if one pool is used by 2 applications you never know when second application is still alive and can process packets from that pool. If you play with references it might be a case where first application waits packets to be freed by second application but because it quit it never happens. So pool destroy might fail due to increasing packet reference or something like that. Maxim. > > > > > > On Fri, May 19, 2017 at 5:02 PM, Dmitry Eremin-Solenikov < > > [email protected] > <mailto:[email protected]>> wrote: > > > >> Since one of last updates I'm receiving timeouts from Travis CI because > >> some tests run silently taking too much time to complete. > >> > > Dmitry, I saw that with your patch for recursive atomic changes. But I > think it's not applied. > > Maxim. > > > >> -- > >> With best wishes > >> Dmitry > >> > >
