Trying to reproduce this I'm seeing sporadic failures in the scheduler validation test that don't seem to appear in the base api-next branch. Issue seems to be failures in the ordered queue tests:
Test: scheduler_test_multi_mq_mt_prio_n ...linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) passed Test: scheduler_test_multi_mq_mt_prio_a ...linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) passed Test: scheduler_test_multi_mq_mt_prio_o ...linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) FAILED 1. scheduler.c:871 - bctx->sequence == seq 2. scheduler.c:871 - bctx->sequence == seq Test: scheduler_test_multi_1q_mt_a_excl ...linux.c:273:odpthread_run_start_routine():helper: ODP worker thread started as linux pthread. (pid=6274) We had seen these earlier but they were never consistently reproducible. Petri: are you able to recreate this on your local systems? On Wed, Nov 16, 2016 at 2:03 PM, Maxim Uvarov <maxim.uva...@linaro.org> wrote: > I can not test patch by patch this series because it fails (one time it > was TM, one time kernel died, other time OOM killer killed tests then hang > kernel). > > And for all patches test/common_plat/validation/api/pktio/pktio_main > hangs forever: > > > Program received signal SIGINT, Interrupt. > 0x00002afbe69ffb80 in __nanosleep_nocancel () at > ../sysdeps/unix/syscall-template.S:81 > 81 in ../sysdeps/unix/syscall-template.S > (gdb) bt > #0 0x00002afbe69ffb80 in __nanosleep_nocancel () at > ../sysdeps/unix/syscall-template.S:81 > #1 0x0000000000415ced in odp_pktin_recv_tmo (queue=..., > packets=packets@entry=0x7ffed64d8bd0, num=num@entry=1, > wait=wait@entry=18446744073709551615) at > ../../../platform/linux-generic/odp_packet_io.c:1584 > #2 0x00000000004047fa in recv_packets_tmo (pktio=pktio@entry=0x2, > pkt_tbl=pkt_tbl@entry=0x7ffed64d9500, > seq_tbl=seq_tbl@entry=0x7ffed64d94b0, num=num@entry=1, > mode=mode@entry=RECV_TMO, > tmo=tmo@entry=18446744073709551615, ns=ns@entry=0) > at ../../../../../../test/common_plat/validation/api/pktio/pktio.c:515 > #3 0x00000000004075f8 in test_recv_tmo (mode=RECV_TMO) at > ../../../../../../test/common_plat/validation/api/pktio/pktio.c:940 > #4 0x00002afbe61cc482 in run_single_test () from > /usr/local/lib/libcunit.so.1 > #5 0x00002afbe61cc0b2 in run_single_suite () from > /usr/local/lib/libcunit.so.1 > #6 0x00002afbe61c9d55 in CU_run_all_tests () from > /usr/local/lib/libcunit.so.1 > #7 0x00002afbe61ce245 in basic_run_all_tests () from > /usr/local/lib/libcunit.so.1 > #8 0x00002afbe61cdfe7 in CU_basic_run_tests () from > /usr/local/lib/libcunit.so.1 > #9 0x0000000000409361 in odp_cunit_run () at > ../../../../test/common_plat/common/odp_cunit_common.c:298 > #10 0x00002afbe6c2ff45 in __libc_start_main (main=0x403850 <main>, argc=1, > argv=0x7ffed64d9878, init=<optimized out>, > fini=<optimized out>, rtld_fini=<optimized out>, > stack_end=0x7ffed64d9868) at libc-start.c:287 > #11 0x000000000040387e in _start () > (gdb) up > #1 0x0000000000415ced in odp_pktin_recv_tmo (queue=..., > packets=packets@entry=0x7ffed64d8bd0, num=num@entry=1, > wait=wait@entry=18446744073709551615) at > ../../../platform/linux-generic/odp_packet_io.c:1584 > 1584 nanosleep(&ts, NULL); > (gdb) p ts > $1 = {tv_sec = 0, tv_nsec = 1000} > (gdb) l > 1579 } > 1580 > 1581 wait--; > 1582 } > 1583 > 1584 nanosleep(&ts, NULL); > 1585 } > 1586 } > 1587 > 1588 int odp_pktin_recv_mq_tmo(const odp_pktin_queue_t queues[], > unsigned num_q, > (gdb) up > #2 0x00000000004047fa in recv_packets_tmo (pktio=pktio@entry=0x2, > pkt_tbl=pkt_tbl@entry=0x7ffed64d9500, > seq_tbl=seq_tbl@entry=0x7ffed64d94b0, num=num@entry=1, > mode=mode@entry=RECV_TMO, > tmo=tmo@entry=18446744073709551615, ns=ns@entry=0) > at ../../../../../../test/common_plat/validation/api/pktio/pktio.c:515 > 515 n = odp_pktin_recv_tmo(pktin[0], pkt_tmp, num - num_rx, > (gdb) p num - num_rx > $2 = 1 > (gdb) l > 510 /** Multiple odp_pktin_recv_tmo()/odp_pktin_recv_mq_tmo() > calls may be > 511 * required to discard possible non-test packets. */ > 512 do { > 513 ts1 = odp_time_global(); > 514 if (mode == RECV_TMO) > 515 n = odp_pktin_recv_tmo(pktin[0], pkt_tmp, num - num_rx, > 516 tmo); > 517 else > 518 n = odp_pktin_recv_mq_tmo(pktin, (unsigned)num_q, > 519 from, pkt_tmp, > (gdb) p tmo > $3 = 18446744073709551615 > > > I applied patches and following script under root: > CLEANUP=0 GIT_URL=/opt/Linaro/odp3.git GIT_BRANCH=api-next ./build.sh > > Need more investigation into this issue... Not applied yet. > > Maxim. > > On 11/16/16 02:58, Bill Fischofer wrote: > >> Trying again as the repost doesn't seem to show up on the list either. >> >> For this series: >> >> Reviewed-and-tested-by: Bill Fischofer <bill.fischo...@linaro.org >> <mailto:bill.fischo...@linaro.org>> >> >> On Tue, Nov 15, 2016 at 5:55 PM, Bill Fischofer < >> bill.fischo...@linaro.org <mailto:bill.fischo...@linaro.org>> wrote: >> >> Reposting this since it doesn't seem to have made it to the >> mailing list. >> >> For this series: >> >> Reviewed-and-tested-by: Bill Fischofer <bill.fischo...@linaro.org >> <mailto:bill.fischo...@linaro.org>> >> >> On Tue, Nov 15, 2016 at 8:41 AM, Bill Fischofer >> <bill.fischo...@linaro.org <mailto:bill.fischo...@linaro.org>> wrote: >> >> For this series: >> >> Reviewed-and-tested-by: Bill Fischofer >> <bill.fischo...@linaro.org <mailto:bill.fischo...@linaro.org>> >> >> On Thu, Nov 10, 2016 at 5:07 AM, Petri Savolainen >> <petri.savolai...@nokia.com >> <mailto:petri.savolai...@nokia.com>> wrote: >> >> Pool performance is optimized by using a ring as the >> global buffer storage. >> IPC build is disabled, since it needs large modifications >> due to dependency to >> pool internals. Old pool implementation was based on locks >> and linked list of >> buffer headers. New implementation maintain a ring of >> buffer handles, which >> enable fast, burst based allocs and frees. Also ring >> scales better with number >> of cpus than a list (enq and deq operations update >> opposite ends of the pool). >> >> L2fwd link rate (%), 2 x 40GE, 64 byte packets >> >> direct- parallel- atomic- >> cpus orig direct diff orig parall diff orig >> atomic diff >> 1 7 % 8 % 1 % 6 % 6 % 2 % 5.4 >> % 5.6 % 4 % >> 2 14 % 15 % 7 % 9 % 9 % 5 % 8 % >> 9 % 8 % >> 4 28 % 30 % 6 % 13 % 14 % 13 % 12 % >> 15 % 19 % >> 6 42 % 44 % 6 % 16 % 19 % 19 % 8 % >> 20 % 150 % >> 8 46 % 59 % 28 % 19 % 23 % 26 % 18 % >> 24 % 34 % >> 10 55 % 57 % 3 % 20 % 27 % 37 % 8 % >> 28 % 264 % >> 12 56 % 56 % -1 % 22 % 31 % 43 % 7 % >> 32 % 357 % >> >> Max packet rate of NICs are reached with 10-12 cpu on >> direct mode. Otherwise, >> all cases were improved. Especially, scheduler driven >> cases suffered on bad >> pool scalability. >> >> changed in v3: >> * rebased >> * ipc disabled with #ifdef >> * added support for multi-segment packets >> * API: added explicit limits for packet length in alloc calls >> * Corrected validation test and example application bugs >> found during >> segmentation implementation >> >> changed in v2: >> * rebased to api-next branch >> * added a comment that ring size must be larger than >> number of items in it >> * fixed clang build issue >> * added parens in align macro >> >> v1 reviews: >> Reviewed-by: Brian Brooks <brian.bro...@linaro.org >> <mailto:brian.bro...@linaro.org>> >> >> >> >> >> Petri Savolainen (19): >> linux-gen: ipc: disable build of ipc pktio >> linux-gen: pktio: do not free zero packets >> linux-gen: ring: created common ring implementation >> linux-gen: align: added round up power of two >> linux-gen: pool: reimplement pool with ring >> linux-gen: ring: added multi enq and deq >> linux-gen: pool: use ring multi enq and deq operations >> linux-gen: pool: optimize buffer alloc >> linux-gen: pool: clean up pool inlines functions >> linux-gen: pool: ptr instead of hdl in buffer_alloc_multi >> test: validation: buf: test alignment >> test: performance: crypto: use capability to select max >> packet >> test: correctly initialize pool parameters >> test: validation: packet: fix bugs in tailroom and >> concat tests >> linux-gen: packet: added support for segmented packets >> test: validation: packet: improved multi-segment alloc test >> api: packet: added limits for packet len on alloc >> linux-gen: packet: remove zero len support from alloc >> linux-gen: packet: enable multi-segment packets >> >> example/generator/odp_generator.c | 2 +- >> include/odp/api/spec/packet.h | 9 +- >> include/odp/api/spec/pool.h | 6 + >> platform/linux-generic/Makefile.am <http://le.am> >> | 1 + >> >> .../include/odp/api/plat/packet_types.h | 6 +- >> .../include/odp/api/plat/pool_types.h | 6 - >> .../linux-generic/include/odp_align_internal.h | 34 +- >> .../linux-generic/include/odp_buffer_inlines.h | 167 +-- >> .../linux-generic/include/odp_buffer_internal.h | 120 +- >> .../include/odp_classification_datamodel.h | 2 +- >> .../linux-generic/include/odp_config_internal.h | 55 +- >> .../linux-generic/include/odp_packet_internal.h | 87 +- >> platform/linux-generic/include/odp_pool_internal.h | 289 >> +--- >> platform/linux-generic/include/odp_ring_internal.h | 176 >> +++ >> .../linux-generic/include/odp_timer_internal.h | 4 - >> platform/linux-generic/odp_buffer.c | 22 +- >> platform/linux-generic/odp_classification.c | 25 +- >> platform/linux-generic/odp_crypto.c | 12 +- >> platform/linux-generic/odp_packet.c | 717 >> ++++++++-- >> platform/linux-generic/odp_packet_io.c | 2 +- >> platform/linux-generic/odp_pool.c | 1440 >> ++++++++------------ >> platform/linux-generic/odp_queue.c | 4 +- >> platform/linux-generic/odp_schedule.c | 102 +- >> platform/linux-generic/odp_schedule_ordered.c | 4 +- >> platform/linux-generic/odp_timer.c | 3 +- >> platform/linux-generic/pktio/dpdk.c | 10 +- >> platform/linux-generic/pktio/ipc.c | 3 +- >> platform/linux-generic/pktio/loop.c | 2 +- >> platform/linux-generic/pktio/netmap.c | 14 +- >> platform/linux-generic/pktio/socket.c | 17 +- >> platform/linux-generic/pktio/socket_mmap.c | 10 +- >> test/common_plat/performance/odp_crypto.c | 47 +- >> test/common_plat/performance/odp_pktio_perf.c | 2 +- >> test/common_plat/performance/odp_scheduling.c | 8 +- >> test/common_plat/validation/api/buffer/buffer.c | 113 +- >> test/common_plat/validation/api/crypto/crypto.c | 2 +- >> test/common_plat/validation/api/packet/packet.c | 96 +- >> test/common_plat/validation/api/pktio/pktio.c | 21 +- >> 38 files changed, 1745 insertions(+), 1895 deletions(-) >> create mode 100644 >> platform/linux-generic/include/odp_ring_internal.h >> >> -- >> 2.8.1 >> >> >> >> >> >