On Wed, Dec 21, 2022 at 11:49:41AM -0800, Stephen Hemminger wrote:
> On Wed, 21 Dec 2022 11:33:36 -0800
> Tyler Retzlaff <roret...@linux.microsoft.com> wrote:
> 
> > On Wed, Dec 21, 2022 at 11:03:33AM -0800, Stephen Hemminger wrote:
> > > On Wed, 21 Dec 2022 10:13:49 -0800
> > > Tyler Retzlaff <roret...@linux.microsoft.com> wrote:
> > >   
> > > > hi folks,
> > > > 
> > > > are there any additional requirements that may not be documented for
> > > > running the DPDK:fast-tests suite in a vm on linux?
> > > > 
> > > > if i run the suite as follows i end up getting intermittent failures.
> > > > are there known issues running in a vm?
> > > > 
> > > >   meson test -C u --test-args='--no-huge' --suite fast-tests
> > > >   
> > > > RTE>>fbarray_autotest    
> > > >  + ------------------------------------------------------- +
> > > >  + Test Suite : fbarray autotest
> > > > 
> > > > Thread 1 "dpdk-test" received signal SIGBUS, Bus error.
> > > > __memset_evex_unaligned_erms () at
> > > > ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:250
> > > > 250     ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S: No such
> > > > file or directory.
> > > > (gdb) bt
> > > > #0  __memset_evex_unaligned_erms () at
> > > > ../sysdeps/x86_64/multiarch/memset-vec-unaligned-erms.S:250
> > > > #1  0x0000555556022d5b in rte_fbarray_init (arr=0x55555e3566c0 <param>,
> > > >     name=0x55555bebdb28 "fbarray_autotest", len=256, elt_sz=4) at 
> > > > ../lib/eal/common/eal_common_fbarray.c:802
> > > > #2  0x00005555557b24af in autotest_setup () at 
> > > > ../app/test/test_fbarray.c:31
> > > > #3  0x0000555555636f03 in unit_test_suite_runner (suite=0x55555c846140 
> > > > <fbarray_test_suite>)
> > > >     at ../app/test/test.c:306
> > > > #4  0x00005555557b5ffc in test_fbarray () at 
> > > > ../app/test/test_fbarray.c:733
> > > > #5  0x00005555556292c4 in cmd_autotest_parsed 
> > > > (parsed_result=0x7fffffff8140, cl=0x55556a8b2fa0, data=0x0)
> > > >     at ../app/test/commands.c:68
> > > > #6  0x0000555555fb3294 in __cmdline_parse (cl=0x55556a8b2fa0, 
> > > > buf=0x55556a8b2fe8 "fbarray_autotest\n",
> > > >     call_fn=true) at ../lib/cmdline/cmdline_parse.c:294
> > > > #7  0x0000555555fb32dc in cmdline_parse (cl=0x55556a8b2fa0, 
> > > > buf=0x55556a8b2fe8 "fbarray_autotest\n")
> > > >     at ../lib/cmdline/cmdline_parse.c:302
> > > > #8  0x0000555555fb1577 in cmdline_valid_buffer (rdl=0x55556a8b2fb0, 
> > > > buf=0x55556a8b2fe8 "fbarray_autotest\n",
> > > >     size=18) at ../lib/cmdline/cmdline.c:24
> > > > #9  0x0000555555fb667d in rdline_char_in (rdl=0x55556a8b2fb0, c=10 
> > > > '\n') at ../lib/cmdline/cmdline_rdline.c:444
> > > > #10 0x0000555555fb19bd in cmdline_in (cl=0x55556a8b2fa0, 
> > > > buf=0x7fffffffe3f0 "fbarray_autotest\n", size=17)
> > > >     at ../lib/cmdline/cmdline.c:146
> > > > #11 0x0000555555636a34 in main (argc=2, argv=0x7fffffffe930) at 
> > > > ../app/test/test.c:208
> > > > 
> > > > thanks!  
> > > 
> > > Were hugepages setup before running the test? Was there enough memory 
> > > available?  
> > 
> > hugepages are not setup on the system, i explicitly pass --no-huge or
> > does that not do what i think it should?
> > 
> > the vm has fixed 16GB memory, stepping through the code i do not see
> > internal allocation calls fail prior to the failure.
> > 
> > thanks
> 
> Could be that the allocation gives bogus memory?

i'm not sure i understand the question? do you mean the allocator has a
bug and is giving me space that is in fact not allocated/mapped?

i'm kind of sensing that maybe this is related to some hugepage code
still running even when --no-huge. but i'm not familiar enough with the
code paths to spot what doesn't belong.

a couple of side thoughts.

  this is very easy to reproduce. maybe someone who has a vm hanging around
  and more knowledge can verify my observation that there is a problem and
  it isn't just finger trouble on my part?

  we could / should perhaps expand the CI to also run tests with hugepages
  disabled in both the system and during test passes.

in the absence of anyone else taking a look i'll see if i can find time
to walk through the code more carefully to understand what is actually
breaking.

thanks for the suggestions.

Reply via email to