On 2018-05-07 06:44 PM, Dylan Baker wrote: > Quoting Tomi Sarvela (2018-05-07 01:20:46) >> On 05/07/2018 10:17 AM, Tomi Sarvela wrote: >>> On 05/04/2018 07:57 PM, Dylan Baker wrote: >>>> Quoting Juan A. Suarez Romero (2018-05-04 04:50:27) >>>>> On Fri, 2018-05-04 at 12:03 +0200, Juan A. Suarez Romero wrote: >>>>>> On Wed, 2018-05-02 at 13:57 -0700, Dylan Baker wrote: >>>>>>> Quoting Juan A. Suarez Romero (2018-05-02 09:49:08) >>>>>>>> Hi, Dylan. >>>>>>>> >>>>>>>> I see you've pushed this series. >>>>>>>> >>>>>>>> Now, when I'm trying to run some profiles (mainly, tests/crucible and >>>>>>>> tests/khr_gl* ), seems they are broken: >>>>>>>> >>>>>>>> [0000/7776] >>>>>>>> Traceback (most recent call last): >>>>>>>> File "./piglit", line 178, in <module> >>>>>>>> main() >>>>>>>> File "./piglit", line 174, in main >>>>>>>> sys.exit(runner(args)) >>>>>>>> File "/home/igalia/jasuarez/piglit/framework/exceptions.py", >>>>>>>> line 51, in >>>>>>>> _inner >>>>>>>> func(*args, **kwargs) >>>>>>>> File "/home/igalia/jasuarez/piglit/framework/programs/run.py", >>>>>>>> line 370, in >>>>>>>> run >>>>>>>> backend.finalize({'time_elapsed': time_elapsed.to_json()}) >>>>>>>> File "/home/igalia/jasuarez/piglit/framework/backends/json.py", >>>>>>>> line 163, in >>>>>>>> finalize >>>>>>>> assert data['tests'] >>>>>>>> AssertionError >>>>>>>> >>>>>>>> J.A. >>>>>>>> >>>>>>> >>>>>>> Dang. >>>>>>> >>>>>>> I can't reproduce any failures with crucible, though I did make it >>>>>>> thread safe >>>>>>> and fix the using a config file :) >>>>>>> >>>>>>> I can't get the glcts binary to run, no matter what target I build >>>>>>> for I run >>>>>>> into either EGL errors of GL errors. >>>>>>> >>>>>> >>>>>> More info on this issue. >>>>>> >>>>>> It seems it happens with the profiles that requires to use an >>>>>> external runner >>>>>> (crucible, vk-gl-cts, deqp, ...). >>>>>> >>>>>> >>>>>> When executing, it tells it will run all the tests, but sometimes it >>>>>> just >>>>>> execute one test, other times 2, and other times none. It is in the >>>>>> last case >>>>>> when the error above is shown. >>>>>> >>>>>> Still don't know why. >>>>>> >>>>> >>>>> >>>>> Found the problem in this commit: >>>>> >>>>> commit 9461d92301e72807eba4776a16a05207e3a16477 >>>>> Author: Dylan Baker <dy...@pnwbakers.com> >>>>> Date: Mon Mar 26 15:23:17 2018 -0700 >>>>> >>>>> framework/profile: Add a __len__ method to TestProfile >>>>> This exposes a standard interface for getting the number of >>>>> tests in a >>>>> profile, which is itself nice. It will also allow us to >>>>> encapsulate the >>>>> differences between the various profiles added in this series. >>>>> Tested-by: Rafael Antognolli <rafael.antogno...@intel.com> >>>>> >>>>> >>>> >>>> I'm really having trouble reproducing this, the vulkan cts and >>>> crucible both run >>>> fine for me, no matter how many times I stop and start them. I even >>>> tried with >>>> python2 and couldn't reproduce. Can you give me some more information >>>> about your >>>> system? >>> >>> I think I've hit this same issue on our CI. >>> >>> Symptoms match so that we sometimes run the whole 25k piglit gbm >>> testset, sometimes we stop around the test 400-600. This behaviour can >>> change with subsequent runs without rebooting the machine. Test where >>> run is stopped is usually the same, and changes if filters change. >>> >>> I can reproduce this with -d / --dry-run so the tests themselves are not >>> an issue. Filtering with large -x / --exclude-tests might play a part. >>> The command line is max 25kB, so there shouldn't be cutoff point with >>> partial regex, which then would match too much. >>> >>> I'm just starting to investigate where does the test list size drop so >>> dramatically, probably by inserting testlist size debugs around to see >>> where it takes me. >>> >>> Environment: Ubuntu 18.04 LTS with default mesa >>> Kernel: DRM-Tip HEAD or Ubuntu default. >>> >>> Commandline is built with bash array from blacklist. This looks correct, >>> and sometimes works correctly. Eg >>> >>> ./piglit run tests/gpu ~/results -d -o -l verbose "${OPTIONS[@]}" >>> >>> where $OPTIONS is an array of >>> '-x', 'timestamp-get', >>> '-x', 'glsl-routing', ... >>> >>> Successful CI runlog: >>> http://gfx-ci.fi.intel.com/tree/drm-tip/CI_DRM_4148/pig-glk-j5005/run0.log >>> >>> Unsuccessful CI runlog: >>> http://gfx-ci.fi.intel.com/tree/drm-tip/CI_DRM_4149/pig-glk-j5005/run0.log >>> >>> Between those two runs, only kernel has changed. >>> >>> The issue is easiest to reproduce with GLK. HSW seems to be somewhat >>> affected too, so the host speed might play a part. >> >> Patch below makes the issue disappear for my GLK testrig. >> >> With multiprocessing.pool.imap I'm getting rougly 50% correct behaviour >> and 50% early exists on dry-runs. >> >> With multiprocessing.pool.map I'm not getting early exists at all. >> >> Sample size is ~50 runs for both setups. >> >> With the testset of 26179 on GLK dry-run, the runtime difference is >> negligible: pool.map 49s vs pool.imap 50s >> >> >> >> piglit/framework$ diff -c profile.py.orig profile.py >> *** profile.py.orig 2018-05-07 19:11:37.649994643 +0300 >> --- profile.py 2018-05-07 19:11:46.880994608 +0300 >> *************** >> *** 584,591 **** >> # more code, and adding side-effects >> test_list = (x for x in test_list if filterby(x)) >> >> ! pool.imap(lambda pair: test(pair[0], pair[1], profile, pool), >> ! test_list, chunksize) >> >> def run_profile(profile, test_list): >> """Run an individual profile.""" >> --- 584,591 ---- >> # more code, and adding side-effects >> test_list = (x for x in test_list if filterby(x)) >> >> ! pool.map(lambda pair: test(pair[0], pair[1], profile, pool), >> ! test_list, chunksize) >> >> def run_profile(profile, test_list): >> """Run an individual profile.""" >> >> >> Tomi > > Juan, can you test this patch and see if it resolves your issue as well? I'm > not > sure why this is fixing things, but if it does I'm happy to merge it and deal > with any performance problems it introduces later.
FWIW, this patch doesn't fix the gpu profile running a lot fewer tests now than it did before 9461d92301e72807eba4776a16a05207e3a16477. I'm also using -x. -- Earthling Michel Dänzer | http://www.amd.com Libre software enthusiast | Mesa and X developer
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Piglit mailing list Piglit@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/piglit