Re: [Piglit] [PATCH 00/35] Serialize profiles into XML at build time

Michel Dänzer Mon, 07 May 2018 09:50:35 -0700

On 2018-05-07 06:44 PM, Dylan Baker wrote:
> Quoting Tomi Sarvela (2018-05-07 01:20:46)
>> On 05/07/2018 10:17 AM, Tomi Sarvela wrote:
>>> On 05/04/2018 07:57 PM, Dylan Baker wrote:
>>>> Quoting Juan A. Suarez Romero (2018-05-04 04:50:27)
>>>>> On Fri, 2018-05-04 at 12:03 +0200, Juan A. Suarez Romero wrote:
>>>>>> On Wed, 2018-05-02 at 13:57 -0700, Dylan Baker wrote:
>>>>>>> Quoting Juan A. Suarez Romero (2018-05-02 09:49:08)
>>>>>>>> Hi, Dylan.
>>>>>>>>
>>>>>>>> I see you've pushed this series.
>>>>>>>>
>>>>>>>> Now, when I'm trying to run some profiles (mainly, tests/crucible and
>>>>>>>> tests/khr_gl* ), seems they are broken:
>>>>>>>>
>>>>>>>> [0000/7776]
>>>>>>>> Traceback (most recent call last):
>>>>>>>>    File "./piglit", line 178, in <module>
>>>>>>>>      main()
>>>>>>>>    File "./piglit", line 174, in main
>>>>>>>>      sys.exit(runner(args))
>>>>>>>>    File "/home/igalia/jasuarez/piglit/framework/exceptions.py", 
>>>>>>>> line 51, in
>>>>>>>> _inner
>>>>>>>>      func(*args, **kwargs)
>>>>>>>>    File "/home/igalia/jasuarez/piglit/framework/programs/run.py", 
>>>>>>>> line 370, in
>>>>>>>> run
>>>>>>>>      backend.finalize({'time_elapsed': time_elapsed.to_json()})
>>>>>>>>    File "/home/igalia/jasuarez/piglit/framework/backends/json.py", 
>>>>>>>> line 163, in
>>>>>>>> finalize
>>>>>>>>      assert data['tests']
>>>>>>>> AssertionError
>>>>>>>>
>>>>>>>>          J.A.
>>>>>>>>
>>>>>>>
>>>>>>> Dang.
>>>>>>>
>>>>>>> I can't reproduce any failures with crucible, though I did make it 
>>>>>>> thread safe
>>>>>>> and fix the using a config file :)
>>>>>>>
>>>>>>> I can't get the glcts binary to run, no matter what target I build 
>>>>>>> for I run
>>>>>>> into either EGL errors of GL errors.
>>>>>>>
>>>>>>
>>>>>> More info on this issue.
>>>>>>
>>>>>> It seems it happens with the profiles that requires to use an 
>>>>>> external runner
>>>>>> (crucible, vk-gl-cts, deqp, ...).
>>>>>>
>>>>>>
>>>>>> When executing, it tells it will run all the tests, but sometimes it 
>>>>>> just
>>>>>> execute one test, other times 2, and other times none. It is in the 
>>>>>> last case
>>>>>> when the error above is shown.
>>>>>>
>>>>>> Still don't know why.
>>>>>>
>>>>>
>>>>>
>>>>> Found the problem in this commit:
>>>>>
>>>>> commit 9461d92301e72807eba4776a16a05207e3a16477
>>>>> Author: Dylan Baker <dy...@pnwbakers.com>
>>>>> Date:   Mon Mar 26 15:23:17 2018 -0700
>>>>>
>>>>>      framework/profile: Add a __len__ method to TestProfile
>>>>>      This exposes a standard interface for getting the number of 
>>>>> tests in a
>>>>>      profile, which is itself nice. It will also allow us to 
>>>>> encapsulate the
>>>>>      differences between the various profiles added in this series.
>>>>>      Tested-by: Rafael Antognolli <rafael.antogno...@intel.com>
>>>>>
>>>>>
>>>>
>>>> I'm really having trouble reproducing this, the vulkan cts and 
>>>> crucible both run
>>>> fine for me, no matter how many times I stop and start them. I even 
>>>> tried with
>>>> python2 and couldn't reproduce. Can you give me some more information 
>>>> about your
>>>> system?
>>>
>>> I think I've hit this same issue on our CI.
>>>
>>> Symptoms match so that we sometimes run the whole 25k piglit gbm 
>>> testset, sometimes we stop around the test 400-600. This behaviour can 
>>> change with subsequent runs without rebooting the machine. Test where 
>>> run is stopped is usually the same, and changes if filters change.
>>>
>>> I can reproduce this with -d / --dry-run so the tests themselves are not 
>>> an issue. Filtering with large -x / --exclude-tests might play a part. 
>>> The command line is max 25kB, so there shouldn't be cutoff point with 
>>> partial regex, which then would match too much.
>>>
>>> I'm just starting to investigate where does the test list size drop so 
>>> dramatically, probably by inserting testlist size debugs around to see 
>>> where it takes me.
>>>
>>> Environment: Ubuntu 18.04 LTS with default mesa
>>> Kernel: DRM-Tip HEAD or Ubuntu default.
>>>
>>> Commandline is built with bash array from blacklist. This looks correct, 
>>> and sometimes works correctly. Eg
>>>
>>> ./piglit run tests/gpu ~/results -d -o -l verbose "${OPTIONS[@]}"
>>>
>>> where $OPTIONS is an array of
>>> '-x', 'timestamp-get',
>>> '-x', 'glsl-routing', ...
>>>
>>> Successful CI runlog:
>>> http://gfx-ci.fi.intel.com/tree/drm-tip/CI_DRM_4148/pig-glk-j5005/run0.log
>>>
>>> Unsuccessful CI runlog:
>>> http://gfx-ci.fi.intel.com/tree/drm-tip/CI_DRM_4149/pig-glk-j5005/run0.log
>>>
>>> Between those two runs, only kernel has changed.
>>>
>>> The issue is easiest to reproduce with GLK. HSW seems to be somewhat 
>>> affected too, so the host speed might play a part.
>>
>> Patch below makes the issue disappear for my GLK testrig.
>>
>> With multiprocessing.pool.imap I'm getting rougly 50% correct behaviour 
>> and 50% early exists on dry-runs.
>>
>> With multiprocessing.pool.map I'm not getting early exists at all.
>>
>> Sample size is ~50 runs for both setups.
>>
>> With the testset of 26179 on GLK dry-run, the runtime difference is 
>> negligible: pool.map 49s vs pool.imap 50s
>>
>>
>>
>> piglit/framework$ diff -c profile.py.orig profile.py
>> *** profile.py.orig     2018-05-07 19:11:37.649994643 +0300
>> --- profile.py  2018-05-07 19:11:46.880994608 +0300
>> ***************
>> *** 584,591 ****
>>                # more code, and adding side-effects
>>                test_list = (x for x in test_list if filterby(x))
>>
>> !         pool.imap(lambda pair: test(pair[0], pair[1], profile, pool),
>> !                   test_list, chunksize)
>>
>>        def run_profile(profile, test_list):
>>            """Run an individual profile."""
>> --- 584,591 ----
>>                # more code, and adding side-effects
>>                test_list = (x for x in test_list if filterby(x))
>>
>> !         pool.map(lambda pair: test(pair[0], pair[1], profile, pool),
>> !                  test_list, chunksize)
>>
>>        def run_profile(profile, test_list):
>>            """Run an individual profile."""
>>
>>
>> Tomi
> 
> Juan, can you test this patch and see if it resolves your issue as well? I'm 
> not
> sure why this is fixing things, but if it does I'm happy to merge it and deal
> with any performance problems it introduces later.


FWIW, this patch doesn't fix the gpu profile running a lot fewer tests
now than it did before 9461d92301e72807eba4776a16a05207e3a16477. I'm
also using -x.


-- 
Earthling Michel Dänzer               |               http://www.amd.com
Libre software enthusiast             |             Mesa and X developer

signature.asc
Description: OpenPGP digital signature

_______________________________________________
Piglit mailing list
Piglit@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/piglit

Re: [Piglit] [PATCH 00/35] Serialize profiles into XML at build time

Reply via email to