On Mon, Mar 01, 2021 at 08:37:46AM +0000, Richard Purdie wrote:
> On Mon, 2021-03-01 at 07:59 +0000, [email protected] wrote:
> > Hi,
> > 
> > On Fri, Feb 26, 2021 at 05:47:52PM +0000, Richard Purdie wrote:
> > > Calling sync between each file compare is horrible performance wise
> > > as we compare thousands of files. We don't care about IO latency here
> > > so disable.
> > > 
> > > Signed-off-by: Richard Purdie <[email protected]>
> > > ---
> > >  meta/lib/oeqa/selftest/cases/reproducible.py | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/meta/lib/oeqa/selftest/cases/reproducible.py 
> > > b/meta/lib/oeqa/selftest/cases/reproducible.py
> > > index ce4e8ebe06b..05fc4b7fa0a 100644
> > > --- a/meta/lib/oeqa/selftest/cases/reproducible.py
> > > +++ b/meta/lib/oeqa/selftest/cases/reproducible.py
> > > @@ -123,7 +123,7 @@ def compare_file(reference, test, diffutils_sysroot):
> > >          result.status = MISSING
> > >          return result
> > >  
> > > 
> > > -    r = runCmd(['cmp', '--quiet', reference, test], 
> > > native_sysroot=diffutils_sysroot, ignore_status=True)
> > > +    r = runCmd(['cmp', '--quiet', reference, test], 
> > > native_sysroot=diffutils_sysroot, ignore_status=True, sync=False)
> > 
> > Uh, sync is horrible for performance. Why don't we change it to false
> > by default:
> > 
> > --- a/meta/lib/oeqa/utils/commands.py
> > +++ b/meta/lib/oeqa/utils/commands.py
> > @@ -167,7 +167,7 @@ class Result(object):
> >      pass
> >  
> > 
> >  
> > 
> > -def runCmd(command, ignore_status=False, timeout=None, assert_error=True, 
> > sync=True,
> > +def runCmd(command, ignore_status=False, timeout=None, assert_error=True, 
> > sync=False,
> >            native_sysroot=None, limit_exc_output=0, output_log=None, 
> > **options):
> >      result = Result()
> >  
> > 
> > 
> > Based on comment:
> > 
> >     # tests can be heavy on IO and if bitbake can't write out its caches, 
> > we see timeouts.
> >     # call sync around the tests to ensure the IO queue doesn't get too 
> > large, taking any IO
> >     # hit here rather than in bitbake shutdown.
> >     if sync:
> >         p = os.environ['PATH']
> >         os.environ['PATH'] = "/usr/bin:/bin:/usr/sbin:/sbin:" + p
> >         os.system("sync")
> >         os.environ['PATH'] = p
> > 
> > this looks like a workaround for some other bug.
> 
> The wider issue is the randomly failing ptests and other issues with the 
> automated
> testing, particularly things running under qemu system mode. Those seemed 
> partly to 
> be caused by huge IO queues building up. The idea of the sync calls was to 
> keep the 
> UI backlog manageable.
> 
> In most cases we execute occasional commands so this should work/help. In the 
> case of
> the reproducible ptest, we execute a cmp on every generated package which 
> really didn't
> work well hence it makes sense to disable it there.
> 
> We don't see this issue with any other tests as far as I've observed. I 
> really don't
> want to make the runtime testing results worse, anyone who's looked at swat 
> or attended
> bug triage will know how much trouble we're having with the issues this code 
> helps
> mitigate :/.

Ok, understood. You need to keep the system and tests running.

But IO flushes can take any amount of time. So sounds like timeouts are
too short, or should not exists at all. Also, systems are doing too much
parallel work or are actually short on RAM if slow IO to disk starts
failing things.

-Mikko
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#148786): 
https://lists.openembedded.org/g/openembedded-core/message/148786
Mute This Topic: https://lists.openembedded.org/mt/80933358/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to