On Tue, 2023-12-19 at 16:44 +0000, Alex Bennée wrote:
> Richard Purdie <[email protected]> writes:
> 
> > On Tue, 2023-12-19 at 15:10 +0000, Alex Bennée wrote:
> > > "Richard Purdie" <[email protected]> writes:
> > > 
> > > > Similar to stdout in the previous commit, we need to ensure serial 
> > > > output
> > > > if written is read and put somewhere, else qemu might block on writes to
> > > > the serial port leading to hangs in the kernel. Use our existing logging
> > > > thread to log data when run_serial is not in use.
> > > > 
> > > > Signed-off-by: Richard Purdie <[email protected]>
> > > 
> > > I'm still seeing hanging with my test case:
> > > 
> > >   o regression_1010.5 Invalid memory access 5
> > >     regression_1010.5 OK
> > >   o regression_1010.6 Invalid memory access 6
> > > 
> > >   Stdout:
> > >   Tried to dump info from target but serial console failed
> > >   Failed CMD: top -bn1
> > >   Tried to dump info from target but serial console failed
> > >   Failed CMD: ps
> > >   Tried to dump info from target but serial console failed
> > >   Failed CMD: free
> > > 
> > >   NOTE: test_soafee (soafeetestsuite.SoafeeTestSuite.test_soafee)
> > >   DEBUG: Checking if at least one of soafee-test-suite is installed
> > >   DEBUG: Setting up a 400 second(s) timeout
> > >   DEBUG: [Running]$ ssh -l root -o ServerAliveCountMax=2 -o 
> > > ServerAliveInterval=30 -o UserKnownHostsFile=/dev/null -o 
> > > StrictHostKeyChecking=no -o LogLevel=ERROR -p 2222 127.0.0.1 export 
> > > PATH=/usr/sbin:/sbin:/usr/bin:/bin; sudo -i -u trs soafee-test-suite run 
> > > -r -t -T ; rm -rf /tmp/soafee-test-suite
> > >   DEBUG: Waiting for process output: time: 1702998407.0291, endtime: 
> > > 1702998707.0285847
> > >   DEBUG: Waiting for process output: time: 1702998412.0338314, endtime: 
> > > 1702998707.0285847
> > >   DEBUG: Waiting for process output: time: 1702998417.0378022, endtime: 
> > > 1702998707.0285847
> > >   DEBUG: Waiting for process output: time: 1702998422.0417945, endtime: 
> > > 1702998707.0285847
> > >   <snip>
> > > 
> > > The regression tests are hanging because that has triggered output on
> > > the serial port as the firmware attempted to service the test and the
> > > port isn't being drained. Then when that times out we start spinning
> > > waiting for more output.
> > 
> > I don't really understand where it might be hanging unfortunately as in
> > my local tests it is draining stdout and the serial ports. Does it
> > login to the serial console correctly? Are there any WARNING messages?
> > 
> > I did use this debug patch:
> > 
> > https://git.yoctoproject.org/poky-contrib/commit/?h=rpurdie/t222&id=0d9029bad610736c114e0770d2433b53133fa620
> > 
> > to prove that it can interleave serial commands with left over output
> > successfully. A similar patch may help work out where your setup may be
> > hanging?
> 
> Yeah I'm only seeing A10/A11:
> 
>   o regression_1010.5 Invalid memory access 5
>     regression_1010.5 OK
>   o regression_1010.6 Invalid memory access 6
> 
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   WARNING: here A10
>   WARNING: here A11
>   NOTE:  ... FAIL
>   DEBUG: Removed SIGALRM handler
>   Traceback (most recent call last):
>     File 
> "/home/alex/lsrc/tests/trs/build/../poky/meta/lib/oeqa/core/decorator/__init__.py",
>  line 35, in wrapped_f
>       return func(*args, **kwargs)
>              ^^^^^^^^^^^^^^^^^^^^^
>     File 
> "/home/alex/lsrc/tests/trs/build/../poky/meta/lib/oeqa/core/decorator/__init__.py",
>  line 35, in wrapped_f
>       return func(*args, **kwargs)
>              ^^^^^^^^^^^^^^^^^^^^^
>     File 
> "/home/alex/lsrc/tests/trs/meta-trs/lib/oeqa/runtime/cases/opteetest.py", 
> line 27, in test_opteetest_xtest
>       self.assertEqual(status, 0, msg='\n'.join([cmd, output]))
>   AssertionError: 255 != 0 :  xtest               
>   Run test suite with level=0

How long does it take for the test data to "back up" and timeout? The
logging thread should take a maximum of 2s to notice data not being
read/handled on the serial port.

What the code should be doing is noticing the run_serial function isn't
using the port and if there is data there, reading it in the logging
thread. You could put further warnings around the code there to see if
it ever detects data to read on that port. A warning when it exits the
run_serial command may help too to see if it is waiting for data there
or not.

I also ended up using:

https://git.yoctoproject.org/poky-contrib/commit/?h=rpurdie/t222&id=febb0d892e4c3648cd88b8ecb9c0260edc45b68e

to show what was going over the serial interface when I found it wasn't
what I expected.

I suspect/hope my patch is a step in the right direction but it looks
like there may be cases it doesn't catch yet :/.

Cheers,

Richard






-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#192758): 
https://lists.openembedded.org/g/openembedded-core/message/192758
Mute This Topic: https://lists.openembedded.org/mt/103244415/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to