Re: [OE-Core][RFC][PATCH 3/3] testimage: implement test artifacts retriever for failing tests

Mikko Rapeli Wed, 07 Jun 2023 00:46:17 -0700

Hi,

On Wed, Jun 07, 2023 at 09:36:07AM +0200, Alexis Lothoré wrote:
> Hi Mikko, sorry for late reply, and thanks for the additional feedback
> 
> On 6/2/23 15:07, Mikko Rapeli wrote:
> > Hi,
> > 
> > These changes are an improvement, but based on my experience in product 
> > test automation,
> > instead of collecting logs after testing is completed, it is better to 
> > capture logs
> > from the target device while tests are being executed. Serial console, 
> > systemd journal
> > etc logs can be captured as streams from the device under test (DUT), and 
> > time stamps added
> > from the test controller so that aligning events like test command 
> > execution and log messages
> > is possible. It is also a good idea to capture core dumps via this kind of 
> > stream(s).
> > Problems with HW and BSP SW may not be visible after testing completes 
> > (device has
> > rebooted, reset) or capturing data is not possible at all (due to system, 
> > kernel, userspace hang,
> > oom or too much load).
> > 
> > This capturing of logs could be implemented by adding some configurable 
> > variables to execute
> > commands on the test controller and inside oeqa environment at certain 
> > stages of testing, for example
> > after serial console login prompt has been detected. Command to execute 
> > could be a simple
> > "ssh -c 'journalctl -b -a'" to capture boot logs and "ssh 'journalctl -a 
> > -f'" and log the output data with
> > additional time stamps to bitbake task output or a separate file.
> > 
> > Just thinking out loud here, these changes are an improvement over current 
> > situation already.
> > Thanks for sending these patches!
> 
> While I tend to agree with what you are suggesting, I feel like there are some
> new constraints, because of targets variety: is the system a real target or a
> qemu image ? Does it run systemd/journald ? Are coredumps enabled by default ?
> Of course those constraints may be quite easy to circumvent, moreover we can
> think of a "best effort" solution which gathers "what it can" from running
> target. Also, there may be some corner cases that would need some specific
> handling: some tests can be quite long (hours), what about broken pipes during
> those ? This would definitely need some re-connection strategies for example.


Agreed. Currently our qemu reboot tests are failing for due some kind
of reconnection issue, for example. I'll try to find time to fix this...

> Since current proposed solution is not very invasive/has not much requirements
> except a valid ssh access, I plan to update the series and let some real 
> testing
> show if it is relevant. If not (or not enough), we may give a try to a 
> "runtime"
> version like the one you are suggesting ?

Yes, the series is a clear improvement over the current state.

Cheers,

-Mikko

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#182457): 
https://lists.openembedded.org/g/openembedded-core/message/182457
Mute This Topic: https://lists.openembedded.org/mt/99283034/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [OE-Core][RFC][PATCH 3/3] testimage: implement test artifacts retriever for failing tests

Reply via email to