Re: [OE-Core][RFC][PATCH 3/3] testimage: implement test artifacts retriever for failing tests

Alexis Lothoré via lists . openembedded . org Wed, 07 Jun 2023 00:35:53 -0700

Hi Mikko, sorry for late reply, and thanks for the additional feedback

On 6/2/23 15:07, Mikko Rapeli wrote:
> Hi,
> 
> These changes are an improvement, but based on my experience in product test 
> automation,
> instead of collecting logs after testing is completed, it is better to 
> capture logs
> from the target device while tests are being executed. Serial console, 
> systemd journal
> etc logs can be captured as streams from the device under test (DUT), and 
> time stamps added
> from the test controller so that aligning events like test command execution 
> and log messages
> is possible. It is also a good idea to capture core dumps via this kind of 
> stream(s).
> Problems with HW and BSP SW may not be visible after testing completes 
> (device has
> rebooted, reset) or capturing data is not possible at all (due to system, 
> kernel, userspace hang,
> oom or too much load).
> 
> This capturing of logs could be implemented by adding some configurable 
> variables to execute
> commands on the test controller and inside oeqa environment at certain stages 
> of testing, for example
> after serial console login prompt has been detected. Command to execute could 
> be a simple
> "ssh -c 'journalctl -b -a'" to capture boot logs and "ssh 'journalctl -a -f'" 
> and log the output data with
> additional time stamps to bitbake task output or a separate file.
> 
> Just thinking out loud here, these changes are an improvement over current 
> situation already.
> Thanks for sending these patches!


While I tend to agree with what you are suggesting, I feel like there are some
new constraints, because of targets variety: is the system a real target or a
qemu image ? Does it run systemd/journald ? Are coredumps enabled by default ?
Of course those constraints may be quite easy to circumvent, moreover we can
think of a "best effort" solution which gathers "what it can" from running
target. Also, there may be some corner cases that would need some specific
handling: some tests can be quite long (hours), what about broken pipes during
those ? This would definitely need some re-connection strategies for example.

Since current proposed solution is not very invasive/has not much requirements
except a valid ssh access, I plan to update the series and let some real testing
show if it is relevant. If not (or not enough), we may give a try to a "runtime"
version like the one you are suggesting ?

Thanks,
Alexis
> 
> Cheers,
> 
> -Mikko
> 
> 
> 
> 
> 

-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#182456): 
https://lists.openembedded.org/g/openembedded-core/message/182456
Mute This Topic: https://lists.openembedded.org/mt/99283034/21656
Group Owner: openembedded-core+ow...@lists.openembedded.org
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Re: [OE-Core][RFC][PATCH 3/3] testimage: implement test artifacts retriever for failing tests

Reply via email to