Hi Mikko, sorry for late reply, and thanks for the additional feedback On 6/2/23 15:07, Mikko Rapeli wrote: > Hi, > > These changes are an improvement, but based on my experience in product test > automation, > instead of collecting logs after testing is completed, it is better to > capture logs > from the target device while tests are being executed. Serial console, > systemd journal > etc logs can be captured as streams from the device under test (DUT), and > time stamps added > from the test controller so that aligning events like test command execution > and log messages > is possible. It is also a good idea to capture core dumps via this kind of > stream(s). > Problems with HW and BSP SW may not be visible after testing completes > (device has > rebooted, reset) or capturing data is not possible at all (due to system, > kernel, userspace hang, > oom or too much load). > > This capturing of logs could be implemented by adding some configurable > variables to execute > commands on the test controller and inside oeqa environment at certain stages > of testing, for example > after serial console login prompt has been detected. Command to execute could > be a simple > "ssh -c 'journalctl -b -a'" to capture boot logs and "ssh 'journalctl -a -f'" > and log the output data with > additional time stamps to bitbake task output or a separate file. > > Just thinking out loud here, these changes are an improvement over current > situation already. > Thanks for sending these patches!
While I tend to agree with what you are suggesting, I feel like there are some new constraints, because of targets variety: is the system a real target or a qemu image ? Does it run systemd/journald ? Are coredumps enabled by default ? Of course those constraints may be quite easy to circumvent, moreover we can think of a "best effort" solution which gathers "what it can" from running target. Also, there may be some corner cases that would need some specific handling: some tests can be quite long (hours), what about broken pipes during those ? This would definitely need some re-connection strategies for example. Since current proposed solution is not very invasive/has not much requirements except a valid ssh access, I plan to update the series and let some real testing show if it is relevant. If not (or not enough), we may give a try to a "runtime" version like the one you are suggesting ? Thanks, Alexis > > Cheers, > > -Mikko > > > > > -- Alexis Lothoré, Bootlin Embedded Linux and Kernel engineering https://bootlin.com
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#182456): https://lists.openembedded.org/g/openembedded-core/message/182456 Mute This Topic: https://lists.openembedded.org/mt/99283034/21656 Group Owner: openembedded-core+ow...@lists.openembedded.org Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-