On 2016-10-17, Karel Gardas <gard...@gmail.com> wrote:
> 1) use machine with proper ECC support
> 2) man sendbug -- and following it report your OpenBSD kernel misbehavior

This can be a hard thing to report.

When the machine totally locks up, it is very difficult to get the information
needed to make a bug report, often it is not known exactly how to trigger it,
or whether it's software bug, bit flip, or a hardware fault.

Sometimes you can get useful information from monitoring the machine in the
run-up to a failure - symon (in ports) can be useful for logging things to a
remote machine at an interval which is often fast enough to give clues into
what might be happening. But unless you have a reproducible case, or something
which happens randomly but fairly often, you can be watching for a long time
and not really not exactly what to be monitoring.

On the other hand if you do have a *reproducible* way to trigger such a bug,
that's of great interest.

> On Mon, Oct 17, 2016 at 3:48 PM, Tinker <ti...@openmailbox.org> wrote:
>> Sometimes a machine goes unresponsive. In this case, a non-ECC RAM machine.
>> The reason could be that something in the hardware or kernel failed, e.g. a
>> bit flip error [1].
>> In this case (for a non-kernel developer), tough luck, and the proper thing
>> would be to reboot, and keep statistics over failures on that machine and
>> replace the hardware should the crashes go above some frequency threshold.

If you're not running an up-to-date release, please do so: stefan@'s work on
amap in the 5.9-6.0 timeframe certainly helps some cases - one of the post-6.0
errata might also apply with very large allocations, so 6.0-stable or -current
would be advisable.

Reply via email to