Re: [atomic-devel] How to handle crashes

2016-10-24 Thread Jakub Filak
Derek, see my answers inline: On 10/24/2016 08:02 PM, Derek Carr wrote: > Hi Jakub, > > I subscribed to your issue upstream, apologies for missing your earlier > notes. > > Is there an exhaustive list of things that ABRT can detect that is documented? We have this list:

Re: [atomic-devel] How to handle crashes

2016-10-24 Thread Derek Carr
Hi Jakub, I subscribed to your issue upstream, apologies for missing your earlier notes. Is there an exhaustive list of things that ABRT can detect that is documented? The document shows Linux kernel items, but they do overlap with what NodeProblemDetector has at this point. I am not sure if I

Re: [atomic-devel] How to handle crashes

2016-10-24 Thread Jakub Filak
I've asked node-problem-detector upstream for help on engaging ABRT in node-problem-detector: https://github.com/kubernetes/node-problem-detector/issues/35 On 10/21/2016 09:35 AM, Jakub Filak wrote: > I've created a Docker file that produces an image with ABRT configured to > detect Kernel

Re: [atomic-devel] How to handle crashes

2016-10-21 Thread Jakub Filak
I've created a Docker file that produces an image with ABRT configured to detect Kernel oopses in systemd-journal, vmcores on host and registers /proc/sys/kernel/core_pattern to detect core files: https://github.com/jfilak/docker-abrt/tree/atomic_minimal Detecting those problems is not a rocket

Re: [atomic-devel] How to handle crashes

2016-10-17 Thread Jakub Filak
Creating ABRT image is definitely possible. ABRT provides an D-Bus API for accessing detected problems and it's possible to set up ABRT on nodes to report detected problems to a central ABRT daemon. The main difference between ABRT and node-problem-detector is that node-problem-detector is tuned

Re: [atomic-devel] How to handle crashes

2016-09-14 Thread Derek Carr
Dominika has been looking into node problem detector on our team, the issue we have found is while we like how it can report NodeConditions back into cluster state, it's current kernel monitoring support is insufficient until https://github.com/kubernetes/node-problem-detector/issues/14 It would

Re: [atomic-devel] How to handle crashes

2016-09-14 Thread Jeremy Eder
Anyone know? There's a node-problem-detector proposed in Kubernetes but ... abrt is far more comprehensive. https://github.com/kubernetes/node-problem-detector The difference is that node-problem-detector has hooks to call back to the kubernetes control plane to inform it that a node has