confirmed in var/log it was the oom killer. We don't have any particularly large data queries/ or long running, but we do have a lot of writes. Any tips on narrowing down what query/write may have caused it?
/var/log/messages:Dec 29 10:44:12 influxdb1 kernel: influxd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 /var/log/messages:Dec 29 10:44:12 influxdb1 kernel: [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0 /var/log/messages:Dec 29 10:44:12 influxdb1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name /var/log/messages:Dec 29 10:56:51 influxdb1 kernel: influxd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 /var/log/messages:Dec 29 10:56:51 influxdb1 kernel: [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0 /var/log/messages:Dec 29 10:56:51 influxdb1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name /var/log/messages:Dec 29 11:05:55 influxdb1 kernel: influxd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 /var/log/messages:Dec 29 11:05:55 influxdb1 kernel: [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0 /var/log/messages:Dec 29 11:05:55 influxdb1 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name /var/log/messages:Dec 29 11:18:03 influxdb1 kernel: influxd invoked oom-killer: gfp_mask=0x200da, order=0, oom_score_adj=0 /var/log/messages:Dec 29 11:18:03 influxdb1 kernel: [<ffffffff8116cdee>] oom_kill_process+0x24e/0x3b0 On Thursday, December 29, 2016 at 2:37:25 PM UTC-5, Mark Rushakoff wrote: > > It was most likely the OOM killer kicking in during an out-of-control > query. Confirming whether it was the OOM killer varies by distribution [1]. > It's normal for systemd to restart a service that dies. > > Besides simply avoiding problematic unbounded queries (such as `SELECT * > FROM /.*/ GROUP BY *`), there are some configuration options within the > coordinator section that you can set [2] to prevent queries from > overconsuming resources. > > [1] > https://unix.stackexchange.com/questions/128642/debug-out-of-memory-with-var-log-messages?rq=1 > [2] > https://docs.influxdata.com/influxdb/v1.1/administration/config/#coordinator > > On Thu, Dec 29, 2016 at 11:05 AM, Jeffery K < > [email protected] <javascript:>> wrote: > >> I had an instance this morning, while influx was under high load, that >> after about 20-25 minutes, the influxd process restarted, and was launched >> again, automatically. Is this a feature? >> >> Looking at the journalctl, all i saw was this in the log at the time it >> happened. We are using version 1.1 >> Dec 29 11:18:04 influxdb1 systemd[1]: influxdb.service: main process >> exited, code=killed, status=9/KILL >> Dec 29 11:18:04 influxdb1 systemd[1]: Unit influxdb.service entered >> failed state. >> Dec 29 11:18:04 influxdb1 systemd[1]: influxdb.service failed. >> >> I've confirmed with others that no one was on the linux system, and no >> one manually restarted or killed the process. Do this mean it crashed? if >> it is, how could I confirm that? >> We had been getting sporadic timeouts on the write API endpoint leading >> up to this restart. >> >> >> >> -- >> Remember to include the version number! >> --- >> You received this message because you are subscribed to the Google Groups >> "InfluxData" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected] <javascript:>. >> To post to this group, send email to [email protected] >> <javascript:>. >> Visit this group at https://groups.google.com/group/influxdb. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/influxdb/ef6f65b4-22f1-4a45-af3d-469a12be9821%40googlegroups.com >> >> <https://groups.google.com/d/msgid/influxdb/ef6f65b4-22f1-4a45-af3d-469a12be9821%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- Remember to include the version number! --- You received this message because you are subscribed to the Google Groups "InfluxData" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/0d031a49-4197-425d-be58-d648bc8b692a%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
