Thank you for responding Mark I started a new topic in the community. I do not have a way to reliably reproduce this yet. I deleted all the influxdb data from the host and restarted the workload. Its been running for the last 7 hours and I haven't seen any issues so far
On Sat, Jul 29, 2017 at 2:14 PM, Mark Rushakoff <[email protected]> wrote: > >Jul 28 21:20:40 ip-172-31-2-161 kernel: [242909.768230] > blk_update_request: I/O error, dev nvme0n1, sector 435160504 > > This looks like a hardware error. Can you move the discussion over to > https://community.influxdata.com/ ? Bonus points if you can provide a > script to reproduce the problem. > > > On Friday, July 28, 2017 at 5:40:56 PM UTC-7, [email protected] wrote: >> >> Hi there >> >> I am running InfluxDB on a single instance 8 core VM with 60GB RAM. The >> VM is running Ubuntu 16.04 Xenial and InfluxDB 1.3.1. >> >> I am generating a write workload of about 2800/s. Each point in the >> timeseries is a separate request and has between 50 and 300 fields. There >> are about 10 measurements in the database. >> >> A couple of hours after the workload started, the db crashed and I see >> the following error in the logs. Once the compaction started, CPU usage >> shot up and after a little while, the database shutdown with a sigpanic. I >> tried restarting the database but the tsm files seem to have gotten >> corrupted. I could only bring the database back up after deleting the tsm >> files. >> >> I am using the default influxdb configuration and I am currently not >> batching the requests and will try to make a change to see if that helps. >> Aside from this, I was wondering if there is any setting I could tweak to >> prevent this from happening again or is the non-batch writes that I am >> doing the main problem ? >> >> Any insight into this would be greatly appreciated >> >> >> Jul 28 20:40:06 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:06Z >> beginning full compaction of group 0, 2 TSM files engine=tsm1 >> Jul 28 20:40:06 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:06Z >> compacting full group (0) /mnt/influxdb/data/blueshift/a >> utogen/2/000000615-000000005.tsm (#0) engine=tsm1 >> Jul 28 20:40:06 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:06Z >> compacting full group (0) /mnt/influxdb/data/blueshift/a >> utogen/2/000000637-000000004.tsm (#1) engine=tsm1 >> Jul 28 20:40:07 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:07Z >> compacted level 1 group (0) into /mnt/influxdb/data/blueshift/a >> utogen/2/000000646-000000002.tsm.tmp (#0) engine=tsm1 >> Jul 28 20:40:07 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:07Z >> compacted level 1 3 files into 1 files in 12.705450299s engine=tsm1 >> Jul 28 20:40:29 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:29Z >> compacted full group (0) into /mnt/influxdb/data/blueshift/a >> utogen/2/000000637-000000005.tsm.tmp (#0) engine=tsm1 >> Jul 28 20:40:29 ip-172-31-2-161 influxd[21528]: [I] 2017-07-28T20:40:29Z >> compacted full 2 files into 1 files in 22.466184429s engine=tsm1 >> Jul 28 21:20:40 ip-172-31-2-161 kernel: [242909.768230] >> blk_update_request: I/O error, dev nvme0n1, sector 435160504 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: unexpected fault address >> 0x7fa963c0a092 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: fatal error: fault >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: [signal SIGBUS: bus error >> code=0x2 addr=0x7fa963c0a092 pc=0x9e2014] >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: goroutine 38 [running]: >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: runtime.throw(0xb82f57, >> 0x5) >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> #011/usr/local/go/src/runtime/panic.go:596 +0x95 fp=0xc4203dfce8 >> sp=0xc4203dfcc8 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: runtime.sigpanic() >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> #011/usr/local/go/src/runtime/signal_unix.go:287 +0xf4 fp=0xc4203dfd38 >> sp=0xc4203dfce8 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> github.com/influxdata/influxdb/tsdb/engine/tsm1.(*indirectIn >> dex).UnmarshalBinary(0xc4203ce280, 0x7fa96272ae26, 0x58d4743, 0x58d474b, >> 0x0, 0x0) >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: #011/root/go/src/ >> github.com/influxdata/influxdb/tsdb/engine/tsm1/reader.go:955 +0x1b4 >> fp=0xc4203dfdb0 sp=0xc4203dfd38 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> github.com/influxdata/influxdb/tsdb/engine/tsm1.(*mmapAccess >> or).init(0xc420463000, 0x0, 0x0, 0x0) >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: #011/root/go/src/ >> github.com/influxdata/influxdb/tsdb/engine/tsm1/reader.go:1053 +0x2d9 >> fp=0xc4203dfe30 sp=0xc4203dfdb0 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> github.com/influxdata/influxdb/tsdb/engine/tsm1.NewTSMReader(0xc4204fe098, >> 0x27de2649, 0xef1580, 0x0) >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: #011/root/go/src/ >> github.com/influxdata/influxdb/tsdb/engine/tsm1/reader.go:201 +0x136 >> fp=0xc4203dfea0 sp=0xc4203dfe30 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: >> github.com/influxdata/influxdb/tsdb/engine/tsm1.(*FileStore) >> .Open.func1(0xc42036e180, 0xc4204f86c0, 0x3, 0xc4204fe098) >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: #011/root/go/src/ >> github.com/influxdata/influxdb/tsdb/engine/tsm1/file_store.go:394 +0x63 >> fp=0xc4203dffc0 sp=0xc4203dfea0 >> Jul 28 21:20:40 ip-172-31-2-161 influxd[23411]: runtime.goexit() >> >> Thank you >> Vinesh >> >> -- Remember to include the version number! --- You received this message because you are subscribed to the Google Groups "InfluxData" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/influxdb. To view this discussion on the web visit https://groups.google.com/d/msgid/influxdb/CAHdYrRHN1j_ybxCpK3KRL%2ByFjdNzQEOB29mv3VvWq9WLWAsWbQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
