No, this it not a normal upgrade issue. Maybe the box was hard reset or the daemon got killed or something like this.
On 15 April 2015 at 12:48, Pete Storey <[email protected]> wrote: > Thanks for that. Ended up rebuilding the machine anyway - was a test box > and have now moved to a cluster setup. Is there any reason the upgrade > should corrupt etcd - is it something to be aware of in the future? > > > On Wednesday, 15 April 2015 10:03:39 UTC+1, Marius Sturm wrote: >> >> Hi Pete, >> mmh looks like the database files of etcd got somehow corrupted. etcd is >> used on the appliances for distributing configuration settings in a cluster >> installation. So if you run a single host installtion you can simply delete >> the files and rerun the reconfigure command. etcd data will then be >> reinitialized. >> >> sudo graylog-ctl stop etcd >> sudo rm -r /var/opt/graylog/data/etcd/* >> sudo graylog-ctl start etcd >> sudo graylog-ctl reconfigure >> >> >> \Marius >> >> On 13 April 2015 at 15:21, Pete Storey <[email protected]> wrote: >> >>> I've tried to upgrade a Graylog instance (running everything) on AWS >>> using the instructions from here: >>> >>> https://github.com/Graylog2/graylog2-images/tree/master/ova >>> >>> Which are to run the commands: >>> >>> wget >>> https://packages.graylog2.org/releases/graylog2-omnibus/ubuntu/graylog_latest.deb >>> sudo rm /var/lib/dpkg/info/graylog.postrm >>> sudo graylog-ctl stop >>> sudo dpkg -G -i graylog_latest.deb >>> sudo graylog-ctl reconfigure >>> >>> >>> However this breaks at the reconfigure stage. >>> >>> * execute[/opt/graylog/embedded/bin/graylog-ctl start graylog-server] >>> action run >>> - execute /opt/graylog/embedded/bin/graylog-ctl start graylog-server >>> * ruby_block[add node to server list] action run >>> >>> ============================================================ >>> ==================== >>> Error executing action `run` on resource 'ruby_block[add node to >>> server list]' >>> ============================================================ >>> ==================== >>> >>> Errno::ECONNREFUSED >>> ------------------- >>> Connection refused - connect(2) for "127.0.0.1" port 4001 >>> >>> Which is I think because the etcd server isn't started: >>> >>> ubuntu@graylog:~$ sudo graylog-ctl status >>> run: elasticsearch: (pid 1019) 629s; run: log: (pid 1012) 629s >>> down: etcd: 1s, normally up, want up; run: log: (pid 1011) 629s >>> run: graylog-server: (pid 1786) 588s; run: log: (pid 1008) 629s >>> run: graylog-web: (pid 1021) 629s; run: log: (pid 1018) 629s >>> run: mongodb: (pid 1014) 629s; run: log: (pid 1009) 629s >>> run: nginx: (pid 4562) 1s; run: log: (pid 1010) 629s >>> >>> It tries to start it in the Chef recipe: >>> >>> * execute[/opt/graylog/embedded/bin/graylog-ctl start etcd] action run >>> - execute /opt/graylog/embedded/bin/graylog-ctl start etcd >>> Recipe: graylog::elasticsearch >>> * directory[/var/log/graylog/elasticsearch] action create (up to date) >>> >>> which doesn't moan, but looking in the etcd log, it's reporting this >>> every second: >>> >>> 2015-04-13_13:13:37.96364 2015/04/13 13:13:37 etcd: listening for peers >>> on http://localhost:2380 >>> 2015-04-13_13:13:37.96367 2015/04/13 13:13:37 etcd: listening for peers >>> on http://localhost:7001 >>> 2015-04-13_13:13:37.96368 2015/04/13 13:13:37 etcd: listening for client >>> requests on http://0.0.0.0:2379 >>> 2015-04-13_13:13:37.96370 2015/04/13 13:13:37 etcd: listening for client >>> requests on http://0.0.0.0:4001 >>> 2015-04-13_13:13:37.96511 2015/04/13 13:13:37 etcdserver: recovered >>> store from snapshot at index 1310131 >>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: name = default >>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: data dir = >>> /var/opt/graylog/data/etcd >>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: heartbeat = >>> 100ms >>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: election = >>> 1000ms >>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: snapshot count >>> = 10000 >>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: advertise >>> client URLs = http://localhost:2379,http://localhost:4001 >>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: loaded cluster >>> information from store: default=http://localhost:2380,default= >>> http://localhost:7001 >>> 2015-04-13_13:13:37.98912 2015/04/13 13:13:37 etcdserver: read wal >>> error: unexpected EOF >>> >>> Not sure what the error means though, nor can I work out how to fix it. >>> Any ideas? >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "graylog2" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> >> >> -- >> Developer >> >> Tel.: +49 (0)40 609 452 077 >> Fax.: +49 (0)40 609 452 078 >> >> TORCH GmbH - A Graylog Company >> Steckelhörn 11 >> 20457 Hamburg >> Germany >> >> https://www.graylog.com <https://www.torch.sh/> >> >> Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 125175 >> Geschäftsführer: Lennart Koopmann (CEO) >> > -- > You received this message because you are subscribed to the Google Groups > "graylog2" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- Developer Tel.: +49 (0)40 609 452 077 Fax.: +49 (0)40 609 452 078 TORCH GmbH - A Graylog Company Steckelhörn 11 20457 Hamburg Germany https://www.graylog.com <https://www.torch.sh/> Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 125175 Geschäftsführer: Lennart Koopmann (CEO) -- You received this message because you are subscribed to the Google Groups "graylog2" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
