Re: [graylog2] Upgrading Graylog 1.0.0 to 1.0.1 on AWS has broken it

Marius Sturm Wed, 15 Apr 2015 05:40:05 -0700

No, this it not a normal upgrade issue. Maybe the box was hard reset or the
daemon got killed or something like this.


On 15 April 2015 at 12:48, Pete Storey <[email protected]> wrote:

> Thanks for that.  Ended up rebuilding the machine anyway - was a test box
> and have now moved to a cluster setup.  Is there any reason the upgrade
> should corrupt etcd - is it something to be aware of in the future?
>
>
> On Wednesday, 15 April 2015 10:03:39 UTC+1, Marius Sturm wrote:
>>
>> Hi Pete,
>> mmh looks like the database files of etcd got somehow corrupted. etcd is
>> used on the appliances for distributing configuration settings in a cluster
>> installation. So if you run a single host installtion you can simply delete
>> the files and rerun the reconfigure command. etcd data will then be
>> reinitialized.
>>
>> sudo graylog-ctl stop etcd
>> sudo rm -r /var/opt/graylog/data/etcd/*
>> sudo graylog-ctl start etcd
>> sudo graylog-ctl reconfigure
>>
>>
>> \Marius
>>
>> On 13 April 2015 at 15:21, Pete Storey <[email protected]> wrote:
>>
>>> I've tried to upgrade a Graylog instance (running everything) on AWS
>>> using the instructions from here:
>>>
>>> https://github.com/Graylog2/graylog2-images/tree/master/ova
>>>
>>> Which are to run the commands:
>>>
>>> wget 
>>> https://packages.graylog2.org/releases/graylog2-omnibus/ubuntu/graylog_latest.deb
>>> sudo rm /var/lib/dpkg/info/graylog.postrm
>>> sudo graylog-ctl stop
>>> sudo dpkg -G -i graylog_latest.deb
>>> sudo graylog-ctl reconfigure
>>>
>>>
>>> However this breaks at the reconfigure stage.
>>>
>>>  * execute[/opt/graylog/embedded/bin/graylog-ctl start graylog-server]
>>> action run
>>>     - execute /opt/graylog/embedded/bin/graylog-ctl start graylog-server
>>>   * ruby_block[add node to server list] action run
>>>
>>>     ============================================================
>>> ====================
>>>     Error executing action `run` on resource 'ruby_block[add node to
>>> server list]'
>>>     ============================================================
>>> ====================
>>>
>>>     Errno::ECONNREFUSED
>>>     -------------------
>>>     Connection refused - connect(2) for "127.0.0.1" port 4001
>>>
>>> Which is I think because the etcd server isn't started:
>>>
>>> ubuntu@graylog:~$ sudo graylog-ctl status
>>> run: elasticsearch: (pid 1019) 629s; run: log: (pid 1012) 629s
>>> down: etcd: 1s, normally up, want up; run: log: (pid 1011) 629s
>>> run: graylog-server: (pid 1786) 588s; run: log: (pid 1008) 629s
>>> run: graylog-web: (pid 1021) 629s; run: log: (pid 1018) 629s
>>> run: mongodb: (pid 1014) 629s; run: log: (pid 1009) 629s
>>> run: nginx: (pid 4562) 1s; run: log: (pid 1010) 629s
>>>
>>> It tries to start it in the Chef recipe:
>>>
>>>   * execute[/opt/graylog/embedded/bin/graylog-ctl start etcd] action run
>>>     - execute /opt/graylog/embedded/bin/graylog-ctl start etcd
>>> Recipe: graylog::elasticsearch
>>>   * directory[/var/log/graylog/elasticsearch] action create (up to date)
>>>
>>> which doesn't moan, but looking in the etcd log, it's reporting this
>>> every second:
>>>
>>> 2015-04-13_13:13:37.96364 2015/04/13 13:13:37 etcd: listening for peers
>>> on http://localhost:2380
>>> 2015-04-13_13:13:37.96367 2015/04/13 13:13:37 etcd: listening for peers
>>> on http://localhost:7001
>>> 2015-04-13_13:13:37.96368 2015/04/13 13:13:37 etcd: listening for client
>>> requests on http://0.0.0.0:2379
>>> 2015-04-13_13:13:37.96370 2015/04/13 13:13:37 etcd: listening for client
>>> requests on http://0.0.0.0:4001
>>> 2015-04-13_13:13:37.96511 2015/04/13 13:13:37 etcdserver: recovered
>>> store from snapshot at index 1310131
>>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: name = default
>>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: data dir =
>>> /var/opt/graylog/data/etcd
>>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: heartbeat =
>>> 100ms
>>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: election =
>>> 1000ms
>>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: snapshot count
>>> = 10000
>>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: advertise
>>> client URLs = http://localhost:2379,http://localhost:4001
>>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: loaded cluster
>>> information from store: default=http://localhost:2380,default=
>>> http://localhost:7001
>>> 2015-04-13_13:13:37.98912 2015/04/13 13:13:37 etcdserver: read wal
>>> error: unexpected EOF
>>>
>>> Not sure what the error means though, nor can I work out how to fix it.
>>> Any ideas?
>>>
>>>  --
>>> You received this message because you are subscribed to the Google
>>> Groups "graylog2" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>>
>>
>> --
>> Developer
>>
>> Tel.: +49 (0)40 609 452 077
>> Fax.: +49 (0)40 609 452 078
>>
>> TORCH GmbH - A Graylog Company
>> Steckelhörn 11
>> 20457 Hamburg
>> Germany
>>
>> https://www.graylog.com <https://www.torch.sh/>
>>
>> Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 125175
>> Geschäftsführer: Lennart Koopmann (CEO)
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "graylog2" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Developer

Tel.: +49 (0)40 609 452 077
Fax.: +49 (0)40 609 452 078

TORCH GmbH - A Graylog Company
Steckelhörn 11
20457 Hamburg
Germany

https://www.graylog.com <https://www.torch.sh/>

Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 125175
Geschäftsführer: Lennart Koopmann (CEO)

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [graylog2] Upgrading Graylog 1.0.0 to 1.0.1 on AWS has broken it

Reply via email to