Re: [graylog2] Upgrading Graylog 1.0.0 to 1.0.1 on AWS has broken it

Pete Storey Wed, 15 Apr 2015 03:49:01 -0700

Thanks for that.  Ended up rebuilding the machine anyway - was a test box 
and have now moved to a cluster setup.  Is there any reason the upgrade 
should corrupt etcd - is it something to be aware of in the future?



On Wednesday, 15 April 2015 10:03:39 UTC+1, Marius Sturm wrote:
>
> Hi Pete,
> mmh looks like the database files of etcd got somehow corrupted. etcd is 
> used on the appliances for distributing configuration settings in a cluster 
> installation. So if you run a single host installtion you can simply delete 
> the files and rerun the reconfigure command. etcd data will then be 
> reinitialized.
>
> sudo graylog-ctl stop etcd
> sudo rm -r /var/opt/graylog/data/etcd/*
> sudo graylog-ctl start etcd
> sudo graylog-ctl reconfigure
>
>
> \Marius
>
> On 13 April 2015 at 15:21, Pete Storey <[email protected] <javascript:>> 
> wrote:
>
>> I've tried to upgrade a Graylog instance (running everything) on AWS 
>> using the instructions from here:
>>
>> https://github.com/Graylog2/graylog2-images/tree/master/ova 
>>
>> Which are to run the commands:
>>
>> wget 
>> https://packages.graylog2.org/releases/graylog2-omnibus/ubuntu/graylog_latest.deb
>> sudo rm /var/lib/dpkg/info/graylog.postrm
>> sudo graylog-ctl stop
>> sudo dpkg -G -i graylog_latest.deb
>> sudo graylog-ctl reconfigure
>>
>>
>> However this breaks at the reconfigure stage.  
>>
>>  * execute[/opt/graylog/embedded/bin/graylog-ctl start graylog-server] 
>> action run
>>     - execute /opt/graylog/embedded/bin/graylog-ctl start graylog-server
>>   * ruby_block[add node to server list] action run
>>
>>     
>> ================================================================================
>>     Error executing action `run` on resource 'ruby_block[add node to 
>> server list]'
>>     
>> ================================================================================
>>
>>     Errno::ECONNREFUSED
>>     -------------------
>>     Connection refused - connect(2) for "127.0.0.1" port 4001
>>
>> Which is I think because the etcd server isn't started:
>>
>> ubuntu@graylog:~$ sudo graylog-ctl status
>> run: elasticsearch: (pid 1019) 629s; run: log: (pid 1012) 629s
>> down: etcd: 1s, normally up, want up; run: log: (pid 1011) 629s
>> run: graylog-server: (pid 1786) 588s; run: log: (pid 1008) 629s
>> run: graylog-web: (pid 1021) 629s; run: log: (pid 1018) 629s
>> run: mongodb: (pid 1014) 629s; run: log: (pid 1009) 629s
>> run: nginx: (pid 4562) 1s; run: log: (pid 1010) 629s
>>
>> It tries to start it in the Chef recipe:
>>
>>   * execute[/opt/graylog/embedded/bin/graylog-ctl start etcd] action run
>>     - execute /opt/graylog/embedded/bin/graylog-ctl start etcd
>> Recipe: graylog::elasticsearch
>>   * directory[/var/log/graylog/elasticsearch] action create (up to date)
>>
>> which doesn't moan, but looking in the etcd log, it's reporting this 
>> every second:
>>
>> 2015-04-13_13:13:37.96364 2015/04/13 13:13:37 etcd: listening for peers 
>> on http://localhost:2380
>> 2015-04-13_13:13:37.96367 2015/04/13 13:13:37 etcd: listening for peers 
>> on http://localhost:7001
>> 2015-04-13_13:13:37.96368 2015/04/13 13:13:37 etcd: listening for client 
>> requests on http://0.0.0.0:2379
>> 2015-04-13_13:13:37.96370 2015/04/13 13:13:37 etcd: listening for client 
>> requests on http://0.0.0.0:4001
>> 2015-04-13_13:13:37.96511 2015/04/13 13:13:37 etcdserver: recovered store 
>> from snapshot at index 1310131
>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: name = default
>> 2015-04-13_13:13:37.96522 2015/04/13 13:13:37 etcdserver: data dir = 
>> /var/opt/graylog/data/etcd
>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: heartbeat = 
>> 100ms
>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: election = 
>> 1000ms
>> 2015-04-13_13:13:37.96523 2015/04/13 13:13:37 etcdserver: snapshot count 
>> = 10000
>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: advertise 
>> client URLs = http://localhost:2379,http://localhost:4001
>> 2015-04-13_13:13:37.96524 2015/04/13 13:13:37 etcdserver: loaded cluster 
>> information from store: default=http://localhost:2380,default=
>> http://localhost:7001
>> 2015-04-13_13:13:37.98912 2015/04/13 13:13:37 etcdserver: read wal error: 
>> unexpected EOF
>>
>> Not sure what the error means though, nor can I work out how to fix it.  
>> Any ideas?
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "graylog2" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>
>
> -- 
> Developer
>
> Tel.: +49 (0)40 609 452 077
> Fax.: +49 (0)40 609 452 078
>
> TORCH GmbH - A Graylog Company
> Steckelhörn 11
> 20457 Hamburg
> Germany
>
> https://www.graylog.com <https://www.torch.sh/>
>
> Commercial Reg. (Registergericht): Amtsgericht Hamburg, HRB 125175
> Geschäftsführer: Lennart Koopmann (CEO)
>  

-- 
You received this message because you are subscribed to the Google Groups 
"graylog2" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [graylog2] Upgrading Graylog 1.0.0 to 1.0.1 on AWS has broken it

Reply via email to