Last night we ran into an interesting issue. We pushed out a change to our
hosts via Puppet that installed Oracles Java7 as the default JRE/JDK on all
of our hosts -- previously it had been the default only on a small subset
of our systems. When this happened, our ElasticSearch hosts broke in a
fairly spectacular way. The basic problem seems to be that changing out the
Java binary caused the /etc/init.d/elasticsearch init script to believe the
app was not running (though it was), and therefore Puppet started it up. It
looked like this:

puppet-agent[7069]:
> (/Stage[main]/Java::Jdk/Exec[set-licence-selected]/returns) executed
> successfullypuppet-agent[7069]:
> (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/Apt::Key[Add key: EEA14886
> from Apt::Source
> oracle_java]/Exec[164487e6b8d5245829c02e964fe69ec79110cb81]/returns)
> executed successfully
> puppet-agent[7069]:
> (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/File[oracle_java.list]/ensure)
> createdpuppet-agent[7069]:
> (/Stage[main]/Flume/Apt::Source[cdh4]/Apt::Key[Add key: 02A818DD from
> Apt::Source cdh4]/Exec[a8c3d5690bde3d926f373000d0a4b28ac782829e]/returns)
> executed successfully
> puppet-agent[7069]:
> (/Stage[main]/Flume/Apt::Source[cdh4]/File[cdh4.list]/ensure) created
> puppet-agent[7069]: (/Stage[main]/Apt::Update/Exec[apt_update]) Triggered
> 'refresh' from 2 events
> puppet-agent[7069]:
> (/Stage[main]/Java::Jdk/Package[oracle-java7-installer]/ensure) ensure
> changed 'purged' to 'present'
> puppet-agent[7069]:
> (/Stage[main]/Java::Jdk/Package[oracle-java7-set-default]/ensure) ensure
> changed 'purged' to 'present'
> *puppet-agent[7069]:
> (/Stage[main]/Elasticsearch::Service/Service[elasticsearch]/ensure) ensure
> changed 'stopped' to 'running'*


I want to stress, ElasticSearch was *already running* ... but the Java
change seems to have tripped up the init script so that its 'status'
command returned a >0 exit code, causing Puppet to think it needed to start
up ElasticSearch. When this happened, we ended up running two ES daemons on
each of our nodes, and a whole ton of "reshuffling" occurred.

Is this a design feature? Bug? Thoughts?

Matt Wise
Sr. Systems Architect
Nextdoor.com

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAOHkZxO-0%3DA0QDSHmek18GauQsjNqLCnju1FHfhN8NW1MLfNqQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to