Last night we ran into an interesting issue. We pushed out a change to our hosts via Puppet that installed Oracles Java7 as the default JRE/JDK on all of our hosts -- previously it had been the default only on a small subset of our systems. When this happened, our ElasticSearch hosts broke in a fairly spectacular way. The basic problem seems to be that changing out the Java binary caused the /etc/init.d/elasticsearch init script to believe the app was not running (though it was), and therefore Puppet started it up. It looked like this:
puppet-agent[7069]: > (/Stage[main]/Java::Jdk/Exec[set-licence-selected]/returns) executed > successfullypuppet-agent[7069]: > (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/Apt::Key[Add key: EEA14886 > from Apt::Source > oracle_java]/Exec[164487e6b8d5245829c02e964fe69ec79110cb81]/returns) > executed successfully > puppet-agent[7069]: > (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/File[oracle_java.list]/ensure) > createdpuppet-agent[7069]: > (/Stage[main]/Flume/Apt::Source[cdh4]/Apt::Key[Add key: 02A818DD from > Apt::Source cdh4]/Exec[a8c3d5690bde3d926f373000d0a4b28ac782829e]/returns) > executed successfully > puppet-agent[7069]: > (/Stage[main]/Flume/Apt::Source[cdh4]/File[cdh4.list]/ensure) created > puppet-agent[7069]: (/Stage[main]/Apt::Update/Exec[apt_update]) Triggered > 'refresh' from 2 events > puppet-agent[7069]: > (/Stage[main]/Java::Jdk/Package[oracle-java7-installer]/ensure) ensure > changed 'purged' to 'present' > puppet-agent[7069]: > (/Stage[main]/Java::Jdk/Package[oracle-java7-set-default]/ensure) ensure > changed 'purged' to 'present' > *puppet-agent[7069]: > (/Stage[main]/Elasticsearch::Service/Service[elasticsearch]/ensure) ensure > changed 'stopped' to 'running'* I want to stress, ElasticSearch was *already running* ... but the Java change seems to have tripped up the init script so that its 'status' command returned a >0 exit code, causing Puppet to think it needed to start up ElasticSearch. When this happened, we ended up running two ES daemons on each of our nodes, and a whole ton of "reshuffling" occurred. Is this a design feature? Bug? Thoughts? Matt Wise Sr. Systems Architect Nextdoor.com -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAOHkZxO-0%3DA0QDSHmek18GauQsjNqLCnju1FHfhN8NW1MLfNqQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
