Hey, actually the init scripts are checking the pid files, so wondering what has happened here. Can you reliably reproduce it? If so, an issue would be great.
--Alex On Tue, Mar 25, 2014 at 10:19 PM, Mark Walkom <[email protected]>wrote: > I'd say it was the java swap that caused it, as ES will not start another > process if it can see one running; > >> markw@es00-fv:~$ ps -ef|grep java >> 106 20801 1 5 Feb25 ? 1-14:27:46 /usr/bin/java -Xms4g >> -Xmx4g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError >> -Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid >> -Des.path.home=/usr/share/elasticsearch -cp >> :/usr/share/elasticsearch/lib/elasticsearch-1.0.0.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* >> -Des.default.config=/etc/elasticsearch/elasticsearch.yml >> -Des.default.path.home=/usr/share/elasticsearch >> -Des.default.path.logs=/var/log/elasticsearch >> -Des.default.path.data=/var/lib/elasticsearch >> -Des.default.path.work=/tmp/elasticsearch >> -Des.default.path.conf=/etc/elasticsearch >> org.elasticsearch.bootstrap.Elasticsearch >> markw 24590 24487 0 08:18 pts/0 00:00:00 grep java >> markw@es00-fv:~$ sservice elasticsearch status >> [sudo] password for markw: >> * elasticsearch is running >> markw@es00-fv:~$ sservice elasticsearch start >> * Starting Elasticsearch Server >> >> >> * Already running. >> >> >> [ OK ] >> markw@es00-fv:~$ ps -ef|grep java >> 106 20801 1 5 Feb25 ? 1-14:27:48 /usr/bin/java -Xms4g >> -Xmx4g -Xss256k -Djava.awt.headless=true -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 >> -XX:+UseCMSInitiatingOccupancyOnly -XX:+HeapDumpOnOutOfMemoryError >> -Delasticsearch -Des.pidfile=/var/run/elasticsearch.pid >> -Des.path.home=/usr/share/elasticsearch -cp >> :/usr/share/elasticsearch/lib/elasticsearch-1.0.0.jar:/usr/share/elasticsearch/lib/*:/usr/share/elasticsearch/lib/sigar/* >> -Des.default.config=/etc/elasticsearch/elasticsearch.yml >> -Des.default.path.home=/usr/share/elasticsearch >> -Des.default.path.logs=/var/log/elasticsearch >> -Des.default.path.data=/var/lib/elasticsearch >> -Des.default.path.work=/tmp/elasticsearch >> -Des.default.path.conf=/etc/elasticsearch >> org.elasticsearch.bootstrap.Elasticsearch >> markw 24626 24487 0 08:18 pts/0 00:00:00 grep java >> markw@es00-fv:~$ > > > Regards, > Mark Walkom > > Infrastructure Engineer > Campaign Monitor > email: [email protected] > web: www.campaignmonitor.com > > > On 26 March 2014 03:51, Matt Wise <[email protected]> wrote: > >> Last night we ran into an interesting issue. We pushed out a change to >> our hosts via Puppet that installed Oracles Java7 as the default JRE/JDK on >> all of our hosts -- previously it had been the default only on a small >> subset of our systems. When this happened, our ElasticSearch hosts broke in >> a fairly spectacular way. The basic problem seems to be that changing out >> the Java binary caused the /etc/init.d/elasticsearch init script to believe >> the app was not running (though it was), and therefore Puppet started it >> up. It looked like this: >> >> puppet-agent[7069]: >>> (/Stage[main]/Java::Jdk/Exec[set-licence-selected]/returns) executed >>> successfullypuppet-agent[7069]: >>> (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/Apt::Key[Add key: EEA14886 >>> from Apt::Source >>> oracle_java]/Exec[164487e6b8d5245829c02e964fe69ec79110cb81]/returns) >>> executed successfully >>> puppet-agent[7069]: >>> (/Stage[main]/Java::Jdk/Apt::Source[oracle_java]/File[oracle_java.list]/ensure) >>> createdpuppet-agent[7069]: >>> (/Stage[main]/Flume/Apt::Source[cdh4]/Apt::Key[Add key: 02A818DD from >>> Apt::Source cdh4]/Exec[a8c3d5690bde3d926f373000d0a4b28ac782829e]/returns) >>> executed successfully >>> puppet-agent[7069]: >>> (/Stage[main]/Flume/Apt::Source[cdh4]/File[cdh4.list]/ensure) created >>> puppet-agent[7069]: (/Stage[main]/Apt::Update/Exec[apt_update]) >>> Triggered 'refresh' from 2 events >>> puppet-agent[7069]: >>> (/Stage[main]/Java::Jdk/Package[oracle-java7-installer]/ensure) ensure >>> changed 'purged' to 'present' >>> puppet-agent[7069]: >>> (/Stage[main]/Java::Jdk/Package[oracle-java7-set-default]/ensure) ensure >>> changed 'purged' to 'present' >>> *puppet-agent[7069]: >>> (/Stage[main]/Elasticsearch::Service/Service[elasticsearch]/ensure) ensure >>> changed 'stopped' to 'running'* >> >> >> I want to stress, ElasticSearch was *already running* ... but the Java >> change seems to have tripped up the init script so that its 'status' >> command returned a >0 exit code, causing Puppet to think it needed to start >> up ElasticSearch. When this happened, we ended up running two ES daemons on >> each of our nodes, and a whole ton of "reshuffling" occurred. >> >> Is this a design feature? Bug? Thoughts? >> >> Matt Wise >> Sr. Systems Architect >> Nextdoor.com >> >> -- >> You received this message because you are subscribed to the Google Groups >> "elasticsearch" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/elasticsearch/CAOHkZxO-0%3DA0QDSHmek18GauQsjNqLCnju1FHfhN8NW1MLfNqQ%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAOHkZxO-0%3DA0QDSHmek18GauQsjNqLCnju1FHfhN8NW1MLfNqQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > > -- > You received this message because you are subscribed to the Google Groups > "elasticsearch" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/elasticsearch/CAEM624ZLzZa9H8_CbXE8brFmcy3wyy0WF%3DMann0QUy2okqcL1Q%40mail.gmail.com<https://groups.google.com/d/msgid/elasticsearch/CAEM624ZLzZa9H8_CbXE8brFmcy3wyy0WF%3DMann0QUy2okqcL1Q%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM_7Ty6qV0o5_-HoiUH7D80XLtPjshwwvHUynGhrU2geFw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
