Faidon Liambotis has submitted this change and it was merged.
Change subject: Update elasticsearch settings
..
Update elasticsearch settings
1. Turn off some logging that I _think_ is causing some cluster actions
to really slow down.
2. Make permanent some settings that we've configured via the api to
speed up recovery after a node crashes.
3. Make elastic1007 not master eligible because it is down. elatic1008
takes its place.
Change-Id: Id40b0c9bececa8b89ebb9228418957a53ada44c9
---
M manifests/role/elasticsearch.pp
M modules/elasticsearch/templates/elasticsearch.yml.erb
M modules/elasticsearch/templates/logging.yml.erb
3 files changed, 9 insertions(+), 14 deletions(-)
Approvals:
Chad: Looks good to me, but someone else must approve
Faidon Liambotis: Looks good to me, approved
jenkins-bot: Verified
diff --git a/manifests/role/elasticsearch.pp b/manifests/role/elasticsearch.pp
index 3aaae6f..f0a8be6 100644
--- a/manifests/role/elasticsearch.pp
+++ b/manifests/role/elasticsearch.pp
@@ -36,7 +36,7 @@
}
$master_eligible = $::hostname ? {
/^elastic1001/= true, # Rack A3
-/^elastic1007/= true, # Rack C5
+/^elastic1008/= true, # Rack C5
# TODO Move this when we get machines on another row/rack
/^elastic1012/= true, # Rack C5
default = false,
diff --git a/modules/elasticsearch/templates/elasticsearch.yml.erb
b/modules/elasticsearch/templates/elasticsearch.yml.erb
index f05759f..d0ccdb2 100644
--- a/modules/elasticsearch/templates/elasticsearch.yml.erb
+++ b/modules/elasticsearch/templates/elasticsearch.yml.erb
@@ -273,22 +273,18 @@
# Set the number of concurrent recoveries happening on a node:
#
# 1. During the initial recovery
-#
-# cluster.routing.allocation.node_initial_primaries_recoveries: 4
-#
+cluster.routing.allocation.node_initial_primaries_recoveries: 4
+
# 2. During adding/removing nodes, rebalancing, etc
-#
-# cluster.routing.allocation.node_concurrent_recoveries: 2
+cluster.routing.allocation.node_concurrent_recoveries: 15
# Set to throttle throughput when recovering (eg. 100mb, by default 20mb):
-#
-# indices.recovery.max_bytes_per_sec: 20mb
+indices.recovery.max_bytes_per_sec: 100mb
# Set to limit the number of open concurrent streams when
# recovering a shard from a peer:
-#
-# indices.recovery.concurrent_streams: 5
+indices.recovery.concurrent_streams: 5
## Discovery ##
diff --git a/modules/elasticsearch/templates/logging.yml.erb
b/modules/elasticsearch/templates/logging.yml.erb
index a0e2e22..55f2374 100644
--- a/modules/elasticsearch/templates/logging.yml.erb
+++ b/modules/elasticsearch/templates/logging.yml.erb
@@ -7,10 +7,9 @@
# https://github.com/elasticsearch/elasticsearch/issues/4203
action.admin.cluster.node.stats: INFO
- # We'd like to know more about shard allocation and management because we've
- # had an outage due to unknown allocation problems. This will cause more
- # logging but it should be worth it.
- cluster: TRACE
+ # If you need to know more about shard allocation you to set this to debug.
+ # Trace seems to generate enough logs to slow down the process.
+ # cluster: DEBUG
# reduce the logging for aws, too much is logged under the default INFO
com.amazonaws: WARN
--
To view, visit https://gerrit.wikimedia.org/r/102048
To unsubscribe, visit https://gerrit.wikimedia.org/r/settings
Gerrit-MessageType: merged
Gerrit-Change-Id: Id40b0c9bececa8b89ebb9228418957a53ada44c9
Gerrit-PatchSet: 2
Gerrit-Project: operations/puppet
Gerrit-Branch: production
Gerrit-Owner: Manybubbles never...@wikimedia.org
Gerrit-Reviewer: Chad ch...@wikimedia.org
Gerrit-Reviewer: Faidon Liambotis fai...@wikimedia.org
Gerrit-Reviewer: Ottomata o...@wikimedia.org
Gerrit-Reviewer: jenkins-bot
___
MediaWiki-commits mailing list
MediaWiki-commits@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-commits