Hello everyone, I can't get god to start my TS and it's starting to
drive me a bit mad(madder). I'm following Tims article here,
http://openmonkey.com/articles/2008/09/configuring-god-to-monitor-sphinxs-searchd
and having some problems.
I've included all my config files and if anyone is nice(mad) enough to
have a look through this for me I'd be eternally grateful. I'm
emailing Tim as well to see if he can help, Tim, sorry mate.
Symptoms
I run a cap deploy, TS restarts and reindexes.
I want god to monitor TS and restart it if it goes outside it's memory
limits or if something 'mental' happens and searchd stops for some
reason.
After deploying, I run sudo god log gaol-1 and it says that it's up.
I check ps aux | grep searchd, and it is.
To test if god will restart I've tried two options.
1. sudo killall searchd
searchd stops, god notices that it's stopped, but it fails to restart
it with the error start command exited with non-zero code = 1
2. sudo god stop gaol-1
searchd stops, god is now not monitoring it
so I run sudo god start gaol-1 and it starts monitoring it, but never
starts searchd, error start command exited with non-zero code = 1
The command it says it is trying to run, /usr/local/bin/searchd --
pidfile --config /var/www/rails/gaol/current/config/
staging_production.sphinx.conf, can be run from anywhere on the system
when I'm logged into the system as the same user in the uid setting
Here's the god watch
God.watch do |w|
w.group = "gaol-sphinx"
w.name = "gaol-1"
w.interval = 30.seconds
w.uid = "username"
w.gid = "groupname"
w.start = "searchd --config /var/www/rails/gaol/current/
config/staging_production.sphinx.conf"
w.start_grace = 10.seconds
w.stop = "searchd --config /var/www/rails/gaol/current/
config/staging_production.sphinx.conf --stop"
w.stop_grace = 10.seconds
w.restart = w.stop + " && " + w.start
w.restart_grace = 15.seconds
w.pid_file = "/var/www/rails/gaol/shared/log/searchd.production.pid"
w.behavior(:clean_pid_file)
w.start_if do |start|
start.condition(:process_running) do |c|
c.interval = 5.seconds
c.running = false
end
end
w.restart_if do |restart|
restart.condition(:memory_usage) do |c|
c.above = 100.megabytes
c.times = [3, 5] # 3 out of 5 intervals
end
end
w.lifecycle do |on|
on.condition(:flapping) do |c|
c.to_state = [:start, :restart]
c.times = 5
c.within = 5.minutes
c.transition = :unmonitored
c.retry_in = 10.minutes
c.retry_times = 5
c.retry_within = 2.hours
end
end
end
Here's config/sphinx.yml
production:
bin_path: "/usr/local/bin"
searchd_file_path: "/var/www/rails/gaol/shared/sphinx"
config_file: "/var/www/rails/gaol/current/config/
staging_production.sphinx.conf"
searchd_log_file: "/var/www/rails/gaol/shared/log/searchd.log"
query_log_file: "/var/www/rails/gaol/shared/log/searchd.query.log"
pid_file: "/var/www/rails/gaol/shared/log/searchd.production.pid"
Here's the static production.sphinx.conf file (we run the index
command with INDEX_ONLY=true)
indexer
{
}
searchd
{
address = 127.0.0.1
port = 3312
pid_file = /var/www/rails/gaol/shared/log/searchd.production.pid
}
source app_core_0
{
type = mysql
sql_host = sql database server
sql_user = user
sql_pass = password
sql_db = database_name
sql_sock = /var/run/mysqld/mysqld.sock
sql_query_pre = SET NAMES utf8
sql_query = SELECT `apps`.`id` * 1 + 0 AS `id` ,CAST
(`apps`.`given_names` AS CHAR) AS `given_names`,CAST(`apps`.`surname`
AS CHAR) AS `surname`,CAST(`apps`.`date_of_birth` AS CHAR) AS
`dob`,CAST(`apps`.`personalid` AS CHAR) AS `personalid`,CAST
(GROUP_CONCAT(DISTINCT `choices`.`decision` SEPARATOR ' ') AS CHAR) AS
`decision`, CAST(GROUP_CONCAT(DISTINCT `courseinstances`.`entry_year`
SEPARATOR ' ') AS CHAR) AS `app_year`, CAST(GROUP_CONCAT(DISTINCT
`courses`.`facultycode_id` SEPARATOR ' ') AS CHAR) AS `faculty`, CAST
(GROUP_CONCAT(DISTINCT `courses`.`coursetitle` SEPARATOR ' ') AS CHAR)
AS `Coursetitle`,CAST(`courses`.`campus` AS CHAR) AS
`campus`,`apps`.`id` AS `sphinx_internal_id`, 4045616687 AS
`class_crc`,'4045616687' AS `subclass_crcs`,0 AS
`sphinx_deleted`,IFNULL(`apps`.`given_names`, '') AS
`given_names_sort` FROM `apps` LEFT OUTER JOIN `choices` ON
choices.app_id = apps.id LEFT OUTER JOIN `courseinstances` ON
choices.courseinstance_id = courseinstances.id LEFT OUTER JOIN
`courses` ON courses.id = courseinstances.course_id WHERE `apps`.`id`
>= $start AND `apps`.`id` <= $end GROUP BY `apps`.`id` ORDER BY NULL
sql_query_range = SELECT IFNULL(MIN(`id`), 1), IFNULL(MAX(`id`), 1)
FROM `apps`
sql_attr_uint = sphinx_internal_id
sql_attr_uint = class_crc
sql_attr_uint = sphinx_deleted
sql_attr_str2ordinal = given_names_sort
sql_attr_multi = uint subclass_crcs from field
sql_query_info = SELECT * FROM `apps` WHERE `id` = (($id - 0) / 1)
}
index app_core
{
source = app_core_0
path = /var/www/rails/gaol/shared/sphinx/app_core
min_infix_len = 3
enable_star = 1
}
index app
{
type = distributed
local = app_core
}
Here's the section of the cap deploy script that restarts sphinx on
deploy
run "cd /var/www/rails/gaol/current; RAILS_ENV=production rake
thinking_sphinx:stop"
run "cd /var/www/rails/gaol/current; RAILS_ENV=production rake
thinking_sphinx:index INDEX_ONLY=true"
run "cd /var/www/rails/gaol/current; RAILS_ENV=production rake
thinking_sphinx:start"
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"god.rb" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/god-rb?hl=en
-~----------~----~----~----~------~----~------~--~---