Re: [boinc_dev] MultiHost
Hi David! On 15.12.12 21:09, David Anderson wrote: Bernd: Thanks; I checked in these changes. However, I don't really understand them (e.g. the need for aproject_dir element in config.xml). When you get a chance, can you please update the documentation on this? http://boinc.berkeley.edu/trac/wiki/MultiHost#Runserverdaemonsandtasksonmultiplehosts Assume the following setup: The main project server myserver keeps the project directory in /nfs/export/boinc/projects/MyProject and exports /nfs/export/boinc Secondary project servers (auto)mount that directory to the local path /auto/myserver/boinc You would usually put symlinks to the paths on the servers, like /boinc - /nfs/export/boinc or /boinc - /auto/myserver/boinc respectively, to get a consistent path on all servers The problem that occurred with the previous code was that in the Python start script and the server status page the project (or bin directory) path was determined by getcwd() or similar, which on myserver above yielded /nfs/export/boinc/projects/MyProject, a path that didn't exist on the secondary servers. However this path was used to construct e.g. the command-line for ssh command executed on the remote servers, which didn't work. You can now add project_dir to the config file to force both the server status page and the start script to use a common path that exists on all servers (here /boinc/projects/MyProject) even if the physical path may not be identical. I'm not sure how this would blend into that page, though. Best, Bernd -- David On 03-Dec-2012 4:20 AM, Bernd Machenschalk wrote: Hi! I wonder whether any project has gotten a multi-server setup to work based on the current code and documentation (http://boinc.berkeley.edu/trac/wiki/MultiHost), and how. What I found is that the current stuff silently assumes that there is a (NFS) shared project directory mounted on all project servers on the very same physical path, which at least for us isn't the case. We do have a project directory path that is common on all servers, but set up with symlinks. The physical path varies (for good reasons), and the project directory (including subdirectories like the pid directories) isn't shared (because some remote servers are far away). In addition the hardcoded ssh command used in the start/stop/status script is completely independent of the ssh configuration for the server status page (SSP), which is at least confusing. The attached series of patches is meant to fix that: - the project path to use on all servers can now be configured asproject_dir. For backwards compatibility the defaults in server_status.php and start are chosen in a way that the old behavior is unchanged (../.. in SSP, os.getcwd() in start). - the start scrip now uses thessh_exec path for ssh if configured. The default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP) - the pid of a daemon is now looked up in the pid directory on the _remote_ host via ssh, thus not requiring a shared project directory. Actually determining the pid and runing ps to find out whether the daemon is running is done by a script (pshelper) executed on the remote host, requiring only one command to be executed remotely via ssh. Still one ssh connection is required for every daemon on a remote host, which could be a significant slowdown. I'd rather handle all daemons running on one host in a single connection, but I couldn't get this finished now. If my current solution is to be used, pshelper must be put into the bin/ directory of the project on the remote server (make_project should be updated to do this). The 'ps' command used on remote hosts must be edited there. - if a daemons is disabled, daemons_status() returns immediately without looking up the PID and checking whether the daemon is actually running (which wouldn't change the return value anyway). - the ssh command that is executed on the remote host by the start script is only printed when the start script is ran in verbose mode. In particular this avoids unnecessary output and thus mails when ran by cron (start --cron). Best, Bernd ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
Re: [boinc_dev] MultiHost
Bernd: Thanks; I checked in these changes. However, I don't really understand them (e.g. the need for a project_dir element in config.xml). When you get a chance, can you please update the documentation on this? http://boinc.berkeley.edu/trac/wiki/MultiHost#Runserverdaemonsandtasksonmultiplehosts -- David On 03-Dec-2012 4:20 AM, Bernd Machenschalk wrote: Hi! I wonder whether any project has gotten a multi-server setup to work based on the current code and documentation (http://boinc.berkeley.edu/trac/wiki/MultiHost), and how. What I found is that the current stuff silently assumes that there is a (NFS) shared project directory mounted on all project servers on the very same physical path, which at least for us isn't the case. We do have a project directory path that is common on all servers, but set up with symlinks. The physical path varies (for good reasons), and the project directory (including subdirectories like the pid directories) isn't shared (because some remote servers are far away). In addition the hardcoded ssh command used in the start/stop/status script is completely independent of the ssh configuration for the server status page (SSP), which is at least confusing. The attached series of patches is meant to fix that: - the project path to use on all servers can now be configured as project_dir. For backwards compatibility the defaults in server_status.php and start are chosen in a way that the old behavior is unchanged (../.. in SSP, os.getcwd() in start). - the start scrip now uses the ssh_exec path for ssh if configured. The default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP) - the pid of a daemon is now looked up in the pid directory on the _remote_ host via ssh, thus not requiring a shared project directory. Actually determining the pid and runing ps to find out whether the daemon is running is done by a script (pshelper) executed on the remote host, requiring only one command to be executed remotely via ssh. Still one ssh connection is required for every daemon on a remote host, which could be a significant slowdown. I'd rather handle all daemons running on one host in a single connection, but I couldn't get this finished now. If my current solution is to be used, pshelper must be put into the bin/ directory of the project on the remote server (make_project should be updated to do this). The 'ps' command used on remote hosts must be edited there. - if a daemons is disabled, daemons_status() returns immediately without looking up the PID and checking whether the daemon is actually running (which wouldn't change the return value anyway). - the ssh command that is executed on the remote host by the start script is only printed when the start script is ran in verbose mode. In particular this avoids unnecessary output and thus mails when ran by cron (start --cron). Best, Bernd ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address. ___ boinc_dev mailing list boinc_dev@ssl.berkeley.edu http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev To unsubscribe, visit the above URL and (near bottom of page) enter your email address.
[boinc_dev] MultiHost
Hi! I wonder whether any project has gotten a multi-server setup to work based on the current code and documentation (http://boinc.berkeley.edu/trac/wiki/MultiHost), and how. What I found is that the current stuff silently assumes that there is a (NFS) shared project directory mounted on all project servers on the very same physical path, which at least for us isn't the case. We do have a project directory path that is common on all servers, but set up with symlinks. The physical path varies (for good reasons), and the project directory (including subdirectories like the pid directories) isn't shared (because some remote servers are far away). In addition the hardcoded ssh command used in the start/stop/status script is completely independent of the ssh configuration for the server status page (SSP), which is at least confusing. The attached series of patches is meant to fix that: - the project path to use on all servers can now be configured as project_dir. For backwards compatibility the defaults in server_status.php and start are chosen in a way that the old behavior is unchanged (../.. in SSP, os.getcwd() in start). - the start scrip now uses the ssh_exec path for ssh if configured. The default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP) - the pid of a daemon is now looked up in the pid directory on the _remote_ host via ssh, thus not requiring a shared project directory. Actually determining the pid and runing ps to find out whether the daemon is running is done by a script (pshelper) executed on the remote host, requiring only one command to be executed remotely via ssh. Still one ssh connection is required for every daemon on a remote host, which could be a significant slowdown. I'd rather handle all daemons running on one host in a single connection, but I couldn't get this finished now. If my current solution is to be used, pshelper must be put into the bin/ directory of the project on the remote server (make_project should be updated to do this). The 'ps' command used on remote hosts must be edited there. - if a daemons is disabled, daemons_status() returns immediately without looking up the PID and checking whether the daemon is actually running (which wouldn't change the return value anyway). - the ssh command that is executed on the remote host by the start script is only printed when the start script is ran in verbose mode. In particular this avoids unnecessary output and thus mails when ran by cron (start --cron). Best, Bernd From 281b9ef3c3d29a0cb6c2a26d725557ffb1af17a3 Mon Sep 17 00:00:00 2001 From: Bernd Machenschalk bernd.machensch...@aei.mpg.de Date: Mon, 3 Dec 2012 10:07:54 + Subject: [PATCH 1/5] only print remote command in verbose mode (in particular not when ran with --cron) --- sched/start | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/sched/start b/sched/start index fb01cc1..98ffc93 100755 --- a/sched/start +++ b/sched/start @@ -745,7 +745,7 @@ if is_main_host: remote_cmd = [ 'ssh', host, 'cd', cwd, ' ' ] + sys.argv if verbose: remote_cmd += [ '-v' ] -print 'running ', ' '.join(remote_cmd) +print 'running ', ' '.join(remote_cmd) os.spawnvp(wait_mode, remote_cmd[0], remote_cmd) os.unlink(start_lockfile) -- 1.7.12.2 From dca59f85a1aa07b85b046bd40bda344b76700577 Mon Sep 17 00:00:00 2001 From: Bernd Machenschalk bernd.machensch...@aei.mpg.de Date: Mon, 3 Dec 2012 10:09:54 + Subject: [PATCH 2/5] sync configuration of remote server management with PHP (server status page) - configure ssh executable to use with ssh_exec - configure a project directory common to all hosts with project_dir --- sched/start | 15 +-- 1 file changed, 13 insertions(+), 2 deletions(-) diff --git a/sched/start b/sched/start index 98ffc93..962b385 100755 --- a/sched/start +++ b/sched/start @@ -611,7 +611,6 @@ def command_show_config(): local_hostname = socket.gethostname() local_hostname = local_hostname.split('.')[0] # print 'local hostname: ', local_hostname -cwd = os.getcwd() program_name = os.path.basename(sys.argv[0]) if program_name == 'start': command = command_enable_start @@ -709,6 +708,18 @@ if not command: config = configxml.ConfigFile(config_filename).read() run_state = configxml.RunStateFile(run_state_filename).read(failopen_ok = True) +if 'ssh_exec' in config.config.__dict__: +ssh = config.config.ssh_exec +else: +ssh = '/usr/bin/ssh' + +if 'project_dir' in config.config.__dict__: +cwd = config.config.project_dir + '/bin' +cmd = './' + program_name +else: +cwd = os.getcwd() +cmd = sys.argv[0] + os.chdir(boinc_project_path.project_path()) bin_dir = get_dir('bin') cgi_bin_dir = get_dir('cgi_bin') @@ -742,7 +753,7 @@ if is_main_host: for host in other_hosts: if host == local_hostname: continue -remote_cmd = [ 'ssh', host, 'cd', cwd, ' ' ] + sys.argv +