Re: [boinc_dev] MultiHost

2013-01-07 Thread Bernd Machenschalk
Hi David!

On 15.12.12 21:09, David Anderson wrote:
 Bernd:
 Thanks; I checked in these changes.
 However, I don't really understand them
 (e.g. the need for aproject_dir  element in config.xml).
 When you get a chance, can you please update the documentation on this?

 http://boinc.berkeley.edu/trac/wiki/MultiHost#Runserverdaemonsandtasksonmultiplehosts

Assume the following setup:

The main project server myserver keeps the project directory in
/nfs/export/boinc/projects/MyProject
and exports
/nfs/export/boinc

Secondary project servers (auto)mount that directory to the local path
/auto/myserver/boinc

You would usually put symlinks to the paths on the servers, like
/boinc - /nfs/export/boinc
or
/boinc - /auto/myserver/boinc
respectively, to get a consistent path on all servers

The problem that occurred with the previous code was that in the Python start 
script and the server status page the project (or bin directory) path 
was determined by getcwd() or similar, which on myserver above yielded 
/nfs/export/boinc/projects/MyProject, a path that didn't exist on the 
secondary servers. However this path was used to construct e.g. the 
command-line for ssh command executed on the remote servers, which didn't work.

You can now add project_dir to the config file to force both the server 
status page and the start script to use a common path that exists on all 
servers (here /boinc/projects/MyProject)  even if the physical path may not be 
identical.

I'm not sure how this would blend into that page, though.

Best,
Bernd


 -- David

 On 03-Dec-2012 4:20 AM, Bernd Machenschalk wrote:
 Hi!

 I wonder whether any project has gotten a multi-server setup to work based on
 the current code and documentation
 (http://boinc.berkeley.edu/trac/wiki/MultiHost), and how.

 What I found is that the current stuff silently assumes that there is a (NFS)
 shared project directory mounted on all project servers on the very same
 physical path, which at least for us isn't the case. We do have a project
 directory path that is common on all servers, but set up with symlinks. The
 physical path varies (for good reasons), and the project directory (including
 subdirectories like the pid directories) isn't shared (because some remote
 servers are far away).

 In addition the hardcoded ssh command used in the start/stop/status script is
 completely independent of the ssh configuration for the server status page
 (SSP), which is at least confusing.

 The attached series of patches is meant to fix that:

 - the project path to use on all servers can now be configured 
 asproject_dir.
 For backwards compatibility the defaults in server_status.php and start are
 chosen in a way that the old behavior is unchanged (../.. in SSP, 
 os.getcwd()
 in start).

 - the start scrip now uses thessh_exec  path for ssh if configured. The
 default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP)

 - the pid of a daemon is now looked up in the pid directory on the _remote_ 
 host
 via ssh, thus not requiring a shared project directory. Actually determining 
 the
 pid and runing ps to find out whether the daemon is running is done by a 
 script
 (pshelper) executed on the remote host, requiring only one command to be
 executed remotely via ssh. Still one ssh connection is required for every 
 daemon
 on a remote host, which could be a significant slowdown. I'd rather handle 
 all
 daemons running on one host in a single connection, but I couldn't get this
 finished now. If my current solution is to be used, pshelper must be put into
 the bin/ directory of the project on the remote server (make_project should 
 be
 updated to do this). The 'ps' command used on remote hosts must be edited 
 there.

 - if a daemons is disabled, daemons_status() returns immediately without 
 looking
 up the PID and checking whether the daemon is actually running (which 
 wouldn't
 change the return value anyway).

 - the ssh command that is executed on the remote host by the start script is
 only printed when the start script is ran in verbose mode. In particular this
 avoids unnecessary output and thus mails when ran by cron (start --cron).

 Best,
 Bernd



 ___
 boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
 To unsubscribe, visit the above URL and
 (near bottom of page) enter your email address.

 ___
 boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
 To unsubscribe, visit the above URL and
 (near bottom of page) enter your email address.

___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


Re: [boinc_dev] MultiHost

2012-12-15 Thread David Anderson
Bernd:
Thanks; I checked in these changes.
However, I don't really understand them
(e.g. the need for a project_dir element in config.xml).
When you get a chance, can you please update the documentation on this?

http://boinc.berkeley.edu/trac/wiki/MultiHost#Runserverdaemonsandtasksonmultiplehosts

-- David

On 03-Dec-2012 4:20 AM, Bernd Machenschalk wrote:
 Hi!

 I wonder whether any project has gotten a multi-server setup to work based on
 the current code and documentation
 (http://boinc.berkeley.edu/trac/wiki/MultiHost), and how.

 What I found is that the current stuff silently assumes that there is a (NFS)
 shared project directory mounted on all project servers on the very same
 physical path, which at least for us isn't the case. We do have a project
 directory path that is common on all servers, but set up with symlinks. The
 physical path varies (for good reasons), and the project directory (including
 subdirectories like the pid directories) isn't shared (because some remote
 servers are far away).

 In addition the hardcoded ssh command used in the start/stop/status script is
 completely independent of the ssh configuration for the server status page
 (SSP), which is at least confusing.

 The attached series of patches is meant to fix that:

 - the project path to use on all servers can now be configured as 
 project_dir.
 For backwards compatibility the defaults in server_status.php and start are
 chosen in a way that the old behavior is unchanged (../.. in SSP, 
 os.getcwd()
 in start).

 - the start scrip now uses the ssh_exec path for ssh if configured. The
 default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP)

 - the pid of a daemon is now looked up in the pid directory on the _remote_ 
 host
 via ssh, thus not requiring a shared project directory. Actually determining 
 the
 pid and runing ps to find out whether the daemon is running is done by a 
 script
 (pshelper) executed on the remote host, requiring only one command to be
 executed remotely via ssh. Still one ssh connection is required for every 
 daemon
 on a remote host, which could be a significant slowdown. I'd rather handle all
 daemons running on one host in a single connection, but I couldn't get this
 finished now. If my current solution is to be used, pshelper must be put into
 the bin/ directory of the project on the remote server (make_project should be
 updated to do this). The 'ps' command used on remote hosts must be edited 
 there.

 - if a daemons is disabled, daemons_status() returns immediately without 
 looking
 up the PID and checking whether the daemon is actually running (which wouldn't
 change the return value anyway).

 - the ssh command that is executed on the remote host by the start script is
 only printed when the start script is ran in verbose mode. In particular this
 avoids unnecessary output and thus mails when ran by cron (start --cron).

 Best,
 Bernd



 ___
 boinc_dev mailing list
 boinc_dev@ssl.berkeley.edu
 http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
 To unsubscribe, visit the above URL and
 (near bottom of page) enter your email address.

___
boinc_dev mailing list
boinc_dev@ssl.berkeley.edu
http://lists.ssl.berkeley.edu/mailman/listinfo/boinc_dev
To unsubscribe, visit the above URL and
(near bottom of page) enter your email address.


[boinc_dev] MultiHost

2012-12-03 Thread Bernd Machenschalk

Hi!

I wonder whether any project has gotten a multi-server setup to work based on the current code and documentation 
(http://boinc.berkeley.edu/trac/wiki/MultiHost), and how.


What I found is that the current stuff silently assumes that there is a (NFS) shared project directory mounted on all project servers on the very same 
physical path, which at least for us isn't the case. We do have a project directory path that is common on all servers, but set up with symlinks. The 
physical path varies (for good reasons), and the project directory (including subdirectories like the pid directories) isn't shared (because some 
remote servers are far away).


In addition the hardcoded ssh command used in the start/stop/status script is completely independent of the ssh configuration for the server status 
page (SSP), which is at least confusing.


The attached series of patches is meant to fix that:

- the project path to use on all servers can now be configured as project_dir. For backwards compatibility the defaults in server_status.php and 
start are chosen in a way that the old behavior is unchanged (../.. in SSP, os.getcwd() in start).


- the start scrip now uses the ssh_exec path for ssh if configured. The 
default ssh in the start script is now '/usr/bin/ssh' (as already in the SSP)

- the pid of a daemon is now looked up in the pid directory on the _remote_ host via ssh, thus not requiring a shared project directory. Actually 
determining the pid and runing ps to find out whether the daemon is running is done by a script (pshelper) executed on the remote host, requiring only 
one command to be executed remotely via ssh. Still one ssh connection is required for every daemon on a remote host, which could be a significant 
slowdown. I'd rather handle all daemons running on one host in a single connection, but I couldn't get this finished now. If my current solution is to 
be used, pshelper must be put into the bin/ directory of the project on the remote server (make_project should be updated to do this). The 'ps' 
command used on remote hosts must be edited there.


- if a daemons is disabled, daemons_status() returns immediately without looking up the PID and checking whether the daemon is actually running (which 
wouldn't change the return value anyway).


- the ssh command that is executed on the remote host by the start script is only printed when the start script is ran in verbose mode. In particular 
this avoids unnecessary output and thus mails when ran by cron (start --cron).


Best,
Bernd

From 281b9ef3c3d29a0cb6c2a26d725557ffb1af17a3 Mon Sep 17 00:00:00 2001
From: Bernd Machenschalk bernd.machensch...@aei.mpg.de
Date: Mon, 3 Dec 2012 10:07:54 +
Subject: [PATCH 1/5] only print remote command in verbose mode

(in particular not when ran with --cron)
---
 sched/start | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sched/start b/sched/start
index fb01cc1..98ffc93 100755
--- a/sched/start
+++ b/sched/start
@@ -745,7 +745,7 @@ if is_main_host:
 remote_cmd = [ 'ssh', host, 'cd', cwd, '  ' ] + sys.argv
 if verbose:
 remote_cmd += [ '-v' ]
-print 'running ', ' '.join(remote_cmd)
+print 'running ', ' '.join(remote_cmd)
 os.spawnvp(wait_mode, remote_cmd[0], remote_cmd)
 
 os.unlink(start_lockfile)
-- 
1.7.12.2

From dca59f85a1aa07b85b046bd40bda344b76700577 Mon Sep 17 00:00:00 2001
From: Bernd Machenschalk bernd.machensch...@aei.mpg.de
Date: Mon, 3 Dec 2012 10:09:54 +
Subject: [PATCH 2/5] sync configuration of remote server management with PHP
 (server status page)

- configure ssh executable to use with ssh_exec

- configure a project directory common to all hosts with project_dir
---
 sched/start | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/sched/start b/sched/start
index 98ffc93..962b385 100755
--- a/sched/start
+++ b/sched/start
@@ -611,7 +611,6 @@ def command_show_config():
 local_hostname = socket.gethostname()
 local_hostname = local_hostname.split('.')[0]
 # print 'local hostname: ', local_hostname
-cwd = os.getcwd()
 program_name = os.path.basename(sys.argv[0])
 if program_name == 'start':
 command = command_enable_start
@@ -709,6 +708,18 @@ if not command:
 config = configxml.ConfigFile(config_filename).read()
 run_state = configxml.RunStateFile(run_state_filename).read(failopen_ok = True)
 
+if 'ssh_exec' in config.config.__dict__:
+ssh = config.config.ssh_exec
+else:
+ssh = '/usr/bin/ssh'
+
+if 'project_dir' in config.config.__dict__:
+cwd = config.config.project_dir + '/bin'
+cmd = './' + program_name
+else:
+cwd = os.getcwd()
+cmd = sys.argv[0]
+
 os.chdir(boinc_project_path.project_path())
 bin_dir = get_dir('bin')
 cgi_bin_dir = get_dir('cgi_bin')
@@ -742,7 +753,7 @@ if is_main_host:
 for host in other_hosts:
 if host == local_hostname:
 continue
-remote_cmd = [ 'ssh', host, 'cd', cwd, '  ' ] + sys.argv
+