Kenji Okamoto wrote:

term% xcpusrv -sxcpu
....
term% mount -ac /srv/xcpu /mnt/xcpu
term% openclone /mnt/xcpu
(then on a newly created rio window)
term% cat /mnt/xcpu/clone
65537term%    <=============here! not 65536!
term% ls -l /mnt/xcpu
--rw-rw-rw- M 66 ssh ssh 0 Jan  1  1970 /mnt/xcpu/65536  <========= not 65537!
--rw-rw-rw- M 66 ssh ssh 0 Jan  1  1970 /mnt/xcpu/clone

weird. I will have to try this on plan 9. I think I know what I am doing wrong but want to verify it.


On the point why only once copy of execution file, I didn't noticed
the danger of your example /bin/sh execution.  Now, I think why
you chose that way.   So, in xcpu we should do only batch job, right?

no, if you look at xsh you can see it can be used for interactive jobs (in future).

The real issues are these:
Suppose I
cp /bin/date exec
echo exec >ctl
cp /bin/uname exec
echo exec > ctl
cat stdout

OK, what does the output of stdout mean in this case? how do I distinguish the two things in a reasonable way? If I want to control the process, which one am I controlling? If I get an eof on the stdout file, which one did I get an EOF for? Should stdout deliver more than one EOF for each process that ends? The whole situation turns into a confusing mess if you let more than one process run for each 'exec' file.

On the xsh.c, I still have some problem.
Isn't this a program to be used for a cluster environment?

yes, but I make no guarantees. I am still fighting what might be a p9p issue on Linux, so xsh is not quite ready yet. I am trying to get this thing ready for SC '05, but I am running out of time :-)


In the file, there is a line of
        dirno[nodeno++] = smprint("/%s/%s/xcpu/%s", base, s, buf);

here, I suppose we should name our cluster's cp server by s,
such as
/mnt/xcpu/"s"/xcpu/"number", right?

yes, exactly, I had no idea of how to name things, so this is what I came up with. This is done this way to make it fit linux as well. On linux, we will have
/mnt/xcpu/"s"/xcpu ==> xcpu server for "s"
/mnt/xcpu/"s"/fs ==> u9fs server for that node, so we can access files on the node "s"

note that in this model, the node is the server, and it exports both the xcpu service and its own file system.

So, we have only one cpu server, then we should use the xsh command
like term% xsh 0 ?

yes.


Or if we have many cpu server, then

xsh 0 1 2 3 4 5 6 7
?

yes. so to run date on all those nodes:
xsh 0 1 2 3 4 5 6 7 -- /bin/date


By the way, in your mkfile-plan9

$O.xsh: xsh.$O P9pshell.$O
        $LD -o $target $prereq $LDFLAGS

should be
$O.xsh: xsh.$O Plan9shell.$O    <============
        $LD -o $target $prereq $LDFLAGS

thanks for that fix, I just installed it.

thanks again

ron

Reply via email to