On Wed, Aug 11, 2010 at 11:51:43AM +0530, Vipul Agrawal wrote: > >On Sat, Aug 07, 2010 at 05:31:24PM +1100, Lutaev D. A. wrote: > >> We used parallel-2.0.2 and we have problems with such code: > >> > >> clear; > >> > >> hosts = []; > >> > >> for i = 1:nargin > >> hosts = [hosts; argv(){i, 1}]; > >> end > >> > >> hosts > >> > >> sockets = connect(hosts) > >> > >> x = rand(50, 1000); > >> > >> send(x, sockets(2, :)); > >> reval("x = recv(sockets(1, :))", sockets(2, :)); > >> scloseall(sockets); > >> > >> Programm stucks when it's trying to send x from sockets(1, :) (master) to > >> slave (sockets(2, :)). > > > >As I said, I'm unable to reproduce the problem. Maybe it won't help, > >but why don't you send a real session transcript (cut-and-paste from > >your terminal running Octave) and indicate exactly the command which > >"stucks"? Commands which you only intended to give are of no use to > >me. Since I have no notion as yet what the cause of the problem is, > >the contents of the variables "host" and "sockets" may be important; > >why don't you show it? Of corse you should hide the real hostnames, > >but I have to see whether they are different, whether the local > >machine is among the servers, and what is the length of the hostnames. > > > >Do you use Octave-3.2.4 and parallel-2.0.2 on _all_ machines? > > > >What you still can do is to check whether the server process and child > >process are running before and after the "stucking" command (on each > >server machine: ps ax | grep octave and post the output (replacing > >hostnames, of corse)). > > > >Olaf > > > > I am using octave-3.2.4 from maverick repo and parallel-2.0.2 build from > source. I am also getting the same issue with big matrices. > I could not send more than 32767 elements(2^15-1) of type double(size 8 > bytes) = 262136 > The reason maybe be incorrect buffer size. > the bufsize in pserver.cc in line 507: > int bufsize = 262144; > A possible solution is to change to > int bufsize = BUFF_SIZE; > > Now, the no. of elements increases to about 46k which interestingly comes > out to be a magic number equal to 2^15 * sqrt(2). Quite Amazing! > I think there is still some other issue which stalls sending matrices larger > than this size. > > -Vipul
"send" does not return until the whole value is written to the socket. If the values length exceeds the sockets buffer size, a process at the far end of the connection must read data for "send" being able to return. So before "send" in the master process, one must first start "recv" on the other end, e.g.: octave:13> reval ("send (recv (sockets(1, :)), sockets(1, :))", sockets(2, :)) octave:14> send (ones (100, 1000000), sockets(2, :)) octave:15> size (recv (sockets(2, :))) ans = 100 1000000 octave:16> I don't know why the sockets buffersize for outgoing connections has been set to a lower value than for incoming connections in pserver.cc; this probably should be corrected, since BUFF_SIZE (the higher value) is considered by send.cc. But this should not be essential (only a matter of efficiency). Thanks for the report. Olaf ------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev _______________________________________________ Octave-dev mailing list Octave-dev@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/octave-dev