Dear R-users

I need help to understand the error message from furrr function.
I am trying to build a parallel compute system which combines two
desktop computers, one of which is a host computer, and runs ubuntu
over wsl2, and the other is a slave, which runs ubuntu. as its OS.
They are mutually connected on LAN.

The host computer has 8 physical cores (16 logical cores), and the
slave has 4 physical cores(8 logical cores).

I wrote a code chunk, which is;

> nodes<-c(rep("localhost",7),rep("192.168.1.11",4))
> plan(list(tweak(cluster, workers = nodes),tweak(multicore,workers=2)))
> system.time(VCtransfrm("typeIII"))

in which VCtransfrm() is the target function, in which future_pmap and future_map are being called tporogically.   The variable "typeIII" shows the file which is sent to the VCtransfrm function. the typeIII file is the largest and has 165MB of data while
a typeII file is smaller and has only 7 MB of data.

The chunk runs just fine when the typeII data is fed.  However, when the typeIII data was fed, it gave the following error messages  and returned to the R prompt.  Oddly, multiple R sessions were still running under the host computers when I obsered its
behaviour by the top command of ubuntu.  The error messages are:

Error in unserialize(node$con) :
  ClusterFuture (<none>) failed to receive results from cluster RichSOCKnode #10 (PID 47955 on localhost ‘localhost’). The reason reported was ‘error reading from connection’. Post-mortem diagnostic: No process exists with this PID, i.e. the localhost worker is no longer alive. Detected a non-exportable reference (‘externalptr’) in one of the globals (‘...furrr_fn’ of class ‘function’) used in the future expression. The total size of the 8 globals exported is 3.77 MiB. The three largest globals are ‘...furrr_chunk_args’ (3.30 MiB of class ‘list’), ‘...furrr_fn’ (456.55 KiB of class ‘function’) and ‘...furrr_map_fn’ (11.91 KiB of class ‘function’)
Timing stopped at: 2.285 4.291 37.53

I hastily add that the part of "multicore" in the chunk is changed to "multisession",
the chunk runs without a problem even when the typeIII file is fed.

I need to understand what this messages mean and how to fix this problem.Since the chunk runs just fine for the smaller data, I reasoned that the problem could not
be a logical matter of the code.

Please direct me to the solution of the problem.
Any suggestion will be greatly appreciated.

Sincerely,

Hiroto

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to