I find the calls to fork/exec in the orte/mca/ess/singleton and orte/mca/filem/rsh. Since the rsh is the only componentfor the filem, I wonder I can also omit the orte/mca/filem/rsh?
2010/6/4 Ralph Castain <r...@open-mpi.org>: > Jeff is correct - create an orte/odls/vxworks and do whatever you need for > that platform to launch a local child process. > > I believe you will also find calls to fork/exec in the > orte/mca/ess/singleton area. You may want to add a configure.m4 to that > component to tell it not to build for vxworks. > > > 2010/6/4 Jeff Squyres <jsquy...@cisco.com> >> >> Maybe gettimeofday() be replaced with opal_gettimeofday(), which could do >> the Right Thing on different platforms...? >> >> Also, for fork/exec, I think that should be mostly limited to >> orte/odls/default, right? If so, perhaps the right thing to do is to clone >> that plugin and adapt it for you platform. >> >> >> On Jun 4, 2010, at 1:43 AM, 张晶 wrote: >> >> > Hi Castain , >> > >> > Your last mail to me is really helpful . I met most of the issues >> > listed and fixed them as the off-list solution or mine . >> > Also as the openmpi code changed there are some other issues (almost >> > the missing function ) that are not reported .For example , the >> > gettimeofday posix function is not implemented by vxworks library ,I >> > just wrote a small library for those function. Until now I have >> > successfully compiled the libopen-rte.a and libopen-pal.a , but now >> > I stuck >> > at the problem of fork and exec ,which is not available in the >> > vxworks. It is not possible to implement the fork and exec by myself.I >> > have to read through the code using the fork ,then substitute them >> > with rtpspawn() . It is a challenging work.I really want to know how >> > Brian Barrett deals with the fork() and exec() . >> > >> > Thanks >> > >> > Jing >> > >> > 2010/3/18 Ralph Castain <r...@open-mpi.org>: >> > > Hi Jing >> > > Someone else took a look at this off-list a few years ago. It was >> > > mostly a >> > > problem with the build system (some flags are different) and header >> > > file >> > > names. I don't believe the port was ever completed though. >> > > I have appended the results of that conversation - the last message >> > > contained a list of the issues. You would need to update that to the >> > > trunk >> > > of course as the code has changed considerably since that discussion >> > > took >> > > place. Brian Barrett subsequently created a first-cut at fixing some >> > > of >> > > these, but that appears to have been lost in the years since it was >> > > done - >> > > and wouldn't really be current anyway. >> > > I would be happy to assist as I can. >> > > Ralph >> > > >> > > 1. configure issues with "checking prefix for global symbol labels" >> > > >> > > 1a. VxWorks assembler (CCAS=asppc) generates a.out by default (vs. >> > > >> > > conftest.o that we need subsequently) >> > > >> > > there is this fragment to determine the way to assemble conftest.s: >> > > >> > > if test "$CC" = "$CCAS" ; then >> > > >> > > ompi_assemble="$CCAS $CCASFLAGS -c conftest.s >conftest.out 2>&1" >> > > >> > > else >> > > >> > > ompi_assemble="$CCAS $CCASFLAGS conftest.s >conftest.out 2>&1" >> > > >> > > fi >> > > >> > > The subsequent link fails because conftest.o does not exist: >> > > >> > > ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o >> > > conftest > >> > > conftest.link 2>&1" >> > > >> > > To work around the problem, I did not set CCAS. This gives me the >> > > first >> > > >> > > invocation that includes the -c argument to CC=ccppc, generating >> > > >> > > conftest.o output. >> > > >> > > >> > > 1b. linker fails because LDFLAGS are not passed >> > > >> > > The same linker command line caused problems because $CFLAGS were >> > > passed >> > > >> > > to the linker >> > > >> > > ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o >> > > conftest > >> > > conftest.link 2>&1" >> > > >> > > In my environment, I set CC/CFLAGS/LDFLAGS as follows: >> > > >> > > CC=ccppc >> > > >> > > CFLAGS=-ggdb3 -std=c99 -pedantic -mrtp -msoft-float -mstrict-align >> > > >> > > -mregnames -fno-builtin -fexceptions' >> > > >> > > LDFLAGS=-mrtp -msoft-float -Wl,--start-group -Wl,--end-group >> > > >> > > >> > > -L/amd/raptor/root/opt/WindRiver/vxworks-6.3/target/usr/lib/ppc/PPC32/sfcommon >> > > >> > > The linker flags are not passed because the ompi_link >> > > >> > > [xp-kcain1:build_vxworks] ccppc -ggdb3 -std=c99 -pedantic -mrtp >> > > >> > > -msoft-float -mstrict-align -mregnames -fno-builtin -fexceptions -o >> > > >> > > hello hello.c >> > > >> > > >> > > /amd/raptor/root/opt/WindRiver/gnu/3.4.4-vxworks-6.3/x86-linux2/bin/../lib/gcc/powerpc-wrs-vxworks/3.4.4/../../../../powerpc-wrs-vxworks/bin/ld: >> > > >> > > >> > > cannot find -lc_internal >> > > >> > > collect2: ld returned 1 exit status >> > > >> > > >> > > 2. OPAL atomics asm.c: >> > > >> > > int versus int32_t (refer to email with Brian Barrett >> > > >> > > 3. OPAL event/event.c: sys/time.h and timercmp() macros not defined by >> > > >> > > VxWorks >> > > >> > > refer to workaround in event.c using #ifdef MCS_VXWORKS >> > > >> > > 4. OPAL event/event.c: pipe() syscall not found >> > > >> > > workaround: >> > > >> > > #ifdef HAVE_UNISTD_H >> > > >> > > #include <unistd.h> >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <ioLib.h> /* for pipe() */ >> > > >> > > #endif >> > > >> > > #endif >> > > >> > > 5. OPAL event/signal.c >> > > >> > > static sig_atomic_t opal_evsigcaught[NSIG]; >> > > >> > > NSIG is not defined >> > > >> > > but _NSIGS is >> > > >> > > In Linux, NSIG is defined with -D__USE_MISC >> > > >> > > So I added this code fragment to signal.c: >> > > >> > > /* VxWorks signal.h defines _NSIGS, not NSIG */ >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #define NSIG (_NSIGS+1) >> > > >> > > #endif >> > > >> > > >> > > 6. OPAL event/signal.c: no socketpair() >> > > >> > > workaround: use pipe(): >> > > >> > > #ifdef HAVE_UNISTD_H >> > > >> > > #include <unistd.h> >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <ioLib.h> /* for pipe() */ >> > > >> > > #endif >> > > >> > > #endif >> > > >> > > and later in void opal_evsignal_init(sigset_t *evsigmask) >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > if (pipe(ev_signal_pair) == -1) >> > > >> > > event_err(1, "%s: pipe", __func__); >> > > >> > > #else >> > > >> > > if (socketpair(AF_UNIX, SOCK_STREAM, 0, ev_signal_pair) == -1) >> > > >> > > event_err(1, "%s: socketpair", __func__); >> > > >> > > #endif >> > > >> > > 7. OPAL util/basename.c: #if HAVE_DIRNAME problem >> > > >> > > ../../../opal/util/basename.c:23:5: warning: "HAVE_DIRNAME" is not >> > > defined >> > > >> > > ../../../opal/util/basename.c: In function `opal_dirname': >> > > >> > > problem: HAVE_DIRNAME is not defined in opal_config.h so the #if >> > > >> > > HAVE_DIRNAME will fail at preprocessor/compile time >> > > >> > > workaround: >> > > >> > > change #if HAVE_DIRNAME to #if defined(HAVE_DIRNAME) >> > > >> > > >> > > 8. OPAL util/basename.c: strncopy_s and _strdup >> > > >> > > ../../../opal/util/basename.c: In function `opal_dirname': >> > > >> > > ../../../opal/util/basename.c:153: error: implicit declaration of >> > > >> > > function `strncpy_s' >> > > >> > > ../../../opal/util/basename.c:160: error: implicit declaration of >> > > >> > > function `_strdup' >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > strncpy( ret, filename, p - filename); >> > > >> > > #else >> > > >> > > strncpy_s( ret, (p - filename + 1), filename, p - >> > > filename ); >> > > >> > > #endif >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > return strdup("."); >> > > >> > > #else >> > > >> > > return _strdup("."); >> > > >> > > #endif >> > > >> > > >> > > >> > > 9. opal/util/if.c: socket() prototype not found in vxworks headers >> > > >> > > #ifdef HAVE_SYS_SOCKET_H >> > > >> > > #include <sys/socket.h> >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <sockLib.h> >> > > >> > > #endif >> > > >> > > #endif >> > > >> > > 10. opal/util/if.c: ioctl() >> > > >> > > #ifdef HAVE_SYS_IOCTL_H >> > > >> > > #include <sys/ioctl.h> >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <ioLib.h> >> > > >> > > #endif >> > > >> > > #endif >> > > >> > > 11. opal/util/os_path.c: MAXPATHLEN change to PATH_MAX >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > if (total_length > PATH_MAX) { /* path length is too long - reject >> > > >> > > it */ >> > > >> > > return(NULL); >> > > >> > > #else >> > > >> > > if (total_length > MAXPATHLEN) { /* path length is too long - >> > > >> > > reject it */ >> > > >> > > return(NULL); >> > > >> > > #endif >> > > >> > > >> > > 12. opal/util/output.c: gethostname() >> > > >> > > include <hostLib.h> >> > > >> > > 13. opal/util/output.c: MAXPATHLEN >> > > >> > > same fix as os_path.c above >> > > >> > > 14. opal/util/output.c: closelog/openlog/syslog >> > > >> > > manually turned off HAVE_SYSLOG_H in opal_config.h >> > > >> > > then got a patch from Jeff Squyres that avoids syslog >> > > >> > > 15. opal/util/opal_pty.c >> > > >> > > complains about mismatched prototype of opal_openpty() between this >> > > >> > > source file and opal_pty.h >> > > >> > > workaround: manually edit >> > > build_vxworks_ppc/opal/include/opal_config.h, >> > > >> > > use the following line (change 1 to 0): >> > > >> > > #define OMPI_ENABLE_PTY_SUPPORT 0 >> > > >> > > 16. opal/util/stacktrace.c >> > > >> > > FPE_FLTINV not present in signal.h >> > > >> > > workaround: edit opal_config.h to turn off >> > > >> > > OMPI_WANT_PRETTY_PRINT_STACKTRACE (this can be explicitly configured >> > > out >> > > >> > > but I don't want to reconfigure because I hacked #15 above) >> > > >> > > 17. opal/mca/base/mca_base_open.c >> > > >> > > gethostname() -- same as opal/util/output.c, must include hostLib.h >> > > >> > > 18. opal_progress.c >> > > >> > > from opal/event/event.h (that I modified earlier) >> > > >> > > cannot find #include <sys/_timeradd.h> >> > > >> > > It is in opal/event/compat/sys >> > > >> > > workaround: change event.h to include the definitions that are present >> > > >> > > in _timeradd.h instead of including it. >> > > >> > > 19. Link errors for opal_wrapper >> > > >> > > strcasecmp >> > > >> > > strncasecmp >> > > >> > > I rolled my own in mca_base_open.c (temporary fix, since we may come >> > > across >> > > this problem elsewhere in the code). >> > > >> > > 20. dss_internal.h uses a type 'uint' >> > > >> > > Not sure if it's depending on something in the headers, or something >> > > it >> > > >> > > defined on its own. >> > > >> > > I changed it to be just like the header I found somewhere under Linux >> > > /usr/include: >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > typedef unsigned int uint; >> > > >> > > #endif >> > > >> > > 21. struct iovec definition needed >> > > >> > > orte/mca/iof/base/iof_base_fragment.h:45: warning: array type has >> > > >> > > incomplete element type >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <net/uio.h> >> > > >> > > #endif >> > > >> > > not sure if this is right, or if I should include something like >> > > >> > > <netBufLib.h> or <ioLib.h> >> > > >> > > >> > > 22. iof_base_setup.c >> > > >> > > struct termios not understood >> > > >> > > can only find termios.h header in 'diab' area and I'm not using that >> > > >> > > compiler. >> > > >> > > a variable usepty is set to 0 already when OMPI_ENABLE_PTY_SUPPORT is >> > > 0. >> > > >> > > So, why are we compiling this fragment of code at all? I hacked the >> > > file >> > > >> > > so that the struct termios code will not get compiled. >> > > >> > > 23. oob_base_send/recv.c, oob_base_send/recv_nb.c. struct iovec not >> > > known. >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <net/uio.h> >> > > >> > > #endif >> > > >> > > 24. orte/mca/rmgr/base/rmgr_base_check_context.c:58: error: >> > > >> > > `MAXHOSTNAMELEN' undeclared (first use in this function) >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #define MAXHOSTNAMELEN 64 >> > > >> > > #endif >> > > >> > > 25. orte/mca/rmgr/base/rmgr_base_check_context.c:58: >> > > >> > > gethostname() >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <hostLib.h> >> > > >> > > #endif >> > > >> > > 26. orte/mca/iof/proxy/iof_proxy.h:135: warning: array type has >> > > >> > > incomplete element type >> > > >> > > ../../../../../orte/mca/iof/proxy/iof_proxy.h:135: error: field >> > > >> > > `proxy_iov' has incomplete type >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <net/uio.h> >> > > >> > > #endif >> > > >> > > 27. /orte/mca/iof/svc/iof_svc.h:147: warning: array type has >> > > incomplete >> > > >> > > element type >> > > >> > > ../../../../../orte/mca/iof/svc/iof_svc.h:147: error: field `svc_iov' >> > > >> > > has incomplete type >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <net/uio.h> >> > > >> > > #endif >> > > >> > > 28. ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: warning: array >> > > >> > > type has incomplete element type >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: error: field >> > > `msg_iov' >> > > >> > > has incomplete type >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h: In function >> > > >> > > `mca_oob_tcp_msg_iov_alloc': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:196: error: invalid >> > > >> > > application of `sizeof' to incomplete type `iovec' >> > > >> > > >> > > 29. ../../../../../orte/mca/oob/tcp/oob_tcp.c:344: error: implicit >> > > >> > > declaration of function `accept' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function >> > > >> > > `mca_oob_tcp_create_listen': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:383: error: implicit >> > > >> > > declaration of function `socket' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:399: error: implicit >> > > >> > > declaration of function `bind' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:407: error: implicit >> > > >> > > declaration of function `getsockname' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:415: error: implicit >> > > >> > > declaration of function `listen' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function >> > > >> > > `mca_oob_tcp_listen_thread': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:459: error: implicit >> > > >> > > declaration of function `bzero' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function >> > > >> > > `mca_oob_tcp_recv_probe': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:696: error: implicit >> > > >> > > declaration of function `send' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function >> > > >> > > `mca_oob_tcp_recv_handler': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:795: error: implicit >> > > >> > > declaration of function `recv' >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function >> > > `mca_oob_tcp_init': >> > > >> > > ../../../../../orte/mca/oob/tcp/oob_tcp.c:1087: error: implicit >> > > >> > > declaration of function `usleep' >> > > >> > > This gets rid of most (except bzero and usleep) >> > > >> > > #ifdef MCS_VXWORKS >> > > >> > > #include <sockLib.h> >> > > >> > > #endif >> > > >> > > Trying to reconfigure the package so CFLAGS will not include >> > > -pedantic. >> > > >> > > This is because $WIND_HOME/vxworks-6.3/target/h/string.h has protos >> > > for >> > > >> > > bzero, but only when #if _EXTENSION_WRS is true. So turn off >> > > >> > > -ansi/-pedantic gets this? In my dreams? >> > > >> > > On Mar 17, 2010, at 9:54 PM, 张晶 wrote: >> > > >> > > Hello all, >> > > >> > > >> > > >> > > In order to add some real-time feature to the OpenMPI for some >> > > research ,I >> > > need a OpenMPI version running on VxWorks. But after going through the >> > > Open-MPI website ,I can't found any indication that it supports >> > > VxWorks . >> > > >> > > >> > > >> > > Follow the thread posted by Ralph Castain , >> > > http://www.open-mpi.org/community/lists/users/2006/06/1371.php . >> > > I read some paper about the OpenRTE ,like "Creating a transparent, >> > > distributed, and resilient computing environment: the OpenRTE project" >> > > and >> > > "The Open Run-Time Environment (OpenRTE):A Transparent Multi-cluster >> > > Environment for High-Performance Computing"which is written by Ralph >> > > H. >> > > Castain ・ Jeffrey M. Squyres and others . >> > > >> > > >> > > >> > > Now I have a basic understanding of the OpenRTE , however ,there is >> > > too few >> > > document of the OpenRTE describing the implement of the OpenRTE . I >> > > don't >> > > know >> > > where and how to begin the migration . Any advice will be appreciated. >> > > >> > > >> > > >> > > >> > > >> > > Thanks >> > > >> > > >> > > >> > > Jing Zhang >> > > >> > > _______________________________________________ >> > > devel mailing list >> > > de...@open-mpi.org >> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > >> > > _______________________________________________ >> > > devel mailing list >> > > de...@open-mpi.org >> > > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > >> > >> > >> > >> > -- >> > 张晶 >> > >> > >> > _______________________________________________ >> > devel mailing list >> > de...@open-mpi.org >> > http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> >> -- >> Jeff Squyres >> jsquy...@cisco.com >> For corporate legal information go to: >> http://www.cisco.com/web/about/doing_business/legal/cri/ >> >> >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > -- 张晶