Hi Ralph ,

Thank you for your immediate and useful help . I will try out what you have
posted to see if the porting can be successful .

Regards,

Jing Zhang

2010/3/18 Ralph Castain <r...@open-mpi.org>

> Hi Jing
>
> Someone else took a look at this off-list a few years ago. It was mostly a
> problem with the build system (some flags are different) and header file
> names. I don't believe the port was ever completed though.
>
> I have appended the results of that conversation - the last message
> contained a list of the issues. You would need to update that to the trunk
> of course as the code has changed considerably since that discussion took
> place. Brian Barrett subsequently created a first-cut at fixing some of
> these, but that appears to have been lost in the years since it was done -
> and wouldn't really be current anyway.
>
> I would be happy to assist as I can.
> Ralph
>
>  1. configure issues with "checking prefix for global symbol labels"
>
>
>  1a. VxWorks assembler (CCAS=asppc) generates a.out by default (vs.
>
>  conftest.o that we need subsequently)
>
>
>  there is this fragment to determine the way to assemble conftest.s:
>
>
>  if test "$CC" = "$CCAS" ; then
>
>     ompi_assemble="$CCAS $CCASFLAGS -c conftest.s >conftest.out 2>&1"
>
>  else
>
>     ompi_assemble="$CCAS $CCASFLAGS conftest.s >conftest.out 2>&1"
>
>  fi
>
>
>  The subsequent link fails because conftest.o does not exist:
>
>
>    ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest
> > conftest.link 2>&1"
>
>
>  To work around the problem, I did not set CCAS. This gives me the first
>
>  invocation that includes the -c argument to CC=ccppc, generating
>
>  conftest.o output.
>
>
>
>  1b. linker fails because LDFLAGS are not passed
>
>
>  The same linker command line caused problems because $CFLAGS were passed
>
>  to the linker
>
>
>    ompi_link="$CC $CFLAGS conftest_c.$OBJEXT conftest.$OBJEXT -o conftest
> > conftest.link 2>&1"
>
>
>  In my environment, I set CC/CFLAGS/LDFLAGS as follows:
>
>  CC=ccppc
>
>
>  CFLAGS=-ggdb3 -std=c99 -pedantic -mrtp -msoft-float -mstrict-align
>
>  -mregnames -fno-builtin -fexceptions'
>
>
>  LDFLAGS=-mrtp -msoft-float -Wl,--start-group -Wl,--end-group
>
>
> -L/amd/raptor/root/opt/WindRiver/vxworks-6.3/target/usr/lib/ppc/PPC32/sfcommon
>
>
>  The linker flags are not passed because the ompi_link
>
>
>  [xp-kcain1:build_vxworks]  ccppc -ggdb3 -std=c99 -pedantic -mrtp
>
>  -msoft-float -mstrict-align -mregnames -fno-builtin -fexceptions -o
>
>  hello hello.c
>
>
> /amd/raptor/root/opt/WindRiver/gnu/3.4.4-vxworks-6.3/x86-linux2/bin/../lib/gcc/powerpc-wrs-vxworks/3.4.4/../../../../powerpc-wrs-vxworks/bin/ld:
>
>
>
>  cannot find -lc_internal
>
>  collect2: ld returned 1 exit status
>
>
>
>  2. OPAL atomics asm.c:
>
>  int versus int32_t (refer to email with Brian Barrett
>
>
>  3. OPAL event/event.c: sys/time.h and timercmp() macros not defined by
>
>  VxWorks
>
>  refer to workaround in event.c using #ifdef MCS_VXWORKS
>
>
>  4. OPAL event/event.c: pipe() syscall not found
>
>  workaround:
>
>
>  #ifdef HAVE_UNISTD_H
>
>  #include <unistd.h>
>
>  #ifdef MCS_VXWORKS
>
>  #include <ioLib.h>        /* for pipe() */
>
>  #endif
>
>  #endif
>
>
>  5. OPAL event/signal.c
>
>  static sig_atomic_t opal_evsigcaught[NSIG];
>
>  NSIG is not defined
>
>  but _NSIGS is
>
>
>  In Linux, NSIG is defined with -D__USE_MISC
>
>
>  So I added this code fragment to signal.c:
>
>
>  /* VxWorks signal.h defines _NSIGS, not NSIG */
>
>  #ifdef MCS_VXWORKS
>
>  #define NSIG (_NSIGS+1)
>
>  #endif
>
>
>
>  6. OPAL event/signal.c: no socketpair()
>
>
>  workaround: use pipe():
>
>
>  #ifdef HAVE_UNISTD_H
>
>  #include <unistd.h>
>
>  #ifdef MCS_VXWORKS
>
>  #include <ioLib.h>        /* for pipe() */
>
>  #endif
>
>  #endif
>
>
>  and later in void opal_evsignal_init(sigset_t *evsigmask)
>
>
>  #ifdef MCS_VXWORKS
>
>         if (pipe(ev_signal_pair) == -1)
>
>                 event_err(1, "%s: pipe", __func__);
>
>  #else
>
>     if (socketpair(AF_UNIX, SOCK_STREAM, 0, ev_signal_pair) == -1)
>
>         event_err(1, "%s: socketpair", __func__);
>
>  #endif
>
>
>  7. OPAL util/basename.c: #if HAVE_DIRNAME problem
>
>
>  ../../../opal/util/basename.c:23:5: warning: "HAVE_DIRNAME" is not
> defined
>
>  ../../../opal/util/basename.c: In function `opal_dirname':
>
>
>  problem: HAVE_DIRNAME is not defined in opal_config.h so the #if
>
>  HAVE_DIRNAME will fail at preprocessor/compile time
>
>
>  workaround:
>
>  change #if HAVE_DIRNAME to #if defined(HAVE_DIRNAME)
>
>
>
>  8. OPAL util/basename.c: strncopy_s and _strdup
>
>  ../../../opal/util/basename.c: In function `opal_dirname':
>
>  ../../../opal/util/basename.c:153: error: implicit declaration of
>
>  function `strncpy_s'
>
>  ../../../opal/util/basename.c:160: error: implicit declaration of
>
>  function `_strdup'
>
>
>  #ifdef MCS_VXWORKS
>
>         strncpy( ret, filename, p - filename);
>
>  #else
>
>                 strncpy_s( ret, (p - filename + 1), filename, p - filename
> );
>
>  #endif
>
>  #ifdef MCS_VXWORKS
>
>     return strdup(".");
>
>  #else
>
>     return _strdup(".");
>
>  #endif
>
>
>
>
>  9. opal/util/if.c: socket() prototype not found in vxworks headers
>
>
>  #ifdef HAVE_SYS_SOCKET_H
>
>  #include <sys/socket.h>
>
>  #ifdef MCS_VXWORKS
>
>  #include <sockLib.h>
>
>  #endif
>
>  #endif
>
>
>  10. opal/util/if.c: ioctl()
>
>
>  #ifdef HAVE_SYS_IOCTL_H
>
>  #include <sys/ioctl.h>
>
>  #ifdef MCS_VXWORKS
>
>  #include <ioLib.h>
>
>  #endif
>
>  #endif
>
>
>  11. opal/util/os_path.c: MAXPATHLEN change to PATH_MAX
>
>
>  #ifdef MCS_VXWORKS
>
>     if (total_length > PATH_MAX) {  /* path length is too long - reject
>
>  it */
>
>         return(NULL);
>
>  #else
>
>     if (total_length > MAXPATHLEN) {  /* path length is too long -
>
>  reject it */
>
>         return(NULL);
>
>  #endif
>
>
>
>  12. opal/util/output.c: gethostname()
>
>  include <hostLib.h>
>
>
>  13. opal/util/output.c: MAXPATHLEN
>
>  same fix as os_path.c above
>
>
>  14. opal/util/output.c: closelog/openlog/syslog
>
>  manually turned off HAVE_SYSLOG_H in opal_config.h
>
>  then got a patch from Jeff Squyres that avoids syslog
>
>
>  15. opal/util/opal_pty.c
>
>  complains about mismatched prototype of opal_openpty() between this
>
>  source file and opal_pty.h
>
>
>  workaround: manually edit build_vxworks_ppc/opal/include/opal_config.h,
>
>  use the following line (change 1 to 0):
>
>  #define OMPI_ENABLE_PTY_SUPPORT 0
>
>
>  16. opal/util/stacktrace.c
>
>  FPE_FLTINV not present in signal.h
>
>
>  workaround: edit opal_config.h to turn off
>
>  OMPI_WANT_PRETTY_PRINT_STACKTRACE (this can be explicitly configured out
>
>  but I don't want to reconfigure because I hacked #15 above)
>
>
>  17. opal/mca/base/mca_base_open.c
>
>  gethostname() -- same as opal/util/output.c, must include hostLib.h
>
>
>  18. opal_progress.c
>
>  from opal/event/event.h (that I modified earlier)
>
>  cannot find #include <sys/_timeradd.h>
>
>  It is in opal/event/compat/sys
>
>
>  workaround: change event.h to include the definitions that are present
>
>  in _timeradd.h instead of including it.
>
>
>  19. Link errors for opal_wrapper
>
>  strcasecmp
>
>  strncasecmp
>
>
>  I rolled my own in mca_base_open.c (temporary fix, since we may come
> across this problem elsewhere in the code).
>
>
>  20. dss_internal.h uses a type 'uint'
>
>  Not sure if it's depending on something in the headers, or something it
>
>  defined on its own.
>
>
>  I changed it to be just like the header I found somewhere under Linux
> /usr/include:
>
>  #ifdef MCS_VXWORKS
>
>  typedef unsigned int uint;
>
>  #endif
>
>
>  21. struct iovec definition needed
>
>  orte/mca/iof/base/iof_base_fragment.h:45: warning: array type has
>
>  incomplete element type
>
>
>  #ifdef MCS_VXWORKS
>
>  #include <net/uio.h>
>
>  #endif
>
>
>  not sure if this is right, or if I should include something like
>
>  <netBufLib.h> or <ioLib.h>
>
>
>
>  22. iof_base_setup.c
>
>  struct termios not understood
>
>  can only find termios.h header in 'diab' area and I'm not using that
>
>  compiler.
>
>
>  a variable usepty is set to 0 already when OMPI_ENABLE_PTY_SUPPORT is 0.
>
>  So, why are we compiling this fragment of code at all? I hacked the file
>
>  so that the struct termios code will not get compiled.
>
>
>  23. oob_base_send/recv.c, oob_base_send/recv_nb.c. struct iovec not
> known.
>
>
>  #ifdef MCS_VXWORKS
>
>  #include <net/uio.h>
>
>  #endif
>
>
>  24. orte/mca/rmgr/base/rmgr_base_check_context.c:58: error:
>
>  `MAXHOSTNAMELEN' undeclared (first use in this function)
>
>
>  #ifdef MCS_VXWORKS
>
>  #define MAXHOSTNAMELEN 64
>
>  #endif
>
>
>  25. orte/mca/rmgr/base/rmgr_base_check_context.c:58:
>
>  gethostname()
>
>
>  #ifdef MCS_VXWORKS
>
>  #include <hostLib.h>
>
>  #endif
>
>
>  26. orte/mca/iof/proxy/iof_proxy.h:135: warning: array type has
>
>  incomplete element type
>
>  ../../../../../orte/mca/iof/proxy/iof_proxy.h:135: error: field
>
>  `proxy_iov' has incomplete type
>
>
>  #ifdef MCS_VXWORKS
>
>  #include <net/uio.h>
>
>  #endif
>
>
>  27. /orte/mca/iof/svc/iof_svc.h:147: warning: array type has incomplete
>
>  element type
>
>  ../../../../../orte/mca/iof/svc/iof_svc.h:147: error: field `svc_iov'
>
>  has incomplete type
>
>
>  #ifdef MCS_VXWORKS
>
>  #include <net/uio.h>
>
>  #endif
>
>
>  28. ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: warning: array
>
>  type has incomplete element type
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:66: error: field `msg_iov'
>
>  has incomplete type
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h: In function
>
>  `mca_oob_tcp_msg_iov_alloc':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp_msg.h:196: error: invalid
>
>  application of `sizeof' to incomplete type `iovec'
>
>
>
>  29. ../../../../../orte/mca/oob/tcp/oob_tcp.c:344: error: implicit
>
>  declaration of function `accept'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
>  `mca_oob_tcp_create_listen':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:383: error: implicit
>
>  declaration of function `socket'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:399: error: implicit
>
>  declaration of function `bind'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:407: error: implicit
>
>  declaration of function `getsockname'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:415: error: implicit
>
>  declaration of function `listen'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
>  `mca_oob_tcp_listen_thread':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:459: error: implicit
>
>  declaration of function `bzero'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
>  `mca_oob_tcp_recv_probe':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:696: error: implicit
>
>  declaration of function `send'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
>
>  `mca_oob_tcp_recv_handler':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:795: error: implicit
>
>  declaration of function `recv'
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c: In function
> `mca_oob_tcp_init':
>
>  ../../../../../orte/mca/oob/tcp/oob_tcp.c:1087: error: implicit
>
>  declaration of function `usleep'
>
>
>  This gets rid of most (except bzero and usleep)
>
>  #ifdef MCS_VXWORKS
>
>  #include <sockLib.h>
>
>  #endif
>
>
>  Trying to reconfigure the package so CFLAGS will not include -pedantic.
>
>  This is because $WIND_HOME/vxworks-6.3/target/h/string.h has protos for
>
>  bzero, but only when #if _EXTENSION_WRS is true. So turn off
>
>  -ansi/-pedantic gets this? In my dreams?
>
>    On Mar 17, 2010, at 9:54 PM, 张晶 wrote:
>
>   Hello all,
>
>
> In order to add some real-time feature to the OpenMPI for some research ,I
> need a OpenMPI version running on VxWorks. But after going through the
> Open-MPI website ,I can’t found any indication that it supports VxWorks .
>
>
> Follow the thread posted by Ralph Castain ,
> http://www.open-mpi.org/community/lists/users/2006/06/1371.php .
> I read some paper about the OpenRTE ,like “Creating a transparent,
> distributed, and resilient computing environment: the OpenRTE project” and
> “The Open Run-Time Environment (OpenRTE):A Transparent Multi-cluster
> Environment for High-Performance Computing”which is written by Ralph H.
> Castain ・ Jeffrey M. Squyres and others .
>
>
> Now I have a basic understanding of the OpenRTE , however ,there is too few
> document of the OpenRTE describing the implement of the OpenRTE . I don’t
> know
> where and how to begin the migration . Any advice will be appreciated.
>
>
>
>
> Thanks
>
>
> Jing Zhang
>
> _______________________________________________
>
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>



-- 
张晶

Reply via email to