I see the issue - there are no "cores" on this topology, only "pu's", so "bind-to core" is going to fail even though binding is supported. Will adjust.
Thanks! On Jan 8, 2014, at 9:06 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: > Requested verbose output below. > -Paul > > -bash-4.2$ mpirun -mca ess_base_verbose 10 -np 1 examples/ring_c > [pcp-j-17:02150] mca: base: components_register: registering ess components > [pcp-j-17:02150] mca: base: components_register: found loaded component env > [pcp-j-17:02150] mca: base: components_register: component env has no > register or open function > [pcp-j-17:02150] mca: base: components_register: found loaded component hnp > [pcp-j-17:02150] mca: base: components_register: component hnp has no > register or open function > [pcp-j-17:02150] mca: base: components_register: found loaded component > singleton > [pcp-j-17:02150] mca: base: components_register: component singleton register > function successful > [pcp-j-17:02150] mca: base: components_register: found loaded component tool > [pcp-j-17:02150] mca: base: components_register: component tool has no > register or open function > [pcp-j-17:02150] mca: base: components_open: opening ess components > [pcp-j-17:02150] mca: base: components_open: found loaded component env > [pcp-j-17:02150] mca: base: components_open: component env open function > successful > [pcp-j-17:02150] mca: base: components_open: found loaded component hnp > [pcp-j-17:02150] mca: base: components_open: component hnp open function > successful > [pcp-j-17:02150] mca: base: components_open: found loaded component singleton > [pcp-j-17:02150] mca: base: components_open: component singleton open > function successful > [pcp-j-17:02150] mca: base: components_open: found loaded component tool > [pcp-j-17:02150] mca: base: components_open: component tool open function > successful > [pcp-j-17:02150] mca:base:select: Auto-selecting ess components > [pcp-j-17:02150] mca:base:select:( ess) Querying component [env] > [pcp-j-17:02150] mca:base:select:( ess) Skipping component [env]. Query > failed to return a module > [pcp-j-17:02150] mca:base:select:( ess) Querying component [hnp] > [pcp-j-17:02150] mca:base:select:( ess) Query of component [hnp] set > priority to 100 > [pcp-j-17:02150] mca:base:select:( ess) Querying component [singleton] > [pcp-j-17:02150] mca:base:select:( ess) Skipping component [singleton]. > Query failed to return a module > [pcp-j-17:02150] mca:base:select:( ess) Querying component [tool] > [pcp-j-17:02150] mca:base:select:( ess) Skipping component [tool]. Query > failed to return a module > [pcp-j-17:02150] mca:base:select:( ess) Selected component [hnp] > [pcp-j-17:02150] mca: base: close: component env closed > [pcp-j-17:02150] mca: base: close: unloading component env > [pcp-j-17:02150] mca: base: close: component singleton closed > [pcp-j-17:02150] mca: base: close: unloading component singleton > [pcp-j-17:02150] mca: base: close: component tool closed > [pcp-j-17:02150] mca: base: close: unloading component tool > [pcp-j-17:02150] [[INVALID],INVALID] Topology Info: > [pcp-j-17:02150] Type: Machine Number of child objects: 2 > Name=NULL > Backend=NetBSD > OSName=NetBSD > OSRelease=6.1 > OSVersion="NetBSD 6.1 (CUSTOM) #0: Fri Sep 20 13:19:58 PDT 2013 > phargrov@pcp-j-17:/home/phargrov/CUSTOM" > Architecture=i386 > Backend=x86 > Cpuset: 0x00000003 > Online: 0x00000003 > Allowed: 0x00000003 > Bind CPU proc: TRUE > Bind CPU thread: TRUE > Bind MEM proc: FALSE > Bind MEM thread: FALSE > Type: PU Number of child objects: 0 > Name=NULL > Cpuset: 0x00000001 > Online: 0x00000001 > Allowed: 0x00000001 > Type: PU Number of child objects: 0 > Name=NULL > Cpuset: 0x00000002 > Online: 0x00000002 > Allowed: 0x00000002 > -------------------------------------------------------------------------- > While computing bindings, we found no available cpus on > the following node: > > Node: pcp-j-17 > > Please check your allocation. > -------------------------------------------------------------------------- > [pcp-j-17:02150] mca: base: close: component hnp closed > [pcp-j-17:02150] mca: base: close: unloading component hnp > > > > On Wed, Jan 8, 2014 at 8:50 PM, Ralph Castain <r...@open-mpi.org> wrote: > Hmmm...looks to me like the code should protect against this - unless the > system isn't correctly reporting binding support. Could you run this with > "-mca ess_base_verbose 10"? This will output the topology we found, including > the binding support (which isn't in the usual output). > > On Jan 8, 2014, at 8:14 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Hmmm...I see the problem. Looks like binding isn't supported on that system >> for some reason, so we need to turn "off" our auto-binding when we hit that >> condition. I'll check to see why that isn't happening (was supposed to do so) >> >> >> On Jan 8, 2014, at 3:43 PM, Paul Hargrove <phhargr...@lbl.gov> wrote: >> >>> While I have yet to get a working build on NetBSD for x86-64 h/w, I *have* >>> successfully built Open MPI's current 1.7.4rc tarball on NetBSD-6 for x86. >>> However, I can't *run* anything: >>> >>> Attempting the ring_c example on 2 cores: >>> -bash-4.2$ mpirun -mca btl sm,self -np 2 examples/ring_c >>> -------------------------------------------------------------------------- >>> While computing bindings, we found no available cpus on >>> the following node: >>> >>> Node: pcp-j-17 >>> >>> Please check your allocation. >>> -------------------------------------------------------------------------- >>> >>> The failure is the same w/o "-mca btl sm,self" >>> Singleton runs fail just as the np=2 run did. >>> >>> I've attached compressed output from "ompi_info --all". >>> >>> Since this is probably an hwloc-related issue, I also build hwloc-1.7.2 >>> from pristine sources. >>> I have attached compressed output of lstopo which NOTABLY indicates a >>> failure to bind to both of the CPUs. >>> >>> For now, an explicit "--bind-to none" is working for me. >>> Please let me know what additional info may be required. >>> >>> -Paul >>> >>> -- >>> Paul H. Hargrove phhargr...@lbl.gov >>> Future Technologies Group >>> Computer and Data Sciences Department Tel: +1-510-495-2352 >>> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 >>> <ompi_info-netbsd-x86.txt.bz2><lstopo172-netbsd-x86.txt.bz2>_______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> > > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel > > > > -- > Paul H. Hargrove phhargr...@lbl.gov > Future Technologies Group > Computer and Data Sciences Department Tel: +1-510-495-2352 > Lawrence Berkeley National Laboratory Fax: +1-510-486-6900 > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel