yes tim, i'm aware of it - this just needed to be fixed quickly so lanl could 
operate.

On Aug 14, 2012, at 1:00 PM, Tim Mattox <timat...@open-mpi.org> wrote:

> Is a linear search actually necessary?  Is there some order to the
> vpid's in the array?
> I would hope you could do a binary search, or if the vpid's are unordered, 
> then
> hopefully this is a rarely invoked code path.  Just thinking of scalability.
> 
> On Tue, Aug 14, 2012 at 2:18 PM,  <svn-commit-mai...@open-mpi.org> wrote:
>> Author: rhc (Ralph Castain)
>> Date: 2012-08-14 14:17:59 EDT (Tue, 14 Aug 2012)
>> New Revision: 27035
>> URL: https://svn.open-mpi.org/trac/ompi/changeset/27035
>> 
>> Log:
>> We can't just lookup the node in the node pool by daemon vpid as the daemons 
>> aren't stored that way - this was done because when holes exist in daemon 
>> vpids, we can generate huge orte_node_pool arrays even when only a few 
>> daemons actually exist. So we have to search for the vpid in the array
>> 
>> Text files modified:
>>   trunk/orte/util/nidmap.c |    42 +++++++++++++++++++++++++++++++++++++--
>>   1 files changed, 39 insertions(+), 3 deletions(-)
>> 
>> Modified: trunk/orte/util/nidmap.c
>> ==============================================================================
>> --- trunk/orte/util/nidmap.c    Tue Aug 14 14:11:09 2012        (r27034)
>> +++ trunk/orte/util/nidmap.c    2012-08-14 14:17:59 EDT (Tue, 14 Aug 2012)   
>>    (r27035)
>> @@ -1045,7 +1045,7 @@
>>     orte_std_cntr_t n;
>>     opal_buffer_t buf;
>>     int rc, j, k;
>> -    orte_job_t *jdata;
>> +    orte_job_t *jdata, *daemons;
>>     orte_proc_t *proc, *pptr;
>>     orte_node_t *node, *nptr;
>>     orte_proc_state_t *states=NULL;
>> @@ -1061,6 +1061,8 @@
>>         goto cleanup;
>>     }
>> 
>> +    daemons = orte_get_job_data_object(ORTE_PROC_MY_NAME->jobid);
>> +
>>     n = 1;
>>     /* cycle through the buffer */
>>     while (ORTE_SUCCESS == (rc = opal_dss.unpack(&buf, &jobid, &n, 
>> ORTE_JOBID))) {
>> @@ -1167,10 +1169,44 @@
>>                 proc->name.vpid = i;
>>                 opal_pointer_array_set_item(jdata->procs, i, proc);
>>             }
>> -            if (NULL == (node = 
>> (orte_node_t*)opal_pointer_array_get_item(orte_node_pool, nodes[i]))) {
>> +            /* we can't just lookup the node in the node pool by daemon vpid
>> +             * as the daemons aren't stored that way - this was done because
>> +             * when holes exist in daemon vpids, we can generate huge 
>> orte_node_pool
>> +             * arrays even when only a few daemons actually exist. So we 
>> have to
>> +             * search for the vpid in the array
>> +             */
>> +            node = NULL;
>> +            for (j=0; j < orte_node_pool->size; j++) {
>> +                if (NULL == (nptr = 
>> (orte_node_t*)opal_pointer_array_get_item(orte_node_pool, j))) {
>> +                    continue;
>> +                }
>> +                if (nptr->daemon->name.vpid == nodes[i]) {
>> +                    node = nptr;
>> +                    break;
>> +                }
>> +            }
>> +            if (NULL == node) {
>>                 /* this should never happen, but protect ourselves anyway */
>>                 node = OBJ_NEW(orte_node_t);
>> -                opal_pointer_array_set_item(orte_node_pool, nodes[i], node);
>> +                /* find the daemon */
>> +                found = false;
>> +                for (j=0; j < daemons->procs->size; j++) {
>> +                    if (NULL == (pptr = 
>> (orte_proc_t*)opal_pointer_array_get_item(daemons->procs, j))) {
>> +                        continue;
>> +                    }
>> +                    if (pptr->name.vpid == nodes[i]) {
>> +                        found = true;
>> +                        break;
>> +                    }
>> +                }
>> +                if (!found) {
>> +                    pptr = OBJ_NEW(orte_proc_t);
>> +                    pptr->name.jobid = ORTE_PROC_MY_NAME->jobid;
>> +                    pptr->name.vpid = nodes[i];
>> +                    opal_pointer_array_set_item(daemons->procs, nodes[i], 
>> pptr);
>> +                }
>> +                node->daemon = pptr;
>> +                opal_pointer_array_add(orte_node_pool, node);
>>             }
>>             if (NULL != proc->node) {
>>                 if (node != proc->node) {
>> _______________________________________________
>> svn-full mailing list
>> svn-f...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/svn-full
> 
> 
> 
> -- 
> Tim Mattox, Ph.D. - I'm a bright... http://www.the-brights.net/
> timat...@open-mpi.org || tmat...@gmail.com
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


Reply via email to