Hi folks, I am Laurent Broto, a Rich Graham postdoc. I'm currently working on the BTL extraction with Greg Koenig and Rainer Keller.
At this time, I want to group all the *_bitmap function in only one layer. Now, you know who I am :) So, just one question. I had in my mind: - adding a max_size in the opal_bitmap_t structure, - at the init time, set this field with INT_MAX or whatever the type is _MAX, - add a set_max_size functions to change the max_size, - for each function needs this test, just do if( new_size < param->max_size) ... I guess it is more efficient than the Jeff approach who is supposed to - first test if the max size has been set, - then ensure that the bitmap never grows beyond that size. In the first approach we only do one test, in the second one, always one and sometimes two. But may I miss something... What do you think about this ? -- Laurent -----Original Message----- From: devel-boun...@open-mpi.org on behalf of Jeff Squyres Sent: Sun 2/1/2009 7:37 AM To: Open MPI Developers Subject: Re: [OMPI devel] RFC: Move of ompi_bitmap_t I just looked through both opal_bitmap_t and ompi_bitmap_t and I think that the only real difference is that in the ompi version, we check (in various places) that the size of the bitmap never grows beyond OMPI_FORTRAN_HANDLE_MAX; the opal version doesn't do these kind of size checks. I think it would be fairly straightforward to: - add generic checks into the opal version, perhaps by adding a new API call (opal_bitmap_set_max_size()) - if the max size has been set, then ensure that the bitmap never grows beyond that size, otherwise let it have the same behavior as today (grow without bound -- assumedly until malloc() fails) It'll take a little care to ensure to merge the functionality correctly, but it is possible. Once that is done, you can: - remove the ompi_bitmap_t class - s/ompi_bitmap/opal_bitmap/g in the OMPI layer - add new calls to opal_bitmap_set_max_size(&bitmap, OMPI_FORTRAN_HANDLE_MAX) in the OMPI layer (should only be in a few places -- probably one for each MPI handle type...? It's been so long since I've looked at that code that I don't remember offhand) I'd generally be in favor of this because, although this is not a lot of repeated code, it *is* repeated code -- so cleaning it up and consolidating the non-Fortran stuff down in opal is not a Bad Thing. On Jan 30, 2009, at 4:59 PM, Ralph Castain wrote: > The history is simple. Originally, there was one bitmap_t in orte > that was also used in ompi. Then the folks working on Fortran found > that they had to put a limit in the bitmap code to avoid getting > values outside of Fortran's range. However, this introduced a > problem - if we had the limit in the orte version, then we limited > ourselves unnecessarily, and introduced some abstraction questions > since orte knows nothing about Fortran. > > So two were created. Then the orte_bitmap_t was blown away at a > later time when we removed the GPR as George felt it wasn't > necessary (which was true). It was later reborn when we needed it in > the routed system, but this time it was done in opal as others > indicated a potential more general use for that capability. > > The problem with uniting the two is that you either have to > introduce Fortran-based limits into opal (which messes up the non- > ompi uses), or deal with the Fortran limits in some other fashion. > Neither is particularly pleasant, though it could be done. > > I think it primarily is a question for the Fortran folks to address > - can they deal with Fortran limits in some other manner without > making the code unmanageable and/or taking a performance hit? > > Ralph > > > On Jan 30, 2009, at 2:40 PM, Richard Graham wrote: > >> This should really be viewed as a code maintenance RFC. The reason >> this >> came up in the first place is because we are investigating the btl >> move, but >> these are really two very distinct issues. There are two bits of >> code that >> have virtually the same functionality - they do have the same >> interface I am >> told. The question is, is there a good reason to keep two different >> versions in the repository ? Not knowing the history of why a second >> version was created this is an inquiry. Is there some performance >> advantage, or some other advantage to having these two versions ? >> >> Rich >> >> >> On 1/30/09 3:23 PM, "Terry D. Dontje" <terry.don...@sun.com> wrote: >> >>> I second Brian's concern. So unless this is just an announcement >>> that >>> this is being done on a tmp branch only until everything is in >>> order I >>> think we need further discussions. >>> >>> --td >>> >>> Brian Barrett wrote: >>>> So once again, I bring up my objection of this entire line of >>>> moving >>>> until such time as the entire process is properly mapped out. I >>>> believe it's premature to being moving around code in preparation >>>> for >>>> a move that hasn't been proven viable yet. Until there is concrete >>>> evidence that such a move is possible, won't degrade application >>>> performance, and does not make the code totally unmaintainable, I >>>> believe that any related code changes should not be brought into >>>> the >>>> trunk. >>>> >>>> Brian >>>> >>>> >>>> On Jan 30, 2009, at 12:30 PM, Rainer Keller wrote: >>>> >>>>> On behalf of Laurent Broto >>>>> >>>>> RFC: Move of ompi_bitmap_t >>>>> >>>>> WHAT: Move ompi_bitmap_t into opal or onet-layer >>>>> >>>>> WHY: Remove dependency on ompi-layer. >>>>> >>>>> WHERE: ompi/class >>>>> >>>>> WHEN: Open MPI-1.4 >>>>> >>>>> TIMEOUT: February 3, 2009. >>>>> >>>>> ------------------------------------- >>>>> Details: >>>>> WHY: >>>>> The ompi_bitmap_t is being used in various places within >>>>> opal/orte/ompi. With >>>>> the proposed splitting of BTLs into a separate library, we are >>>>> currently >>>>> investigating several of the differences between ompi/class/* and >>>>> opal/class/* >>>>> >>>>> One of the items is the ompi_bitmap_t which is quite similar to >>>>> the >>>>> opal_bitmap_t. >>>>> The question is, whether we can remove favoring a solution just >>>>> in opal. >>>>> >>>>> WHAT: >>>>> The data structures in the opal-version are the same, >>>>> so is the interface, >>>>> the implementation is *almost* the same.... >>>>> >>>>> The difference is the Fortran handles ;-]! >>>>> >>>>> Maybe we're missing something but could we have a discussion, on >>>>> why >>>>> Fortran >>>>> sizes are playing a role here, and if this is a hard >>>>> requirement, how >>>>> we could >>>>> settle that into that current interface (possibly without a >>>>> notion of >>>>> Fortran, >>>>> but rather, set some upper limit that the bitmap may grow to?) >>>>> >>>>> With best regards, >>>>> Laurent and Rainer >>>>> -- >>>>> ------------------------------------------------------------------------ >>>>> Rainer Keller, PhD Tel: (865) 241-6293 >>>>> Oak Ridge National Lab Fax: (865) 241-4811 >>>>> PO Box 2008 MS 6164 Email: kel...@ornl.gov >>>>> Oak Ridge, TN 37831-2008 AIM/Skype: rusraink >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> devel mailing list >>>>> de...@open-mpi.org >>>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >>>>> >>>> >>> >>> _______________________________________________ >>> devel mailing list >>> de...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/devel >> >> _______________________________________________ >> devel mailing list >> de...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/devel > > _______________________________________________ > devel mailing list > de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/devel -- Jeff Squyres Cisco Systems _______________________________________________ devel mailing list de...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/devel