Re: [Plplot-devel] Do we want to move plAlloc2dGrid(), plFree2dGrid, and plMinMax2dGrid

Phil Rosenberg Mon, 26 Jan 2015 02:27:05 -0800

Having thought about this a bit I just wanted to put some arguments
against such a change too. I think there are some good reasons to not
to change the way we do things. This is going to be a bit of a long
and rambling email, so sorry in advance.


The most important reason to stay as we are is the principle of least
astonishment (see Wikipedia). In general when people see a 2d array
(i.e. an array of pointers) they expect that they will each be from a
separate malloc. Certainly if I were to see this and then I needed to
resize the array I would attempt to free and re-malloc each row. This
would probably lead to a potentially difficult to track down segfault
or worse.

If you want to do this then it needs to be done properly. There needs
to be a framework for allocating and freeing memory so that people
don't accidentally make the mistake above. I would recommend having
creator, resizer and destructor functions (which I guess already
exist) for the memory at the minimum, and typedef arrays as something
like PL_FLT_ARRAY2D to make it clearer that the coder shouldn't tamper
with them too much. Or as I'm a C++ person I would even advocate
encapsulating the arrays in a struct with function pointers that deal
with a lot of this stuff and you could have an array factory that
creates such structs and populates the function pointers. Then you
could hide code in different translational units so coders can't
tinker with the raw pointers and break things. This is a similar idea
to the COM model used by windows, although COM is more strict about
the interface design. In fact I started to do this already - see my
branch on github.
I reality however I realised that really the best way to do this would
be to convert the library to C++ with a C frontend. This way all
resources could be encapsulated in classes and dealt with
automatically in constructors and destructors. However I'm not sure
everyone would feel the same way ;-)
So then we return to the fact that if we are going to have to rely on
people to make sure memory management is dealt with properly maybe the
best thing is KISS and we have done a full circle :-)

>I ran a benchmark and the one_allocate approach was almost 17 times faster (I 
>allocated and freed a 20 x 20 array).
To be honest that looks like a premature optimisation to me. I would
imagine that more often than not the memory of multiple adjacent
allocations would be put in contiguous memory anyway. That is
certainly my experience when using c++ std::vectors of std::vectors.
Also was the benchmark for one allocation, a small amount of access
and one deallocation? In this case and for such a small array the
bottleneck is probably the memory allocation. However for a large
array that exists for a significant fraction of the code lifetime, it
is the access (and cache misses in particular) that will be the
bottleneck.
Also I assume this was all done with compiler optimisations on and not
in debug code (sorry Jim if I'm not giving you due credit but it's
amazing how often people try to optimise debug code). I feel that if
you profile the plplot library as a whole then you will find that the
major bottlenecks are more likely to be deriving contours and the
drawing by the drivers and particularly blitting to screen.

Anyway after all this if you still want to go ahead and you are happy
to check the existing code for bad frees and mallocs that might break
it then feel free as Alan said. :-)

Phil

On 26 January 2015 at 01:35, Alan W. Irwin <ir...@beluga.phys.uvic.ca> wrote:
> On 2015-01-25 14:59-0500 Jim Dishaw wrote:
>
>>
>> On Jan 25, 2015, at 1:26 PM, "Alan W. Irwin" <ir...@beluga.phys.uvic.ca> 
>> wrote:
>>
>>> On 2015-01-24 22:34-0500 Jim Dishaw wrote:
>>>
>>>> Also, I would like to rewrite plAlloc2dGrid to use only one alloc
>>> because the multiple alloc algorithm that is currently used is
>>> unnecessary, slower, and on some OSes results in a fragmented heap. Also, 
>>> by using one alloc call, plFree2dGrid needs only one free.
>>>
>>> I think not.  I am pretty sure from code comments I have read in the
>>> past, the point of the two-dimensional mallocs (and frees) is to make
>>> dealing with dynamical two-dimensional arrays exactly the same as
>>> static two-dimensional arrays.  That is a big benefit of the current
>>> code I would not want to give up.
>>>
>> The behavior of the two dimensional arrays would be exactly the same.  
>> Instead of doing a malloc for the row pointers and then a series of mallocs 
>> for each row, one malloc call is made for all the required space.  I use 
>> this technique all the time and here is some example code:
>>
>> void one_allocate(float ***array, int m, int n)
>> {
>>  void *mem;
>>  float **ptr;
>>  int i;
>>
>>  // Allocate one chunk of memory to store the array data
>>  mem = malloc(
>>      // Allocate for the row pointer
>>      sizeof(float *) * m
>>      // Allocate for the rows
>>      + sizeof(float) * m * n);
>>
>>  // Assign top of the memory chunk to
>>  ptr = (float **)mem;
>>
>>  // Advance the memory pointer past the row pointer
>>  mem = (void *)((float **)mem + m);
>>
>>  // Point each row pointer to a chunk of memory
>>  // for storing one rows worth of data
>>  for(i = 0; i < m; i++) {
>>    ptr[i] = (float *)mem;
>>    mem = (void *)((float *)mem + n);
>>  }
>>
>>  (*array) = ptr;
>> }
>>
>> I ran a benchmark and the one_allocate approach was almost 17 times faster 
>> (I allocated and freed a 20 x 20 array).
>
> Hi Jim:
>
> The above is an obviously better approach for mallocing 2D arrays, and
> I am not surprised it is a lot faster.  So my apologies for objecting
> before seeing your code.
>
> Please make a separate commit for the change you mentioned before of
> splitting off plAlloc2dGrid and related functions to their own file.
> Also, please make a separate commit where plAlloc2dGrid is rewritten
> as above (with corresponding changes to plFree2dGrid).
>
> If the rest of your plbuf changes are ready, you could just make those
> two commits part of a larger patch series you prepare with git
> format-patch and attach to your post to this list.  Or you could send
> those changes separately as a two-commit patch series (also generated with
> git format-patch).  Anyhow, once I see those two commits (on their
> own or part of a larger patch series) I will follow up
> by testing and pushing those two commits to master, and I will leave
> all the rest of your plbuf changes for Phil to evaluate.
>
>>
>>
>>>> [From a separate post] Does anyone build plplot with BUFFERED_FILE turned 
>>>> on
>>>
>>> I have looked through the entire source tree with the following
>>> command for mentions of BUFFERED_FILE.
>>>
>>> software@raven> find -type f |grep -v .git |xargs grep -l BUFFERED_FILE
>>> ./src/plcore.c
>>> ./src/plbuf.c
>>> ./include/plstrm.h
>>>
>>> And if you drop the -l option so you see all the instances, it is
>>> clear the BUFFERED_FILE macro is only used and not actually set.
>>> Actually there is substantial danger of a name clash here so
>>> "BUFFERED_FILE" could be set in some system header somewhere.  So at
>>> minimum you should change this to PL_BUFFERED_FILE.  However, I agree
>>> with your conclusion that this is just code clutter since it is
>>> definitely not set in our current CMake-based build system (in place
>>> for almost a decade now) and likely not set in our previous build
>>> system either. So I think (unless someone steps forward now to say
>>> this is experimental code they meant to get back to working on some
>>> day) you should go ahead and completely remove the code (from all 3
>>> files above) that would be compiled if BUFFERED_FILE was ever #defined
>>> or specified as a compiler option.
>>>
>> I added the BUFFERED_FILE ifdef blocks to turnoff the original temporary 
>> file buffers when I implemented memory buffers.  I did not want to delete 
>> the original code in case the memory buffers was a horrible idea or if 
>> someone had a low memory machine.  I don't think anyone uses it because, as 
>> you pointed out, it is not in the documentation.
>
> So your BUFFERED_FILE ifdef blocks have replaced temporary file
> buffers with memory buffers for a long time now, and nobody has had
> issues with that change.  So I agree with you that it is time to
> remove the temporary file buffer approach completely.
>
> Alan
> __________________________
> Alan W. Irwin
>
> Astronomical research affiliation with Department of Physics and Astronomy,
> University of Victoria (astrowww.phys.uvic.ca).
>
> Programming affiliations with the FreeEOS equation-of-state
> implementation for stellar interiors (freeeos.sf.net); the Time
> Ephemerides project (timeephem.sf.net); PLplot scientific plotting
> software package (plplot.sf.net); the libLASi project
> (unifont.org/lasi); the Loads of Linux Links project (loll.sf.net);
> and the Linux Brochure Project (lbproject.sf.net).
> __________________________
>
> Linux-powered Science
> __________________________
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming. The Go Parallel Website,
> sponsored by Intel and developed in partnership with Slashdot Media, is your
> hub for all things parallel software development, from weekly thought
> leadership blogs to news, videos, case studies, tutorials and more. Take a
> look and join the conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> Plplot-devel mailing list
> Plplot-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/plplot-devel

------------------------------------------------------------------------------
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Plplot-devel mailing list
Plplot-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/plplot-devel

Re: [Plplot-devel] Do we want to move plAlloc2dGrid(), plFree2dGrid, and plMinMax2dGrid

Reply via email to