Re: [Rd] Suggestion for serialization performance improvement on Windows

2010-07-20 Thread Prof Brian Ripley

On Fri, 9 Jul 2010, Bryan W. Lewis wrote:


Dear R developers,

 The slow performance of serializing to a raw vector on Windows is an
issue that has appeared in this list before. It appears to be due to


References?


the frequent use of realloc from the resize_buffer method in
serialize.c.

I suggest a more granular, but still incremental, re-allocation of
memory. For example change near the top of resize_buffer to:

R_size_t newsize = needed + 65536 - (needed % 65536);

or some other similar small multiple of a typical system page size.


for some definition of 'small multiple'


I have found this to dramatically improve performance of serialization
to raw vectors on Windows.


However, I didn't and you presented no evidence.  On HB's 2008 example 
your idea achieved for me a speedup of about 3x.  A much better 
speedup (15x) was achieved by switching serialize.c to use the 
alternative malloc used by memory.c, and using a much larger page size 
(e.g. 1Mb) was better still.  But changing the re-allocation strategy 
resulted in a 150x speed up, to levels comparable to decent operating 
systems like Linux and Solaris with the existing code.


(In case it matters, I was using x64 Windows 7.)

Ideally you would have

- given references for your claims

- given examples for why this was too slow for you

- specified an exact patch with performance comparisons for your examples

- given your credentials (see the comment about 'good manners' in the 
R posting guide).  It is very likely that we would not have been able 
to use any patch you supplied without such credentials.


So please test R-devel, and if there is still a problem reply with all 
the details omitted here.


--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Suggestion for serialization performance improvement on Windows

2010-07-13 Thread Henrik Bengtsson
On Fri, Jul 9, 2010 at 6:49 AM, Bryan W. Lewis bwaynele...@gmail.com wrote:
 Dear R developers,

  The slow performance of serializing to a raw vector on Windows is an
 issue that has appeared in this list before.

My guess is that you are referring to:

[Rd] serialize() to via temporary file is heaps faster than doing it
directly (on Windows), 2008-07-24
http://tolstoy.newcastle.edu.au/R/e4/devel/08/07/2355.html

If so, that thread show how unnecessarily slow (5 mins instead of 5
secs) it is on Windows.

 It appears to be due to
 the frequent use of realloc from the resize_buffer method in
 serialize.c.

 I suggest a more granular, but still incremental, re-allocation of
 memory. For example change near the top of resize_buffer to:

 R_size_t newsize = needed + 65536 - (needed % 65536);

 or some other similar small multiple of a typical system page size.

 I have found this to dramatically improve performance of serialization
 to raw vectors on Windows.

I second this update, which seems to make serialize(...,
connection=NULL) useful in Windows.

Thxs,

Henrik


 Best,

 Bryan

 __
 R-devel@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-devel


__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel