We're looking for some advice and/or comments.

During performance and scalability testing with 8.1.4 and also 8.2Beta1,
OProfile reported that the strncpy () C library call was taking a large
amount of CPU time while we were running one of our benchmarks.

We traced a partial benchmark run using ltrace and collected statistics
on all the strncpy () calls that were being made. We noticed that a
number of the calls were similar to the following:

        strncpy (0x80808080, "BIND", 2500);

This usage of strncpy () can be traced back to the set_ps_display ()
function call. We could probably turn off set_ps_display (), but it's a
useful tool.

The specification for strncpy () indicates that when the length of the
source string (4 bytes) is less than the length of the number of bytes
to copy (2500 bytes), the remainder of the destination string will be
padded with NULL bytes (2496). Based on the GLIBC source code, strncpy
() is written to pad the destination string a byte at a time and this is
where all the time was taken, according to OProfile. 

We only have access to SLES 9 SP3 and RHEL 4 U3 based environments. To
assess the performance improvement, we have replaced the strncpy () call
in set_ps_display () with a strcpy () call, as this seemed safe for our
environments. We're seeing ~3% performance improvement. However, we
realize that our environments do not represent all the environments
where Postgres is available.

The NULL padding side effect of strncpy () does mean that the process
status buffer would be cleared of any previous output. We are not sure
if there might be any security concerns when replacing the strncpy ()
with strcpy (). 

A strncpy () call could still be used with a calculated length for the
PS activity, rather than the environment size. Although, this might
incur a penalty for using strlen () against the activity and perhaps a
bounds check to make sure it would fit into the process status buffer.

Are there any implications using PS_USE_CHANGE_ARGV on Linux rather than
PS_USE_CLOBBER_ARGV, as this seems to only use a 256 byte buffer for the
process status buffer?

We're wondering if a patch worth pursuing? There could be issues with
different platforms and the performance gain is very narrow. However, we
thought we would pass this information on.



---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?


Reply via email to