Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-12-03 Thread Florian Weimer
* Tom Lane:

 Yeah.  You certainly don't want to do the division sequence twice,
 and a log() call wouldn't be cheap either, and there don't seem to
 be many other alternatives.

What about a sequence of comparisons, and unrolling the loop?  That
could avoid the final division, too.  It might also be helpful to
break down the dependency chain for large input values.

The int8 version should probably work in 1e9 chunks and use a
zero-padding variant of the 32-bit code.

-- 
Florian Weimerfwei...@bfk.de
BFK edv-consulting GmbH   http://www.bfk.de/
Kriegsstraße 100  tel: +49-721-96201-1
D-76133 Karlsruhe fax: +49-721-96201-99

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Robert Haas
On Fri, Nov 19, 2010 at 10:18 PM, Robert Haas robertmh...@gmail.com wrote:
 Sure thing.  Thanks for taking time to do this - very nice speedup.
 This part now committed, too.

It occurs to me belatedly that there might be a better way to do this.
 Instead of flipping value from negative to positive, with a special
case for the smallest possible integer, we could do it the other
round.  And actually, I think we can rid of neg, too.

if (value  0)
*a++ = '-';
else
value = -value;
start = a;

Then we could just adjust the calculation of the actual digit.

*a++ = '0' + (-remainder);

Good idea?  Bad idea?  Seems cleaner to me, assuming it'll actually work...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 It occurs to me belatedly that there might be a better way to do this.
  Instead of flipping value from negative to positive, with a special
 case for the smallest possible integer, we could do it the other
 round.  And actually, I think we can rid of neg, too.

The trouble with that approach is that you have to depend on the
direction of rounding for negative quotients.  Which was unspecified
before C99, and it's precisely pre-C99 compilers that are posing a
hazard to the current coding.

FWIW, I find the code still pretty darn unsightly.  I think this change
is just wrong:

 * Avoid problems with the most negative integer not being representable
 * as a positive integer.
 */
-   if (value == INT32_MIN)
+   if (value == INT_MIN)
{
memcpy(a, -2147483648, 12);

and even with INT32_MIN it was pretty silly, because there is exactly 0
hope of the code behaving sanely for some other value of the symbolic
constant.  I think it'd be much better to abandon the macros altogether
and write

if (value == (-2147483647-1))
{
memcpy(a, -2147483648, 12);

Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
than it was yesterday: LL is not the portable way to write int64
constants.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Robert Haas
On Sat, Nov 20, 2010 at 10:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The trouble with that approach is that you have to depend on the
 direction of rounding for negative quotients.  Which was unspecified
 before C99, and it's precisely pre-C99 compilers that are posing a
 hazard to the current coding.

Interesting.  I wondered whether there might be compilers out there
that handled that inconsistently, but then I thought I was probably
being paranoid.

 Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
 than it was yesterday: LL is not the portable way to write int64
 constants.

Gah.  I wish we had some documentation of this stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Tom Lane
BTW, while we're thinking about marginal improvements: instead of
constructing the string backwards and then reversing it in-place,
what about building it working backwards from the end of the buffer
and then memmove'ing it down to the start of the buffer?

I haven't tested this but it seems likely to be roughly a wash
speed-wise.  The reason I find the idea attractive is that it will
immediately expose any caller that is providing a buffer shorter
than the required length, whereas now such callers will appear to
work fine if they're only tested on small values.

A small downside is that pg_itoa would then need its own implementation
instead of just punting to pg_ltoa.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Robert Haas
On Sat, Nov 20, 2010 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 BTW, while we're thinking about marginal improvements: instead of
 constructing the string backwards and then reversing it in-place,
 what about building it working backwards from the end of the buffer
 and then memmove'ing it down to the start of the buffer?

 I haven't tested this but it seems likely to be roughly a wash
 speed-wise.  The reason I find the idea attractive is that it will
 immediately expose any caller that is providing a buffer shorter
 than the required length, whereas now such callers will appear to
 work fine if they're only tested on small values.

 A small downside is that pg_itoa would then need its own implementation
 instead of just punting to pg_ltoa.

I think that might be more clever than is really warranted.  I get
your point about buffer overrun, but I don't think it's that hard for
callers to do the right thing, so I'm inclined to think that's not
worth much in this case.  Of course, if memmove() can be implemented
as a single assembler instruction or something, that might be
appealing from a speed standpoint, but otherwise I think we may as
well stick with this.  There's less chance of needlessly touching an
extra cache line, less chance of being confused by leftover garbage in
memory after the end of the output string, and less duplicate code.

I had given some thought to whether it might make sense to try to
figure out how long the string will be before we actually start
generating it, so that we can just start in the exactly right space
and have to clean up afterward.  But the obvious implementation seems
like it could be more expensive than just doing the copy.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sat, Nov 20, 2010 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 what about building it working backwards from the end of the buffer
 and then memmove'ing it down to the start of the buffer?

 I think that might be more clever than is really warranted.  I get
 your point about buffer overrun, but I don't think it's that hard for
 callers to do the right thing, so I'm inclined to think that's not
 worth much in this case.

Fair enough --- it was just a passing thought.

 I had given some thought to whether it might make sense to try to
 figure out how long the string will be before we actually start
 generating it, so that we can just start in the exactly right space
 and have to clean up afterward.  But the obvious implementation seems
 like it could be more expensive than just doing the copy.

Yeah.  You certainly don't want to do the division sequence twice,
and a log() call wouldn't be cheap either, and there don't seem to
be many other alternatives.  If we were going to get picky about
avoiding the reverse step, I'd go with Andres' idea of changing
the API to pass back an address instead of guaranteeing that the
result begins at the start of the buffer.  But I think that's much
more complicated for callers than it's worth.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Andres Freund
On Saturday 20 November 2010 18:34:04 Tom Lane wrote:
 BTW, while we're thinking about marginal improvements: instead of
 constructing the string backwards and then reversing it in-place,
 what about building it working backwards from the end of the buffer
 and then memmove'ing it down to the start of the buffer?
 
 I haven't tested this but it seems likely to be roughly a wash
 speed-wise.  The reason I find the idea attractive is that it will
 immediately expose any caller that is providing a buffer shorter
 than the required length, whereas now such callers will appear to
 work fine if they're only tested on small values.
Tried that, the cost was measurable although not big (~3-5%)...

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Andres Freund
On Saturday 20 November 2010 18:18:32 Robert Haas wrote:
  Likewise for the int64 case, which BTW is no safer for pre-C99 compilers
  than it was yesterday: LL is not the portable way to write int64
  constants.
 Gah.  I wish we had some documentation of this stuff.
Dito. I started doing Cish stuff quite a bit *after* C99 was mostly available 
in gcc...

Sorry btw, for not realizing those points (and the regression-expectation file) 
myself...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-20 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Sat, Nov 20, 2010 at 6:31 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I had given some thought to whether it might make sense to try to
 figure out how long the string will be before we actually start
 generating it, so that we can just start in the exactly right space
 and have to clean up afterward.  But the obvious implementation seems
 like it could be more expensive than just doing the copy.

 Yeah.  You certainly don't want to do the division sequence twice,
 and a log() call wouldn't be cheap either, and there don't seem to
 be many other alternatives.

 There are bittwiddling hacks for computing log based 2. I'm not sure
 it's worth worrying about to this degree though.

I think converting log2 to log10 *exactly* would end up being not so
cheap, anyhow.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-19 Thread Andres Freund
On Monday 15 November 2010 17:12:25 Robert Haas wrote: I notice that int8out 
isn't terribly consistent with int2out and
 int4out, in that it does an extra copy.   Maybe that's justified given
 the greater potential memory wastage, but I'm not certain.  One
 approach might be to pick some threshold value and allocate a buffer
 in one of two sizes based on how large the value is relative to that
 cutoff.  But that might also be a stupid idea, not sure.
I removed the extra buffer - its actually a tiny bit faster without it  (I 
guess the allocation pattern is a bit nicer during copy as it will always take 
the same paths and eventually the same address).
I couldn't measure any difference memory-usage wise.

The code was that way before btw.

 It would speed things up for me if you or someone else could take a
 quick pass over what remains here and fix the formatting and
 whitespace to be consistent with our general project style, and make
 the comment headers more consistent among the functions being
 added/modified.
I think I did most of those - the function comments in numutils weren't 
consistent before - now its consistent with the unchanged pg_atoi. 

Thanks for reviewing/applying the first part,

Andres
From 55acfa4f971f5a0e33eb8b9e66d621c16be96d42 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Fri, 19 Nov 2010 21:44:29 +0100
Subject: [PATCH] Implement custom int[248]-string conversion routines out of speed reasons.

---
 src/backend/utils/adt/int8.c   |   10 +--
 src/backend/utils/adt/numutils.c   |  130 
 src/include/utils/builtins.h   |1 +
 src/test/regress/expected/int2.out |   13 
 src/test/regress/expected/int4.out |   13 
 src/test/regress/expected/int8.out |   13 
 src/test/regress/sql/int2.sql  |4 +
 src/test/regress/sql/int4.sql  |4 +
 src/test/regress/sql/int8.sql  |4 +
 9 files changed, 172 insertions(+), 20 deletions(-)

diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 894110d..8f4ef5a 100644
*** a/src/backend/utils/adt/int8.c
--- b/src/backend/utils/adt/int8.c
***
*** 20,25 
--- 20,26 
  #include funcapi.h
  #include libpq/pqformat.h
  #include utils/int8.h
+ #include utils/builtins.h
  
  
  #define MAXINT8LEN		25
*** Datum
*** 157,170 
  int8out(PG_FUNCTION_ARGS)
  {
  	int64		val = PG_GETARG_INT64(0);
! 	char	   *result;
! 	int			len;
! 	char		buf[MAXINT8LEN + 1];
! 
! 	if ((len = snprintf(buf, MAXINT8LEN, INT64_FORMAT, val))  0)
! 		elog(ERROR, could not format int8);
  
! 	result = pstrdup(buf);
  	PG_RETURN_CSTRING(result);
  }
  
--- 158,166 
  int8out(PG_FUNCTION_ARGS)
  {
  	int64		val = PG_GETARG_INT64(0);
! 	char	   *result = palloc(MAXINT8LEN + 1);
  
! 	pg_lltoa(val, result);
  	PG_RETURN_CSTRING(result);
  }
  
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 5f8083f..7b50549 100644
*** a/src/backend/utils/adt/numutils.c
--- b/src/backend/utils/adt/numutils.c
***
*** 3,10 
   * numutils.c
   *	  utility functions for I/O of built-in numeric types.
   *
-  *		integer:pg_atoi, pg_itoa, pg_ltoa
-  *
   * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
--- 3,8 
*** pg_atoi(char *s, int size, int c)
*** 109,135 
  }
  
  /*
!  *		pg_itoa			- converts a short int to its string represention
   *
!  *		Note:
!  *previously based on ~ingres/source/gutil/atoi.c
!  *now uses vendor's sprintf conversion
   */
  void
  pg_itoa(int16 i, char *a)
  {
! 	sprintf(a, %hd, (short) i);
  }
  
  /*
!  *		pg_ltoa			- converts a long int to its string represention
   *
!  *		Note:
!  *previously based on ~ingres/source/gutil/atoi.c
!  *now uses vendor's sprintf conversion
   */
  void
! pg_ltoa(int32 l, char *a)
  {
! 	sprintf(a, %d, l);
  }
--- 107,239 
  }
  
  /*
!  * pg_ltoa - convert a signed 16bit integer to its string representation
   *
!  * It doesnt seem worth implementing this separately.
   */
  void
  pg_itoa(int16 i, char *a)
  {
! 	pg_ltoa((int32)i, a);
  }
  
+ 
  /*
!  * pg_ltoa: convert a signed 32bit integer to its string representation
   *
!  * 'buf' has to be 12 bytes long to fit the result of any 32bit integer.
!  *
!  * Its unfortunate to have this function twice - once for 32bit, once
!  * for 64bit, but incurring the cost of 64bit computation to 32bit
!  * platforms doesn't seem to be acceptable.
   */
  void
! pg_ltoa(int32 value, char *buf)
  {
! 	char *bufstart = buf;
! 	bool neg = false;
! 
! 	/*
! 	 * Avoid problems with the most negative not being representable
! 	 * as a positive integer
! 	 */
! 	if (value == INT32_MIN)
! 	{
! 		memcpy(buf, -2147483648, 12);
! 		return;
! 	}
! 	else if (value  0)
! 	{
! 		value = -value;
! 		neg = true;
! 	}
! 
! 	/* Build the string by computing the wanted 

Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-19 Thread Robert Haas
On Fri, Nov 19, 2010 at 4:16 PM, Andres Freund and...@anarazel.de wrote:
 On Monday 15 November 2010 17:12:25 Robert Haas wrote: I notice that int8out
 isn't terribly consistent with int2out and
 int4out, in that it does an extra copy.   Maybe that's justified given
 the greater potential memory wastage, but I'm not certain.  One
 approach might be to pick some threshold value and allocate a buffer
 in one of two sizes based on how large the value is relative to that
 cutoff.  But that might also be a stupid idea, not sure.
 I removed the extra buffer - its actually a tiny bit faster without it  (I
 guess the allocation pattern is a bit nicer during copy as it will always take
 the same paths and eventually the same address).
 I couldn't measure any difference memory-usage wise.

 The code was that way before btw.

Yeah, I know.  After further thought I decided not to commit this
part, because using 32 bytes when you only need 8 is sort of sucky.
I'm not sure if it matters in real life, but if it's only a tiny
speedup I guess I might as well play it safe.

 It would speed things up for me if you or someone else could take a
 quick pass over what remains here and fix the formatting and
 whitespace to be consistent with our general project style, and make
 the comment headers more consistent among the functions being
 added/modified.
 I think I did most of those - the function comments in numutils weren't
 consistent before - now its consistent with the unchanged pg_atoi.

 Thanks for reviewing/applying the first part,

Sure thing.  Thanks for taking time to do this - very nice speedup.
This part now committed, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-15 Thread Robert Haas
On Sun, Oct 31, 2010 at 5:41 PM, Andres Freund and...@anarazel.de wrote:
 While at it:

These words always make me a bit frightened when reviewing a patch,
since it's generally simpler if a single patch only does one thing.
However, in this case...

 * I remove the outdated
 -- NOTE: int[24] operators never check for over/underflow!
 -- Some of these answers are consequently numerically incorrect.
 warnings in the regressions tests.

...this part looks obviously OK, so I have committed it.

The rest is attached as a residual patch, except that I reverted this change:

 * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not
 sure if its worth it.

I notice that int8out isn't terribly consistent with int2out and
int4out, in that it does an extra copy.   Maybe that's justified given
the greater potential memory wastage, but I'm not certain.  One
approach might be to pick some threshold value and allocate a buffer
in one of two sizes based on how large the value is relative to that
cutoff.  But that might also be a stupid idea, not sure.

It would speed things up for me if you or someone else could take a
quick pass over what remains here and fix the formatting and
whitespace to be consistent with our general project style, and make
the comment headers more consistent among the functions being
added/modified.

I think the new regression tests look good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


custom-int248-string-conversion-routines.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-15 Thread Andres Freund
On Monday 15 November 2010 17:12:25 Robert Haas wrote:
 It would speed things up for me if you or someone else could take a
 quick pass over what remains here and fix the formatting and
 whitespace to be consistent with our general project style, and make
 the comment headers more consistent among the functions being
 added/modified.
will do.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-02 Thread Peter Eisentraut
On sön, 2010-10-31 at 22:41 +0100, Andres Freund wrote:
 * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
 confusing. Not sure if its worth it.

Given that there are widely established functions atoi() and atol(),
naming the reverse itoa() and ltoa() makes a lot of sense.  The changed
versions read like string to ASCII.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-02 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On sön, 2010-10-31 at 22:41 +0100, Andres Freund wrote:
 * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
 confusing. Not sure if its worth it.

 Given that there are widely established functions atoi() and atol(),
 naming the reverse itoa() and ltoa() makes a lot of sense.  The changed
 versions read like string to ASCII.

Yeah, and s32 makes no sense at all.  I think we should either leave
well enough alone (to avoid introducing a cross-version backpatch
hazard) or use pg_i32toa etc.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-01 Thread Andres Freund
On Monday 01 November 2010 04:04:51 Itagaki Takahiro wrote:
 On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund and...@anarazel.de wrote:
  While looking at binary COPY performance I forgot to add BINARY and was a
  bit shocked to see printf that high in the profile...
  
  A change from 9192.476ms 5309.928ms seems to be pretty good indication
  that a change in that area is waranted given integer columns are quite
  ubiquous...
 
 Good optimization. Here is the result on my machine:
 * before: 13057.190 ms, 12429.092 ms, 12622.374 ms
 * after: 8261.688 ms, 8427.024 ms, 8622.370 ms
Thanks.

  * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
  confusing. Not sure if its worth it.
 
 Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than
 's'. See also
 http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx
I find itoa not as clear about signedness as stoa, but if you insist, I dont 
feel strongly about it.

 I have a couple of questions and comments:
 
 * Why did you change MAXINT8LEN + 1 to + 2 ?
   Are there possibility of buffer overflow in the current code?
 @@ -158,12 +159,9 @@ int8out(PG_FUNCTION_ARGS)
 - charbuf[MAXINT8LEN + 1];
 + charbuf[MAXINT8LEN + 2];
Argh. That should have never gotten into the patch. I was playing around with 
another optimization which would have needed more buffer space (but was quite a 
 
bit slower).

 * The buffer reordering seems a bit messy.
 //have to reorder the string, but not 0byte.
 I'd suggest to fill a fixed-size local buffer from right to left
 and copy it to the actual output.
Hm. 

while(bufstart  buf){
char swap = *bufstart;
*bufstart++ = *buf;
*buf-- = swap;
}

Is a bit cleaner maybe, but I dont see much point in putting it into its own 
function... But again, I don't feel strongly.


 * C++-style comments should be cleaned up.
Will clean up.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-01 Thread Andres Freund
Hi, 
On Monday 01 November 2010 10:15:01 Andres Freund wrote:
 On Monday 01 November 2010 04:04:51 Itagaki Takahiro wrote:
  On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund and...@anarazel.de wrote:
   While looking at binary COPY performance I forgot to add BINARY and was
   a bit shocked to see printf that high in the profile...
   
   A change from 9192.476ms 5309.928ms seems to be pretty good indication
   that a change in that area is waranted given integer columns are quite
   ubiquous...
   * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names
   confusing. Not sure if its worth it.
  Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than
  's'. See also
  http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx
 I find itoa not as clear about signedness as stoa, but if you insist, I
 dont feel strongly about it.
Let whover commits it decide...

  * The buffer reordering seems a bit messy.
  //have to reorder the string, but not 0byte.
  I'd suggest to fill a fixed-size local buffer from right to left
  and copy it to the actual output.
 Is a bit cleaner maybe, but I dont see much point in putting it into its
 own function... But again, I don't feel strongly.
Using a seperate buffer cost nearly 500ms... So I only changed the comments 
there.

The only way I could think of to make it faster was to fill the buffer from the 
end and then return a pointer to the starting point in the buffer. The speed 
benefits are small (around 80ms) and it makes the interface more cumbersome...


Revised version attached - I will submit this to the next comittfest now.

Andres
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index c450333..5340052 100644
--- a/src/backend/utils/adt/int.c
+++ b/src/backend/utils/adt/int.c
@@ -74,7 +74,7 @@ int2out(PG_FUNCTION_ARGS)
 	int16		arg1 = PG_GETARG_INT16(0);
 	char	   *result = (char *) palloc(7);	/* sign, 5 digits, '\0' */
 
-	pg_itoa(arg1, result);
+	pg_s16toa(arg1, result);
 	PG_RETURN_CSTRING(result);
 }
 
@@ -189,7 +189,7 @@ int2vectorout(PG_FUNCTION_ARGS)
 	{
 		if (num != 0)
 			*rp++ = ' ';
-		pg_itoa(int2Array-values[num], rp);
+		pg_s16toa(int2Array-values[num], rp);
 		while (*++rp != '\0')
 			;
 	}
@@ -293,7 +293,7 @@ int4out(PG_FUNCTION_ARGS)
 	int32		arg1 = PG_GETARG_INT32(0);
 	char	   *result = (char *) palloc(12);	/* sign, 10 digits, '\0' */
 
-	pg_ltoa(arg1, result);
+	pg_s32toa(arg1, result);
 	PG_RETURN_CSTRING(result);
 }
 
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 894110d..4de2dfc 100644
--- a/src/backend/utils/adt/int8.c
+++ b/src/backend/utils/adt/int8.c
@@ -20,6 +20,7 @@
 #include funcapi.h
 #include libpq/pqformat.h
 #include utils/int8.h
+#include utils/builtins.h
 
 
 #define MAXINT8LEN		25
@@ -158,12 +159,9 @@ int8out(PG_FUNCTION_ARGS)
 {
 	int64		val = PG_GETARG_INT64(0);
 	char	   *result;
-	int			len;
 	char		buf[MAXINT8LEN + 1];
 
-	if ((len = snprintf(buf, MAXINT8LEN, INT64_FORMAT, val))  0)
-		elog(ERROR, could not format int8);
-
+	pg_s64toa(val, buf);
 	result = pstrdup(buf);
 	PG_RETURN_CSTRING(result);
 }
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 5f8083f..61b4728 100644
--- a/src/backend/utils/adt/numutils.c
+++ b/src/backend/utils/adt/numutils.c
@@ -3,8 +3,6 @@
  * numutils.c
  *	  utility functions for I/O of built-in numeric types.
  *
- *		integer:pg_atoi, pg_itoa, pg_ltoa
- *
  * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
  * Portions Copyright (c) 1994, Regents of the University of California
  *
@@ -109,27 +107,126 @@ pg_atoi(char *s, int size, int c)
 }
 
 /*
- *		pg_itoa			- converts a short int to its string represention
+ * pg_s32toa - convert a signed 16bit integer to a string representation
  *
- *		Note:
- *previously based on ~ingres/source/gutil/atoi.c
- *now uses vendor's sprintf conversion
+ * It doesnt seem worth implementing this separately.
  */
 void
-pg_itoa(int16 i, char *a)
+pg_s16toa(int16 i, char *a)
 {
-	sprintf(a, %hd, (short) i);
+	pg_s32toa((int32)i, a);
 }
 
+
 /*
- *		pg_ltoa			- converts a long int to its string represention
+ * pg_s32toa - convert a signed 32bit integer to a string representation
  *
- *		Note:
- *previously based on ~ingres/source/gutil/atoi.c
- *now uses vendor's sprintf conversion
+ * Its unfortunate to have this function twice - once for 32bit, once
+ * for 64bit, but incurring the cost of 64bit computation to 32bit
+ * platforms doesn't seem to be acceptable.
  */
 void
-pg_ltoa(int32 l, char *a)
-{
-	sprintf(a, %d, l);
+pg_s32toa(int32 value, char *buf){
+	char *bufstart = buf;
+	bool neg = false;
+
+	/*
+	 * Avoid problems with the most negative not being representable
+	 * as a positive number
+	 */
+	if(value == INT32_MIN)
+	{
+		memcpy(buf, -2147483648, 12);
+		return;
+	}
+	else if(value  0)
+	{
+		value = -value;
+		neg = true;
+	}
+
+	/* Build the string by computing the wanted string backwards. 

Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-11-01 Thread Andres Freund
On Tuesday 02 November 2010 01:37:43 Andres Freund wrote:
 Revised version attached - I will submit this to the next comittfest now.
Context diff attached this time...
diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index c450333..5340052 100644
*** a/src/backend/utils/adt/int.c
--- b/src/backend/utils/adt/int.c
*** int2out(PG_FUNCTION_ARGS)
*** 74,80 
  	int16		arg1 = PG_GETARG_INT16(0);
  	char	   *result = (char *) palloc(7);	/* sign, 5 digits, '\0' */
  
! 	pg_itoa(arg1, result);
  	PG_RETURN_CSTRING(result);
  }
  
--- 74,80 
  	int16		arg1 = PG_GETARG_INT16(0);
  	char	   *result = (char *) palloc(7);	/* sign, 5 digits, '\0' */
  
! 	pg_s16toa(arg1, result);
  	PG_RETURN_CSTRING(result);
  }
  
*** int2vectorout(PG_FUNCTION_ARGS)
*** 189,195 
  	{
  		if (num != 0)
  			*rp++ = ' ';
! 		pg_itoa(int2Array-values[num], rp);
  		while (*++rp != '\0')
  			;
  	}
--- 189,195 
  	{
  		if (num != 0)
  			*rp++ = ' ';
! 		pg_s16toa(int2Array-values[num], rp);
  		while (*++rp != '\0')
  			;
  	}
*** int4out(PG_FUNCTION_ARGS)
*** 293,299 
  	int32		arg1 = PG_GETARG_INT32(0);
  	char	   *result = (char *) palloc(12);	/* sign, 10 digits, '\0' */
  
! 	pg_ltoa(arg1, result);
  	PG_RETURN_CSTRING(result);
  }
  
--- 293,299 
  	int32		arg1 = PG_GETARG_INT32(0);
  	char	   *result = (char *) palloc(12);	/* sign, 10 digits, '\0' */
  
! 	pg_s32toa(arg1, result);
  	PG_RETURN_CSTRING(result);
  }
  
diff --git a/src/backend/utils/adt/int8.c b/src/backend/utils/adt/int8.c
index 894110d..4de2dfc 100644
*** a/src/backend/utils/adt/int8.c
--- b/src/backend/utils/adt/int8.c
***
*** 20,25 
--- 20,26 
  #include funcapi.h
  #include libpq/pqformat.h
  #include utils/int8.h
+ #include utils/builtins.h
  
  
  #define MAXINT8LEN		25
*** int8out(PG_FUNCTION_ARGS)
*** 158,169 
  {
  	int64		val = PG_GETARG_INT64(0);
  	char	   *result;
- 	int			len;
  	char		buf[MAXINT8LEN + 1];
  
! 	if ((len = snprintf(buf, MAXINT8LEN, INT64_FORMAT, val))  0)
! 		elog(ERROR, could not format int8);
! 
  	result = pstrdup(buf);
  	PG_RETURN_CSTRING(result);
  }
--- 159,167 
  {
  	int64		val = PG_GETARG_INT64(0);
  	char	   *result;
  	char		buf[MAXINT8LEN + 1];
  
! 	pg_s64toa(val, buf);
  	result = pstrdup(buf);
  	PG_RETURN_CSTRING(result);
  }
diff --git a/src/backend/utils/adt/numutils.c b/src/backend/utils/adt/numutils.c
index 5f8083f..61b4728 100644
*** a/src/backend/utils/adt/numutils.c
--- b/src/backend/utils/adt/numutils.c
***
*** 3,10 
   * numutils.c
   *	  utility functions for I/O of built-in numeric types.
   *
-  *		integer:pg_atoi, pg_itoa, pg_ltoa
-  *
   * Portions Copyright (c) 1996-2010, PostgreSQL Global Development Group
   * Portions Copyright (c) 1994, Regents of the University of California
   *
--- 3,8 
*** pg_atoi(char *s, int size, int c)
*** 109,135 
  }
  
  /*
!  *		pg_itoa			- converts a short int to its string represention
   *
!  *		Note:
!  *previously based on ~ingres/source/gutil/atoi.c
!  *now uses vendor's sprintf conversion
   */
  void
! pg_itoa(int16 i, char *a)
  {
! 	sprintf(a, %hd, (short) i);
  }
  
  /*
!  *		pg_ltoa			- converts a long int to its string represention
   *
!  *		Note:
!  *previously based on ~ingres/source/gutil/atoi.c
!  *now uses vendor's sprintf conversion
   */
  void
! pg_ltoa(int32 l, char *a)
! {
! 	sprintf(a, %d, l);
  }
--- 107,232 
  }
  
  /*
!  * pg_s32toa - convert a signed 16bit integer to a string representation
   *
!  * It doesnt seem worth implementing this separately.
   */
  void
! pg_s16toa(int16 i, char *a)
  {
! 	pg_s32toa((int32)i, a);
  }
  
+ 
  /*
!  * pg_s32toa - convert a signed 32bit integer to a string representation
   *
!  * Its unfortunate to have this function twice - once for 32bit, once
!  * for 64bit, but incurring the cost of 64bit computation to 32bit
!  * platforms doesn't seem to be acceptable.
   */
  void
! pg_s32toa(int32 value, char *buf){
! 	char *bufstart = buf;
! 	bool neg = false;
! 
! 	/*
! 	 * Avoid problems with the most negative not being representable
! 	 * as a positive number
! 	 */
! 	if(value == INT32_MIN)
! 	{
! 		memcpy(buf, -2147483648, 12);
! 		return;
! 	}
! 	else if(value  0)
! 	{
! 		value = -value;
! 		neg = true;
! 	}
! 
! 	/* Build the string by computing the wanted string backwards. */
! 	do
! 	{
! 		int32 remainder;
! 		int32 oldval = value;
! 		/*
! 		 * division by constants can be optimized by some modern
! 		 * compilers (including gcc). We could add the concrete,
! 		 * optimized, calculatation here to be fast at -O0 and/or
! 		 * other compilers... Not sure if its worth doing.
! 		 */
! 		value /= 10;
! 		remainder = oldval - value * 10;
! 		*buf++ = '0' + remainder;
! 	}
! 	while(value != 0);
! 
! 	if(neg)
! 		*buf++ = '-';
! 
! 	/* have to reorder the string, but not 0 byte */
! 	*buf-- = 0;
! 
! 	/* reverse string */

[HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-10-31 Thread Andres Freund
Hi,

While looking at binary COPY performance I forgot to add BINARY and was a bit 
shocked to see printf that high in the profile...

Setup:

CREATE TABLE convtest AS SELECT a.i ai, b.i bi, a.i*b.i aibi, (a.i*b.i)::text 
aibit FROM generate_series(1,1000) a(i), generate_series(1, 1) b(i);

Profile with an unmodified pg:

speedtest=# COPY convtest(ai,bi,aibi) TO '/dev/null';
COPY 1000
Time: 9192.476 ms

Profile:
 # Events: 9K cycles
 #
 # Overhead  Command  Shared ObjectSymbol
 #   ...  .  
 #
 18.24%  postgres_oldint  libc-2.12.1.so [.] __GI_vfprintf
  8.90%  postgres_oldint  libc-2.12.1.so [.] _itoa_word
  8.77%  postgres_oldint  postgres_oldint[.] CopyOneRowTo
  8.19%  postgres_oldint  libc-2.12.1.so [.] 
_IO_default_xsputn_internal
  3.67%  postgres_oldint  postgres_oldint[.] AllocSetAlloc
  3.38%  postgres_oldint  libc-2.12.1.so [.] __strchrnul
  3.24%  postgres_oldint  libc-2.12.1.so [.] __GI___vsprintf_chk
  2.87%  postgres_oldint  postgres_oldint[.] heap_deform_tuple
  2.49%  postgres_oldint  libc-2.12.1.so [.] _IO_old_init
  2.25%  postgres_oldint  libc-2.12.1.so [.] _IO_new_file_xsputn
  2.03%  postgres_oldint  postgres_oldint[.] appendBinaryStringInfo
  1.89%  postgres_oldint  postgres_oldint[.] heapgettup_pagemode
  1.86%  postgres_oldint  postgres_oldint[.] FunctionCall1
  1.85%  postgres_oldint  postgres_oldint[.] AllocSetCheck
  1.79%  postgres_oldint  postgres_oldint[.] enlargeStringInfo



Timing after replacing those sprintf(%li, ...) calls with a quickly coded 
handrolled itoa:

speedtest=# COPY convtest(ai,bi,aibi) TO '/dev/null';
COPY 1000
Time: 5309.928 ms

Profile:
 # Events: 5K cycles
 #
 # Overhead   Command  Shared Object   Symbol
 #     .  ...
 #
 14.96%  postgres  postgres   [.] pg_s32toa
 14.75%  postgres  postgres   [.] CopyOneRowTo
  5.97%  postgres  postgres   [.] AllocSetAlloc
  4.73%  postgres  postgres   [.] heap_deform_tuple
  4.54%  postgres  postgres   [.] AllocSetCheck
  4.01%  postgres  libc-2.12.1.so [.] _IO_new_file_xsputn
  3.59%  postgres  postgres   [.] heapgettup_pagemode
  3.32%  postgres  postgres   [.] enlargeStringInfo
  3.25%  postgres  postgres   [.] appendBinaryStringInfo
  2.87%  postgres  postgres   [.] CopySendChar
  2.65%  postgres  postgres   [.] FunctionCall1
  2.44%  postgres  postgres   [.] int4out
  2.38%  postgres  [kernel.kallsyms]  [k] copy_user_generic_string
  2.30%  postgres  postgres   [.] AllocSetReset
  2.06%  postgres  postgres   [.] pg_server_to_client
  1.89%  postgres  libc-2.12.1.so [.] __GI_memset
  1.87%  postgres  libc-2.12.1.so [.] memcpy



A change from 9192.476ms 5309.928ms seems to be pretty good indication that a 
change in that area is waranted given integer columns are quite ubiquous...

While at it:

* I remove the outdated
-- NOTE: int[24] operators never check for over/underflow!
-- Some of these answers are consequently numerically incorrect.
warnings in the regressions tests.

* I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not 
sure if its worth it.

* I added some tests for the border cases of 2^31-1 / -2^31

The 'after' profile shows obvious room for furhter improvement, but on a quick 
look I couldn't think of anything. Any Ideas?



Andres


PS: Oh, thats with assertions, but the results are comparable without them 
(8765.796ms vs 4561.673ms)
From 328ae1e35988f8670323b67167256e00cb5cfde7 Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Sun, 31 Oct 2010 21:52:08 +0100
Subject: [PATCH] Implement custom int[248]-string conversion routines out of speed reasons.

While at it:
* Add a few tests for int[248]out
* remove some old comments about int[24] ops not checking for overflow
* rename pg_[il]toa to pg_s(16|32)toa for clarities sake.
---
 src/backend/utils/adt/int.c|6 +-
 src/backend/utils/adt/int8.c   |8 +--
 src/backend/utils/adt/numutils.c   |  113 +++-
 src/include/utils/builtins.h   |6 +-
 src/test/regress/expected/int2.out |   15 -
 src/test/regress/expected/int4.out |   15 -
 src/test/regress/expected/int8.out |   13 
 src/test/regress/regress.c |2 +-
 src/test/regress/sql/int2.sql  |6 +-
 src/test/regress/sql/int4.sql  |6 +-
 src/test/regress/sql/int8.sql  |4 +
 11 files changed, 160 insertions(+), 34 deletions(-)

diff --git a/src/backend/utils/adt/int.c b/src/backend/utils/adt/int.c
index c450333..5340052 100644
--- a/src/backend/utils/adt/int.c
+++ 

Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-10-31 Thread Itagaki Takahiro
On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund and...@anarazel.de wrote:
 While looking at binary COPY performance I forgot to add BINARY and was a bit
 shocked to see printf that high in the profile...

 A change from 9192.476ms 5309.928ms seems to be pretty good indication that a
 change in that area is waranted given integer columns are quite ubiquous...

Good optimization. Here is the result on my machine:
* before: 13057.190 ms, 12429.092 ms, 12622.374 ms
* after: 8261.688 ms, 8427.024 ms, 8622.370 ms

 * I renamed pg_[il]toa to pg_s(16|32|64)toa - I found the names confusing. Not
 sure if its worth it.

Agreed, but how about pg_i(16|32|64)toa? 'i' might be more popular than 's'.
See also http://msdn.microsoft.com/en-US/library/yakksftt(VS.100).aspx

I have a couple of questions and comments:

* Why did you change MAXINT8LEN + 1 to + 2 ?
  Are there possibility of buffer overflow in the current code?
@@ -158,12 +159,9 @@ int8out(PG_FUNCTION_ARGS)
-   charbuf[MAXINT8LEN + 1];
+   charbuf[MAXINT8LEN + 2];

* The buffer reordering seems a bit messy.
//have to reorder the string, but not 0byte.
I'd suggest to fill a fixed-size local buffer from right to left
and copy it to the actual output.

* C++-style comments should be cleaned up.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Custom code int(32|64) = text conversions out of performance reasons

2010-10-31 Thread Robert Haas
On Sun, Oct 31, 2010 at 11:04 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Mon, Nov 1, 2010 at 6:41 AM, Andres Freund and...@anarazel.de wrote:
 While looking at binary COPY performance I forgot to add BINARY and was a bit
 shocked to see printf that high in the profile...

 A change from 9192.476ms 5309.928ms seems to be pretty good indication that a
 change in that area is waranted given integer columns are quite ubiquous...

 Good optimization. Here is the result on my machine:
 * before: 13057.190 ms, 12429.092 ms, 12622.374 ms
 * after: 8261.688 ms, 8427.024 ms, 8622.370 ms

Wow.  Nice stuff, Andres!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers