Re: [PATCHES] optimize md5_text

2005-02-23 Thread Neil Conway
Neil Conway wrote:
This patch optimizes the md5_text() function (which is used to implement 
the md5() SQL-level function).
Applied.
-Neil
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [PATCHES] optimize md5_text

2005-02-23 Thread John Hansen
Will this be backpatched to 8.0? 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Neil Conway
 Sent: Thursday, February 24, 2005 9:46 AM
 To: pgsql-patches
 Subject: Re: [PATCHES] optimize md5_text
 
 Neil Conway wrote:
  This patch optimizes the md5_text() function (which is used to 
  implement the md5() SQL-level function).
 
 Applied.
 
 -Neil
 
 ---(end of 
 broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so 
 that your
   message can get through to the mailing list cleanly
 
 

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] optimize md5_text

2005-02-23 Thread Neil Conway
John Hansen wrote:
Will this be backpatched to 8.0?
Since it's just a performance optimization, I wasn't planning to 
backpatch it, no. I suppose the OOM fix might be worth backporting, 
although that is more of a theoretical problem than something I would 
expect to actually occur in practice.

-Neil
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


[PATCHES] optimize md5_text

2005-02-22 Thread Neil Conway
This patch optimizes the md5_text() function (which is used to implement 
the md5() SQL-level function). The old code did the following:

1. de-toast the datum
2. convert it to a cstring via textout()
3. get the length of the cstring via strlen()
Since we are treating the datum context as a blob of binary data, the 
latter two steps are unnecessary. Once the data has been detoasted, we 
can just use it as-is, and derive its length from the varlena metadata.

This patch improves some run-of-the-mill md5() computations by just 
under 10% in my limited tests, and passes the regression tests.

I also noticed that md5_text() wasn't checking the return value of 
md5_hash(); encountering OOM at precisely the right moment could result 
in returning a random md5 hash. This patch corrects that. A better fix 
would be to make md5_hash() only return on success (and/or allocate via 
palloc()), but since it's used in the frontend as well I don't see an 
easy way to do that.

Barring any objections, I'll apply this to HEAD tomorrow.
-Neil
Index: src/backend/libpq/md5.c
===
RCS file: /var/lib/cvs/pgsql/src/backend/libpq/md5.c,v
retrieving revision 1.27
diff -c -r1.27 md5.c
*** src/backend/libpq/md5.c	31 Dec 2004 21:59:50 -	1.27
--- src/backend/libpq/md5.c	23 Feb 2005 07:11:43 -
***
*** 289,296 
   *		  characters.  you thus need to provide an array
   *		  of 33 characters, including the trailing '\0'.
   *
!  *	RETURNS		  0 on failure (out of memory for internal buffers) or
!  *  non-zero on success.
   *
   *	STANDARDS	  MD5 is described in RFC 1321.
   *
--- 289,296 
   *		  characters.  you thus need to provide an array
   *		  of 33 characters, including the trailing '\0'.
   *
!  *	RETURNS		  false on failure (out of memory for internal buffers) or
!  *  true on success.
   *
   *	STANDARDS	  MD5 is described in RFC 1321.
   *
Index: src/backend/utils/adt/varlena.c
===
RCS file: /var/lib/cvs/pgsql/src/backend/utils/adt/varlena.c,v
retrieving revision 1.118
diff -c -r1.118 varlena.c
*** src/backend/utils/adt/varlena.c	31 Dec 2004 22:01:22 -	1.118
--- src/backend/utils/adt/varlena.c	23 Feb 2005 07:06:44 -
***
*** 2310,2325 
  Datum
  md5_text(PG_FUNCTION_ARGS)
  {
! 	char	   *buff = PG_TEXT_GET_STR(PG_GETARG_TEXT_P(0));
! 	size_t		len = strlen(buff);
  	char	   *hexsum;
  	text	   *result_text;
  
  	/* leave room for the terminating '\0' */
  	hexsum = (char *) palloc(MD5_HASH_LEN + 1);
  
  	/* get the hash result */
! 	md5_hash((void *) buff, len, hexsum);
  
  	/* convert to text and return it */
  	result_text = PG_STR_GET_TEXT(hexsum);
--- 2310,2331 
  Datum
  md5_text(PG_FUNCTION_ARGS)
  {
! 	text	   *in_text = PG_GETARG_TEXT_P(0);
! 	size_t		len;
  	char	   *hexsum;
  	text	   *result_text;
  
+ 	/* Calculate the length of the buffer using varlena metadata */
+ 	len = VARSIZE(in_text) - VARHDRSZ;
+ 
  	/* leave room for the terminating '\0' */
  	hexsum = (char *) palloc(MD5_HASH_LEN + 1);
  
  	/* get the hash result */
! 	if (md5_hash(VARDATA(in_text), len, hexsum) == false)
! 		ereport(ERROR,
! (errcode(ERRCODE_OUT_OF_MEMORY),
!  errmsg(out of memory)));
  
  	/* convert to text and return it */
  	result_text = PG_STR_GET_TEXT(hexsum);

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])