Re: [HACKERS] 64-bit pgbench V2

2011-02-06 Thread Bruce Momjian

What happened to this idea/patch?

---

Greg Smith wrote:
 Tom Lane wrote:
  Please choose a way that doesn't introduce new portability assumptions.
  The backend gets along fine without strtoll, and I don't see why pgbench
  should have to require it.

 
 Funny you should mention this...it turns out there is some code already 
 there, I just didn't notice it before because it's only the unsigned 
 64-bit strtoul used, not the signed one I was looking for, and it's only 
 called in one place I didn't previously check.  
 src/interfaces/ecpg/ecpglib/data.c does this:
 
 *((unsigned long long int *) (var + offset * act_tuple)) = 
 strtoull(pval, scan_length, 10);
 
 The appropriate autoconf magic was in the code all along for both 
 versions, so my bad not noticing it until now.  It even transparently 
 remaps the BSD-ism of calling it strtoq.
 
 I suspect that this alone isn't sufficient to make the code I'm trying 
 to wedge into pgbench to always work on the platforms I consider must 
 haves, because of the weird names like _strtoi64 that Windows uses:  
 http://msdn.microsoft.com/en-us/library/h80404d3(v=VS.80).aspx  In fact, 
 I wouldn't be surprised to discover the ECPG code above doesn't do the 
 right thing if compiled with a 64-bit MSVC version.  Don't expect that's 
 a popular combination to explicitly test in a way that hits the code 
 path where this line is at.
 
 The untested (I need to setup for building Windows to really confirm 
 this works) next patch attempt I've attached does what I think is the 
 right general sort of thing here.  It extends the autoconf remapping 
 that was already being done to include the second variation on how the 
 function needed can be named in a MSVC build.  This might improve the 
 ECPG compatibility issue I theorize could be there on that platform.  
 Given the autoconf stuff and use of the unsigned version was already a 
 dependency, I'd rather improve that code (so it's more obvious when it 
 is broken) than do the refactoring work suggested to re-use the server's 
 internal 64-bit parsing method instead.  I could split this into two 
 patches instead--add 64-bit strtoull/strtoll support for MSVC on the 
 presumption it's actually broken now (possibly wrong on my part) and 
 make pgbench use 64-bit values--but it's not so complicated as one.
 
 I expect there is almost zero overlap between needs pgbench setshell to 
 return 32 bit return values and not on a platform with a working 
 64-bit strtoull variation.  What I did to hedge against that was add a 
 little check to pgbench that lets you confirm whether setshell lines are 
 limited to 32 bits or not, depending on whether the appropriate function 
 was found.  It tries to fall back to the existing strtol in that case, 
 and I've put a note when that happens (and matching documentation to 
 look for it) into the debug output of the program.
 
 I'll continue with testing work here, but what's attached is now the 
 first form I think this could potentially be committed in once it's 
 known to be free of obvious bugs (testing at this database scale takes 
 forever).  I can revisit not using the library function instead if Tom 
 or someone else really opposes this new approach.  Given most of the 
 autoconf bits are already there and the limited number of platforms 
 where this is a problem, I think there's little gain for doing that work 
 though.
 
 Style/functional suggestions appreciated.
 
 -- 
 Greg Smith  2ndQuadrant US  Baltimore, MD
 PostgreSQL Training, Services and Support
 g...@2ndquadrant.com   www.2ndQuadrant.us
 


 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit pgbench V2

2011-02-06 Thread Euler Taveira de Oliveira

Em 06-02-2011 13:09, Bruce Momjian escreveu:


What happened to this idea/patch?


I refactored the patch [1] to not depend on strtoll.


[1] http://archives.postgresql.org/message-id/4d2cccd9@timbira.com


--
  Euler Taveira de Oliveira
  http://www.timbira.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit pgbench V2

2010-07-12 Thread Greg Smith

Tom Lane wrote:

Please choose a way that doesn't introduce new portability assumptions.
The backend gets along fine without strtoll, and I don't see why pgbench
should have to require it.
  


Funny you should mention this...it turns out there is some code already 
there, I just didn't notice it before because it's only the unsigned 
64-bit strtoul used, not the signed one I was looking for, and it's only 
called in one place I didn't previously check.  
src/interfaces/ecpg/ecpglib/data.c does this:


*((unsigned long long int *) (var + offset * act_tuple)) = 
strtoull(pval, scan_length, 10);


The appropriate autoconf magic was in the code all along for both 
versions, so my bad not noticing it until now.  It even transparently 
remaps the BSD-ism of calling it strtoq.


I suspect that this alone isn't sufficient to make the code I'm trying 
to wedge into pgbench to always work on the platforms I consider must 
haves, because of the weird names like _strtoi64 that Windows uses:  
http://msdn.microsoft.com/en-us/library/h80404d3(v=VS.80).aspx  In fact, 
I wouldn't be surprised to discover the ECPG code above doesn't do the 
right thing if compiled with a 64-bit MSVC version.  Don't expect that's 
a popular combination to explicitly test in a way that hits the code 
path where this line is at.


The untested (I need to setup for building Windows to really confirm 
this works) next patch attempt I've attached does what I think is the 
right general sort of thing here.  It extends the autoconf remapping 
that was already being done to include the second variation on how the 
function needed can be named in a MSVC build.  This might improve the 
ECPG compatibility issue I theorize could be there on that platform.  
Given the autoconf stuff and use of the unsigned version was already a 
dependency, I'd rather improve that code (so it's more obvious when it 
is broken) than do the refactoring work suggested to re-use the server's 
internal 64-bit parsing method instead.  I could split this into two 
patches instead--add 64-bit strtoull/strtoll support for MSVC on the 
presumption it's actually broken now (possibly wrong on my part) and 
make pgbench use 64-bit values--but it's not so complicated as one.


I expect there is almost zero overlap between needs pgbench setshell to 
return 32 bit return values and not on a platform with a working 
64-bit strtoull variation.  What I did to hedge against that was add a 
little check to pgbench that lets you confirm whether setshell lines are 
limited to 32 bits or not, depending on whether the appropriate function 
was found.  It tries to fall back to the existing strtol in that case, 
and I've put a note when that happens (and matching documentation to 
look for it) into the debug output of the program.


I'll continue with testing work here, but what's attached is now the 
first form I think this could potentially be committed in once it's 
known to be free of obvious bugs (testing at this database scale takes 
forever).  I can revisit not using the library function instead if Tom 
or someone else really opposes this new approach.  Given most of the 
autoconf bits are already there and the limited number of platforms 
where this is a problem, I think there's little gain for doing that work 
though.


Style/functional suggestions appreciated.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us

diff --git a/configure b/configure
index f6b891e..a5371ba 100755
--- a/configure
+++ b/configure
@@ -21624,7 +21624,8 @@ fi
 
 
 
-for ac_func in strtoll strtoq
+
+for ac_func in strtoll strtoq _strtoi64
 do
 as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh`
 { $as_echo $as_me:$LINENO: checking for $ac_func 5
@@ -21726,7 +21727,8 @@ done
 
 
 
-for ac_func in strtoull strtouq
+
+for ac_func in strtoull strtouq _strtoui64
 do
 as_ac_var=`$as_echo ac_cv_func_$ac_func | $as_tr_sh`
 { $as_echo $as_me:$LINENO: checking for $ac_func 5
diff --git a/configure.in b/configure.in
index 0a529fa..cca6453 100644
--- a/configure.in
+++ b/configure.in
@@ -1385,8 +1385,8 @@ if test x$pgac_cv_var_int_optreset = xyes; then
   AC_DEFINE(HAVE_INT_OPTRESET, 1, [Define to 1 if you have the global variable 'int optreset'.])
 fi
 
-AC_CHECK_FUNCS([strtoll strtoq], [break])
-AC_CHECK_FUNCS([strtoull strtouq], [break])
+AC_CHECK_FUNCS([strtoll strtoq _strtoi64], [break])
+AC_CHECK_FUNCS([strtoull strtouq _strtoui64], [break])
 
 # Check for one of atexit() or on_exit()
 AC_CHECK_FUNCS(atexit, [],
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index c830dee..541510b 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -56,6 +56,15 @@
 #include sys/resource.h		/* for getrlimit */
 #endif
 
+/*
+ * If this platform doesn't have a 64-bit strtoll, fall back to
+ * using the 32-bit version.
+ */
+#ifndef HAVE_STRTOLL
+#define strtoll strtol
+#define LIMITED_STRTOLL
+#endif
+
 #ifndef 

Re: [HACKERS] 64-bit pgbench V2

2010-07-06 Thread Greg Smith

Robert Haas wrote:

It doesn't seem very palatable to have multiple handwritten integer
parsers floating around the code base either.  Maybe we should try to
standardize something and ship it in src/port, or somesuch


I was considering at one point making two trips through strtol, each 
allowed to gobble 10 characters, then combining the two--just to cut 
down a little bit on the roll your own parser aspects here.  I hadn't 
really considered how the main server does this job though.  If there's 
something reasonable to expose by refactoring some code that's already 
there, I could take a stab at that.  I'm not exactly sure where the 
integer parsing code in the server that would be appropriate is to break 
out is at though.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit pgbench V2

2010-07-06 Thread Robert Haas
On Tue, Jul 6, 2010 at 11:01 AM, Greg Smith g...@2ndquadrant.com wrote:
 Robert Haas wrote:

 It doesn't seem very palatable to have multiple handwritten integer
 parsers floating around the code base either.  Maybe we should try to
 standardize something and ship it in src/port, or somesuch

 I was considering at one point making two trips through strtol, each allowed
 to gobble 10 characters, then combining the two--just to cut down a little
 bit on the roll your own parser aspects here.  I hadn't really considered
 how the main server does this job though.  If there's something reasonable
 to expose by refactoring some code that's already there, I could take a stab
 at that.  I'm not exactly sure where the integer parsing code in the server
 that would be appropriate is to break out is at though.

Take a look at int8in.  It's got some backend-specific stuff in it ATM
but maybe it would be reasonable to try to fact that out somehow.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 64-bit pgbench V2

2010-07-05 Thread Greg Smith
Attached is an updated second rev of the patch I sent a few months ago, 
to expand pgbench to support database scales larger than around 
4,294--where the 32-bit integer for the account number overflows in the 
current version.  The current limit makes for about a 60GB database.  
Last week I ran this on a system with 72GB of RAM, which are already 
quite common, and wasn't able to get a test that didn't fit in RAM.  
Without a bug fix here I am concerned that pgbench will ship in 9.0 
already obsolete for the generation of hardware is it going to be 
deployed on.


The main tricky part was figuring how to convert the \setshell 
implementation.  That uses strtol to parse the number that should have 
been returned by the shell call.  It turns out there are a stack of ways 
to do something similar but return 64 bits instead:


* strtoll is defined by ISO C99
* strtoq was used on some earlier BSD systems
* MSVC has _strtoi64 for signed and _strtoui64 for unsigned 64bit integers

According to the glib docs at 
http://www.gnu.org/software/gnulib/manual/html_node/strtoll.html , 
stroll is missing on HP-UX 11, OSF/1 5.1, Interix 3.5, so one of the 
HP-UX boxes might be a useful testbed for what works on a trickier platform.


For prototype purposes, I wrote the patch to include some minimal logic 
to map the facility available to strtoint64, falling back to the 32-bit 
strtol if that's the best available.  There are three ways I could 
forsee this going:


1) Keep this ugly bit of code isolated to pgbench
2) Move it to src/include/c.h where the other 64-bit int abstraction is done
3) Push the problem toward autoconf

I don't have a clear argument for or against those individual options, 
they all seem reasonable from some perspectives.


The only open issue I'm not sure about is whether the situation where 
the code falls back to 32-bits should be documented, or even a warning 
produced if you create something at a scale without some strtoll 
available.  Given that it only impacts the \setrandom case, it's not 
really a disaster that it might not work, so long as there's 
documentation explaining the potential limitations.  I'll write those if 
necessary, but I think that some testing on known tricky platforms that 
I don't have setup here is the best next step, so I'm looking for 
feedback on that.


--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
g...@2ndquadrant.com   www.2ndQuadrant.us

diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index c830dee..e6621e2 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -60,6 +60,20 @@
 #define INT64_MAX	INT64CONST(0x7FFF)
 #endif
 
+/* Try to find a string to 64-bit integer implementation */
+#ifndef strtoint64
+#ifdef strtoll
+#define strtoint64	strtoll
+#elif defined(_strtoi64)
+#define strtoi64	_strtoi64
+#elif defined(strtoq)
+#define strtoi64	strtoq
+#else
+/* Fall back to 32 bit version if no 64-bit version is available */
+#define strtoi64	strtol
+#endif
+#endif
+
 /*
  * Multi-platform pthread implementations
  */
@@ -310,14 +324,14 @@ usage(const char *progname)
 }
 
 /* random number generator: uniform distribution from min to max inclusive */
-static int
-getrand(int min, int max)
+static int64
+getrand(int64 min, int64 max)
 {
 	/*
 	 * Odd coding is so that min and max have approximately the same chance of
 	 * being selected as do numbers between them.
 	 */
-	return min + (int) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
+	return min + (int64) (((max - min + 1) * (double) random()) / (MAX_RANDOM_VALUE + 1.0));
 }
 
 /* call PQexec() and exit() on failure */
@@ -627,7 +641,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 	FILE	   *fp;
 	char		res[64];
 	char	   *endptr;
-	int			retval;
+	int64			retval;
 
 	/*
 	 * Join arguments with whilespace separaters. Arguments starting with
@@ -700,7 +714,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 	}
 
 	/* Check whether the result is an integer and assign it to the variable */
-	retval = (int) strtol(res, endptr, 10);
+	retval = strtoll(res, endptr, 19);
 	while (*endptr != '\0'  isspace((unsigned char) *endptr))
 		endptr++;
 	if (*res == '\0' || *endptr != '\0')
@@ -708,7 +722,7 @@ runShellCommand(CState *st, char *variable, char **argv, int argc)
 		fprintf(stderr, %s: must return an integer ('%s' returned)\n, argv[0], res);
 		return false;
 	}
-	snprintf(res, sizeof(res), %d, retval);
+	snprintf(res, sizeof(res), INT64_FORMAT, retval);
 	if (!putVariable(st, setshell, variable, res))
 		return false;
 
@@ -956,8 +970,9 @@ top:
 		if (pg_strcasecmp(argv[0], setrandom) == 0)
 		{
 			char	   *var;
-			int			min,
-		max;
+			int64		min,
+		max,
+		rand;
 			char		res[64];
 
 			if (*argv[2] == ':')
@@ -997,15 +1012,16 @@ top:
 
 			if (max  min || max  MAX_RANDOM_VALUE)
 			{
-fprintf(stderr, %s: invalid maximum number %d\n, 

Re: [HACKERS] 64-bit pgbench V2

2010-07-05 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 The main tricky part was figuring how to convert the \setshell 
 implementation.  That uses strtol to parse the number that should have 
 been returned by the shell call.  It turns out there are a stack of ways 
 to do something similar but return 64 bits instead:

Please choose a way that doesn't introduce new portability assumptions.
The backend gets along fine without strtoll, and I don't see why pgbench
should have to require it.

(BTW, I don't actually believe that the proposed code works at all,
since in general strtoll or other variants aren't going to be macros,
but plain functions.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 64-bit pgbench V2

2010-07-05 Thread Robert Haas
On Mon, Jul 5, 2010 at 8:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Smith g...@2ndquadrant.com writes:
 The main tricky part was figuring how to convert the \setshell
 implementation.  That uses strtol to parse the number that should have
 been returned by the shell call.  It turns out there are a stack of ways
 to do something similar but return 64 bits instead:

 Please choose a way that doesn't introduce new portability assumptions.
 The backend gets along fine without strtoll, and I don't see why pgbench
 should have to require it.

It doesn't seem very palatable to have multiple handwritten integer
parsers floating around the code base either.  Maybe we should try to
standardize something and ship it in src/port, or somesuch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers