Re: [HACKERS] gaussian distribution pgbench -- splits Bv6

2014-07-25 Thread Mitsumasa KONDO
Thanks for your modify the patch! I confirmed that It seems to be fine.

I think that our latest patch fill all community comment.
So it is really ready for committer now.

Best regards,
--
Mitsumasa KONDO


Re: [HACKERS] gaussian distribution pgbench -- splits Bv6

2014-07-24 Thread Fabien COELHO



Thank you for your grate documentation and fix working!!!
It becomes very helpful for understanding our feature.


Hopefully it will help make it, or part of it, pass through.


I add two feature in gauss_B_4.patch.

1) Add gaussianProbability() function
It is same as exponentialProbability(). And the feature is as same as
before.


Ok, that is better for readability and easy reuse.


2) Add result of max/min percent of the range
It is almost same as --exponential option's result. However, max percent of
the range is center of distribution
and min percent of the range is most side of distribution.
Here is the output example,


Ok, good that make it homogeneous with the exponential case.


+ pgbench_account's aid selected with a truncated gaussian distribution
+ standard deviation threshold: 5.0
+ decile percents: 0.0% 0.1% 2.1% 13.6% 34.1% 34.1% 13.6% 2.1% 0.1% 0.0%
+ probability of max/min percent of the range: 4.0% 0.0%



And I add the explanation about this in the document.


This is a definite improvement. I tested these minor changes and 
everything seems ok.


Attached is a very small update. One word removed from the doc, and one 
redundant declaration removed from the code.


I also have a problem with assert  Assert.  I finally figured out that 
Assert is not compiled in by default, thus it is generally ignored. So it 
is more for debugging purposes when activated than for guarding against 
some unexpected user errors.


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index e07206a..0247a05 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -41,6 +41,7 @@
 #include math.h
 #include signal.h
 #include sys/time.h
+#include assert.h
 #ifdef HAVE_SYS_SELECT_H
 #include sys/select.h
 #endif
@@ -173,6 +174,11 @@ bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
 
+/* gaussian/exponential distribution tests */
+double		threshold;  /* threshold for gaussian or exponential */
+booluse_gaussian = false;
+bool		use_exponential = false;
+
 char	   *pghost = ;
 char	   *pgport = ;
 char	   *login = NULL;
@@ -294,11 +300,11 @@ static int	num_commands = 0;	/* total number of Command structs */
 static int	debug = 0;			/* debug flag */
 
 /* default scenario */
-static char *tpc_b = {
+static char *tpc_b_fmt = {
 	\\set nbranches  CppAsString2(nbranches)  * :scale\n
 	\\set ntellers  CppAsString2(ntellers)  * :scale\n
 	\\set naccounts  CppAsString2(naccounts)  * :scale\n
-	\\setrandom aid 1 :naccounts\n
+	\\setrandom aid 1 :naccounts%s\n
 	\\setrandom bid 1 :nbranches\n
 	\\setrandom tid 1 :ntellers\n
 	\\setrandom delta -5000 5000\n
@@ -312,11 +318,11 @@ static char *tpc_b = {
 };
 
 /* -N case */
-static char *simple_update = {
+static char *simple_update_fmt = {
 	\\set nbranches  CppAsString2(nbranches)  * :scale\n
 	\\set ntellers  CppAsString2(ntellers)  * :scale\n
 	\\set naccounts  CppAsString2(naccounts)  * :scale\n
-	\\setrandom aid 1 :naccounts\n
+	\\setrandom aid 1 :naccounts%s\n
 	\\setrandom bid 1 :nbranches\n
 	\\setrandom tid 1 :ntellers\n
 	\\setrandom delta -5000 5000\n
@@ -328,9 +334,9 @@ static char *simple_update = {
 };
 
 /* -S case */
-static char *select_only = {
+static char *select_only_fmt = {
 	\\set naccounts  CppAsString2(naccounts)  * :scale\n
-	\\setrandom aid 1 :naccounts\n
+	\\setrandom aid 1 :naccounts%s\n
 	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
 };
 
@@ -377,6 +383,8 @@ usage(void)
 		 -v, --vacuum-all vacuum all four standard tables before tests\n
 		 --aggregate-interval=NUM aggregate data over NUM seconds\n
 		 --sampling-rate=NUM  fraction of transactions to log (e.g. 0.01 for 1%%)\n
+		 --exponential=NUMexponential distribution with NUM threshold parameter\n
+		 --gaussian=NUM   gaussian distribution with NUM threshold parameter\n
 		   \nCommon options:\n
 		 -d, --debug  print debugging output\n
 	-h, --host=HOSTNAME  database server host or socket directory\n
@@ -2329,6 +2337,30 @@ process_builtin(char *tb)
 	return my_commands;
 }
 
+/*
+ * compute the probability of the truncated gaussian random generation
+ * to draw values in the i-th slot of the range.
+ */
+static double gaussianProbability(int i, int slots, double threshold)
+{
+	assert(1 = i  i = slots);
+	return (0.50 * (erf (threshold * (1.0 - 1.0 / slots * (2.0 * i - 2.0)) / sqrt(2.0)) -
+		erf (threshold * (1.0 - 1.0 / slots * 2.0 * i) / sqrt(2.0))) /
+		erf (threshold / sqrt(2.0)));
+}
+
+/*
+ * compute the probability of the truncated exponential random generation
+ * to draw values in the i-th slot of the range.
+ */
+static double exponentialProbability(int i, int slots, double threshold)
+{
+	assert(1 = i  i = slots);
+	return (exp(- threshold * (i - 1) / slots) - exp(- threshold * i / slots)) /
+		

Re: [HACKERS] gaussian distribution pgbench -- splits Bv6

2014-07-24 Thread Alvaro Herrera
Fabien COELHO wrote:

 I also have a problem with assert  Assert.  I finally figured out
 that Assert is not compiled in by default, thus it is generally
 ignored. So it is more for debugging purposes when activated than
 for guarding against some unexpected user errors.

Yes, Assert() is for debugging during development.  If you need to deal
with user error, use regular if () and exit() as appropriate (ereport()
in the backend).  We mostly avoid assert() in our own code.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers