[HACKERS] About the portal in postgres

2014-03-14 Thread Tanmay Deshpande
My doubt is when the query enters into a portal, does it stay in the portal
till the final execution ? i.e. Do the further function calls such as
DefineRelation,create_heap_with_catalog etc. for Create query occur inside
the portal ?


Re: [HACKERS] gaussian distribution pgbench

2014-03-14 Thread Fabien COELHO


Well, when we set '--gaussian=NUM' or '--exponential=NUM' on command line, we 
can see access probability of top N records in result of final output. This 
out put is under following,


Indeed. I had forgotten this point. This is a significant information that 
I would not like to loose.


This feature helps user to understand bias of distribution for tuning 
threshold parameter.
If this feature is nothing, it is difficult to understand distribution of 
access pattern, and it cannot realized on custom script. Because range of 
distribution (min, max, and SQL pattern) are unknown on custom script. So I 
think present UI is not bad and should not change.


Ok. I agree with this argument.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About the portal in postgres

2014-03-14 Thread Atri Sharma
On Fri, Mar 14, 2014 at 12:29 PM, Tanmay Deshpande tp.deshpand...@gmail.com
 wrote:

 My doubt is when the query enters into a portal, does it stay in the
 portal till the final execution ? i.e. Do the further function calls such
 as DefineRelation,create_heap_with_catalog etc. for Create query occur
 inside the portal ?



What do you mean by 'Portal' ?

Regards,

Atri


Re: [HACKERS] About the portal in postgres

2014-03-14 Thread Amit Kapila
On Fri, Mar 14, 2014 at 12:29 PM, Tanmay Deshpande
tp.deshpand...@gmail.com wrote:
 My doubt is when the query enters into a portal, does it stay in the portal
 till the final execution ?

Yes, portal represents the execution state of query, after the
optimizer creates the
plan, portal is used in execution of query.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.warn_shadow

2014-03-14 Thread Simon Riggs
On 3 February 2014 20:17, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 I am not happy from warnings_as_error

 what about stop_on_warning instead?

 second question: should be these errors catchable or uncatchable?

 When I work on large project, where I had to use some error handler of
 EXCEPTION WHEN OTHERS I found very strange and not useful so all syntax
 errors was catched by this handler. Any debugging was terribly difficult and
 I had to write plpgsql_lint as solution.

The patch looks fine, apart from some non-guideline code formatting issues.

Having looked at gcc and clang, I have a proposal for naming/API

We just have two variables

  plpgsql.compile_warnings = 'list'default = 'none'
  plpgsql.compile_errors = 'list'default = 'none'

Only possible values in 9.4 are 'shadow', 'all', 'none'

If we can agree something quickly then we can commit this for 9.4

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Kyotaro HORIGUCHI
Hello, we found that postgreql won't complete archive recovery
foever on some situation. This occurs HEAD, 9.3.3, 9.2.7, 9.1.12.

Restarting server with archive recovery fails as following just
after it was killed with SIGKILL after pg_start_backup and some
wal writes but before pg_stop_backup.

| FATAL:  WAL ends before end of online backup
| HINT: Online backup started with pg_start_backup() must be
|  ended with pg_stop_backup(), and all WAL up to that point must
|  be available at recovery.

What the mess is once entering this situation, I could find no
formal operation to exit from it.

On this situation, 'Backup start location' in controldata has
some valid location but corresponding 'end of backup' WAL record
won't come forever.

But I think PG cannot tell the situation dintinctly whether the
'end of backup' reocred is not exists at all or it will come
later especially when the server starts as a streaming
replication hot-standby.

One solution for it would be a new parameter in recovery.conf
which tells that the operator wants the server to start as if
there were no backup label ever before when the situation
comes. It looks ugly and somewhat danger but seems necessary.

The first attached file is the script to replay the problem, and
the second is the patch trying to do what is described above.

After applying this patch on HEAD and uncommneting the
'cancel_backup_label_on_failure = true' in test.sh, the test
script runs as following,

| LOG:  record with zero length at 0/2010F40
| WARNING:  backup_label was canceled.
| HINT:  server might have crashed during backup mode.
| LOG:  consistent recovery state reached at 0/2010F40
| LOG:  redo done at 0/2010DA0

What do you thing about this?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
#! /bin/sh

killall postgres
rm -rf $PGDATA/*
initdb

cat  $PGDATA/postgresql.conf EOF
wal_level = hot_standby
EOF

pg_ctl start -w
sleep 1
psql postgres -c select pg_start_backup('hoge');
psql postgres -c create table t (a int);
killall -9 postgres
# pg_ctl stop -m f

cat  $PGDATA/recovery.conf EOF
# standby_mode = on
restore_command = '/bin/true'
recovery_target_timeline = 'latest'
# cancel_backup_label_on_failure = true
EOF

pg_ctl start
sleep 5
pg_ctl stop -w

pg_ctl start
sleep 5
pg_ctl stop -w

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index cdbe305..d1f93bb 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -230,6 +230,7 @@ static TimestampTz recoveryDelayUntilTime;
 static bool StandbyModeRequested = false;
 static char *PrimaryConnInfo = NULL;
 static char *PrimarySlotName = NULL;
+static bool cancelBackupLabelOnFailure = false;
 static char *TriggerFile = NULL;
 
 /* are we currently in standby mode? */
@@ -5569,6 +5570,16 @@ readRecoveryCommandFile(void)
 			ereport(DEBUG2,
 	(errmsg(min_recovery_apply_delay = '%s', item-value)));
 		}
+		else if (strcmp(item-name, cancel_backup_label_on_failure) == 0)
+		{
+			if (!parse_bool(item-value, cancelBackupLabelOnFailure))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg(parameter \%s\ requires a Boolean value,
+cancel_backup_label_on_failure)));
+			ereport(DEBUG2,
+	(errmsg_internal(cancel_backup_label_on_failure = '%s', item-value)));
+		}
 		else
 			ereport(FATAL,
 	(errmsg(unrecognized recovery parameter \%s\,
@@ -7111,6 +7122,21 @@ StartupXLOG(void)
 record = ReadRecord(xlogreader, InvalidXLogRecPtr, LOG, false);
 			} while (record != NULL);
 
+			if (cancelBackupLabelOnFailure 
+ControlFile-backupStartPoint != InvalidXLogRecPtr)
+			{
+/*
+ * Try to force complete recocovery when backup_label was
+ * found but end-of-backup record has not been found.
+ */
+
+ControlFile-backupStartPoint = InvalidXLogRecPtr;
+
+ereport(WARNING,
+  (errmsg(backup_label was canceled.),
+   errhint(server might have crashed during backup mode.)));
+CheckRecoveryConsistency();
+			}
 			/*
 			 * end of main redo apply loop
 			 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Providing catalog view to pg_hba.conf file - Patch submission

2014-03-14 Thread Magnus Hagander
On Fri, Mar 14, 2014 at 6:30 AM, Prabakaran, Vaishnavi 
vaishna...@fast.au.fujitsu.com wrote:

 Hi,



 In connection to my previous proposal about providing catalog view to
 pg_hba.conf file contents , I have developed the attached patch .



 [Current situation]

 Currently, to view the pg_hba.conf file contents, DB admin has to access
 the file from database server to read the settings.  In case of huge and
 multiple hba files, finding the appropriate hba rules which are loaded will
 be difficult and take some time.



 [What this Patch does]

 Functionality of the attached patch is that it will provide a new view
 pg_hba_settings to admin users. Public access to the view is restricted.
 This view will display basic information about HBA setting details of
 postgresql cluster.  Information to be shown , is taken from parsed hba
 lines and not directly read from pg_hba.conf files. Documentation files are
 also updated to include details of this new view under Chapter 47.System
 Catalogs. Also , a new note is added in chapter 19.1 The pg_hba.conf File



 [Advantage]

 Advantage of having this pg_hba_settings view is that the admin can
 check, what hba rules are loaded in runtime via database connection itself.
  And, thereby it will be easy and useful for admin to check all the users
 with their privileges in a single view to manage them.



This looks like a useful feature, so make sure you register it on
https://commitfest.postgresql.org/action/commitfest_view?id=22.

I haven't looked at the actual code yet, btu I did notice one thing at a
very quick lookover at the docs - it seems to be completely ignoring the
key/value parameters given on a row, and stops reporting after the auth
method? That seems bad. And also, probably host/mask should be using the
inet style datatypes and not text?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Heikki Linnakangas

On 03/14/2014 12:32 PM, Kyotaro HORIGUCHI wrote:

Restarting server with archive recovery fails as following just
after it was killed with SIGKILL after pg_start_backup and some
wal writes but before pg_stop_backup.

| FATAL:  WAL ends before end of online backup
| HINT: Online backup started with pg_start_backup() must be
|  ended with pg_stop_backup(), and all WAL up to that point must
|  be available at recovery.

What the mess is once entering this situation, I could find no
formal operation to exit from it.

On this situation, 'Backup start location' in controldata has
some valid location but corresponding 'end of backup' WAL record
won't come forever.

But I think PG cannot tell the situation dintinctly whether the
'end of backup' reocred is not exists at all or it will come
later especially when the server starts as a streaming
replication hot-standby.


If you kill the server while a backup is in progress, the backup is 
broken. It's correct that the server refuses to start up from the broken 
backup. So basically, don't do that.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.warn_shadow

2014-03-14 Thread Marko Tiikkaja

On 3/14/14 10:56 AM, Simon Riggs wrote:

The patch looks fine, apart from some non-guideline code formatting issues.


I'm not sure what you're referring to.  I thought it looked fine.


Having looked at gcc and clang, I have a proposal for naming/API

We just have two variables

   plpgsql.compile_warnings = 'list'default = 'none'
   plpgsql.compile_errors = 'list'default = 'none'

Only possible values in 9.4 are 'shadow', 'all', 'none'


I'm fine with this.  I'm starting to think that runtime warnings are a 
bad idea anyway.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun

2014-03-14 Thread Peter Geoghegan
Ping?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Tomas Vondra
On 13 Březen 2014, 23:39, Peter Geoghegan wrote:
 On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote:
 It does sound like the main question here is which opclass should be
 the default. From the discussion there's a jsonb_hash_ops which works
 on all input values but supports fewer operators and a jsonb_ops which
 supports more operators but can't handle json with larger individual
 elements. Perhaps it's better to make jsonb_hash_ops the default so at
 least it's always safe to create a default gin index?

 Personally, I don't think it's a good idea to change the default. I
 have yet to be convinced that if you hit the GIN limitation it's an
 indication of anything other than that you need to reconsider your
 indexing choices (how often have we heard that complaint of GIN before
 in practice?). Even if you don't hit the limitation directly, with

I've never used GIN with anything else than values that built-in full-text
(tsvector), pg_trgm or points, and I suspect that's the case with most
other users. All those types have naturally limited size (e.g. words
tend to have very limited length, unless you're Maori, but even there the
longest name is just 85 characters [1]).

The only place in (core|contrib) where I'd expect this kind of issues is
probably intarray, but it's arguably less frequently used than
tsvector/pg_trgm for example.

So ISTM this is the main reason why we don't see more complaints about the
GIN size limit. I expect that to change with json + index all approach.

 something like jsonb_hash_ops you're still hashing a large nested
 structure, very probably uselessly. Are you really going to look for
 an exact match to an elaborate nested structure? I would think,
 probably not.

What I find (very) useful is queries that look like this:

   SELECT if FROM json_table WHERE json_value @ '{a : {b : {c : 3}}}';

or (without the @ operator) like this:

   SELECT if FROM json_table WHERE json_value # ARRAY['a', 'b', 'c'] = '3';

or something like that ...

 Now, as Alexander says, there might be a role for another
 (jsonb_hash_ops) opclass that separately indexes values only. I still
 think that by far the simplest solution is to use expressional
 indexes, because we index key values and array element values
 indifferently. Of course, nothing we have here precludes the
 development of such an opclass.

Maybe. I don't have much insight into ho GIN works / what is possible. But
I think we should avoid having large number of opclasses, each supporting
a small fraction of use cases. If we could keep the two we have right now,
that'd be nice.

regards
Tomas

[1] http://en.wikipedia.org/wiki/List_of_long_place_names




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql.warn_shadow

2014-03-14 Thread Pavel Stehule
2014-03-14 12:02 GMT+01:00 Marko Tiikkaja ma...@joh.to:

 On 3/14/14 10:56 AM, Simon Riggs wrote:

 The patch looks fine, apart from some non-guideline code formatting
 issues.


 I'm not sure what you're referring to.  I thought it looked fine.


  Having looked at gcc and clang, I have a proposal for naming/API

 We just have two variables

plpgsql.compile_warnings = 'list'default = 'none'


+1


 plpgsql.compile_errors = 'list'default = 'none'

 Only possible values in 9.4 are 'shadow', 'all', 'none'


what is compile_errors ? We don't allow to ignore any error now.




 I'm fine with this.  I'm starting to think that runtime warnings are a bad
 idea anyway.


+1

Pavel




 Regards,
 Marko Tiikkaja



Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Kyotaro HORIGUCHI
Thank you.

2014/03/14 19:42 Heikki Linnakangas hlinnakan...@vmware.com:

 On 03/14/2014 12:32 PM, Kyotaro HORIGUCHI wrote:

 Restarting server with archive recovery fails as following just
 after it was killed with SIGKILL after pg_start_backup and some
 wal writes but before pg_stop_backup.

 | FATAL:  WAL ends before end of online backup
 | HINT: Online backup started with pg_start_backup() must be
 |  ended with pg_stop_backup(), and all WAL up to that point must
 |  be available at recovery.

 What the mess is once entering this situation, I could find no
 formal operation to exit from it.


 If you kill the server while a backup is in progress, the backup is
broken. It's correct that the server refuses to start up from the broken
backup. So basically, don't do that.

Hmm.. What I did is simplly restarting server  just after a crash but the
server was accidentially in backup mode. No backup copy is used. Basically,
the server is in the same situation with the simple restart after crash.
The difference here is the restarting made the database completly useless
while it had been not. I wish to save the database for the case and I
suppose it so acceptable.

regards,
-- 
Kyotaro Horiguchi
NTT Opensource Software Center


Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Kyotaro HORIGUCHI
Sorry, I  wrote a little wrong.

2014/03/14 20:24 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp:
 I wish to save the database for the case and I suppose it so acceptable.

and I don't suppose it so unacceptable.

regards,
-- 
Kyotaro Horiguchi
NTT Opensource Software Center


Re: [HACKERS] gaussian distribution pgbench

2014-03-14 Thread Heikki Linnakangas

On 03/13/2014 04:00 PM, Fujii Masao wrote:

On Thu, Mar 13, 2014 at 10:51 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

IMHO we should just implement the \setrandom changes, and not add any of
these options to modify the standard test workload. If someone wants to run
TPC-B workload with gaussian or exponential distribution, they can implement
it as a custom script. The docs include the script for the standard TPC-B
workload; just copy-paster that and modify the \setrandom lines.


Yeah, I'm OK with this.


So I took a look at the \setrandom parts of this patch to see if that's 
ready for commit, without any of the changes to modify the standard 
TPC-B workload. Attached is a patch with just those parts; everyone 
please focus on this.


A couple of comments:

* There should be an explicit \setrandom ... uniform option too, even 
though you get that implicitly if you don't specify the distribution


* What exactly does the threshold mean? The docs informally explain 
that the larger the thresold, the more frequent values close to the 
middle of the interval are drawn, but that's pretty vague.


* Does min and max really make sense for gaussian and exponential 
distributions? For gaussian, I would expect mean and standard deviation 
as the parameters, not min/max/threshold.


* How about setting the variable as a float instead of integer? Would 
seem more natural to me. At least as an option.


- Heikki
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 7c1e59e..a7713af 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -98,6 +98,9 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define LOG_STEP_SECONDS	5	/* seconds between log messages */
 #define DEFAULT_NXACTS	10		/* default nxacts */
 
+#define MIN_GAUSSIAN_THRESHOLD		2.0	/* minimum threshold for gauss */
+#define MIN_EXPONENTIAL_THRESHOLD	2.0	/* minimum threshold for exp */
+
 int			nxacts = 0;			/* number of transactions per client */
 int			duration = 0;		/* duration in seconds */
 
@@ -469,6 +472,79 @@ getrand(TState *thread, int64 min, int64 max)
 	return min + (int64) ((max - min + 1) * pg_erand48(thread-random_state));
 }
 
+/* random number generator: exponential distribution from min to max inclusive */
+static int64
+getExponentialrand(TState *thread, int64 min, int64 max, double exp_threshold)
+{
+	double		rand;
+
+	/*
+	 * Get user specified random number in this loop. This loop is executed until
+	 * the number in the expected range. As the minimum threshold is 2.0, the
+	 * probability of a retry is at worst 13.5% as - ln(0.135) ~ 2.0 ;
+	 * For a 5.0 threshold, it is about e^{-5} ~ 0.7%.
+	 */
+	do
+	{
+		/* as pg_erand48 is in [0, 1), uniform is in (0, 1] */
+		double uniform = 1.0 - pg_erand48(thread-random_state);
+		/* rand is in [0 LARGE) */
+		rand = - log(uniform);
+	} while (rand = exp_threshold);
+
+	/* rand in [0, exp_threshold), normalized to [0,1) */
+	rand /= exp_threshold;
+
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1) * rand);
+}
+
+/* random number generator: gaussian distribution from min to max inclusive */
+static int64
+getGaussianrand(TState *thread, int64 min, int64 max, double stdev_threshold)
+{
+	double		stdev;
+	double		rand;
+
+	/*
+	 * Get user specified random number from this loop, with
+	 * -stdev_threshold  stdev = stdev_threshold
+	 *
+	 * This loop is executed until the number is in the expected range.
+	 *
+	 * As the minimum threshold is 2.0, the probability of looping is low:
+	 * sqrt(-2 ln(r)) = 2 = r = e^{-2} ~ 0.135, then when taking the average
+	 * sinus multiplier as 2/pi, we have a 8.6% looping probability in the
+	 * worst case. For a 5.0 threshold value, the looping proability
+	 * is about e^{-5} * 2 / pi ~ 0.43%.
+	 */
+	do
+	{
+		/*
+		 * pg_erand48 generates [0,1), but for the basic version of the
+		 * Box-Muller transform the two uniformly distributed random numbers
+		 * are expected in (0, 1] (see http://en.wikipedia.org/wiki/Box_muller)
+		 */
+		double rand1 = 1.0 - pg_erand48(thread-random_state);
+		double rand2 = 1.0 - pg_erand48(thread-random_state);
+
+		/* Box-Muller basic form transform */
+		double var_sqrt = sqrt(-2.0 * log(rand1));
+		stdev = var_sqrt * sin(2.0 * M_PI * rand2);
+
+		/* we may try with cos, but there may be a bias induced if the previous
+		 * value fails the test? To be on the safe side, let us try over.
+		 */
+	}
+	while (stdev  -stdev_threshold || stdev = stdev_threshold);
+
+	/* stdev is in [-threshold, threshold), normalization to [0,1) */
+	rand = (stdev + stdev_threshold) / (stdev_threshold * 2.0);
+
+	/* return int64 random number within between min and max */
+	return min + (int64)((max - min + 1) * rand);
+}
+
 /* call PQexec() and exit() on failure */
 static void
 executeStatement(PGconn *con, const char *sql)
@@ -1312,6 +1388,7 @@ top:
 			char	   *var;
 			int64		min,
 		max;
+			double		threshold = 0;
 			char		

Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Heikki Linnakangas

On 03/14/2014 01:24 PM, Kyotaro HORIGUCHI wrote:

Hmm.. What I did is simplly restarting server  just after a crash but the
server was accidentially in backup mode. No backup copy is used. Basically,
the server is in the same situation with the simple restart after crash.


You created recovery.conf in the master server after crash. Just don't 
do that.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Oleg Bartunov
VODKA index will have no lenght limitation.

On Fri, Mar 14, 2014 at 3:07 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 13 Březen 2014, 23:39, Peter Geoghegan wrote:
 On Thu, Mar 13, 2014 at 2:21 AM, Greg Stark st...@mit.edu wrote:
 It does sound like the main question here is which opclass should be
 the default. From the discussion there's a jsonb_hash_ops which works
 on all input values but supports fewer operators and a jsonb_ops which
 supports more operators but can't handle json with larger individual
 elements. Perhaps it's better to make jsonb_hash_ops the default so at
 least it's always safe to create a default gin index?

 Personally, I don't think it's a good idea to change the default. I
 have yet to be convinced that if you hit the GIN limitation it's an
 indication of anything other than that you need to reconsider your
 indexing choices (how often have we heard that complaint of GIN before
 in practice?). Even if you don't hit the limitation directly, with

 I've never used GIN with anything else than values that built-in full-text
 (tsvector), pg_trgm or points, and I suspect that's the case with most
 other users. All those types have naturally limited size (e.g. words
 tend to have very limited length, unless you're Maori, but even there the
 longest name is just 85 characters [1]).

 The only place in (core|contrib) where I'd expect this kind of issues is
 probably intarray, but it's arguably less frequently used than
 tsvector/pg_trgm for example.

 So ISTM this is the main reason why we don't see more complaints about the
 GIN size limit. I expect that to change with json + index all approach.

 something like jsonb_hash_ops you're still hashing a large nested
 structure, very probably uselessly. Are you really going to look for
 an exact match to an elaborate nested structure? I would think,
 probably not.

 What I find (very) useful is queries that look like this:

SELECT if FROM json_table WHERE json_value @ '{a : {b : {c : 3}}}';

 or (without the @ operator) like this:

SELECT if FROM json_table WHERE json_value # ARRAY['a', 'b', 'c'] = '3';

 or something like that ...

 Now, as Alexander says, there might be a role for another
 (jsonb_hash_ops) opclass that separately indexes values only. I still
 think that by far the simplest solution is to use expressional
 indexes, because we index key values and array element values
 indifferently. Of course, nothing we have here precludes the
 development of such an opclass.

 Maybe. I don't have much insight into ho GIN works / what is possible. But
 I think we should avoid having large number of opclasses, each supporting
 a small fraction of use cases. If we could keep the two we have right now,
 that'd be nice.

 regards
 Tomas

 [1] http://en.wikipedia.org/wiki/List_of_long_place_names




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Kyotaro HORIGUCHI
Hello,

2014/03/14 20:51 Heikki Linnakangas hlinnakan...@vmware.com:
 You created recovery.conf in the master server after crash. Just don't do
that.

Ah, ok. I understood what you meant.
Sorry that I can't confirm rihgt now, the original issue should occur on
the standby. I might've oversimplicated.

regards,
-- 
Kyotaro Horiguchi
NTT Opensource Software Center


Re: [HACKERS] plpgsql.warn_shadow

2014-03-14 Thread Simon Riggs
On 14 March 2014 11:10, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-03-14 12:02 GMT+01:00 Marko Tiikkaja ma...@joh.to:

 On 3/14/14 10:56 AM, Simon Riggs wrote:

 The patch looks fine, apart from some non-guideline code formatting
 issues.


 I'm not sure what you're referring to.  I thought it looked fine.


 Having looked at gcc and clang, I have a proposal for naming/API

 We just have two variables

plpgsql.compile_warnings = 'list'default = 'none'


 +1


plpgsql.compile_errors = 'list'default = 'none'

 Only possible values in 9.4 are 'shadow', 'all', 'none'


 what is compile_errors ? We don't allow to ignore any error now.

How about

plpgsql.additional_warnings = 'list'
plpgsql.additional_errors = 'list'

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Archive recovery won't be completed on some situation.

2014-03-14 Thread Kyotaro HORIGUCHI
Umm.. Sorry for repeated correction.

2014/03/14 21:12 Kyotaro HORIGUCHI kyota.horigu...@gmail.com:

 Ah, ok. I understood what you meant.
 Sorry that I can't confirm rihgt now, the original issue should occur on
the standby.

The original issue should have occurred on standby

 I might've oversimplicated.

 regards,
 --
 Kyotaro Horiguchi
 NTT Opensource Software Center


Re: [HACKERS] plpgsql.warn_shadow

2014-03-14 Thread Pavel Stehule
2014-03-14 13:12 GMT+01:00 Simon Riggs si...@2ndquadrant.com:

 On 14 March 2014 11:10, Pavel Stehule pavel.steh...@gmail.com wrote:
 
 
 
  2014-03-14 12:02 GMT+01:00 Marko Tiikkaja ma...@joh.to:
 
  On 3/14/14 10:56 AM, Simon Riggs wrote:
 
  The patch looks fine, apart from some non-guideline code formatting
  issues.
 
 
  I'm not sure what you're referring to.  I thought it looked fine.
 
 
  Having looked at gcc and clang, I have a proposal for naming/API
 
  We just have two variables
 
 plpgsql.compile_warnings = 'list'default = 'none'
 
 
  +1
 
 
 plpgsql.compile_errors = 'list'default = 'none'
 
  Only possible values in 9.4 are 'shadow', 'all', 'none'
 
 
  what is compile_errors ? We don't allow to ignore any error now.

 How about

 plpgsql.additional_warnings = 'list'
 plpgsql.additional_errors = 'list'


I understand .

+1

Pavel



 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Add CREATE support to event triggers

2014-03-14 Thread Robert Haas
On Thu, Mar 13, 2014 at 5:06 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Alvaro Herrera escribió:

 I also fixed the sequence OWNED BY problem simply by adding support for
 ALTER SEQUENCE.  Of course, the intention is that all forms of CREATE
 and ALTER are supported, but this one seems reasonable standalone
 because CREATE TABLE uses it internally.

 I have been hacking on this on and off.  This afternoon I discovered
 that interval typmod output can also be pretty unusual.  Example:

 create table a (a interval year to month);

 For the column, we get this type spec (note the typmod):

 coltype: {
 is_array: false,
 schemaname: pg_catalog,
 typename: interval,
 typmod:  year to month
 },

 so the whole command output ends up being this:

 NOTICE:  expanded: CREATE  TABLE  public.a (a pg_catalog.interval year to 
 month   )WITH (oids=OFF)

 However, this is not accepted on input:

 alvherre=# CREATE  TABLE  public.a (a pg_catalog.interval year to month   ) 
WITH (oids=OFF);
 ERROR:  syntax error at or near year
 LÍNEA 1: CREATE  TABLE  public.a (a pg_catalog.interval year to mon...
   ^

 I'm not too sure what to do about this yet.  I checked the catalogs and
 gram.y, and it seems that interval is the only type that allows such
 strange games to be played.  I would hate to be forced to add a kludge
 specific to type interval, but that seems to be the only option.  (This
 would involve checking the OID of the type in deparse_utility.c, and if
 it's INTERVALOID, then omit the schema qualification and quoting on the
 type name).

 I have also been working on adding ALTER TABLE support.  So far it's
 pretty simple; here is an example.  Note I run a single command which
 includes a SERIAL column, and on output I get three commands (just like
 a serial column on create table).

 alvherre=# alter table tt add column b numeric, add column c serial, alter 
 column a set default extract(epoch from now());
 NOTICE:  JSON blob: {
 definition: [
 {
 clause: cache,
 fmt: CACHE %{value}s,
 value: 1
 },
 {
 clause: cycle,
 fmt: %{no}s CYCLE,
 no: NO
 },
 {
 clause: increment_by,
 fmt: INCREMENT BY %{value}s,
 value: 1
 },
 {
 clause: minvalue,
 fmt: MINVALUE %{value}s,
 value: 1
 },
 {
 clause: maxvalue,
 fmt: MAXVALUE %{value}s,
 value: 9223372036854775807
 },
 {
 clause: start,
 fmt: START WITH %{value}s,
 value: 1
 },
 {
 clause: restart,
 fmt: RESTART %{value}s,
 value: 1
 }
 ],
 fmt: CREATE %{persistence}s SEQUENCE %{identity}D %{definition: }s,
 identity: {
 objname: tt_c_seq,
 schemaname: public
 },
 persistence: 
 }

What does the colon-space in %{definition: }s mean?

In general, it seems like you're making good progress here, and I'm
definitely happier with this than with previous approaches, but I'm
still concerned about how maintainable it's going to be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upcoming back branch releases

2014-03-14 Thread Tom Lane
After some discussion, the core committee has concluded that the
WAL-replay bug fixed in commit 6bfa88acd3df830a5f7e8677c13512b1b50ae813
is indeed bad enough to justify near-term update releases.  Since
there seems no point in being slow about it, tarballs will be wrapped
Monday (3/17) for public announcement Thursday (3/20).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Disk usage for intermediate results in join

2014-03-14 Thread Marti Raudsepp
On Tue, Mar 11, 2014 at 1:24 PM, Parul Lakkad parul.lak...@gmail.com wrote:
 I am trying to figure out when disk is used to store intermediate results
 while performing joins in postgres.

Joins can also cause a Nested Loop+Materialize plan, which spills to
disk if the materialize result set is too large for work_mem.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers Reply-To:

2014-03-14 Thread Alvaro Herrera
ca+tgmoayo1e2tjjefrvhgq6tmgpyhg_4loy_fwrzl8txxgf...@mail.gmail.com

Robert Haas wrote:

 What does the colon-space in %{definition: }s mean?

It means it expects the definition element to be a JSON array, and
that it will format the elements by separating them with a space.  In
other DDL commands, there are things like %{table_element:, }s  which
means to separate with comma-space.  (In CREATE TABLE, these
table_elements might be column definitions or constraints).  It's a
pretty handy way to format various things.  The separator is arbitrary.

 In general, it seems like you're making good progress here, and I'm
 definitely happier with this than with previous approaches, but I'm
 still concerned about how maintainable it's going to be.

Thanks.

There are three parts to this code.  Two of them are infrastructure to
make it all work: one is the code to support creation of the JSON
values, that is, functions to add new elements to the tree that's
eventually going to become JSON (this is the approximately 640 lines at
the top of deparse_utility.c).  The second one is the expand
functionality, i.e. what turns the JSON back into text (bottom 700 lines
in event_trigger.c).  Both are fairly static now; while I was writing it
initially there was a lot of churn until I found an interface that made
the most sense.  I don't think these parts are going to cause much
trouble.

The third part is the bits that take a parse node and determine what
elements to put into JSON.  This is the part that is going to show the
most churn as DDL is modified.  But it's not a lot of code; for
instance, deparsing a CREATE SCHEMA node takes 30 lines.
90 lines to deparse CREATE RULE.
80 lines for CREATE INDEX.
280 lines of shared code for ALTER SEQUENCE and CREATE SEQUENCE.

I don't think we should be worried about there being a lot of extra code
to write as DDL is added or modified.  I do share your concern that
we're going to *forget* to write these things in the first place, unless
we do something to avoid that problem specifically.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add CREATE support to event triggers Reply-To:

2014-03-14 Thread Robert Haas
On Fri, Mar 14, 2014 at 12:00 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I don't think we should be worried about there being a lot of extra code
 to write as DDL is added or modified.  I do share your concern that
 we're going to *forget* to write these things in the first place, unless
 we do something to avoid that problem specifically.

That mirrors my concern.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun

2014-03-14 Thread Heikki Linnakangas

On 03/14/2014 01:03 PM, Peter Geoghegan wrote:

Ping?


I committed the other patch this depends on now. I'll take another stab 
at this one next.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Josh Berkus
On 03/14/2014 04:52 AM, Oleg Bartunov wrote:
 VODKA index will have no lenght limitation.

Yeah, so I think we go with what we have, and tell people if you're
hitting these length issues, wait for 9.5, where they will be fixed.


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Oleg Bartunov
9.5 may too optimistic :)

On Fri, Mar 14, 2014 at 11:18 PM, Josh Berkus j...@agliodbs.com wrote:
 On 03/14/2014 04:52 AM, Oleg Bartunov wrote:
 VODKA index will have no lenght limitation.

 Yeah, so I think we go with what we have, and tell people if you're
 hitting these length issues, wait for 9.5, where they will be fixed.


 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-14 Thread Josh Berkus
Alvaro, All:

Can someone help me with what we should tell users about this issue?

1. What users are especially likely to encounter it?  All replication
users, or do they have to do something else?

2. What error messages will affected users get?  A link to the reports
of this issue on pgsql lists would tell me this, but I'm not sure
exactly which error reports are associated.

3. If users have already encountered corruption due to the fixed issue,
what do they need to do after updating?  re-basebackup?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2014-03-14 Thread Jeff Janes
On Mon, Mar 10, 2014 at 3:16 PM, Kevin Grittner kgri...@ymail.com wrote:

 Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-16 21:26:47 -0500, Robert Haas wrote:

  I don't really know about cpu_tuple_cost.  Kevin's often
  advocated raising it, but I haven't heard anyone else advocate
  for that. I think we need data points from more people to know
  whether or not that's a good idea in general.
 
  FWIW It's a good idea in my experience.

 This is more about the balance among the various cpu_* costs than
 the balance between cpu_* costs and the *_page costs.  I usually
 need to adjust the page costs, too; and given how heavily cached
 many machines are, I'm usually moving them down.  But if you think
 about the work involved in moving to a new tuple, do you really
 think it's only twice the cost of moving to a new index entry on an
 index scan?  Or only four times as expensive as executing an
 average operator function?


If the next tuple is already hinted and not compressed or toasted, I would
completely believe that.  In fact, unless the operator is integer or dp, I
would say it is less than 2 times as expensive.  If it is a text operator
and the collation is not C or POSIX, then moving to the next tuple is
likely less expensive than a single operator evaluation.

If your tuples are updated nearly as often as queried, the hint resolution
could be a big cost.  But in that case, probably the contention would be a
bigger issue than the pure CPU cost.

I don't know how compression and toast would affect the times.  Are your
tables heavily toasted?

If top down measurements and bottom up measurements aren't giving the same
results, then what is going on?  We know and document how caching needs to
be baked into the page costs parameters.  What unknown thing is throwing
off the cpu costs?



 In my experience setting cpu_tuple_cost
 higher tends to better model costs, and prevent CPU-sucking scans
 of large numbers of rows.

 I only have anecdotal evidence, though.  I have seen it help dozens
 of times, and have yet to see it hurt.  That said, most people on
 this list are probably capable of engineering a benchmark which
 will show whichever result they would prefer.  I would prefer to
 hear about other data points based on field experience with
 production systems.  I haven't offered the trivial patch because
 when I've raised the point before, there didn't seem to be anyone
 else who had the same experience.  It's good to hear that Andres
 has seen this, too.

 FWIW, even though I'm repeating something I've mentioned before,
 whenever raising this setting did help, 0.03 was high enough to see
 the benefit.  Several times I have also tried 0.05 just to test
 whether I was wandering near a tipping point for a bad choice from
 this.  I have never had 0.05 produce plans noticeably better or
 worse than 0.03.



Have you ever tried lowering the other two cpu cost parameters instead?
 That would be the more definitive indication that the benefit is not
coming just by moving the io costs closer to the cpu costs

Cheers,

Jeff


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Gavin Flower

On 15/03/14 08:45, Oleg Bartunov wrote:

9.5 may too optimistic :)

On Fri, Mar 14, 2014 at 11:18 PM, Josh Berkus j...@agliodbs.com wrote:

On 03/14/2014 04:52 AM, Oleg Bartunov wrote:

VODKA index will have no lenght limitation.

Yeah, so I think we go with what we have, and tell people if you're
hitting these length issues, wait for 9.5, where they will be fixed.


--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


No tell them to wait for Postgres 12.3.42 - the version that is totally 
bug free  implements parallel processing of individual queries!  :-)


(With apologies to Douglas Adams)


Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Josh Berkus
On 03/14/2014 12:45 PM, Oleg Bartunov wrote:
 9.5 may too optimistic :)

Nonsense, you, Teodor and Alexander are geniuses.  It can't possibly
take you more than a year.  ;-)

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Portability issues in shm_mq

2014-03-14 Thread Tom Lane
Whilst setting up a buildfarm member on an old, now-spare Mac, I was
somewhat astonished to discover that contrib/test_shm_mq crashes thus:
TRAP: FailedAssertion(!(rb = sizeof(uint64)), File: shm_mq.c, Line: 429)
but only in UTF8 locales, not in C locale.  You'd have bet your last
dollar that that code was locale-independent, right?

The reason appears to be that in the payload string generated with
(select string_agg(chr(32+(random()*96)::int), '') from generate_series(1,400))
the chr() argument rounds up to 128 every so often.  In UTF8 encoding,
that causes chr() to return a multibyte character instead of a single
byte.  So, instead of always having a fixed payload string length of
400 bytes, the payload length moves around a bit --- in a few trials
I see anywhere from 400 to 409 bytes.

How is that leading to a crash?  Well, this machine is 32-bit, so MAXALIGN
is only 4.  This means it is possible for an odd-length message cum
message length word to not exactly divide the size of the shared memory
ring buffer, resulting in cases where an 8-byte message length word is
wrapped around the end of the buffer.  shm_mq_receive_bytes makes no
attempt to hide that situation from its caller, and happily returns just
4 bytes with SHM_MQ_SUCCESS.  shm_mq_receive, on the other hand, is so
confident that it will always get an indivisible length word that it just
Asserts that that's the case.

Recommendations:

1. Reduce the random() multiplier from 96 to 95.  In multibyte encodings
other than UTF8, chr() would flat out reject values of 128, so this test
case is unportable.

2. Why in the world is the test case testing exactly one message length
that happens to be a multiple of 8?  Put some randomness into that,
instead.

3. Either you need to work a bit harder at forcing alignment, or you need
to fix shm_mq_receive to cope with split message length words.

4. The header comment for shm_mq_receive_bytes may once have described its
API accurately, but that appears to have been a long long time ago in a
galaxy far far away.  Please fix.


Also, while this is not directly your problem, it's becoming clear that we
don't have enough buildfarm coverage of not-64-bit platforms; this problem
would have been spotted awhile ago if we did.  I'm going to spin up a
couple of critters on old machines lying around my office.  We should
probably also encourage owners of existing critters to expand their test
coverage a bit, eg try locales other than C.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-14 Thread Alvaro Herrera
Josh Berkus wrote:
 Alvaro, All:
 
 Can someone help me with what we should tell users about this issue?
 
 1. What users are especially likely to encounter it?  All replication
 users, or do they have to do something else?

Replication users are more likely to get it on replicas, of course,
because that's running the recovery code continuously; however, anyone
that suffers a crash of a standalone system might also be affected.
(And it'd be worse, even, because that corrupts your main source of
data, not just a replicated copy of it.)  Obviously, if you have a
corrupted replica and fail over to it, you're similarly screwed.

Basically you might be affected if you have tables that are referenced
in primary keys and to which you also apply UPDATEs that are
HOT-enabled.

 2. What error messages will affected users get?  A link to the reports
 of this issue on pgsql lists would tell me this, but I'm not sure
 exactly which error reports are associated.

Not sure about error messages.  Essentially some rows would be visible
to seqscans but not to index scans.
These are the threads:
http://www.postgresql.org/message-id/CAM3SWZTMQiCi5PV5OWHb+bYkUcnCk=o67w0csswpvv7xfuc...@mail.gmail.com
http://www.postgresql.org/message-id/cam-w4hptoemt4kp0ojk+mggzgctotlrtvfzyvd0o4ah-7dx...@mail.gmail.com

 3. If users have already encountered corruption due to the fixed issue,
 what do they need to do after updating?  re-basebackup?

Replicas can be fixed by recloning, yeah.  I haven't stopped to think
how to fix the masters.  Greg, Peter, any clues there?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Tomas Vondra
On 14.3.2014 20:18, Josh Berkus wrote:
 On 03/14/2014 04:52 AM, Oleg Bartunov wrote:
 VODKA index will have no lenght limitation.
 
 Yeah, so I think we go with what we have, and tell people if you're 
 hitting these length issues, wait for 9.5, where they will be
 fixed.

VODKA may be great, but I haven't seen a single line of code for that
yet. And given the response from Oleg, 9.5 seems ambitious.

I'm not awfully familiar with the GIN code, but based on Alexander's
feedback I presume fixing the GIN length limit (or rather removing it,
as it's a feature, not a bug) is quite straightforward. Why not to at
least consider that for 9.4, unless it turns more complex than expected?

Don't get me wrong - I'm aware it's quite late in the last commitfest,
and if it's deemed unacceptable / endandering 9.4 release, I'm not going
to say a word. But if it's a simple patch ...

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Peter Geoghegan
On Fri, Mar 14, 2014 at 2:21 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm not awfully familiar with the GIN code, but based on Alexander's
 feedback I presume fixing the GIN length limit (or rather removing it,
 as it's a feature, not a bug) is quite straightforward. Why not to at
 least consider that for 9.4, unless it turns more complex than expected?

Alexander said nothing about removing that limitation, or if he did I
missed it. Which, as I said, I don't consider to be much of a
limitation, because indexing the whole nested value doesn't mean it
can satisfy a query on some more nested subset of an indexed value
datum (i.e. a value in the sense of a value in a key/value pair).

Alexander mentioned just indexing keys (object keys, or equivalently
array elements at the jsonb level), which is a reasonable thing, but
can be worked on later. I don't have much interest in working on
making it possible to index elaborate nested values in key/value
pairs, which is what you're suggesting if I've understood correctly.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Peter Geoghegan
For the benefit of anyone that would like to try the patch out, I make
available a custom format dump of some delicious sample data. I can
query the sample data as follows on my local installation:

[local]/jsondata=# select count(*) from delicious ;
  count
-
 1079399
(1 row)

[local]/jsondata=# \dt+ delicious
 List of relations
 Schema |   Name| Type  | Owner |  Size   | Description
+---+---+---+-+-
 public | delicious | table | pg| 1174 MB |
(1 row)

It's available from:
http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/jsondata.dump

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Tomas Vondra
On 14.3.2014 22:54, Peter Geoghegan wrote:
 On Fri, Mar 14, 2014 at 2:21 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm not awfully familiar with the GIN code, but based on Alexander's
 feedback I presume fixing the GIN length limit (or rather removing it,
 as it's a feature, not a bug) is quite straightforward. Why not to at
 least consider that for 9.4, unless it turns more complex than expected?
 
 Alexander said nothing about removing that limitation, or if he did I
 missed it. Which, as I said, I don't consider to be much of a

Sure he did, see this:

http://www.postgresql.org/message-id/capphfds4xmg5zop+1ctrrqnm6wxhh2a7j11nnjeosa76uow...@mail.gmail.com

Although it doesn't mention how complex change it would be.

 limitation, because indexing the whole nested value doesn't mean it
 can satisfy a query on some more nested subset of an indexed value
 datum (i.e. a value in the sense of a value in a key/value pair).

OK, I'm getting lost in the nested stuff. The trouble I'm running into
are rather unlerated to nesting. For example indexing this fails if the
string is sufficiently long (~1350B if random, more if compressible).

  {key : ... string ...}

How's that related to nesting?

Anyway, I'm not talking about exact matches on subtrees. I'm talking
about queries like this:

   SELECT doc FROM delicious
  WHERE doc @ '{title_detail : {value : TheaterMania}}';

which does exactly the same thing like this query:

   SELECT doc FROM delicious
  WHERE doc-'title_detail'-'value' = 'TheaterMania';

Except that the first query can use a GIN index created like this:

  CREATE INDEX delicious_idx ON delicious USING GIN (doc);

while the latter does sequential scan. It can use a GiST index too, but
it takes 140ms with GiST and only ~0.3ms with GIN. Big difference.

 Alexander mentioned just indexing keys (object keys, or equivalently
 array elements at the jsonb level), which is a reasonable thing, but
 can be worked on later. I don't have much interest in working on
 making it possible to index elaborate nested values in key/value
 pairs, which is what you're suggesting if I've understood correctly.

I never asked for indexing elaborate nested values in key/value pairs.
All I'm asking for is indexing of json values containing long strings.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Tomas Vondra
On 14.3.2014 23:06, Peter Geoghegan wrote:
 For the benefit of anyone that would like to try the patch out, I make
 available a custom format dump of some delicious sample data. I can
 query the sample data as follows on my local installation:
 
 [local]/jsondata=# select count(*) from delicious ;
   count
 -
  1079399
 (1 row)
 
 [local]/jsondata=# \dt+ delicious
  List of relations
  Schema |   Name| Type  | Owner |  Size   | Description
 +---+---+---+-+-
  public | delicious | table | pg| 1174 MB |
 (1 row)
 
 It's available from:
 http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/jsondata.dump

Thanks.

I've been doing some simple queries on this dataset and ISTM there's a
memory leak somewhere in the json code (i.e. something is probably using
a wrong memory context), because this query:

  SELECT doc-'title_detail'-'value', COUNT(*)
FROM delicious GROUP BY 1;

results in this:

  PID USER  PR  NIVIRTRESSHR S  %CPU %MEM TIME+
COMMAND


 8231 tomas 20   0 5987520 4,645g   6136 R  95,4 60,4   0:37.54
postgres: tomas delicious [local]

I have shared_buffers=1GB and work_mem=64MB, so 4.6GB seems a bit too
much 4.6GB. Actually it grows even further, and then OOM jumps in and
kills the backend like this:

[ 9227.318998] Out of memory: Kill process 8159 (postgres) score 595
   or sacrifice child
[ 9227.319000] Killed process 8159 (postgres) total-vm:5920272kB,
   anon-rss:4791568kB, file-rss:6192kB

I'm on commit a3115f0d, which is just 2 days old, so I suppose this was
not fixed yet.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Andres Freund
On 2014-03-14 22:21:18 +0100, Tomas Vondra wrote:
 Don't get me wrong - I'm aware it's quite late in the last commitfest,
 and if it's deemed unacceptable / endandering 9.4 release, I'm not going
 to say a word. But if it's a simple patch ...

IMNSHO there's no bloody chance for such an addition at this point of
the cycle.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Greg Stark
On Fri, Mar 14, 2014 at 9:21 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm not awfully familiar with the GIN code, but based on Alexander's
 feedback I presume fixing the GIN length limit (or rather removing it,
 as it's a feature, not a bug) is quite straightforward. Why not to at
 least consider that for 9.4, unless it turns more complex than expected?

 Don't get me wrong - I'm aware it's quite late in the last commitfest,
 and if it's deemed unacceptable / endandering 9.4 release, I'm not going
 to say a word. But if it's a simple patch ...

Well I think the bigger picture is that the cases were we're getting
this error it's because we're expecting too much from the GIN opclass.
It's trying to index entire json objects as individual values which
isn't really very useful. We're unlikely to go querying for rows where
the value of a given key is a specific json object.

As I understand it Peter's right that in its current form the GIN
opclass is only useful if you use it on an expression index on
specific pieces of your json which are traditional non-nested hash
tables. Or I suppose if you're really only concerned with the ?
operator which looks for keys, which is pretty common too.

I had in mind that the GIN opclass would do something clever like
decompose the json into all the path-value tuples so I could do
arbitrary path lookups for values. That might be possible in the
future but it's not what we have today and what we have today is
already better than hstore. I think we're better off committing this
and moving forward with the contrib hstore2 wrapper which uses this
infrastructure so people have a migration path.

I don't think Josh is right to say it'll be fixed in 9.5. It'll be
better in 9.5 because we have ambitious plans to continue improving
in this direction. But it'll be even better in 9.6 and better again in
9.7. It'll never be fixed.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Peter Geoghegan
On Fri, Mar 14, 2014 at 5:10 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm on commit a3115f0d, which is just 2 days old, so I suppose this was
 not fixed yet.

Try merging the feature branch now, which will get you commit 16923d,
which you're missing. That was an open item for a while, which I only
got around to fixing a few days ago.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Tomas Vondra
On 15.3.2014 02:03, Greg Stark wrote:
 On Fri, Mar 14, 2014 at 9:21 PM, Tomas Vondra t...@fuzzy.cz wrote:
 I'm not awfully familiar with the GIN code, but based on Alexander's
 feedback I presume fixing the GIN length limit (or rather removing it,
 as it's a feature, not a bug) is quite straightforward. Why not to at
 least consider that for 9.4, unless it turns more complex than expected?

 Don't get me wrong - I'm aware it's quite late in the last commitfest,
 and if it's deemed unacceptable / endandering 9.4 release, I'm not going
 to say a word. But if it's a simple patch ...
 
 Well I think the bigger picture is that the cases were we're getting 
 this error it's because we're expecting too much from the GIN
 opclass. It's trying to index entire json objects as individual
 values which isn't really very useful. We're unlikely to go querying
 for rows where the value of a given key is a specific json object.

Stupid question - so if I have a json like this:

  { a : { b : c}}

the GIN code indexes {b : c} as a single value? And then takes c
and indexes it as a single value too?

Because otherwise I don't understand how the index could be used for
queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with
value c).

H, if that's how it works, removing the size limit would be
certainly more difficult than I thought.


 As I understand it Peter's right that in its current form the GIN
 opclass is only useful if you use it on an expression index on
 specific pieces of your json which are traditional non-nested hash
 tables. Or I suppose if you're really only concerned with the ?
 operator which looks for keys, which is pretty common too.

Well, depends on how you define useful. With the sample dataset
'delicious' (see Peter's post) I can do this:

   SELECT doc FROM delicious
  WHERE doc @ '{title_detail : {value : TheaterMania}}';

with arbitrary paths, and I may create a GIN index to support such
queries. And yes, it's much faster than GiST for example (by a factor of
1000).

Yes, the GIN index is quite large (~560MB for a ~1.2GB table).


 I had in mind that the GIN opclass would do something clever like
 decompose the json into all the path-value tuples so I could do
 arbitrary path lookups for values. That might be possible in the
 future but it's not what we have today and what we have today is
 already better than hstore. I think we're better off committing this
 and moving forward with the contrib hstore2 wrapper which uses this
 infrastructure so people have a migration path.

Yes, it's better than hstore - no doubt about that. The hierarchy and
data types are great, and hstore has the same size limitation.

 I don't think Josh is right to say it'll be fixed in 9.5. It'll be 
 better in 9.5 because we have ambitious plans to continue
 improving in this direction. But it'll be even better in 9.6 and
 better again in 9.7. It'll never be fixed.

I don't dare to say what will be in 9.5 (not even thinking about the
following versions).

Assuming the GIN will remain for 9.4 as it is now (both opclasses), it
would be nice if we could improve this in 9.5. I can live with custom
opclasses in an extension, if there are some ...

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Josh Berkus
On 03/14/2014 06:44 PM, Tomas Vondra wrote:
 Stupid question - so if I have a json like this:

Not a stupid question, actually.   In fact, I expect to answer it 400 or
500 times over the lifespan of 9.4.

   { a : { b : c}}
 
 the GIN code indexes {b : c} as a single value? And then takes c
 and indexes it as a single value too?

I don't know that c is indexed separately.

 Because otherwise I don't understand how the index could be used for
 queries with @ '{a : {b : c}}' conditions (i.e. path [a,b] with
 value c).
 
 H, if that's how it works, removing the size limit would be
 certainly more difficult than I thought.

Precisely.  Hence, the Russian plans for VODKA.

 Well, depends on how you define useful. With the sample dataset
 'delicious' (see Peter's post) I can do this:
 
SELECT doc FROM delicious
   WHERE doc @ '{title_detail : {value : TheaterMania}}';
 
 with arbitrary paths, and I may create a GIN index to support such
 queries. And yes, it's much faster than GiST for example (by a factor of
 1000).
 
 Yes, the GIN index is quite large (~560MB for a ~1.2GB table).

State of the art, actually.  In MongoDB, the indexes are frequently
several times larger than the raw data.  So if ours are 50% the size,
we're doing pretty good.

On 15.3.2014 02:03, Greg Stark wrote:
 I don't think Josh is right to say it'll be fixed in 9.5. It'll be 
 better in 9.5 because we have ambitious plans to continue
 improving in this direction. But it'll be even better in 9.6 and
 better again in 9.7. It'll never be fixed.

Oh, no doubt.  The important thing is that 9.4 will significantly
broaden the class of applications for which our JSON support is useful,
and allow us to remain relevant to an increasingly NoSQLish developer
base.  We're both showing progress and delivering features which are
actually useful, even if they still have major limitations.

Plus, you know, those features are useful to *me*, so I'm keen on them
personally.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-03-14 Thread Peter Geoghegan
On Fri, Mar 14, 2014 at 6:44 PM, Tomas Vondra t...@fuzzy.cz wrote:
 Well, depends on how you define useful. With the sample dataset
 'delicious' (see Peter's post) I can do this:

SELECT doc FROM delicious
   WHERE doc @ '{title_detail : {value : TheaterMania}}';

 with arbitrary paths, and I may create a GIN index to support such
 queries. And yes, it's much faster than GiST for example (by a factor of
 1000).

If you know ahead of time the entire nested value you can. So, if you
attach some other data to the TheaterMania document, you had better
know that too if you hope to write a query like this. You also have to
index the entire table, where presumably with a little thought you
could get away with a much smaller index. That strikes me as not very
useful.

 Yes, the GIN index is quite large (~560MB for a ~1.2GB table).

With the default opclass, without an expressional index, 100% of the
data from the table appears in the index. Why do you think that's
quite large?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers