date:20140312

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Bruce Momjian

On Wed, Mar 12, 2014 at 01:58:14PM -0700, Peter Geoghegan wrote:
> The use case you describe here doesn't sound like something similar to
> full text search. It sounds like something identical.
> 
> In any case, let's focus on what we have right now. I think that the
> indexing facilities proposed here are solid. In any case they do not
> preclude working on better indexing strategies as the need emerges.

Keep in mind that if we ship an index format, we are going to have
trouble changing the layout because of pg_upgrade.  pg_upgrade can mark
the indexes as invalid and force users to reindex, but that is less than
idea.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_archivecleanup bug

2014-03-12 Thread Bruce Momjian

On Mon, Dec  9, 2013 at 11:27:28AM -0500, Robert Haas wrote:
> On Thu, Dec 5, 2013 at 6:15 PM, Tom Lane  wrote:
> > But the other usages seem to be in assorted utilities, which
> > will need to do it right for themselves.  initdb.c's walkdir() seems to
> > have it right and might be a reasonable model to follow.  Or maybe we
> > should invent a frontend-friendly version of ReadDir() rather than
> > duplicating all the error checking code in ten-and-counting places?
> 
> If there's enough uniformity in all of those places to make that
> feasible, it certainly seems wise to do it that way.  I don't know if
> that's the case, though - e.g. maybe some callers want to exit and
> others do not.  pg_resetxlog wants to exit; pg_archivecleanup and
> pg_standby most likely want to print an error and carry on.

I have developed the attached patch which fixes all cases where
readdir() wasn't checking for errno, and cleaned up the syntax in other
cases to be consistent.

While I am not a fan of backpatching, the fact we are ignoring errors in
some critical cases seems the non-cosmetic parts should be backpatched.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/contrib/pg_archivecleanup/pg_archivecleanup.c b/contrib/pg_archivecleanup/pg_archivecleanup.c
new file mode 100644
index 7b5484b..fbc3e5a
*** a/contrib/pg_archivecleanup/pg_archivecleanup.c
--- b/contrib/pg_archivecleanup/pg_archivecleanup.c
*** CleanupPriorWALFiles(void)
*** 106,112 
  
  	if ((xldir = opendir(archiveLocation)) != NULL)
  	{
! 		while ((xlde = readdir(xldir)) != NULL)
  		{
  			strncpy(walfile, xlde->d_name, MAXPGPATH);
  			TrimExtension(walfile, additional_ext);
--- 106,112 
  
  	if ((xldir = opendir(archiveLocation)) != NULL)
  	{
! 		while (errno = 0, (xlde = readdir(xldir)) != NULL)
  		{
  			strncpy(walfile, xlde->d_name, MAXPGPATH);
  			TrimExtension(walfile, additional_ext);
*** CleanupPriorWALFiles(void)
*** 164,169 
--- 164,172 
  }
  			}
  		}
+ 		if (errno)
+ 			fprintf(stderr, "%s: could not read archive location \"%s\": %s\n",
+ 	progname, archiveLocation, strerror(errno));
  		closedir(xldir);
  	}
  	else
diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c
new file mode 100644
index 8ddd486..be4d31f
*** a/contrib/pg_standby/pg_standby.c
--- b/contrib/pg_standby/pg_standby.c
*** CustomizableCleanupPriorWALFiles(void)
*** 245,251 
  		 */
  		if ((xldir = opendir(archiveLocation)) != NULL)
  		{
! 			while ((xlde = readdir(xldir)) != NULL)
  			{
  /*
   * We ignore the timeline part of the XLOG segment identifiers
--- 245,251 
  		 */
  		if ((xldir = opendir(archiveLocation)) != NULL)
  		{
! 			while (errno = 0, (xlde = readdir(xldir)) != NULL)
  			{
  /*
   * We ignore the timeline part of the XLOG segment identifiers
*** CustomizableCleanupPriorWALFiles(void)
*** 283,288 
--- 283,291 
  	}
  }
  			}
+ 			if (errno)
+ fprintf(stderr, "%s: could not read archive location \"%s\": %s\n",
+ 		progname, archiveLocation, strerror(errno));
  			if (debug)
  fprintf(stderr, "\n");
  		}
diff --git a/src/bin/pg_basebackup/pg_receivexlog.c b/src/bin/pg_basebackup/pg_receivexlog.c
new file mode 100644
index 2478789..e252405
*** a/src/bin/pg_basebackup/pg_receivexlog.c
--- b/src/bin/pg_basebackup/pg_receivexlog.c
*** FindStreamingStart(uint32 *tli)
*** 139,145 
  		disconnect_and_exit(1);
  	}
  
! 	while ((dirent = readdir(dir)) != NULL)
  	{
  		uint32		tli;
  		XLogSegNo	segno;
--- 139,145 
  		disconnect_and_exit(1);
  	}
  
! 	while (errno = 0, (dirent = readdir(dir)) != NULL)
  	{
  		uint32		tli;
  		XLogSegNo	segno;
*** FindStreamingStart(uint32 *tli)
*** 209,214 
--- 209,221 
  		}
  	}
  
+ 	if (errno)
+ 	{
+ 		fprintf(stderr, _("%s: could not read directory \"%s\": %s\n"),
+ progname, basedir, strerror(errno));
+ 		disconnect_and_exit(1);
+ 	}
+ 
  	closedir(dir);
  
  	if (high_segno > 0)
diff --git a/src/bin/pg_dump/pg_backup_directory.c b/src/bin/pg_dump/pg_backup_directory.c
new file mode 100644
index 1bed8a9..c24f7e3
*** a/src/bin/pg_dump/pg_backup_directory.c
--- b/src/bin/pg_dump/pg_backup_directory.c
*** InitArchiveFmt_Directory(ArchiveHandle *
*** 177,183 
  struct dirent *d;
  
  is_empty = true;
! while ((d = readdir(dir)))
  {
  	if (strcmp(d->d_name, ".") != 0 && strcmp(d->d_name, "..") != 0)
  	{
--- 177,183 
  struct dirent *d;
  
  is_empty = true;
! while (errno = 0, (d = readdir(dir)))
  {
  	if (strcmp(d->d_name, ".") != 0 && strcmp(d->d_name, "..") != 0)
  	{
*** InitArchiveFmt_Directory(ArchiveHandle *
*** 185,190 
--- 185,193 
  		break;
  	}
  }
+ if (errno)
+ 	exit_horribly(modulename, "c

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Simon Riggs

On 13 March 2014 02:14, Robert Haas  wrote:

>> I'm not sure why this is being blocked. This is a community
>> contribution that seeks to improve everybody's options. Blocking it
>> does *nothing* to prevent individual extensions from providing
>> table-level options - we give them freedom to do whatever the hell
>> they want. Validation is a pipe dream, not *ever* an achievable
>> reality. Blocking is just exercise of a veto for nobody's gain.
>
> Unsurprisingly, I don't agree with any of that.

The point is that execising a veto here is irrelevant. Blocking this
patch does *nothing* to prevent extensions from adopting per-table
options. All that is happening is that a single, coherent mechanism
for such options is being blocked. Blocking this is like trying to
block rain. We can all pretend the blocking viewpoint has succeeded,
but all it does is to bring Postgres core into disrepute. I have often
heard that from others that this is a business opportunity, not a
problem. If that is true, its not because we didn't try to act for the
good of all.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Amit Kapila

On Thu, Mar 13, 2014 at 12:16 AM, Robert Haas  wrote:
> On Tue, Mar 11, 2014 at 3:53 AM, Amit Kapila  wrote:
>> Places where tuple info not available
>>
>> LOG:  process 5788 still waiting for ShareLock on transaction 679 after 
>> 1014.000
>>  ms
>> CONTEXT:  while attempting to operate in relation "public"."idx_t1" of 
>> database
>> "postgres"
>
> The way the context message is assembled piecemeal in
> XactLockTableWaitErrorContextCallback violates translation guidelines.
>  You need to have completely separate strings for each variant.
>
> While attempting to "operate in"?  That seems like unhelpful
> weasel-wording.  I wonder if we ought to have separate messages for
> each possibility, like "delete tuple (X,Y)" when called from
> heap_delete(), "update tuple (X,Y)", "check exclusion constraint on
> tuple (X,Y)" when called from check_exclusion_constraint, etc.  That
> seems like it would be handy information to have.

Okay, below are the distinct places from where we need to pass
such information.

heap_delete - "delete tuple (X,Y)"
heap_update - "update tuple (X,Y)"
heap_lock_tuple - "lock tuple (X,Y)"
heap_lock_updated_tuple_rec - "lock updated tuple (X,Y)"
_bt_doinsert - "insert index tuple (X,Y)" (here it will refer to index tuple
location)
IndexBuildHeapScan - "scan tuple (X,Y)"
EvalPlanQualFetch - "fetch tuple (X,Y)"
check_exclusion_constraint - "check exclusion constraint on tuple (X,Y)"

I think it might not be a big deal to update the patch to pass such info.
Won't it effect the translatability guidelines as we need to have different
translation message for each op?

> Why can't check_exclusion_constraint, for example, pass the TID, so
> that at least that much information is available?

I don't think there is as such any problem in passing TID, rather I think
if we can pass TID from all places, it will be a better way.

The other option could be we need to ensure which places are safe to
pass tuple so that we can display whole tuple instead of just TID,
for example the tuple we are passing from heap_lock_tuple() has been
fetched using Dirty Snapshot (refer EvalPlanQualFetch() caller of
heap_lock_tuple()), but still we can use it in error as it has been decided
that it is live tuple and transaction can update it by the time it reaches
XactLockTableWaitWithInfo(), so it is safe. I think we need to discuss
and validate all places where ever we use Dirty/Any Snapshot to ensure
that we can pass tuple from such a call, may be at end the result is
we can pass tuple from most of locations, but still it needs to be done
carefully.

> I'm not very happy with the idea of including the tuple details only
> when the level is less than ERROR.  For one thing, to do that in a way
> that respects translatability guidelines will require two versions of
> every string that would otherwise require only one.  For another
> thing, it seems like it's punting a pretty important case.  If we're
> gonna add context detail to lots of cases (instead only the "still
> waiting" case that people probably mostly care about) then we should
> actually print the details more-or-less consistently in all of those
> cases, not pretend like a solution that only works in the narrow case
> is more general than it really is.  I think we should really try hard
> to make the amount of detail provided as uniform as possible across
> all the cases, even if that means removing information from some cases
> where it might have been available.

Agreed and if we use TID, then it will address your concern.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-12 Thread Greg Stark

On 13 Mar 2014 01:36, "Stephen Frost"  wrote:
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > This thread badly needs a more informative Subject line.
>
> Agreed.
>
> > But, yeah: do people think the referenced commit fixes a bug bad enough
> > to deserve a quick update release?  If so, why?  Multiple reports of
> > problems in the field would be a good reason, but I've not seen such.
>
> Uh, isn't what brought this to light two independent complaints from
> Peter and Greg Stark of seeing corruption in the field due to this?
>
> Peter's initial email also indicated it was two different systems which
> had gotten bit by this and Greg explicitly stated that he was working on
> an independent database from what Peter was reporting on, so that's at
> least 2 (one each), or 3 (if you count databases, as Peter had 2).
> Sure, they're all from Heroku, but I find it highly unlikely no one else
> has run into this issue.  More likely, they simply haven't realized it's
> happened to them (which is another reason this is a particularly nasty
> bug..).

We have the two databases where we're sure this was the problem. On the one
I worked on the customer complained that it happened repeatedly.

The key I demonstrated here wasn't even the one the costumer was
complaining about. It seems their usage pattern made it extremely easy to
trigger and that usage pattern arose naturally from using a rails module
called counter_cache which maintains a cache of the count of a child take
in the parent table.

We also have a few other customers complaining about duplicate keys. It's
hard to be sure but these may have been standbys where the problem occurred
ages ago and they only now activated their standby and ran into the problem.

That's what worries me most about this bug. You'll only detect it if you're
routinely querying your standby. If you have a standby for HA purposes it
might be corrupt for a long time without you realising it. We may be
fielding corruption complaints for a long time without being able to
conclusively prove whether it's due to this bug or not.

[HACKERS] Is this a bug?

2014-03-12 Thread Fabrízio de Royes Mello

Hi all,

Shouldn't the "ALTER" statements below raise an exception?

fabrizio=# CREATE TABLE foo(bar SERIAL PRIMARY KEY);
CREATE TABLE

fabrizio=# SELECT relname, reloptions FROM pg_class WHERE relname ~ '^foo';
   relname   | reloptions
-+
 foo |
 foo_bar_seq |
 foo_pkey|
(3 rows)

fabrizio=# ALTER TABLE foo RESET (noname);
ALTER TABLE

fabrizio=# ALTER INDEX foo_pkey RESET (noname);
ALTER INDEX

fabrizio=# ALTER TABLE foo ALTER COLUMN bar RESET (noname);
ALTER TABLE


If I try to "SET" an option called "noname" obviously will raise an
exception:

fabrizio=# ALTER TABLE foo SET (noname=1);
ERROR:  unrecognized parameter "noname"

fabrizio=# ALTER INDEX foo_pkey SET (noname=1);
ERROR:  unrecognized parameter "noname"

fabrizio=# ALTER TABLE foo ALTER COLUMN bar SET (noname=1);
ERROR:  unrecognized parameter "noname"

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 9:38 PM, Simon Riggs  wrote:
> On 12 March 2014 22:58, Robert Haas  wrote:
>> I don't like the idea of using reloptions to let people attach
>> arbitrary unvalidated settings to tables.
>
> I respect your opinion. If you disagree, don't use them. Same as is
> possible for RULEs etc.

That's not an answer.  We don't let people put things in a date column
that aren't actually dates, and we don't let people put things in an
integer columns that aren't actually integers.  Some other database
have made different choices in those areas, and we've rightly chosen
to more strict.  Why is validation a good thing for the values that
are stored in the tables but not a good idea for the metadata
associated with those tables?

> Experience was that requiring validation made things more brittle,
> which is why we relaxed things a few releases ago. Opinions are one
> thing, experience is quite another.

Sure.  But I think the reason why requiring validation made things
more brittle is because the validation mechanism we used to have
wasn't very good, not because validating stuff is in general not a
good thing to do.

> I'm not sure why this is being blocked. This is a community
> contribution that seeks to improve everybody's options. Blocking it
> does *nothing* to prevent individual extensions from providing
> table-level options - we give them freedom to do whatever the hell
> they want. Validation is a pipe dream, not *ever* an achievable
> reality. Blocking is just exercise of a veto for nobody's gain.

Unsurprisingly, I don't agree with any of that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Simon Riggs

On 12 March 2014 22:58, Robert Haas  wrote:

> I don't like the idea of using reloptions to let people attach
> arbitrary unvalidated settings to tables.

I respect your opinion. If you disagree, don't use them. Same as is
possible for RULEs etc.

> I consider the way things
> work with GUCs to be a bug, not a feature, and definitely not
> something I want to propagate into every other area of the system
> where the underlying storage format happens to allow it.

Experience was that requiring validation made things more brittle,
which is why we relaxed things a few releases ago. Opinions are one
thing, experience is quite another.

> I kind of think that this is too half-baked for 9.4 and we ought to
> punt it to 9.5.

No, its fully functional, apart from the requirement for validation
which is imposed upon this patch.

I'm not sure why this is being blocked. This is a community
contribution that seeks to improve everybody's options. Blocking it
does *nothing* to prevent individual extensions from providing
table-level options - we give them freedom to do whatever the hell
they want. Validation is a pipe dream, not *ever* an achievable
reality. Blocking is just exercise of a veto for nobody's gain.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Bug: Fix Wal replay of locking an updated tuple (WAS: Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6)

2014-03-12 Thread David Johnston

Joshua D. Drake wrote
> On 03/12/2014 06:15 PM, Tom Lane wrote:
>> Robert Haas <

> robertmhaas@

> > writes:
>>> Discuss.
>>
>> This thread badly needs a more informative Subject line.
>>
> 
> No kidding. Or at least a link for goodness sake. Although the 
> pgsql-packers list wasn't all that helpful either.

A link would be nice though if -packers is a security list then that may not
be a good thing since -hackers is public...

A quick search of Nabble and the "Mailing Lists" section of the homepage do
not indicate pgsql-packers exists - at least not in any publicly (even if
read-only) accessible way.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/9a57858f1103b89a5674f0d50c5fe1f756411df6-tp5795816p5795827.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-12 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> This thread badly needs a more informative Subject line.

Agreed.

> But, yeah: do people think the referenced commit fixes a bug bad enough
> to deserve a quick update release?  If so, why?  Multiple reports of
> problems in the field would be a good reason, but I've not seen such.

Uh, isn't what brought this to light two independent complaints from
Peter and Greg Stark of seeing corruption in the field due to this?

Peter's initial email also indicated it was two different systems which
had gotten bit by this and Greg explicitly stated that he was working on
an independent database from what Peter was reporting on, so that's at
least 2 (one each), or 3 (if you count databases, as Peter had 2).
Sure, they're all from Heroku, but I find it highly unlikely no one else
has run into this issue.  More likely, they simply haven't realized it's
happened to them (which is another reason this is a particularly nasty
bug..).

I understand that another release makes work for everyone, and that
stinks, and it's also no fun in the press to have *another* release that
is fixing corruption issues, but sitting on a fix which is actively
causing corruption in the field isn't any good either.

So, my +1 is for a "quick update release"- and if there's a way I can
help offload some of the work (or at least learn the steps to help with
offloading in the future), I'm happy to do so- just let me know.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Alvaro Herrera

Josh Berkus escribió:
> On 03/12/2014 03:58 PM, Robert Haas wrote:
> > I don't like the idea of using reloptions to let people attach
> > arbitrary unvalidated settings to tables.  I consider the way things
> > work with GUCs to be a bug, not a feature, and definitely not
> > something I want to propagate into every other area of the system
> > where the underlying storage format happens to allow it.
> 
> +1.  Relopts are one of the uglier warts we have.

I'm not sure what you're plus-oneing here, but I hope it's not the
ability to set custom reloptions altogether.  As I interpret what Robert
was saying, it was "let's not have *unvalidated* reloptions", with which
I'm fine --- it only means we need to make sure custom reloptions are
validated, in some way yet to be agreed.

I agree that it has gotten too late for this in 9.4, also.

I don't see what's so ugly about reloptions as they currently exist,
anyway.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Bug: Fix Wal replay of locking an updated tuple (WAS: Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6)

2014-03-12 Thread Joshua D. Drake



On 03/12/2014 06:15 PM, Tom Lane wrote:

Robert Haas  writes:

Discuss.


This thread badly needs a more informative Subject line.



No kidding. Or at least a link for goodness sake. Although the 
pgsql-packers list wasn't all that helpful either.


What I know is that we have a known in the wild version of PostgreSQL 
that eats data. That is bad. It is unfortunate that we just released 
9.3.3 but we can't knowingly allow people to get their data eaten. We 
look bad.


It appears that this is the specific bug:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=9a57858f1103b89a5674f0d50c5fe1f756411df6

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-12 Thread Tom Lane

Robert Haas  writes:
> Discuss.

This thread badly needs a more informative Subject line.

But, yeah: do people think the referenced commit fixes a bug bad enough
to deserve a quick update release?  If so, why?  Multiple reports of
problems in the field would be a good reason, but I've not seen such.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Tom Lane

Robert Haas  writes:
> On Wed, Mar 12, 2014 at 9:19 AM, Andres Freund  wrote:
>> Except that we don't have the infrastructure to perform such checks
>> (neither partial, nor expression indexes, no exclusion constraints) on
>> system tables atm. So it's not a entirely trivial thing to do.

> I'm probably woefully underinformed here, but it seems like getting
> exclusion constraints working might be simpler than partial indexes or
> expression indexes, because both of those involve being able to
> evaluate arbitrary predicates, whereas exclusion constraints just
> involve invoking index access methods to look for conflicting rows via
> smarts built into your index AM.  The latter seems to involve less
> risk of circularity (but I might be wrong).

You might be right.  I don't think anyone's ever looked at what it
would take to support that particular case.  We have looked at the
other cases and run away screaming ... but I think that was before
exclusion constraints existed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus

On 03/12/2014 04:52 PM, Thom Brown wrote:
> On 12 March 2014 23:17, Michael Paquier  wrote:
>> On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown  wrote:
>>> I'm not clear on why would dropping an active replication slot would
>>> solve disk space problems related to WAL.  I thought it was inactive
>>> slots that were the problem in this regard?
>> You could still have an active slot with a standby that is not able to
>> catch up AFAIK.
> 
> In that scenario, why would one wish to drop the replication slot?  If
> it can't keep up, dropping the replication slot would likely mean
> you'd orphan the standby due to the primary no longer holding on to
> the necessary WAL, and the standby is then useless.  In which case, if
> the standby is causing such problems, why not shut down that standby,
> thereby effectively decommissioning it, then delete the slot?

The problem I'm anticipating is that the replica server is actually
offline, but the master doesn't know it yet.  So here's the situ:

1. replica with a slot dies
2. wal logs start piling up and master is running low on disk space
3. replica is still marked "active" because we're waiting for default
tcp timeout (3+ hours) or for the proxy to kill the connection (forever).

But as Andres has shown, there's a two ways to fix the above.  So we're
in good shape.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Josh Berkus

On 03/12/2014 03:58 PM, Robert Haas wrote:
> I don't like the idea of using reloptions to let people attach
> arbitrary unvalidated settings to tables.  I consider the way things
> work with GUCs to be a bug, not a feature, and definitely not
> something I want to propagate into every other area of the system
> where the underlying storage format happens to allow it.

+1.  Relopts are one of the uglier warts we have.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql XML parsing

2014-03-12 Thread Kyotaro HORIGUCHI

Hello,

> On 03/12/2014 09:36 AM, Ashoke wrote:
> > Hi,
> >
> >I am working on adding a functionality to PostgreSQL. I need to parse
> >the XML format query plan (produced by PostgreSQL v9.3) and save it in
> >a simple data structure (say C structure). I was wondering if
...
> The only XML parsing we have is where Postgres is built with libxml,
> in which case we use its parser. But query plan XML is delivered to a
> client (or a log file, which means more or less the same thing
> here).

As a HACKERS' matter, explain output can be obtained from
ExplainPrintPlan() in any format in backend. I don't know if it
is the case though.

> If you want to parse it then it should be parsed in the client
> - that's why we provide it. Inside postgres I don't see a point in
> parsing the XML rather than handling the query plan directly.
> 
> The worst possible option would be to make a hand-cut XML parser,
> either in the client or the server - XML parsing has all sorts of
> wrinkles that can bite you badly.

I agree with it. If XML input is not essential, JSON format would
be parsed more easily than xml. 9.3 already intrinsically has a
JSON parser infrastructure available for the purpose.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] 9a57858f1103b89a5674f0d50c5fe1f756411df6

2014-03-12 Thread Robert Haas

On the pgsql-packagers list, there has been some (OT for that list)
discussion of whether commit 9a57858f1103b89a5674f0d50c5fe1f756411df6
is sufficiently serious to justify yet another immediate minor release
of 9.3.x.  The relevant questions seem to be:

1. Is it really bad?

2. Does it affect a lot of people or only a few?

3. Are there more, equally bad bugs that are unfixed, or perhaps even
unreported, yet?

Obviously, we don't want to leave serious bugs unpatched.  On the
other hand, as Tom pointed out in that discussion, releases are a lot
of work, and we can't do them for every commit.

Discuss.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Thom Brown

On 12 March 2014 23:17, Michael Paquier  wrote:
> On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown  wrote:
>> On 12 March 2014 19:00, Josh Berkus  wrote:
>>> All:
>>>
>>> I was just reading Michael's explanation of replication slots
>>> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
>>> and realized there was something which had completely escaped me in the
>>> pre-commit discussion:
>>>
>>> select pg_drop_replication_slot('slot_1');
>>> ERROR:  55006: replication slot "slot_1" is already active
>>> LOCATION:  ReplicationSlotAcquire, slot.c:339
>>>
>>> What defines an "active" slot?
>>>
>>> It seems like there's no way for a DBA to drop slots from the master if
>>> it's rapidly running out of disk WAL space without doing a restart, and
>>> there's no way to drop the slot for a replica which the DBA knows is
>>> permanently offline but was connected earlier.  Am I missing something?
>>
>> I'm not clear on why would dropping an active replication slot would
>> solve disk space problems related to WAL.  I thought it was inactive
>> slots that were the problem in this regard?
> You could still have an active slot with a standby that is not able to
> catch up AFAIK.

In that scenario, why would one wish to drop the replication slot?  If
it can't keep up, dropping the replication slot would likely mean
you'd orphan the standby due to the primary no longer holding on to
the necessary WAL, and the standby is then useless.  In which case, if
the standby is causing such problems, why not shut down that standby,
thereby effectively decommissioning it, then delete the slot?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 9:19 AM, Andres Freund  wrote:
>> Isn't this just a case of creating a suitable operator and an exclusion
>> constraint?  Defining the constraint in BKI might require extra
>> infrastructure, but it should be possible.
>
> Except that we don't have the infrastructure to perform such checks
> (neither partial, nor expression indexes, no exclusion constraints) on
> system tables atm. So it's not a entirely trivial thing to do.

I'm probably woefully underinformed here, but it seems like getting
exclusion constraints working might be simpler than partial indexes or
expression indexes, because both of those involve being able to
evaluate arbitrary predicates, whereas exclusion constraints just
involve invoking index access methods to look for conflicting rows via
smarts built into your index AM.  The latter seems to involve less
risk of circularity (but I might be wrong).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-03-12 Thread Christian Kruse

Hi,

On 13/03/14 03:27, Fujii Masao wrote:
> Committed!

Thank you very much!

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



pgpkDoVMmXIL4.pgp
Description: PGP signature

Re: [HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Michael Paquier

On Wed, Mar 12, 2014 at 11:16 PM, Tom Lane  wrote:
> Julien Tachoires  writes:
>> A customer has reported us a strange behaviour regarding a rowtype
>> column with a domain subfield:
>
> Rowtypes in general do not support defaults for component fields.
And what about adding a TODO item?
Support default values for component fields of rowtypes

We could as well for the time being improve the documentation to
mention that with some examples. For example with some more content on
the page of INSERT.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Michael Paquier

On Thu, Mar 13, 2014 at 5:45 AM, Thom Brown  wrote:
> On 12 March 2014 19:00, Josh Berkus  wrote:
>> All:
>>
>> I was just reading Michael's explanation of replication slots
>> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
>> and realized there was something which had completely escaped me in the
>> pre-commit discussion:
>>
>> select pg_drop_replication_slot('slot_1');
>> ERROR:  55006: replication slot "slot_1" is already active
>> LOCATION:  ReplicationSlotAcquire, slot.c:339
>>
>> What defines an "active" slot?
>>
>> It seems like there's no way for a DBA to drop slots from the master if
>> it's rapidly running out of disk WAL space without doing a restart, and
>> there's no way to drop the slot for a replica which the DBA knows is
>> permanently offline but was connected earlier.  Am I missing something?
>
> I'm not clear on why would dropping an active replication slot would
> solve disk space problems related to WAL.  I thought it was inactive
> slots that were the problem in this regard?
You could still have an active slot with a standby that is not able to
catch up AFAIK.
--
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 5:30 PM, Heikki Linnakangas
 wrote:
> Ok, great. Committed!

Awesome.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Store Extension Options

2014-03-12 Thread Robert Haas

On Mon, Mar 10, 2014 at 9:33 PM, Alvaro Herrera
 wrote:
> I haven't touched pg_dump yet, but if this proposed design sits well
> with everyone, my intention is that the dump output will contain the
> pg_register_option_namespace() calls necessary so that a table
> definition will be able to do the SET calls to set the values the
> original table has, and succeed.  In other words, restoring a dump will
> preserve the values you had, without a need of having the module loaded
> in the new server.  I think this is what was discussed.  Robert, do you
> agree?

No, I wasn't imagining anything like pg_register_option_namespace().
My thought was that you'd need to have any relevant modules loaded at
restore time.  In essence, patching in a new option via an extension
module would work about like adding one by patching the core code: you
need a server version that supports that option in order to set it.

I don't like the idea of using reloptions to let people attach
arbitrary unvalidated settings to tables.  I consider the way things
work with GUCs to be a bug, not a feature, and definitely not
something I want to propagate into every other area of the system
where the underlying storage format happens to allow it.

I also kind of think that what you're going to find if you try to
press forward with the pg_register_option_namespace() idea is that
what you really want is CREATE RELOPTION NAMESPACE, ALTER RELOPTION
NAMESPACE, DROP RELOPTION NAMESPACE.  Short of that, you're going to
end up with a bunch of kludges, I suspect.  And some kind of real DDL
syntax (with better naming) is OK with me, but as you observed
elsewhere on the thread, now you're looking at a new catalog and a
bunch more complexity.

I kind of think that this is too half-baked for 9.4 and we ought to
punt it to 9.5.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

2014-03-12 Thread Rukh Meski

Oops.  Of course shouldn't try and change how INSERT works.  Latest version 
attached.



♜

update_delete_order_by_limit_v2.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12.3.2014 22:43, Peter Geoghegan wrote:
> On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra  wrote:
>> I think that's unfounded assumption. Many users actually have very
>> little control over the documents or queries - a nice example may be the
>> mail archive, with headers stored in a hstore/jsonb. I have absolutely
>> no control over the headers or queries.
> 
> Maybe, but what do you want me to do to help them? Indexing a
> typical jsonb field is a bad idea, unless you really do want
> something essentially equivalent to full text search (which could be
> justified), or unless you know ahead of time that your documents are
> not going to be heavily nested. The whole basis of your complaints
> seems to be that people won't know that at all.

Well, I would be quite happy with the GIN indexing without the limit I
ran into. I don't think we need to invent something entirely new.

You're right that the index is pretty futile with a condition matching
field/value combination. But what if I'm doing a query with multiple
such conditions, and the combination matches just a small fraction of
rows? GIN index works with that (and the patches from Alexander improve
this case tremendously, IIRC).

I still don't understand how's this similar to fulltext - that seems
pretty unsuitable for a treeish structure, assuming you can't flatten
it. Which you can't, if the queries use paths to access just parts of
the json value.

>> For many usecases, expressional indexes are the right tool. But not for
>> all and I see no reason to just throw some tools away.
> 
> If the tool you're talking about throwing away is the GiST opclass, I
> do not propose to throw that away. I don't think it's important enough
> to justify inclusion in our first cut at this, especially given the
> fact that the code has bugs, and is quite a bit more complex than GIN.
> What's wrong with those reasons?

Meh, I accidentally mixed two responses :-/

I have no problem with expression indexes, but it's not a good solution
to all problems. I certainly can't use them to achieve what I'd like and
I disagree with your assumptions that it doesn't make sense to index
everything / non-interesting keys, or that the documents have
well-defined structure. I can live with larger / less efficient indexes
on all fields.

Regarding GiST - I understand your concerns about complexity, and you
may be right that not shipping it now is prefferable to shipping it with
bugs. The thing is it doesn't have issues with the value lengths, which
prevents me from using GIN, and although GiST is slower, it's at least
some indexing. But maybe jsonb_hash_ops will work, I haven't tried yet.

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Stephen Frost

* Tomas Vondra (t...@fuzzy.cz) wrote:
> So I think it's quite difficult to give simple and exact explanation in
> the docs, other than "there are limits, but it's difficult to say when
> you hit them".

Arrays have more-or-less the same issue...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12.3.2014 21:55, Josh Berkus wrote:
> Andrew, Peter:
> 
> Just so I'm clear on the limits here, lemme make sure I understand this:
> 
> a) GIN indexing is limited to ~~1500chars

The exact message I get is this:

ERROR: index row size 1944 exceeds maximum 1352 for index "tmp_idx"

so it's 1352B. But IIRC this is closely related to block size, so with
larger block sizes you'll get different limits. Also, this is a limit on
compressed value, which makes it less user-friendly as it's difficult to
predict whether the row is OK or not :-(

And I just discovered this:

  create table tmp (val jsonb);
  create index tmp_gin_idx on tmp using gin (val);
  insert into tmp
 select ('{"z" : "' || repeat('z', 100) || '"}')::jsonb;

which tries to insert a well-compressible string ('z' repeated
1e6-times), and fails with this:

ERROR: index row requires 11472 bytes, maximum size is 8191

So I think it's quite difficult to give simple and exact explanation in
the docs, other than "there are limits, but it's difficult to say when
you hit them".

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan

On Wed, Mar 12, 2014 at 2:30 PM, Tomas Vondra  wrote:
> I think that's unfounded assumption. Many users actually have very
> little control over the documents or queries - a nice example may be the
> mail archive, with headers stored in a hstore/jsonb. I have absolutely
> no control over the headers or queries.

Maybe, but what do you want me to do to help them? Indexing a typical
jsonb field is a bad idea, unless you really do want something
essentially equivalent to full text search (which could be justified),
or unless you know ahead of time that your documents are not going to
be heavily nested. The whole basis of your complaints seems to be that
people won't know that at all.

> For many usecases, expressional indexes are the right tool. But not for
> all and I see no reason to just throw some tools away.

If the tool you're talking about throwing away is the GiST opclass, I
do not propose to throw that away. I don't think it's important enough
to justify inclusion in our first cut at this, especially given the
fact that the code has bugs, and is quite a bit more complex than GIN.
What's wrong with those reasons?

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12.3.2014 21:58, Peter Geoghegan wrote:
> 
> The use case you describe here doesn't sound like something similar to
> full text search. It sounds like something identical.

I think this very depends on the definition of full text search.

> In any case, let's focus on what we have right now. I think that the
> indexing facilities proposed here are solid. In any case they do not
> preclude working on better indexing strategies as the need emerges.

+1

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-03-12 Thread Heikki Linnakangas


On 03/04/2014 01:58 PM, Amit Kapila wrote:

On Mon, Mar 3, 2014 at 7:57 PM, Heikki Linnakangas
 wrote:

On 02/16/2014 01:51 PM, Amit Kapila wrote:


On Wed, Feb 5, 2014 at 5:29 PM, Heikki Linnakangas
  wrote:


Thanks. I have to agree with Robert though that using the pglz encoding when
we're just checking for a common prefix/suffix is a pretty crappy way of
going about it [1].

As the patch stands, it includes the NULL bitmap when checking for a common
prefix. That's probably not a good idea, because it defeats the prefix
detection in a the common case that you update a field from NULL to not-NULL
or vice versa.

Attached is a rewritten version, which does the prefix/suffix tests directly
in heapam.c, and adds the prefix/suffix lengths directly as fields in the
WAL record. If you could take one more look at this version, to check if
I've missed anything.


I had verified the patch and found few minor points:
...


Fixed those.


One Question:
+ rdata[1].data = (char *) &xlrec;
Earlier it seems to store record hearder as first segment rdata[0],
whats the reason of changing it?


I found the code easier to read that way. The order of rdata entries 
used to be:


0: xl_heap_update struct
1: full-page reference to oldbuf (no data)
2: xl_heap_header_len struct for the new tuple
3-7: logical decoding stuff

The prefix/suffix fields made that order a bit awkward, IMHO. They are 
logically part of the header, even though they're not part of the struct 
(they are documented in comments inside the struct). So they ought to 
stay together with the xl_heap_update struct. Another option would've 
been to move it after the xl_heap_header_len struct.


Note that this doesn't affect the on-disk format of the WAL record, 
because the moved rdata entry is just a full-page reference, with no 
payload of its own.



I have verified the patch by doing crash recovery for below scenario's
and it worked fine:
a. no change in old and new tuple
b. all changed in new tuple
c. half changed (update half of the values to NULLS) in new tuple
d. only prefix same in new tuple
e. only suffix same in new tuple
f.  prefix-suffix same, other columns values changed in new tuple.


Thanks!


Conclusion is that patch shows good WAL reduction and performance
improvement for favourable cases without CPU overhead for non-favourable
cases.


Ok, great. Committed!

I left out the regression tests. It was good to have them while 
developing this, but I don't think there's a lot of value in including 
them permanently in the regression suite. Low-level things like the 
alignment-sensitive test are fragile, and can easily stop testing the 
thing it's supposed to test, depending on the platform and future 
changes in the code. And the current algorithm doesn't care much about 
alignment anyway.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12.3.2014 20:40, Peter Geoghegan wrote:
> On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra  wrote:
>> I'm still not sure how would that look. Does that mean I'd have to
>> create multiple GIN indexes - one for each possible key or
>> something like that? Can you give an example?
> 
> It could mean that you're obliged to create multiple indexes, yes.
> For an example, and to get a better sense of what I mean, look at
> the documentation in the patch.

OK, will do.

> The idea that you're going to create one index on a jsonb, and it's
> going to be able to usefully index a lot of different queries doesn't
> seem practical for most use-cases. Mostly, people will have fairly
> homogeneous json documents, and they'll want to index certain nested
> fields common to all or at least a large majority of those documents.

I think that's unfounded assumption. Many users actually have very
little control over the documents or queries - a nice example may be the
mail archive, with headers stored in a hstore/jsonb. I have absolutely
no control over the headers or queries.

But I think this is a "feedback loop" too - what if many users actually
want that functionality, but realize that expression indexes are not
sufficient for their needs and thus don't even try (and so we don't hear
about them)?

And my experience is that this is actualy one of the very cool hstore
features - being able to index the whole structure and then do arbitrary
queries over that.

The only reason why I'm looking at jsonb is that it the improved support
for data types (especially arrays).

So I have my doubts about the claims that users have homogenous
documents and only want to index some fields with expression indexes.

> By indexing entire jsonb datums, do you hope to get much benefit out
> of the indexed values (as opposed to keys) being stored (in serialized
> form) in the GIN index? Because you *are* indexing a large nested
> structure as a value. Is that large nested structure going to appear
> in your query predicate, or are you just going to subscript the jsonb
> to get to the level that's of interest to query that? I'm pretty sure
> that people want the latter. Are you sure that your complaint isn't
> just that the default GIN opclass indexes values (as distinct from
> keys) that are large and unwieldy, and not terribly useful?

No, I don't expect a large nested structure to appear in the query. And
I expect most people won't need that, although I can imagine queries  @>
doing that (not sure if that checks for equality or 'subset').

But I'm not sure I understand how's this related to my original post?

All I was asking whether it wouldn't be enough to store a hash instead
of the original value, i.e. instead of this:

  {"from" : "j...@example.com",
   "to" : "j...@example.com",
   "content-type" : "text/plain; charset=us-ascii",
   "dkim-signature" : " vry long value "}

this

  {129812 : 29382,
   459821 : 1029381,
21083 : 102941,
   111390 : 129010292}

which would solve issues with the long values and might still support
the queries (with recheck, of course). I don't know if that's what
jsonb_hash_ops do or if it's even possible / compatible with GIN.

> I don't think expressional indexes are some kind of unfortunate work
> around for a jsonb limitation. I think that they're the natural way to
> approach indexing a nested structure in Postgres. MongoDB, for
> example, does not magically index everything. You're still required to
> make choices about indexing that consider the access patterns.

For many usecases, expressional indexes are the right tool. But not for
all and I see no reason to just throw some tools away.

regards
Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5: UPDATE/DELETE .. ORDER BY .. LIMIT ..

2014-03-12 Thread Rukh Meski

Hi,

Here's an updated patch.  I had to push the LIMIT processing into ModifyTable 
to make the behaviour sane in parallel scenarios.  As usual, please ignore if 
you're busy with 9.4.  I will work on better docs and more tests from now on 
and am preparing to make a solid case for adding this.



♜ 

update_delete_order_by_limit_v1.diff
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Andrew Dunstan



On 03/12/2014 04:58 PM, Peter Geoghegan wrote:

In any case, let's focus on what we have right now. I think that the
indexing facilities proposed here are solid. In any case they do not
preclude working on better indexing strategies as the need emerges.




I quite agree, didn't mean to suggest otherwise.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan

On Wed, Mar 12, 2014 at 1:37 PM, Andrew Dunstan  wrote:
> One major use case for using treeish data types in the first place is that
> you don't know when you're designing the database exactly what shape the
> data will be. If you don't know that, then how are you supposed to know what
> in it will be interesting? It's somewhat analogous to full text indexing,
> where we don't know in advance what phrases or words will be interesting.
> Here, a key is the equivalent of a word and a key path or subpath is the
> equivalent of a phrase.

You don't know exactly how, but you have some idea. The major benefit
is that you can add new things to new documents as the need arises,
and that's not a big deal, nor does it require a migration with DDL.
If we continue to take MongoDB as representative of how people will
use jsonb, they pretty strongly encourage the idea that you have to
have some structure or design. Google "mongodb schema design" to see
what I mean - you'll find plenty. It has more to do with making
querying the data possible than anything else. There is a limited
amount you can do with a bunch of documents that share little in
common in terms of their structure - what does a query (that can use
an index just in principle) even look like there?

The use case you describe here doesn't sound like something similar to
full text search. It sounds like something identical.

In any case, let's focus on what we have right now. I think that the
indexing facilities proposed here are solid. In any case they do not
preclude working on better indexing strategies as the need emerges.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Josh Berkus

Andrew, Peter:

Just so I'm clear on the limits here, lemme make sure I understand this:

a) GIN indexing is limited to ~~1500chars

b) The "value", which includes everything other than the top level set
of keys, is one item as far as GIN is concerned.

Therefore: we are limited to indexing JSON where nothing below a
top-level key is more than 1500bytes?

I'm asking for documentation purposes.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Thom Brown

On 12 March 2014 19:00, Josh Berkus  wrote:
> All:
>
> I was just reading Michael's explanation of replication slots
> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
> and realized there was something which had completely escaped me in the
> pre-commit discussion:
>
> select pg_drop_replication_slot('slot_1');
> ERROR:  55006: replication slot "slot_1" is already active
> LOCATION:  ReplicationSlotAcquire, slot.c:339
>
> What defines an "active" slot?
>
> It seems like there's no way for a DBA to drop slots from the master if
> it's rapidly running out of disk WAL space without doing a restart, and
> there's no way to drop the slot for a replica which the DBA knows is
> permanently offline but was connected earlier.  Am I missing something?

I'm not clear on why would dropping an active replication slot would
solve disk space problems related to WAL.  I thought it was inactive
slots that were the problem in this regard?

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Andrew Dunstan



On 03/12/2014 04:10 PM, Peter Geoghegan wrote:

On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov  wrote:

Also, GiST index is faster for create/update operations. I really hope we will
improve jsonb indexing in the next one-two releases. For now I'd suggest people
index expressional indexes to index just interesting keys or use GiST.

When do you ever want to index non-interesting keys?




The problem is when do you know they are interesting?

One major use case for using treeish data types in the first place is 
that you don't know when you're designing the database exactly what 
shape the data will be. If you don't know that, then how are you 
supposed to know what in it will be interesting? It's somewhat analogous 
to full text indexing, where we don't know in advance what phrases or 
words will be interesting. Here, a key is the equivalent of a word and a 
key path or subpath is the equivalent of a phrase.


Maybe I'm dreaming, since I have no idea how to go about this sort of 
indexing, but it's where I'd like to see lots of effort.


I agree with Oleg that we need to be very creative about jsonb indexing. 
One of my hopes is that by going down the road we are on, we'll get much 
wider interest in this, and that both ideas and money might flow towards 
addressing it in a way that we probably wouldn't have seen otherwise.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Oleg Bartunov

On Thu, Mar 13, 2014 at 12:10 AM, Peter Geoghegan  wrote:
> On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov  wrote:
>> Also, GiST index is faster for create/update operations. I really hope we 
>> will
>> improve jsonb indexing in the next one-two releases. For now I'd suggest 
>> people
>> index expressional indexes to index just interesting keys or use GiST.
>
> When do you ever want to index non-interesting keys?

Regular user may just index all keys.

I mean, that json can contains keys, which are not searched, so it's
not needed to index them and save index size. We probably could
provide option in CREATE INDEX to specify what to index and what not
index, but it require planner to know that information.

>
> --
> Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus

On 03/12/2014 12:34 PM, Robert Haas wrote:
>>> Urgh.  That error message looks susceptible to improvement.  How about:
>>> >>
>>> >> replication slot "%s" cannot be dropped because it is currently in use
>> >
>> > I think that'd require duplicating some code between acquire and drop,
>> > but how about "replication slot "%s" is in use by another backend"?
> Sold.

Wait ... before you go further ... I object to dropping the word
"active" from the error message.  The column is called "active", and
that's where a DBA should look; that word needs to stay in the error
message.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus

On 03/12/2014 12:26 PM, Andres Freund wrote:
> On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:
>> On 03/12/2014 12:03 PM, Andres Freund wrote:
 It seems like there's no way for a DBA to drop slots from the master if
 it's rapidly running out of disk WAL space without doing a restart, and
 there's no way to drop the slot for a replica which the DBA knows is
 permanently offline but was connected earlier.  Am I missing something?
>>>
>>> It's sufficient to terminate the walsender and then drop the slot. That
>>> seems ok for now?
>>
>> We have no safe way to terminate the walsender that I know of;
>> pg_terminate_backend() doesn't include walsenders last I checked.
> 
> SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

Aha!  Ok, I'll work on some documentation.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan

On Wed, Mar 12, 2014 at 11:57 AM, Oleg Bartunov  wrote:
> Also, GiST index is faster for create/update operations. I really hope we will
> improve jsonb indexing in the next one-two releases. For now I'd suggest 
> people
> index expressional indexes to index just interesting keys or use GiST.

When do you ever want to index non-interesting keys?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Peter Geoghegan

On Wed, Mar 12, 2014 at 6:20 AM, Tomas Vondra  wrote:
> I'm still not sure how would that look. Does that mean I'd have to create
> multiple GIN indexes - one for each possible key or something like that?
> Can you give an example?

It could mean that you're obliged to create multiple indexes, yes. For
an example, and to get a better sense of what I mean, look at the
documentation in the patch.

The idea that you're going to create one index on a jsonb, and it's
going to be able to usefully index a lot of different queries doesn't
seem practical for most use-cases. Mostly, people will have fairly
homogeneous json documents, and they'll want to index certain nested
fields common to all or at least a large majority of those documents.

By indexing entire jsonb datums, do you hope to get much benefit out
of the indexed values (as opposed to keys) being stored (in serialized
form) in the GIN index? Because you *are* indexing a large nested
structure as a value. Is that large nested structure going to appear
in your query predicate, or are you just going to subscript the jsonb
to get to the level that's of interest to query that? I'm pretty sure
that people want the latter. Are you sure that your complaint isn't
just that the default GIN opclass indexes values (as distinct from
keys) that are large and unwieldy, and not terribly useful?

I don't think expressional indexes are some kind of unfortunate work
around for a jsonb limitation. I think that they're the natural way to
approach indexing a nested structure in Postgres. MongoDB, for
example, does not magically index everything. You're still required to
make choices about indexing that consider the access patterns.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 3:25 PM, Andres Freund  wrote:
> On 2014-03-12 15:18:04 -0400, Robert Haas wrote:
>> On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund  
>> wrote:
>> > Hi,
>> >
>> > On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
>> >> I was just reading Michael's explanation of replication slots
>> >> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
>> >> and realized there was something which had completely escaped me in the
>> >> pre-commit discussion:
>> >>
>> >> select pg_drop_replication_slot('slot_1');
>> >> ERROR:  55006: replication slot "slot_1" is already active
>> >> LOCATION:  ReplicationSlotAcquire, slot.c:339
>> >>
>> >> What defines an "active" slot?
>> >
>> > One with a connected walsender.
>> >
>> >> It seems like there's no way for a DBA to drop slots from the master if
>> >> it's rapidly running out of disk WAL space without doing a restart, and
>> >> there's no way to drop the slot for a replica which the DBA knows is
>> >> permanently offline but was connected earlier.  Am I missing something?
>> >
>> > It's sufficient to terminate the walsender and then drop the slot. That
>> > seems ok for now?
>>
>> Urgh.  That error message looks susceptible to improvement.  How about:
>>
>> replication slot "%s" cannot be dropped because it is currently in use
>
> I think that'd require duplicating some code between acquire and drop,
> but how about "replication slot "%s" is in use by another backend"?

Sold.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Memory ordering issue in LWLockRelease, WakeupWaiters, WALInsertSlotRelease

2014-03-12 Thread Andres Freund

On 2014-03-07 17:54:32 +0200, Heikki Linnakangas wrote:
> So there are some unexplained differences there, but based on these results,
> I'm still OK with committing the patch.

So, I am looking at this right now.

I think there are some minor things I'd like to see addressed:

1) I think there needs to be a good sized comment explaining why
   WaitXLogInsertionsToFinish() isn't racy due to the unlocked read at
   the beginning of LWLockWait(). I think it's safe because we're
   reading Insert->CurrBytePos inside a spinlock, and it will only ever
   increment. As SpinLockAcquire() has to be a read barrier we can
   assume that every skewed read in LWLockWait() will be for lock
   protecting a newer insertingAt?
2) I am not particularly happy about the LWLockWait() LWLockWakeup()
   function names. They sound too much like a part of the normal lwlock
   implementation to me. But admittedly I don't have a great idea for
   a better naming scheme. Maybe LWLockWaitForVar(),
   LWLockWakeupVarWaiter()?
3) I am the wrong one to complain, I know, but the comments above struct
   WALInsertLock are pretty hard to read from th sentence structure.
4) WALInsertLockAcquire() needs to comment on acquiring/waking all but
   the last slot. Generally the trick of exclusive xlog insertion lock
   acquiration only really using the last lock could use a bit more
   docs.
5) WALInsertLockRelease() comments on the reset of insertingAt being
   optional, but I am not convinced that that's true anymore. If an
   exclusive acquiration isn't seen as 0 or
   INT64CONST(0x) by another backend we're in trouble,
   right? Absolutely not sure without thinking on it for longer than I
   can concentrate right now.
6) Pretty minor, but from a style POV it seems nicer to separate
   exclusive/nonexclusive out of WALInsertLockAcquire(). The cases don't
   share any code now.

A patch contianing some trivial changes is attached...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 484b9c5..8a55c6b 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1628,8 +1628,6 @@ WALInsertLockRelease(void)
 static void
 WALInsertLockWakeup(XLogRecPtr insertingAt)
 {
-	int			i;
-
 	if (holdingAllLocks)
 	{
 		/*
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index f88bf76..2695128 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -873,6 +873,9 @@ LWLockWait(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
 	int			extraWaits = 0;
 	bool		result = false;

+	/* can't be used with shared locks for now */
+	Assert(lock->shared == 0);
+
 	/*
 	 * Quick test first to see if it the slot is free right now.
 	 *
@@ -905,6 +908,8 @@ LWLockWait(LWLock *l, uint64 *valptr, uint64 oldval, uint64 *newval)
 		SpinLockAcquire(&lock->mutex);
 #endif

+		Assert(lock->shared == 0);
+
 		/* Is the lock now free, and if not, does the value match? */
 		if (lock->exclusive == 0)
 		{
@@ -1022,6 +1027,7 @@ LWLockWakeup(LWLock *l, uint64 *valptr, uint64 val)
 	SpinLockAcquire(&lock->mutex);

 	/* we should hold the lock */
+	LWLockHeldByMe(l);
 	Assert(lock->exclusive == 1);

 	/* Update the lock's value */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund

On 2014-03-12 12:23:01 -0700, Josh Berkus wrote:
> On 03/12/2014 12:03 PM, Andres Freund wrote:
> > Hi,
> > 
> > On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
> >> I was just reading Michael's explanation of replication slots
> >> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
> >> and realized there was something which had completely escaped me in the
> >> pre-commit discussion:
> >>
> >> select pg_drop_replication_slot('slot_1');
> >> ERROR:  55006: replication slot "slot_1" is already active
> >> LOCATION:  ReplicationSlotAcquire, slot.c:339
> >>
> >> What defines an "active" slot?
> > 
> > One with a connected walsender.
> 
> In a world of network proxies, a walsender could be "connected" for
> hours after the replica has ceased to exist.  Fortunately,
> wal_sender_timeout is changeable on a reload.  We check for actual
> standby feedback for the timeout, yes?

Yep.

> >> It seems like there's no way for a DBA to drop slots from the master if
> >> it's rapidly running out of disk WAL space without doing a restart, and
> >> there's no way to drop the slot for a replica which the DBA knows is
> >> permanently offline but was connected earlier.  Am I missing something?
> > 
> > It's sufficient to terminate the walsender and then drop the slot. That
> > seems ok for now?
> 
> We have no safe way to terminate the walsender that I know of;
> pg_terminate_backend() doesn't include walsenders last I checked.

SELECT pg_terminate_backend(pid) FROM pg_stat_replication;

works.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund

On 2014-03-12 15:18:04 -0400, Robert Haas wrote:
> On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund  wrote:
> > Hi,
> >
> > On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
> >> I was just reading Michael's explanation of replication slots
> >> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
> >> and realized there was something which had completely escaped me in the
> >> pre-commit discussion:
> >>
> >> select pg_drop_replication_slot('slot_1');
> >> ERROR:  55006: replication slot "slot_1" is already active
> >> LOCATION:  ReplicationSlotAcquire, slot.c:339
> >>
> >> What defines an "active" slot?
> >
> > One with a connected walsender.
> >
> >> It seems like there's no way for a DBA to drop slots from the master if
> >> it's rapidly running out of disk WAL space without doing a restart, and
> >> there's no way to drop the slot for a replica which the DBA knows is
> >> permanently offline but was connected earlier.  Am I missing something?
> >
> > It's sufficient to terminate the walsender and then drop the slot. That
> > seems ok for now?
> 
> Urgh.  That error message looks susceptible to improvement.  How about:
> 
> replication slot "%s" cannot be dropped because it is currently in use

I think that'd require duplicating some code between acquire and drop,
but how about "replication slot "%s" is in use by another backend"?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus

On 03/12/2014 12:03 PM, Andres Freund wrote:
> Hi,
> 
> On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
>> I was just reading Michael's explanation of replication slots
>> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
>> and realized there was something which had completely escaped me in the
>> pre-commit discussion:
>>
>> select pg_drop_replication_slot('slot_1');
>> ERROR:  55006: replication slot "slot_1" is already active
>> LOCATION:  ReplicationSlotAcquire, slot.c:339
>>
>> What defines an "active" slot?
> 
> One with a connected walsender.

In a world of network proxies, a walsender could be "connected" for
hours after the replica has ceased to exist.  Fortunately,
wal_sender_timeout is changeable on a reload.  We check for actual
standby feedback for the timeout, yes?

> 
>> It seems like there's no way for a DBA to drop slots from the master if
>> it's rapidly running out of disk WAL space without doing a restart, and
>> there's no way to drop the slot for a replica which the DBA knows is
>> permanently offline but was connected earlier.  Am I missing something?
> 
> It's sufficient to terminate the walsender and then drop the slot. That
> seems ok for now?

We have no safe way to terminate the walsender that I know of;
pg_terminate_backend() doesn't include walsenders last I checked.

So the procedure for this would be:

1) set wal_sender_timeout to some low value (1);
2) reload
3) call pg_drop_replication_slot('slotname')

Clumsy, but it will do for a first pass; we can make it better (for
example, by adding a "force" boolean to pg_drop_replication_slot) in 9.5.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 3:03 PM, Andres Freund  wrote:
> Hi,
>
> On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
>> I was just reading Michael's explanation of replication slots
>> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
>> and realized there was something which had completely escaped me in the
>> pre-commit discussion:
>>
>> select pg_drop_replication_slot('slot_1');
>> ERROR:  55006: replication slot "slot_1" is already active
>> LOCATION:  ReplicationSlotAcquire, slot.c:339
>>
>> What defines an "active" slot?
>
> One with a connected walsender.
>
>> It seems like there's no way for a DBA to drop slots from the master if
>> it's rapidly running out of disk WAL space without doing a restart, and
>> there's no way to drop the slot for a replica which the DBA knows is
>> permanently offline but was connected earlier.  Am I missing something?
>
> It's sufficient to terminate the walsender and then drop the slot. That
> seems ok for now?

Urgh.  That error message looks susceptible to improvement.  How about:

replication slot "%s" cannot be dropped because it is currently in use

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
> On 03/12/2014 11:25 AM, Stephen Frost wrote:
> > * Josh Berkus (j...@agliodbs.com) wrote:
> >> Local superusers (maybe this concept needs another name) would be able
> >> to do the following things in a *single* database:
> >>
> >> 1 change permissions for other users on that database and its objects
> > 
> > What about "bypass" permissions, ala what superuser does today?  Or are
> > you saying we'd only need to allow this new kind of role to bypass the
> > checks in the GRANT/REVOKE system?
> 
> More like what we have for the database owner role today.

eh?  Can you clarify what you think the database owner role can do today
that actually matches what you're asking for above, or even what I'm
suggesting?

> >> 2 load extensions from a predefined .so directory / list
> > 
> > This would obviously have to be a curated list that avoids things like
> > 'adminpack'...
> 
> It would need to be a list created by the global superuser.  By default,
> nothing would be on it.

Hmm, so this would be independent from what's installed on the system at
a file level.  That's an interesting idea.

> >> 4 create per-database users and change their settings
> > 
> > Presumably just for the 'local' DB?
> 
> Right.
> 
> >> 5 change database settings (SET stuff)
> > 
> > This can be done by the database-owner already, no?
> 
> Oh, point.
> 
> >> 6 NOT change their own user settings
> > 
> > Don't think this is quite that simple (passwords?).
> 
> Well, we already limit what things users can change about themselves;
> they can't promote themselves to superuser, for example.

Just saying that we need to be more specific on this point.

> >> 7 NOT change any global users
> > 
> > What about role membership, wrt local vs. global roles?
> 
> Allowing global users to join local ROLEs is its own can'o'worms that
> would merit an entire other thread.  

What would the caveats here be then..?  Local roles can join global
ones, but global roles can't join local ones?  I'm not sure how much of
an issue this will really be if they're all sharing one catalog with
different namespaces (eg: the '@db' stuff).

> Assuming, of course, that we had
> local users in the first place.

That's what this thread is about, isn't it..?

> >> 8 NOT run SET PERSISTENT or other commands with global effect
> > 
> > Indeed, or use 'COPY'..
> 
> Yeah.  In theory, we should allow the local superuser to use COPY; in
> practice, nobody will care because they'll be using client-side COPY
> since the entire use-case for this is cloud-hosted DBs anyway.

How could we allow the local "superuser" to use COPY?  They've be able
to overwrite files in other databases, or possibly modify
postgresql.conf, etc, depending on the overall system environment.

> > For my part- I don't see having everyone write their own set of SECURITY
> > DEFINER functions as being either less complex or less risk.  They're
> > also a lot less convenient to use.  That's not what RDS did, is it?  No,
> > and I agree with them on that part.
> 
> I was thinking of having an extension in contrib, actually.  That is, a
> canonical set of security definer functions.  But if you think it's
> easier to actually implement the permissions restrictions in the actual
> utility functions, I wouldn't argue.

Having it in contrib would be better than nothing, but I'd advocate
putting this in as part of our actual permissions model, along with the
read-only/auditor-type option.  Having this not-quite-superuser which is
able to read all tables, possibly modify any of them, without having to
change the permissions system could be extremely useful and would be
rather painful to implement through SD functions.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Replication slots and footguns

2014-03-12 Thread Andres Freund

Hi,

On 2014-03-12 12:00:25 -0700, Josh Berkus wrote:
> I was just reading Michael's explanation of replication slots
> (http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
> and realized there was something which had completely escaped me in the
> pre-commit discussion:
> 
> select pg_drop_replication_slot('slot_1');
> ERROR:  55006: replication slot "slot_1" is already active
> LOCATION:  ReplicationSlotAcquire, slot.c:339
> 
> What defines an "active" slot?

One with a connected walsender.

> It seems like there's no way for a DBA to drop slots from the master if
> it's rapidly running out of disk WAL space without doing a restart, and
> there's no way to drop the slot for a replica which the DBA knows is
> permanently offline but was connected earlier.  Am I missing something?

It's sufficient to terminate the walsender and then drop the slot. That
seems ok for now?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Josh Berkus

On 03/12/2014 11:40 AM, Tom Lane wrote:
> Andrew Dunstan  writes:
>> On 03/12/2014 02:09 PM, Josh Berkus wrote:
>>> Well, if you really want my "I want a pony" list:
>>>
>>> Local superusers (maybe this concept needs another name) would be able
>>> to do the following things in a *single* database:
>>>
>>> 1 change permissions for other users on that database and its objects
>>> 2 load extensions from a predefined .so directory / list
>>> 3 create/modify untrusted language functions
>>> 4 create per-database users and change their settings
>>> 5 change database settings (SET stuff)
>>> 6 NOT change their own user settings
>>> 7 NOT change any global users
>>> 8 NOT run SET PERSISTENT or other commands with global effect
> 
>> Item 3 gives away the store.
> 
> Indeed.  If you can do (3), you can break out of any of the other
> constraints.  I suspect even (1) and/or (5) would be enough to mount
> trojan-horse attacks against real superusers who visit your database.

... nobody reads my whole post, except Stephen.  :-(

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Replication slots and footguns

2014-03-12 Thread Josh Berkus

All:

I was just reading Michael's explanation of replication slots
(http://michael.otacoo.com/postgresql-2/postgres-9-4-feature-highlight-replication-slots/)
and realized there was something which had completely escaped me in the
pre-commit discussion:

select pg_drop_replication_slot('slot_1');
ERROR:  55006: replication slot "slot_1" is already active
LOCATION:  ReplicationSlotAcquire, slot.c:339

What defines an "active" slot?

It seems like there's no way for a DBA to drop slots from the master if
it's rapidly running out of disk WAL space without doing a restart, and
there's no way to drop the slot for a replica which the DBA knows is
permanently offline but was connected earlier.  Am I missing something?

If I'm not, that seems like something to fix before 9.4 release.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Oleg Bartunov

Also, GiST index is faster for create/update operations. I really hope we will
improve jsonb indexing in the next one-two releases. For now I'd suggest people
index expressional indexes to index just interesting keys or use GiST.

On Wed, Mar 12, 2014 at 5:15 PM, Tomas Vondra  wrote:
> On 12 Březen 2014, 0:41, Peter Geoghegan wrote:
>> On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra  wrote:
>>>   ERROR:  index row size 1416 exceeds maximum 1352 for index "gin_idx"
>>
>> All index AMs have similar restrictions.
>
> Yes, I know and I have no problem with restrictions in general. You may
> run into similar issues with btree indexes on text columns with long text,
> for example. The thing is that people don't generally index text directly,
> because it usually does not make much sense, but using tsvector etc.
>
> But with jsonb it's more likely because indexing is one of the goodies (at
> least for me). And the discussions with several people interested in
> storing json data I had recently went often like this:
>
> me: It seems we'll have a better json datatype in 9.4.
> them: Nice!
> me: And it will be possible to do searches on arbitrary keys.
> them: Yay!
> me: And we actually got pretty significant improvements in GIN indexes.
> them: Awesome!
> me: But the values you may index need to be less than ~1500B.
> them: Bummer :-(
> me: Well, you can use GIST then.
>
>>> A good example of such header is "dkim-signature" which basically
>>> contains the whole message digitally signed with DKIM. The signature
>>> tends to be long and non-compressible, thanks to the signature.
>>>
>>> I'm wondering what's the best way around this, because I suspect many
>>> new users (especially those attracted by jsonb and GIN improvements)
>>> will run into this. Maybe not immediately, but eventully they'll try to
>>> insert a jsonb with long value, and it will fail ...
>>
>> The jsonb_hash_ops operator class just stores a 32-bit integer hash
>> value (it always sets the recheck flag, which only some of the other
>> default GIN opclass' strategies do). It only supports containment, and
>> not the full variety of operators that the default opclass supports,
>> which is why it isn't the default. I think that in practice the
>> general recommendation will be that when indexing at the "top level",
>> use jsonb_hash_ops. When indexing nested items, use the more flexible
>> default GIN opclass. That seems like a pretty smart trade-off to me.
>
> OK, I'll look into the jsonb_hash_ops - that sounds more or less like what
> I was thinking about (and sure, storing hashes makes some operations
> impossible to support).
>
> The other thing I was thinking about is introducing some kind of upper
> limit for the value length - e.g. index just the first 1kB, or something
> like that. My experience is most values are way shorter, or actually
> differ in the first 1kB, so this should allow most decisions to be made.
> But I'm not really that familiar with how GIN works, so maybe this is
> nonsense.
>
>> The more I think about it, the more inclined I am to lose GiST support
>> entirely for the time being. It lets us throw out about 700 lines of C
>> code, which is a very significant fraction of the total, removes the
>> one open bug, and removes the least understood part of the code. The
>> GiST opclass is not particularly compelling for this.
>
> I disagree with that. I see GiST as a simple fallback option for the cases
> I described. I wasn't able to create a GIN index because of exceeding the
> max item length, but GiST created just fine. It was considerably slower,
> but it worked.
>
> Tomas
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-03-12 Thread Robert Haas

On Tue, Mar 11, 2014 at 3:53 AM, Amit Kapila  wrote:
> Places where tuple info not available
>
> LOG:  process 5788 still waiting for ShareLock on transaction 679 after 
> 1014.000
>  ms
> CONTEXT:  while attempting to operate in relation "public"."idx_t1" of 
> database
> "postgres"

The way the context message is assembled piecemeal in
XactLockTableWaitErrorContextCallback violates translation guidelines.
 You need to have completely separate strings for each variant.

While attempting to "operate in"?  That seems like unhelpful
weasel-wording.  I wonder if we ought to have separate messages for
each possibility, like "delete tuple (X,Y)" when called from
heap_delete(), "update tuple (X,Y)", "check exclusion constraint on
tuple (X,Y)" when called from check_exclusion_constraint, etc.  That
seems like it would be handy information to have.

Why can't check_exclusion_constraint, for example, pass the TID, so
that at least that much information is available?

I'm not very happy with the idea of including the tuple details only
when the level is less than ERROR.  For one thing, to do that in a way
that respects translatability guidelines will require two versions of
every string that would otherwise require only one.  For another
thing, it seems like it's punting a pretty important case.  If we're
gonna add context detail to lots of cases (instead only the "still
waiting" case that people probably mostly care about) then we should
actually print the details more-or-less consistently in all of those
cases, not pretend like a solution that only works in the narrow case
is more general than it really is.  I think we should really try hard
to make the amount of detail provided as uniform as possible across
all the cases, even if that means removing information from some cases
where it might have been available.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2014

2014-03-12 Thread Atri Sharma

On Wed, Mar 12, 2014 at 8:05 PM, Ashutosh Dhundhara <
ashutoshdhundh...@yahoo.com> wrote:

> Hello all,
> I am Ashutosh Dhundhara from Thapat University, Patiala-India presently
> pursuing Bachelors degree in Computer Science and Engineering.
> This year I wish to work for PostgreSQL under the flagship of GSoC 2014.
> So please help regarding this. I have a few questions :
>
> 1) Do I have to choose all ideas from the GSoC wiki page or any one of
> them ?
> 2) What is the deadline for fixing bugs which will account for selection
> procedure ?
>
> Please guide me on how to proceed.
>
>
>
You can propose your own ideas as well. You can pick any number of ideas
from GSoC 2014 wiki page and send proposals for them.

The deadline for proposal is next friday, I believe.

Regards,

Atri


-- 
Regards,

Atri
*l'apprenant*

[HACKERS] GSoC 2014

2014-03-12 Thread Ashutosh Dhundhara

Hello all,
I am Ashutosh Dhundhara from Thapat University, Patiala-India presently 
pursuing Bachelors degree in Computer Science and Engineering.
This year I wish to work for PostgreSQL under the flagship of GSoC 2014. So 
please help regarding this. I have a few questions :

1) Do I have to choose all ideas from the GSoC wiki page or any one of them ?
2) What is the deadline for fixing bugs which will account for selection 
procedure ?

Please guide me on how to proceed.
 
Regards, 
Ashutosh Dhundhara

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Tom Lane

Andrew Dunstan  writes:
> On 03/12/2014 02:09 PM, Josh Berkus wrote:
>> Well, if you really want my "I want a pony" list:
>> 
>> Local superusers (maybe this concept needs another name) would be able
>> to do the following things in a *single* database:
>> 
>> 1 change permissions for other users on that database and its objects
>> 2 load extensions from a predefined .so directory / list
>> 3 create/modify untrusted language functions
>> 4 create per-database users and change their settings
>> 5 change database settings (SET stuff)
>> 6 NOT change their own user settings
>> 7 NOT change any global users
>> 8 NOT run SET PERSISTENT or other commands with global effect

> Item 3 gives away the store.

Indeed.  If you can do (3), you can break out of any of the other
constraints.  I suspect even (1) and/or (5) would be enough to mount
trojan-horse attacks against real superusers who visit your database.

I do not put any stock in the notion of "constrained superuser".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane

Robert Haas  writes:
> On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane  wrote:
>> My inclination now (see later traffic) is to suppress the
>> status report when the COPY destination is the same as pset.queryFout
>> (ie, a simple test whether the FILE pointers are equal).  This would
>> suppress the status report for "\copy to stdout" and "COPY TO STDOUT"
>> cases, and also for "\copy to pstdout" if you'd not redirected queryFout
>> with \o.

> This is reasonably similar to what we already do for SELECT, isn't it?
>  I mean, the server always sends back a command tag, but psql
> sometimes opts not to print it.

Right, the analogy to SELECT gives some comfort that this is reasonable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Josh Berkus

On 03/12/2014 11:25 AM, Stephen Frost wrote:
> * Josh Berkus (j...@agliodbs.com) wrote:
>> Local superusers (maybe this concept needs another name) would be able
>> to do the following things in a *single* database:
>>
>> 1 change permissions for other users on that database and its objects
> 
> What about "bypass" permissions, ala what superuser does today?  Or are
> you saying we'd only need to allow this new kind of role to bypass the
> checks in the GRANT/REVOKE system?

More like what we have for the database owner role today.

>> 2 load extensions from a predefined .so directory / list
> 
> This would obviously have to be a curated list that avoids things like
> 'adminpack'...

It would need to be a list created by the global superuser.  By default,
nothing would be on it.

>> 4 create per-database users and change their settings
> 
> Presumably just for the 'local' DB?

Right.

>> 5 change database settings (SET stuff)
> 
> This can be done by the database-owner already, no?

Oh, point.

>> 6 NOT change their own user settings
> 
> Don't think this is quite that simple (passwords?).

Well, we already limit what things users can change about themselves;
they can't promote themselves to superuser, for example.

>> 7 NOT change any global users
> 
> What about role membership, wrt local vs. global roles?

Allowing global users to join local ROLEs is its own can'o'worms that
would merit an entire other thread.  Assuming, of course, that we had
local users in the first place.

>> 8 NOT run SET PERSISTENT or other commands with global effect
> 
> Indeed, or use 'COPY'..

Yeah.  In theory, we should allow the local superuser to use COPY; in
practice, nobody will care because they'll be using client-side COPY
since the entire use-case for this is cloud-hosted DBs anyway.

> For my part- I don't see having everyone write their own set of SECURITY
> DEFINER functions as being either less complex or less risk.  They're
> also a lot less convenient to use.  That's not what RDS did, is it?  No,
> and I agree with them on that part.

I was thinking of having an extension in contrib, actually.  That is, a
canonical set of security definer functions.  But if you think it's
easier to actually implement the permissions restrictions in the actual
utility functions, I wouldn't argue.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-03-12 Thread Fujii Masao

On Tue, Feb 25, 2014 at 1:03 PM, Rajeev rastogi
 wrote:
> On 04 February 2014 14:38, Myself wrote:
>
>>
>> On 4th February 2014, Christian kruse Wrote:
>> > On 04/02/14 12:38, Fujii Masao wrote:
>> > > ISTM that the phrase "Request queue" is not used much around the
>> lock.
>> > > Using the phrase "wait queue" or Simon's suggestion sound better to
>> > at least me.
>> > > Thought?
>> >
>> > Sounds reasonable to me. Attached patch changes messages to the
>> > following:
>> >
>> > Process holding the lock: A. Wait queue: B.
>> > Processes holding the lock: A, B. Wait queue: C.
>>
>> This looks good to me also.
>
> I have tested the revised patch and found ready to be committed.
>
> I am marking this as "Ready for Committer".

Committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Stephen Frost

* Josh Berkus (j...@agliodbs.com) wrote:
> Local superusers (maybe this concept needs another name) would be able
> to do the following things in a *single* database:
> 
> 1 change permissions for other users on that database and its objects

What about "bypass" permissions, ala what superuser does today?  Or are
you saying we'd only need to allow this new kind of role to bypass the
checks in the GRANT/REVOKE system?

> 2 load extensions from a predefined .so directory / list

This would obviously have to be a curated list that avoids things like
'adminpack'...

> 3 create/modify untrusted language functions

Uhh, I don't believe RDS allows you to do this..?

> 4 create per-database users and change their settings

Presumably just for the 'local' DB?

> 5 change database settings (SET stuff)

This can be done by the database-owner already, no?

> 6 NOT change their own user settings

Don't think this is quite that simple (passwords?).

> 7 NOT change any global users

What about role membership, wrt local vs. global roles?

> 8 NOT run SET PERSISTENT or other commands with global effect

Indeed, or use 'COPY'..

> Now, obviously permission (3) could be used to escalate a local
> superuser to global superuser permissions, so local superusers aren't
> really a secure concept, unless you don't add any untrusted languages to
> the list of allowed extensions.  Alternately, we could drop (3) from the
> list of features.

That'd certainly be the main issue that I see with this proposal.  Doing
the rest but allowing untrusted languages would just get the naive in
trouble and not help those of us who want this, as we wouldn't be able
to use it.

> H. On the other foot, though: all of 1,2,4 and 5 could conceivably
> be done via a set of Security Definer functions loaded into the
> database, with a lot less complexity and security risk.

For my part- I don't see having everyone write their own set of SECURITY
DEFINER functions as being either less complex or less risk.  They're
also a lot less convenient to use.  That's not what RDS did, is it?  No,
and I agree with them on that part.

> On 03/11/2014 09:39 PM, David Johnston wrote:
> > So if dave is already a user in db1 only that specific dave can be made a
> > global user - any other dave would be disallowed.
> 
> Correct.  Well, unless the other dave was promoted first.  However, I
> personally don't see any reason why we should even support promoting
> users from local to global.  It adds complexity to the concept, and the
> value of it eludes me.

Agreed.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Andrew Dunstan



On 03/12/2014 02:09 PM, Josh Berkus wrote:

On 03/12/2014 12:22 AM, Magnus Hagander wrote:

On Mar 12, 2014 1:46 AM, "Josh Berkus"  wrote:

Yeah, what we really need is encapsulated per-DB users and local
superusers.  I think every agrees that this is the goal, but nobody
wants to put in the work to implement a generalized solution.


Encapsulated would probably be the doable part. But local superuser? Given
that a superuser can load and run binaries, how would you propose you
restrict that superuser from doing anything they want? And if you don't
need that functionality, then hows it really different from being the
database owner?

Well, if you really want my "I want a pony" list:

Local superusers (maybe this concept needs another name) would be able
to do the following things in a *single* database:

1 change permissions for other users on that database and its objects
2 load extensions from a predefined .so directory / list
3 create/modify untrusted language functions
4 create per-database users and change their settings
5 change database settings (SET stuff)
6 NOT change their own user settings
7 NOT change any global users
8 NOT run SET PERSISTENT or other commands with global effect


Item 3 gives away the store. AFAIK Amazon doesn't load untrusted 
languages, period.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2014-03-12 Thread Alvaro Herrera

MauMau escribió:

> The "raw" link only gave the mail in text format.  I hoped to import
> the mail into Windows Mail on Windows Vista, but I couldn't.

You might need to run a conversion process by which you transform the
raw file (in mbox format) into EML format or whatever it is that Windows
Mail uses.  I vaguely recall there are tools for this.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas


On 03/12/2014 07:42 PM, Alexander Korotkov wrote:

Preparation we do in startScanKey requires knowledge of estimate size of
posting lists/trees. We do this estimate by traversal to leaf pages. I
think gincostestimate is expected to be way more cheap. So, we probably
need so more rough estimate there, don't we?


Yeah, maybe. We do something similar for b-tree MIN/MAX currently, but 
with a lot of keys, it could be a lot more expensive to traverse down to 
all of them.


I wonder if we could easily at least catch the common skewed cases, 
where e.g the logic of the consistent function is to AND all the keys. 
The opclass would know that, but we would somehow need to pass down the 
information to gincostestimate.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Josh Berkus

On 03/12/2014 12:22 AM, Magnus Hagander wrote:
> On Mar 12, 2014 1:46 AM, "Josh Berkus"  wrote:
>> Yeah, what we really need is encapsulated per-DB users and local
>> superusers.  I think every agrees that this is the goal, but nobody
>> wants to put in the work to implement a generalized solution.
>>
> 
> Encapsulated would probably be the doable part. But local superuser? Given
> that a superuser can load and run binaries, how would you propose you
> restrict that superuser from doing anything they want? And if you don't
> need that functionality, then hows it really different from being the
> database owner?

Well, if you really want my "I want a pony" list:

Local superusers (maybe this concept needs another name) would be able
to do the following things in a *single* database:

1 change permissions for other users on that database and its objects
2 load extensions from a predefined .so directory / list
3 create/modify untrusted language functions
4 create per-database users and change their settings
5 change database settings (SET stuff)
6 NOT change their own user settings
7 NOT change any global users
8 NOT run SET PERSISTENT or other commands with global effect

The above is fairly similar to what Amazon currently offers for Postgres
RDS, except that they only have 1 database per instance, so clearly they
haven't worked out some of the security issues.

Now, obviously permission (3) could be used to escalate a local
superuser to global superuser permissions, so local superusers aren't
really a secure concept, unless you don't add any untrusted languages to
the list of allowed extensions.  Alternately, we could drop (3) from the
list of features.

H. On the other foot, though: all of 1,2,4 and 5 could conceivably
be done via a set of Security Definer functions loaded into the
database, with a lot less complexity and security risk.  So if we're
sacrificing untrusted languages, then we really don't need a local
superuser at all; just a bunch of variadic SD functions which can handle
user creation and permissions issues. It would limit the permissions
syntax we can express, but not critically so in my opinion. This could
all be done as an extension, and would probably be the better for it.

Assuming we have db-local users in the first place, of course.

On 03/11/2014 09:39 PM, David Johnston wrote:

> So if dave is already a user in db1 only that specific dave can be made a
> global user - any other dave would be disallowed.

Correct.  Well, unless the other dave was promoted first.  However, I
personally don't see any reason why we should even support promoting
users from local to global.  It adds complexity to the concept, and the
value of it eludes me.

>
> Would "user - password" be a better PK? Even with the obvious issue that
> password part of the key can change.  "user-password to database" is a
> properly many-to-many relationship.  Or see next for something simpler.

I have no idea where you're going with this.

> A simple implementation would simply have the global users copied into
each
> database as it is constructed.  There would also be a link from each
of the
> database-specific users and the global master so that a password change
> issued against the global user propagates to all the database-specific
> versions.

That's called "massive backwards compatibility failure", and I don't
think we'll go any further with your concept.

> Be nice if all users could be "global" and there would be some way to give
> them permissions on databases.

Um, there is:
http://www.postgresql.org/docs/9.3/static/sql-grant.html

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] logical decoding documentation?

2014-03-12 Thread Robert Haas

On Tue, Mar 11, 2014 at 4:16 PM, Andres Freund  wrote:
> Could you perhaps commit the attached patch fixing the issues you
> mentioned?

I committed this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 1:52 PM, Alexander Korotkov
 wrote:
>> * This patch added a triConsistent function for array and tsvector
>> opclasses. Were you planning to submit a patch to do that for the rest of
>> the opclasses, like pg_trgm? (it's getting awfully late for that...)
>
> Yes. I can try to get it into 9.4 if it's possible.

It seems to me that we'd be wise to push that to 9.5 at this point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov

On Wed, Mar 12, 2014 at 8:02 PM, Heikki Linnakangas  wrote:

> On 02/26/2014 11:25 PM, Alexander Korotkov wrote:
>
>> On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov > >wrote:
>>
>>  On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas <
>>> hlinnakan...@vmware.com> wrote:
>>>
>>>  On 02/09/2014 12:11 PM, Alexander Korotkov wrote:

  I've rebased catalog changes with last master. Patch is attached. I've
> rerun my test suite with both last master ('committed') and attached
> patch ('ternary-consistent').
>
>
 Thanks!

 method |   sum

> +--
>committed  | 143491.71501
>fast-scan-11   | 126916.11199
>fast-scan-light|   137321.211
>fast-scan-light-heikki | 138168.02801
>master |   446976.288
>ternary-consistent |   125923.514
>
> I explain regression in last master by change of MAX_MAYBE_ENTRIES
> from 8
> to 4.
>
>
 Yeah, probably. I set MAX_MAYBE_ENTRIES to 8 in initial versions to make
 sure we get similar behavior in Tomas' tests that used 6 search terms.
 But
 I always felt that it was too large for real queries, once we have the
 catalog changes, that's why I lowered to 4 when committing. If an
 opclass
 benefits greatly from fast scan, it should provide the ternary
 consistent
 function, and not rely on the shim implementation.

   I'm not sure about decision to reserve separate procedure number for

> ternary consistent. Probably, it would be better to add ginConfig
> method.
> It would be useful for post 9.4 improvements.
>
>
 Hmm, it might be useful for an opclass to provide both, a boolean and
 ternary consistent function, if the boolean version is significantly
 more
 efficient when all the arguments are TRUE/FALSE. OTOH, you could also
 do a
 quick check through the array to see if there are any MAYBE arguments,
 within the consistent function. But I'm inclined to keep the
 possibility to
 provide both versions. As long as we support the boolean version at all,
 there's not much difference in terms of the amount of code to support
 having them both for the same opclass.

 A ginConfig could be useful for many other things, but I don't think
 it's
 worth adding it now.

 What's the difference between returning GIN_MAYBE and GIN_TRUE+recheck?
 We discussed that earlier, but didn't reach any conclusion. That needs
 to
 be clarified in the docs. One possibility is to document that they're
 equivalent. Another is to forbid one of them. Yet another is to assign a
 different meaning to each.

 I've been thinking that it might be useful to define them so that a
 MAYBE
 result from the tri-consistent function means that it cannot decide if
 you
 have a match or not, because some of the inputs were MAYBE. And
 TRUE+recheck means that even if all the MAYBE inputs were passed as
 TRUE or
 FALSE, the result would be the same, TRUE+recheck. The practical
 difference
 would be that if the tri-consistent function returns TRUE+recheck,
 ginget.c
 wouldn't need to bother fetching the other entries, it could just return
 the entry with recheck=true immediately. While with MAYBE result, it
 would
 fetch the other entries and call tri-consistent again. ginget.c doesn't
 currently use the tri-consistent function that way - it always fetches
 all
 the entries for a potential match before calling tri-consistent, but it
 could. I had it do that in some of the patch versions, but Tomas'
 testing
 showed that it was a big loss on some queries, because the consistent
 function was called much more often. Still, something like that might be
 sensible in the future, so it might be good to distinguish those cases
 in
 the API now. Note that ginarrayproc is already using the return values
 like
 that: in GinContainedStrategy, it always returns TRUE+recheck
 regardless of
 the inputs, but in other cases it uses GIN_MAYBE.

>>>
>>>
>>> Next revision of patch is attached.
>>>
>>> In this version opclass should provide at least one consistent function:
>>> binary or ternary. It's expected to achieve best performance when opclass
>>> provide both of them. However, tests shows opposite :( I've to recheck it
>>> carefully.
>>>
>>>
>> However, it's not!
>> This revision of patch completes my test case in 123330 ms. This is
>> slightly faster than previous revision.
>>
>
> Great. Committed, I finally got around to it.
>
> Some minor changes: I reworded the docs and also updated the table of
> support functions in xindex.sgml. I refactored the query in opr_sanity.sql,
> to make it easier to read, an

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 12:09 PM, Tom Lane  wrote:
> David Johnston  writes:
>> Tom Lane-2 wrote
>>> 1. Treat this as a non-backwards-compatible change, and document that
>>> people have to use -q if they don't want the COPY tag in the output.
>>> I'm not sure this is acceptable.
>
>> I've mostly used copy to with files and so wouldn't mind if STDOUT had the
>> COPY n sent to it as long as the target file is just the copy contents.
>
> I think you're missing the point: the case I'm concerned about is exactly
> that the target file is psql's stdout, or more specifically the same place
> that the COPY status would get printed to.
>
>>> 2. Kluge ProcessResult so that it continues to not pass back a PGresult
>>> for the COPY TO STDOUT case, or does so only in limited circumstances
>>> (perhaps only if isatty(stdout), for instance).
>
>> The main problem with this is that people will test by sending output to a
>> TTY and see the COPY n.  Although if it can be done consistently then you
>> minimize backward incompatibility and encourage people to enforce quiet mode
>> while the command runs...
>
> Yeah, the inconsistency of behavior that this solution would cause is not
> a good thing.  My inclination now (see later traffic) is to suppress the
> status report when the COPY destination is the same as pset.queryFout
> (ie, a simple test whether the FILE pointers are equal).  This would
> suppress the status report for "\copy to stdout" and "COPY TO STDOUT"
> cases, and also for "\copy to pstdout" if you'd not redirected queryFout
> with \o.

This is reasonably similar to what we already do for SELECT, isn't it?
 I mean, the server always sends back a command tag, but psql
sometimes opts not to print it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Alexander Korotkov

On Wed, Mar 12, 2014 at 8:29 PM, Heikki Linnakangas  wrote:

> On 03/12/2014 12:09 AM, Tomas Vondra wrote:
>
>> Hi all,
>>
>> a quick question that just occured to me - do you plan to tweak the cost
>> estimation fot GIN indexes, in this patch?
>>
>> IMHO it would be appropriate, given the improvements and gains, but it
>> seems to me gincostestimate() was not touched by this patch.
>>
>
> Good point. We have done two major changes to GIN in this release cycle:
> changed the data page format and made it possible to skip items without
> fetching all the keys ("fast scan"). gincostestimate doesn't know about
> either change.
>
> Adjusting gincostestimate for the more compact data page format seems
> easy. When I hacked on that, I assumed all along that gincostestimate
> doesn't need to be changed as the index will just be smaller, which will be
> taken into account automatically. But now that I look at gincostestimate,
> it assumes that the size of one item on a posting tree page is a constant 6
> bytes (SizeOfIptrData), which is no longer true. I'll go fix that.
>
> Adjusting for the effects of skipping is harder. gincostestimate needs to
> do the same preparation steps as startScanKey: sort the query keys by
> frequency, and call consistent function to split the keys intao "required"
> and "additional" sets. And then model that the "additional" entries only
> need to be fetched when the other keys match. That's doable in principle,
> but requires a bunch of extra code.
>
> Alexander, any thoughts on that? It's getting awfully late to add new code
> for that, but it sure would be nice somehow take fast scan into account.


Preparation we do in startScanKey requires knowledge of estimate size of
posting lists/trees. We do this estimate by traversal to leaf pages. I
think gincostestimate is expected to be way more cheap. So, we probably
need so more rough estimate there, don't we?

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Robert Haas

On Wed, Mar 12, 2014 at 12:45 PM, Heikki Linnakangas
 wrote:
> On 03/12/2014 06:26 PM, Robert Haas wrote:
>> On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus  wrote:
>>> In the 9.3.3 updates, we added three new GUCs to control multixact
>>> freezing.  This was an unprecented move in my memory -- I can't recall
>>> ever adding a GUC to a minor release which wasn't backwards
>>> compatibility for a security fix.  This was a mistake.
>>
>> I disagree.  I think it was the right decision.  I think it was a
>> mistake not including all of that stuff in the first place, and I
>> think it's good that we've now corrected that oversight.
>
> In hindsight, I think permanent multixids in their current form was a
> mistake. Before 9.3, the thing that made multixids special was that they
> could just be thrown away at a restart. They didn't need freezing. Now that
> they do, why not just use regular XIDs for them?

Well, the numbering of MXIDs is closely bound up with their storage
format.  To do what you're proposing, we'd need to invent some new way
of associating an XID-used-as-MXID with update XID, list of lockers,
and lock modes.  Which is certainly possible, but it's not obvious
that it's a good idea.

I *am* concerned that we didn't adequately weigh the costs of adding
another thing that has to be frozen before we did it.  Clearly, the
feature has a lot of benefit, or will once we've flushed out most of
the bugs.  But it's hard to say at this point how much the cost is
going to be, and I do think that's cause for concern.  But I'm not
convinced that unifying the XID and MXID spaces would have addressed
that concern to any measurable degree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tom Lane

Jeff Janes  writes:
> On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane  wrote:
>> We've seen sporadic reports of that sort of behavior for years, but no
>> developer has ever been able to reproduce it reliably.  Now that you've
>> got a reproducible case, do you want to poke into it and see what's going
>> on?

> I didn't know we were trying to reproduce it, nor that it was a mystery.
>  Do anything that causes serious IO constipation, and you will probably see
> that message.

The cases that are a mystery to me are where there's no reason to believe
that I/O is particularly overloaded.  But perhaps Kaigai-san's example is
only that ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Heikki Linnakangas


On 03/12/2014 06:26 PM, Robert Haas wrote:

On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus  wrote:

In the 9.3.3 updates, we added three new GUCs to control multixact
freezing.  This was an unprecented move in my memory -- I can't recall
ever adding a GUC to a minor release which wasn't backwards
compatibility for a security fix.  This was a mistake.


I disagree.  I think it was the right decision.  I think it was a
mistake not including all of that stuff in the first place, and I
think it's good that we've now corrected that oversight.


In hindsight, I think permanent multixids in their current form was a 
mistake. Before 9.3, the thing that made multixids special was that they 
could just be thrown away at a restart. They didn't need freezing. Now 
that they do, why not just use regular XIDs for them? We had to 
duplicate much of the wraparound and freezing logic for multixids that 
simply would not have been an issue if we had used regular XIDs instead.


We could've perhaps kept the old multixids for their original purpose, 
as transient xids that can be forgotten about after all the old 
snapshots are gone. But for the permanent ones, it would've been simpler 
if we handled them more like subxids; make them part of the same XID 
space as regular XIDs.


This is pretty hand-wavy of course, and it's too late now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Jeff Janes

On Wed, Mar 12, 2014 at 7:42 AM, Tom Lane  wrote:

> Kouhei Kaigai  writes:
> > WARNING:  pgstat wait timeout
> > WARNING:  pgstat wait timeout
> > WARNING:  pgstat wait timeout
> > WARNING:  pgstat wait timeout
>
> > Once I got above messages, write performance is dramatically
> > degraded, even though I didn't take detailed investigation.
>
> > I could reproduce it on the latest master branch without my
> > enhancement, so I guess it is not a problem something special
> > to me.
> > One other strangeness is, right now, this problem is only
> > happen on my virtual machine environment - VMware ESXi 5.5.0.
> > I couldn't reproduce the problem on my physical environment
> > (Fedora20, core i5-4570S).
>
> We've seen sporadic reports of that sort of behavior for years, but no
> developer has ever been able to reproduce it reliably.  Now that you've
> got a reproducible case, do you want to poke into it and see what's going
> on?
>

I didn't know we were trying to reproduce it, nor that it was a mystery.
 Do anything that causes serious IO constipation, and you will probably see
that message.  For example, turn off synchronous_commit and run the default
pgbench transaction at a large scale but that still comfortably fits in
RAM, and wait for a checkpoint sync phase to kick in.

The pgstat wait timeout is a symptom, not the cause.

Cheers,

Jeff

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas


On 03/12/2014 12:09 AM, Tomas Vondra wrote:

Hi all,

a quick question that just occured to me - do you plan to tweak the cost
estimation fot GIN indexes, in this patch?

IMHO it would be appropriate, given the improvements and gains, but it
seems to me gincostestimate() was not touched by this patch.


Good point. We have done two major changes to GIN in this release cycle: 
changed the data page format and made it possible to skip items without 
fetching all the keys ("fast scan"). gincostestimate doesn't know about 
either change.


Adjusting gincostestimate for the more compact data page format seems 
easy. When I hacked on that, I assumed all along that gincostestimate 
doesn't need to be changed as the index will just be smaller, which will 
be taken into account automatically. But now that I look at 
gincostestimate, it assumes that the size of one item on a posting tree 
page is a constant 6 bytes (SizeOfIptrData), which is no longer true. 
I'll go fix that.


Adjusting for the effects of skipping is harder. gincostestimate needs 
to do the same preparation steps as startScanKey: sort the query keys by 
frequency, and call consistent function to split the keys intao 
"required" and "additional" sets. And then model that the "additional" 
entries only need to be fetched when the other keys match. That's doable 
in principle, but requires a bunch of extra code.


Alexander, any thoughts on that? It's getting awfully late to add new 
code for that, but it sure would be nice somehow take fast scan into 
account.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] The case against multixact GUCs

2014-03-12 Thread Robert Haas

On Tue, Mar 11, 2014 at 3:14 PM, Josh Berkus  wrote:
> In the 9.3.3 updates, we added three new GUCs to control multixact
> freezing.  This was an unprecented move in my memory -- I can't recall
> ever adding a GUC to a minor release which wasn't backwards
> compatibility for a security fix.  This was a mistake.

I disagree.  I think it was the right decision.  I think it was a
mistake not including all of that stuff in the first place, and I
think it's good that we've now corrected that oversight.

> What makes these GUCs worse is that nobody knows how to set them; nobody
> on this list and nobody in the field.  Heck, I doubt 1 in 1000 of our
> users (or 1 in 10 people on this list) know what a multixact *is*.

Yeah, and that's a problem.   See, it turns out that we love periodic
full-table scans to freeze xmin so much that, in 9.3, we committed to
a design that requires us to make periodic full-table scans to freeze
xmax, too.  That may or may not have been a good decision, but at this
point we're stuck with it.  People are going to have to come to
understand the requirements there just as they do for freezing xmin.
Denying the user the ability to adjust the thresholds is not going to
accelerate the process of figuring out how they should be set.

> Further, there's no clear justification why these cannot be set to be
> the same as our other freeze ages (which our users also don't
> understand), or a constant calculated portion of them, or just a
> constant.

On most systems, mxid consumption will be much slower than xid
consumption because most users won't use tuple locks all that heavily.
 If we made all the defaults the same, then a full-table scan for xid
freezing would likely conclude that the many or all of the mxids
weren't old enough to be frozen yet.  To the greatest extent possible,
we want full-table vacuums for either XID freezing or MXID freezing to
advance both relfrozenxid and relminmxid so that we don't go through
and freeze for one reason and then have to come back and freeze for
the other reasons shortly thereafter.  Nobody knows exactly how to set
the settings to make that happen just yet, so we need settings at
least until people can determine what values work well in practice -
and probably permanently, because unfortunately I think the answer is
likely workload-dependent.

> Since nobody anticipated someone adding a GUC in a minor
> release, there was no discussion of this topic that I can find; the new
> GUCs were added as a "side effect" of fixing the multixact vacuum issue.
>  Certainly I would have raised a red flag if the discussion of the new
> GUCs hadn't been buried deep inside really long emails.

Alvaro did explicitly ask if anyone wanted to oppose back-patching.  I
don't think you can really blame him if you didn't see/read that
email.

> Adding new GUCs which nobody has any idea how to set, or can even
> explain to new users, is not a service to our users.  These should be
> removed.

The need for these GUCs is an outgrowth of the fkey locking stuff.
Unless we rip that out again or rewrite it completely, the need for
them doesn't seem likely to go away - so we're going to need to learn
to live with it, not pretend like it isn't a problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane

David Johnston  writes:
> Tom Lane-2 wrote
>> 1. Treat this as a non-backwards-compatible change, and document that
>> people have to use -q if they don't want the COPY tag in the output.
>> I'm not sure this is acceptable.

> I've mostly used copy to with files and so wouldn't mind if STDOUT had the
> COPY n sent to it as long as the target file is just the copy contents.

I think you're missing the point: the case I'm concerned about is exactly
that the target file is psql's stdout, or more specifically the same place
that the COPY status would get printed to.

>> 2. Kluge ProcessResult so that it continues to not pass back a PGresult
>> for the COPY TO STDOUT case, or does so only in limited circumstances
>> (perhaps only if isatty(stdout), for instance).

> The main problem with this is that people will test by sending output to a
> TTY and see the COPY n.  Although if it can be done consistently then you
> minimize backward incompatibility and encourage people to enforce quiet mode
> while the command runs...

Yeah, the inconsistency of behavior that this solution would cause is not
a good thing.  My inclination now (see later traffic) is to suppress the
status report when the COPY destination is the same as pset.queryFout
(ie, a simple test whether the FILE pointers are equal).  This would
suppress the status report for "\copy to stdout" and "COPY TO STDOUT"
cases, and also for "\copy to pstdout" if you'd not redirected queryFout
with \o.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN improvements part2: fast scan

2014-03-12 Thread Heikki Linnakangas


On 02/26/2014 11:25 PM, Alexander Korotkov wrote:

On Thu, Feb 27, 2014 at 1:07 AM, Alexander Korotkov wrote:


On Thu, Feb 20, 2014 at 1:48 PM, Heikki Linnakangas <
hlinnakan...@vmware.com> wrote:


On 02/09/2014 12:11 PM, Alexander Korotkov wrote:


I've rebased catalog changes with last master. Patch is attached. I've
rerun my test suite with both last master ('committed') and attached
patch ('ternary-consistent').



Thanks!


method |   sum

+--
   committed  | 143491.71501
   fast-scan-11   | 126916.11199
   fast-scan-light|   137321.211
   fast-scan-light-heikki | 138168.02801
   master |   446976.288
   ternary-consistent |   125923.514

I explain regression in last master by change of MAX_MAYBE_ENTRIES from 8
to 4.



Yeah, probably. I set MAX_MAYBE_ENTRIES to 8 in initial versions to make
sure we get similar behavior in Tomas' tests that used 6 search terms. But
I always felt that it was too large for real queries, once we have the
catalog changes, that's why I lowered to 4 when committing. If an opclass
benefits greatly from fast scan, it should provide the ternary consistent
function, and not rely on the shim implementation.


  I'm not sure about decision to reserve separate procedure number for

ternary consistent. Probably, it would be better to add ginConfig method.
It would be useful for post 9.4 improvements.



Hmm, it might be useful for an opclass to provide both, a boolean and
ternary consistent function, if the boolean version is significantly more
efficient when all the arguments are TRUE/FALSE. OTOH, you could also do a
quick check through the array to see if there are any MAYBE arguments,
within the consistent function. But I'm inclined to keep the possibility to
provide both versions. As long as we support the boolean version at all,
there's not much difference in terms of the amount of code to support
having them both for the same opclass.

A ginConfig could be useful for many other things, but I don't think it's
worth adding it now.


What's the difference between returning GIN_MAYBE and GIN_TRUE+recheck?
We discussed that earlier, but didn't reach any conclusion. That needs to
be clarified in the docs. One possibility is to document that they're
equivalent. Another is to forbid one of them. Yet another is to assign a
different meaning to each.

I've been thinking that it might be useful to define them so that a MAYBE
result from the tri-consistent function means that it cannot decide if you
have a match or not, because some of the inputs were MAYBE. And
TRUE+recheck means that even if all the MAYBE inputs were passed as TRUE or
FALSE, the result would be the same, TRUE+recheck. The practical difference
would be that if the tri-consistent function returns TRUE+recheck, ginget.c
wouldn't need to bother fetching the other entries, it could just return
the entry with recheck=true immediately. While with MAYBE result, it would
fetch the other entries and call tri-consistent again. ginget.c doesn't
currently use the tri-consistent function that way - it always fetches all
the entries for a potential match before calling tri-consistent, but it
could. I had it do that in some of the patch versions, but Tomas' testing
showed that it was a big loss on some queries, because the consistent
function was called much more often. Still, something like that might be
sensible in the future, so it might be good to distinguish those cases in
the API now. Note that ginarrayproc is already using the return values like
that: in GinContainedStrategy, it always returns TRUE+recheck regardless of
the inputs, but in other cases it uses GIN_MAYBE.



Next revision of patch is attached.

In this version opclass should provide at least one consistent function:
binary or ternary. It's expected to achieve best performance when opclass
provide both of them. However, tests shows opposite :( I've to recheck it
carefully.



However, it's not!
This revision of patch completes my test case in 123330 ms. This is
slightly faster than previous revision.


Great. Committed, I finally got around to it.

Some minor changes: I reworded the docs and also updated the table of 
support functions in xindex.sgml. I refactored the query in 
opr_sanity.sql, to make it easier to read, and to check more carefully 
that the required support functions are present. I also added a runtime 
check to avoid crashing if neither is present.


A few things we ought to still discuss:

* I just noticed that the dummy trueTriConsistentFn returns GIN_MAYBE, 
rather than GIN_TRUE. The equivalent boolean version returns 'true' 
without recheck. Is that a typo, or was there some reason for the 
discrepancy?


* Come to think of it, I'm not too happy with the name GinLogicValue. 
It's too vague. It ought to include "ternary" or "tri-value" or 
something like that. How about renaming it to "GinTernaryValue" or ju

Re: [HACKERS] COPY table FROM STDIN doesn't show count tag

2014-03-12 Thread Tom Lane

Rajeev rastogi  writes:
> On 11 March 2014 19:52, Tom Lane wrote:
>> After sleeping on it, I'm inclined to think we should continue to not
>> print status for COPY TO STDOUT.  Aside from the risk of breaking
>> scripts, there's a decent analogy to be made to SELECT: we don't print
>> a status tag for that either.

> It is correct that SELECT does not print conventional way of status tag but 
> still it prints the number
> of rows selected (e.g. (2 rows)) along with rows actual value, which can be 
> very well considered
> as kind of status. User can make out with this result, that how many rows 
> have been selected.

> But in-case of COPY TO STDOUT, if we don't print anything, then user does not 
> have any direct way of finding
> that how many rows were copied from table to STDOUT, which might have been 
> very useful.

Uh, you mean other than the data rows that were just printed?  I fail
to see how this is much different from the SELECT case:

regression=# \copy int8_tbl to stdout
123 456
123 4567890123456789
4567890123456789123
45678901234567894567890123456789
4567890123456789-4567890123456789
regression=# 

(Note that I'm defining TO STDOUT from psql's perspective, ie the rows are
going to the queryFout file, which is the same place the COPY status would
get printed to.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tomas Vondra

On 12 Březen 2014, 14:54, Kouhei Kaigai wrote:
> It is another topic from the main thread,
>
> I noticed the following message under the test cases that
> takes heavy INSERT workload; provided by Haribabu.
>
> [kaigai@iwashi ~]$ createdb mytest
> [kaigai@iwashi ~]$ psql -af ~/cache_scan.sql mytest
> \timing
> Timing is on.
> --cache scan select 5 million
> create table test(f1 int, f2 char(70), f3 float, f4 char(100));
> CREATE TABLE
> Time: 22.373 ms
> truncate table test;
> TRUNCATE TABLE
> Time: 17.705 ms
> insert into test values (generate_series(1,500), 'fujitsu', 1.1,
> 'Australia software tech pvt ltd');
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
>:
>
> Once I got above messages, write performance is dramatically
> degraded, even though I didn't take detailed investigation.
>
> I could reproduce it on the latest master branch without my
> enhancement, so I guess it is not a problem something special
> to me.
> One other strangeness is, right now, this problem is only
> happen on my virtual machine environment - VMware ESXi 5.5.0.
> I couldn't reproduce the problem on my physical environment
> (Fedora20, core i5-4570S).
> Any ideas?

I've seen this happening in cases when it was impossible to write
the stat file for some reason. IIRC there were two basic causes I've seen
in the past:

(1) writing the stat copy failed - for example when the temporary stat
directory was placed in tmpfs, but it was too small

(2) writing the stat copy took too long - e.g. with tmpfs and memory
pressure, forcing the system to swap to free space for the stat copy

(3) IIRC the inquiry (backend -> postmaster) to write the file is sent
using UDP, which may be dropped in some cases (e.g. when the system is
overloaded), so the postmaster does not even know it should write the file

I'm not familiar with VMware ESXi virtualization, but I suppose it might
be relevant to all three causes.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Stephen Frost

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Magnus Hagander  writes:
> > You could COPY over the hba file or sometihng like that :)  Or just
> > pg_read_binary_file() on the files in another database, which is accessible
> > through SQL as well.
> 
> More directly, he could alter pg_authid to make himself a not-local user.
> But I don't see that it's our responsibility to prevent that.  As long as
> the combination of features works in a straightforward way, I'm happy
> with it --- and it would, AFAICS.

That depends on exactly what you mean by 'those features'.  There's
quite a difference between "you can set the superuser flag on a local
user and then that user will be a superuser" and "a local user with
superuser flag will only be able to impact the database they are local
to".  I agree that there's nothing stopping us from having a "local"
user which is marked as a superuser from a technical level.

What Magnus and I are worried about is the *implication* of such a
configuration is and what the user will think it means.  Specifically,
there will be an assumption that "local" users can only access or impact
the databases which they have access to, which wouldn't be accurate for
a "local" user who is a superuser.  Certainly, documenting this would
help with that but with as many warnings as we'd have to put up about
that being dangerous and that it isn't actually going to prevent that
superuser from accessing the other databases if they really wanted to,
or prevent them from making a global superuser account, etc, I'm just
not convinced that it's worth it for the "feature" of allowing a "local"
account to be set as superuser- I don't see a huge use-case there.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Tom Lane

Magnus Hagander  writes:
> On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane  wrote:
>> A local user with the superuser privilege would not be able to log into
>> another database, because superuser doesn't give you any extra privilege
>> until you've logged in.
>> 
>> Yeah, as superuser you could still break things as much as you pleased,
>> but not through SQL.

> You could COPY over the hba file or sometihng like that :)  Or just
> pg_read_binary_file() on the files in another database, which is accessible
> through SQL as well.

More directly, he could alter pg_authid to make himself a not-local user.
But I don't see that it's our responsibility to prevent that.  As long as
the combination of features works in a straightforward way, I'm happy
with it --- and it would, AFAICS.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Stephen Frost

* Magnus Hagander (mag...@hagander.net) wrote:
> On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane  wrote:
> > I share your doubts as to how useful such a concept actually is, but
> > it'd work if we had real local users.
>
> 
> It can also do interesting things like ALTER SYSTEM, replication, backups,
> etc. All of which could be used to escalate privileges beyond the local
> database.

Probably DROP ROLE for global users too.

> So you'd have to somehow restrict those, at which point what's the point of
> the property in the first place?

We've been asked quite often for a not-quite-superuser, as in, one which
can bypass the normal GRANT-based permission system but which can't do
things like create untrusted functions or do other particularly bad
activities.  I can certainly see value in that.  Another oft-requested
option is a read-only role which pg_dump or an auditor could use.

Anyway, this is getting a bit far afield from the original discussion,
which looked like it might actually be heading somewhere interesting..

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Magnus Hagander

On Wed, Mar 12, 2014 at 3:52 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> >> Yeah, what we really need is encapsulated per-DB users and local
> >> superusers.  I think every agrees that this is the goal, but nobody
> >> wants to put in the work to implement a generalized solution.
>
> > Encapsulated would probably be the doable part. But local superuser?
> Given
> > that a superuser can load and run binaries, how would you propose you
> > restrict that superuser from doing anything they want? And if you don't
> > need that functionality, then hows it really different from being the
> > database owner?
>
> A local user with the superuser privilege would not be able to log into
> another database, because superuser doesn't give you any extra privilege
> until you've logged in.
>
> Yeah, as superuser you could still break things as much as you pleased,
> but not through SQL.
>

You could COPY over the hba file or sometihng like that :)  Or just
pg_read_binary_file() on the files in another database, which is accessible
through SQL as well.

I share your doubts as to how useful such a concept actually is, but
> it'd work if we had real local users.
>

It can also do interesting things like ALTER SYSTEM, replication, backups,
etc. All of which could be used to escalate privileges beyond the local
database.

So you'd have to somehow restrict those, at which point what's the point of
the property in the first place?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Tom Lane

Magnus Hagander  writes:
>> Yeah, what we really need is encapsulated per-DB users and local
>> superusers.  I think every agrees that this is the goal, but nobody
>> wants to put in the work to implement a generalized solution.

> Encapsulated would probably be the doable part. But local superuser? Given
> that a superuser can load and run binaries, how would you propose you
> restrict that superuser from doing anything they want? And if you don't
> need that functionality, then hows it really different from being the
> database owner?

A local user with the superuser privilege would not be able to log into
another database, because superuser doesn't give you any extra privilege
until you've logged in.

Yeah, as superuser you could still break things as much as you pleased,
but not through SQL.

I share your doubts as to how useful such a concept actually is, but
it'd work if we had real local users.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Tom Lane

Jaime Casanova  writes:
> On Tue, Mar 11, 2014 at 10:06 PM, Tom Lane  wrote:
>> But not sure how to define a unique
>> index that allows (joe, db1) to coexist with (joe, db2) but not with
>> (joe, 0).

> and why you want that restriction?

So that if I say "GRANT SELECT ON mytable TO joe", it's unambiguous which
user I'm granting to.  There should be at most one "joe" that can access
any given database.

If we don't have such a restriction, we'll need syntax kluges in GRANT,
ALTER OWNER, and probably other commands to disambiguate whether a local
or global user is meant.  Right now with the db_user_namespace mechanism,
you have to say GRANT ... TO "joe@db1" if you were granting to a local
user in db1.  That's unambiguous all right, but it has little else to
recommend it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Tom Lane

Kouhei Kaigai  writes:
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout
> WARNING:  pgstat wait timeout

> Once I got above messages, write performance is dramatically
> degraded, even though I didn't take detailed investigation.

> I could reproduce it on the latest master branch without my
> enhancement, so I guess it is not a problem something special
> to me.
> One other strangeness is, right now, this problem is only
> happen on my virtual machine environment - VMware ESXi 5.5.0.
> I couldn't reproduce the problem on my physical environment
> (Fedora20, core i5-4570S).

We've seen sporadic reports of that sort of behavior for years, but no
developer has ever been able to reproduce it reliably.  Now that you've
got a reproducible case, do you want to poke into it and see what's going
on?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Tom Lane

Julien Tachoires  writes:
> A customer has reported us a strange behaviour regarding a rowtype 
> column with a domain subfield:

Rowtypes in general do not support defaults for component fields.

> Is build_column_default() the right place to handle that case ?

It's unlikely that this is simple to change.  As an example, should
the default be inserted during a cast to the rowtype?  How about
plpgsql variable initialization?  What are you going to do about
scalar-NULL values of the rowtype (note "forbid them" is likely
to cause all sorts of collateral damage)?

But in any case, none of the examples you showed have anything to
do with build_column_default().  That would only get applied if
the INSERT's targetlist didn't mention col1 at all.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgstat wait timeout (RE: contrib/cache_scan)

2014-03-12 Thread Kouhei Kaigai

It is another topic from the main thread,

I noticed the following message under the test cases that
takes heavy INSERT workload; provided by Haribabu.

[kaigai@iwashi ~]$ createdb mytest
[kaigai@iwashi ~]$ psql -af ~/cache_scan.sql mytest
\timing
Timing is on.
--cache scan select 5 million
create table test(f1 int, f2 char(70), f3 float, f4 char(100));
CREATE TABLE
Time: 22.373 ms
truncate table test;
TRUNCATE TABLE
Time: 17.705 ms
insert into test values (generate_series(1,500), 'fujitsu', 1.1, 'Australia 
software tech pvt ltd');
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
WARNING:  pgstat wait timeout
   :

Once I got above messages, write performance is dramatically
degraded, even though I didn't take detailed investigation.

I could reproduce it on the latest master branch without my
enhancement, so I guess it is not a problem something special
to me.
One other strangeness is, right now, this problem is only
happen on my virtual machine environment - VMware ESXi 5.5.0.
I couldn't reproduce the problem on my physical environment
(Fedora20, core i5-4570S).
Any ideas?

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Kouhei Kaigai
> Sent: Wednesday, March 12, 2014 3:26 PM
> To: Haribabu Kommi; Kohei KaiGai
> Cc: Tom Lane; PgHacker; Robert Haas
> Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only
> table scan?)
> 
> Thanks for your efforts!
> >  Head  patched
> > Diff
> > Select -  500K772ms2659ms-200%
> > Insert - 400K   3429ms 1948ms  43% (I am
> > not sure how it improved in this case)
> > delete - 200K 2066ms 3978ms-92%
> > update - 200K3915ms  5899ms-50%
> >
> > This patch shown how the custom scan can be used very well but coming
> > to patch as It is having some performance problem which needs to be
> > investigated.
> >
> > I attached the test script file used for the performance test.
> >
> First of all, it seems to me your test case has too small data set that
> allows to hold all the data in memory - briefly 500K of 200bytes record
> will consume about 100MB. Your configuration allocates 512MB of
> shared_buffer, and about 3GB of OS-level page cache is available.
> (Note that Linux uses free memory as disk cache adaptively.)
> 
> This cache is designed to hide latency of disk accesses, so this test case
> does not fit its intention.
> (Also, the primary purpose of this module is a demonstration for
> heap_page_prune_hook to hook vacuuming, so simple code was preferred than
> complicated implementation but better performance.)
> 
> I could reproduce the overall trend, no cache scan is faster than cached
> scan if buffer is in memory. Probably, it comes from the cost to walk down
> T-tree index using ctid per reference.
> Performance penalty around UPDATE and DELETE likely come from trigger
> invocation per row.
> I could observe performance gain on INSERT a little bit.
> It's strange for me, also. :-(
> 
> On the other hand, the discussion around custom-plan interface effects this
> module because it uses this API as foundation.
> Please wait for a few days to rebase the cache_scan module onto the newer
> custom-plan interface; that I submitted just a moment before.
> 
> Also, is it really necessary to tune the performance stuff in this example
> module of the heap_page_prune_hook?
> Even though I have a few ideas to improve the cache performance, like
> insertion of multiple rows at once or local chunk copy instead of t-tree
> walk down, I'm not sure whether it is productive in the current v9.4
> timeframe. ;-(
> 
> Thanks,
> --
> NEC OSS Promotion Center / PG-Strom Project KaiGai Kohei
> 
> 
> 
> > -Original Message-
> > From: pgsql-hackers-ow...@postgresql.org
> > [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Haribabu
> > Kommi
> > Sent: Wednesday, March 12, 2014 1:14 PM
> > To: Kohei KaiGai
> > Cc: Kaigai Kouhei(海外 浩平); Tom Lane; PgHacker; Robert Haas
> > Subject: Re: contrib/cache_scan (Re: [HACKERS] What's needed for
> > cache-only table scan?)
> >
> > On Thu, Mar 6, 2014 at 10:15 PM, Kohei KaiGai  wrote:
> > > 2014-03-06 18:17 GMT+09:00 Haribabu Kommi :
> > >> I will update you later regarding the performance test results.
> > >>
> >
> > I ran the performance test on the cache scan patch and below are the
> readings.
> >
> > Configuration:
> >
> > Shared_buffers - 512MB
> > cache_scan.num_blocks - 600
> > checkpoint_segments - 255
> >
> > Machine:
> > OS - centos - 6.4
> > CPU - 4 core 2.5 GHZ
> > Memory - 4GB
> >
> >  Head  patched
> > Diff
> > Select -  500K772ms2659ms-200%
> > Insert - 400K

Re: [HACKERS] Postgresql XML parsing

2014-03-12 Thread Andrew Dunstan



On 03/12/2014 09:36 AM, Ashoke wrote:

Hi,

   I am working on adding a functionality to PostgreSQL. I need to 
parse the XML format query plan (produced by PostgreSQL v9.3) and save 
it in a simple data structure (say C structure). I was wondering if 
PostgreSQL already had any parsing functions implemented that I can 
use to do the XML parsing. This is getting difficult as I was not able 
to get any DTD or XML Schema for the XML files generated by PostgreSQL.


   I found the files xpath.c/xslt_proc.c files that contain parsing 
related code, but none of the functions are being called for any xml 
related query I issue to the database and some of the functions in 
those files mention as deprecated.


   It would be greatly helpful if someone could guide me on this.






The only XML parsing we have is where Postgres is built with libxml, in 
which case we use its parser. But query plan XML is delivered to a 
client (or a log file, which means more or less the same thing here). If 
you want to parse it then it should be parsed in the client - that's why 
we provide it. Inside postgres I don't see a point in parsing the XML 
rather than handling the query plan directly.


The worst possible option would be to make a hand-cut XML parser, either 
in the client or the server - XML parsing has all sorts of wrinkles that 
can bite you badly.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgresql XML parsing

2014-03-12 Thread Ashoke

Hi,

   I am working on adding a functionality to PostgreSQL. I need to parse
the XML format query plan (produced by PostgreSQL v9.3) and save it in a
simple data structure (say C structure). I was wondering if PostgreSQL
already had any parsing functions implemented that I can use to do the XML
parsing. This is getting difficult as I was not able to get any DTD or XML
Schema for the XML files generated by PostgreSQL.

   I found the files xpath.c/xslt_proc.c files that contain parsing related
code, but none of the functions are being called for any xml related query
I issue to the database and some of the functions in those files mention as
deprecated.

   It would be greatly helpful if someone could guide me on this.

   Thank you.

-- 
Regards,
Ashoke

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12 Březen 2014, 0:51, Peter Geoghegan wrote:
> On Tue, Mar 11, 2014 at 4:41 PM, Peter Geoghegan  wrote:
>> I think that in practice the
>> general recommendation will be that when indexing at the "top level",
>> use jsonb_hash_ops. When indexing nested items, use the more flexible
>> default GIN opclass. That seems like a pretty smart trade-off to me.
>
> By which I mean: index nested items using an expressional GIN index.

I'm still not sure how would that look. Does that mean I'd have to create
multiple GIN indexes - one for each possible key or something like that?
Can you give an example?

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Andres Freund

On 2014-03-12 10:03:42 -0300, Alvaro Herrera wrote:
> Tom Lane wrote:
> > Andrew Dunstan  writes:
> > > On 03/11/2014 09:37 PM, Tom Lane wrote:
> > >> In particular, I'd like to see an exclusion that prevents local users
> > >> from having the same name as any global user, so that we don't have
> > >> ambiguity in GRANT and similar commands.  This doesn't seem simple to
> > >> enforce (if we supported partial indexes on system catalogs, it would
> > >> be ...) but surely this representation is more amenable to enforcing it
> > >> than the existing one.
> > 
> > > Should be workable if you're creating a local name - just check against 
> > > the list of global roles.
> > 
> > Concurrent creations won't be safe without some sort of locking scheme.
> > A unique index would be a lot better way of plugging that hole than a
> > system-wide lock on user creation.  But not sure how to define a unique
> > index that allows (joe, db1) to coexist with (joe, db2) but not with
> > (joe, 0).
> 
> Isn't this just a case of creating a suitable operator and an exclusion
> constraint?  Defining the constraint in BKI might require extra
> infrastructure, but it should be possible.

Except that we don't have the infrastructure to perform such checks
(neither partial, nor expression indexes, no exclusion constraints) on
system tables atm. So it's not a entirely trivial thing to do.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb and nested hstore

2014-03-12 Thread Tomas Vondra

On 12 Březen 2014, 0:41, Peter Geoghegan wrote:
> On Tue, Mar 11, 2014 at 3:58 PM, Tomas Vondra  wrote:
>>   ERROR:  index row size 1416 exceeds maximum 1352 for index "gin_idx"
>
> All index AMs have similar restrictions.

Yes, I know and I have no problem with restrictions in general. You may
run into similar issues with btree indexes on text columns with long text,
for example. The thing is that people don't generally index text directly,
because it usually does not make much sense, but using tsvector etc.

But with jsonb it's more likely because indexing is one of the goodies (at
least for me). And the discussions with several people interested in
storing json data I had recently went often like this:

me: It seems we'll have a better json datatype in 9.4.
them: Nice!
me: And it will be possible to do searches on arbitrary keys.
them: Yay!
me: And we actually got pretty significant improvements in GIN indexes.
them: Awesome!
me: But the values you may index need to be less than ~1500B.
them: Bummer :-(
me: Well, you can use GIST then.

>> A good example of such header is "dkim-signature" which basically
>> contains the whole message digitally signed with DKIM. The signature
>> tends to be long and non-compressible, thanks to the signature.
>>
>> I'm wondering what's the best way around this, because I suspect many
>> new users (especially those attracted by jsonb and GIN improvements)
>> will run into this. Maybe not immediately, but eventully they'll try to
>> insert a jsonb with long value, and it will fail ...
>
> The jsonb_hash_ops operator class just stores a 32-bit integer hash
> value (it always sets the recheck flag, which only some of the other
> default GIN opclass' strategies do). It only supports containment, and
> not the full variety of operators that the default opclass supports,
> which is why it isn't the default. I think that in practice the
> general recommendation will be that when indexing at the "top level",
> use jsonb_hash_ops. When indexing nested items, use the more flexible
> default GIN opclass. That seems like a pretty smart trade-off to me.

OK, I'll look into the jsonb_hash_ops - that sounds more or less like what
I was thinking about (and sure, storing hashes makes some operations
impossible to support).

The other thing I was thinking about is introducing some kind of upper
limit for the value length - e.g. index just the first 1kB, or something
like that. My experience is most values are way shorter, or actually
differ in the first 1kB, so this should allow most decisions to be made.
But I'm not really that familiar with how GIN works, so maybe this is
nonsense.

> The more I think about it, the more inclined I am to lose GiST support
> entirely for the time being. It lets us throw out about 700 lines of C
> code, which is a very significant fraction of the total, removes the
> one open bug, and removes the least understood part of the code. The
> GiST opclass is not particularly compelling for this.

I disagree with that. I see GiST as a simple fallback option for the cases
I described. I wasn't able to create a GIN index because of exceeding the
max item length, but GiST created just fine. It was considerably slower,
but it worked.

Tomas

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Rowtype column and domain subfield with DEFAULT and NOT NULL constraint

2014-03-12 Thread Julien Tachoires


Hi,

A customer has reported us a strange behaviour regarding a rowtype 
column with a domain subfield:


test=# CREATE DOMAIN my_int_not_null_1 AS INTEGER DEFAULT 1 NOT NULL;
CREATE DOMAIN
test=# CREATE TYPE  my_int_rowtype AS (
test(# f1 INTEGER,
test(# f2 my_int_not_null_1
test(# );
CREATE TYPE
test=# CREATE TABLE test (id SERIAL, col1 my_int_rowtype);
CREATE TABLE
test=# INSERT INTO test (col1.f1) VALUES (1);
INSERT 0 1
test=# INSERT INTO test (id, col1.f1) VALUES (2, 1);
INSERT 0 1
test=# INSERT INTO test (col1) VALUES ((1,NULL));
ERROR:  domain my_int_not_null_1 does not allow null values
test=# SELECT * FROM test;
 id | col1
+--
  1 | (1,)
  2 | (1,)
(2 rows)

It seems:

- the DEFAULT value (from the domain) is not inserted
- the NOT NULL constraint is no applied excepting if we set explicitly 
the value to NULL, looks like it is the same issue than before, when 
there is no DEFAULT the parser/rewriter should set the column/subfield 
to NULL.


Is build_column_default() the right place to handle that case ?

--
Julien Tachoires
http://dalibo.com - http://dalibo.org


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] git-review: linking commits to review discussion in git

2014-03-12 Thread Murtuza Mukadam

From: Murtuza Mukadam 
Date: Sat, Mar 8, 2014 at 1:56 PM
Subject: Re: [HACKERS] git-review: linking commits to review discussion in git
To: Heikki Linnakangas 
Cc: pgsql-hackers@postgresql.org


Hi Heikki

We have linked git commits and reviews in a web interface. If you
enter a commit hash, you will be redirected to the email archive of
the peer review discussion:
http://cesel.encs.concordia.ca/git-reviewed-tracker.php

This work is part of my thesis, so feedback is much appreciated. If
you have another git repo and mailing lists that you'd like linked,
please let us know.

Cheers,
Murtuza

How do we do the linking? We take each email patch, eliminate white
space and hash each line. We then compare the lines with those in
commits to the same files. The commit that changes the same files and
has the largest number of matching lines is considered to be the
reviewed commit.



On Tue, Jan 28, 2014 at 2:10 AM, Heikki Linnakangas
 wrote:
> On 01/27/2014 11:36 PM, Murtuza Mukadam wrote:
>>
>> Hello All,
>>We have linked peer review discussions on
>> 'pgsql-hackers' to their respective commits within the main
>> postgresql.git repository. You can view the linked reviews from 2012
>> until present in the GitHub repo at
>> https://github.com/mmukadam/postgres/tree/review
>>
>> If you want to work with these reviews locally, you can use our
>> git-review tool. It allows you to create reviews and attach them to
>> commits in git. We didn't modify git, instead we added some scripts
>> that use standard git commands. git-review is beta, but since it only
>> adds a detached 'review' branch and modifies the contents of this
>> branch, it has minimal impact and can easily be removed by deleting
>> the 'review' branch and scripts.
>>
>> The online man-page is here:
>> http://users.encs.concordia.ca/~m_mukada/git-review.html
>>
>> In order to install git-review, you need to clone the repository:
>> https://github.com/mmukadam/git-review.git
>>
>> The online tutorial is available here:
>> http://users.encs.concordia.ca/~m_mukada/git-review-tutorial.html
>>
>> The clone of postgresql.git with linked review discussion is here (new
>> review discussion are linked nightly)
>> https://github.com/mmukadam/postgres
>>
>> This work is part of my Master's thesis. If you'd like us to change
>> the tool to better suit your review process, have another git repo
>> you'd like us to link commits with review discussion, or have other
>> feedback, please let us know.
>
>
> I don't understand what this does. The repository at
> https://github.com/mmukadam/postgres looks like just a clone of the main
> PostgreSQL repository, with no extra links anywhere. And the repository at
> https://github.com/mmukadam/postgres/tree/review looks like a mailing list
> archive turned into a git repository, but I don't see any links to the
> commits in the main repository there.
>
> Am I missing something?
>
> - Heikki


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] db_user_namespace a "temporary measure"

2014-03-12 Thread Alvaro Herrera

Tom Lane wrote:
> Andrew Dunstan  writes:
> > On 03/11/2014 09:37 PM, Tom Lane wrote:
> >> In particular, I'd like to see an exclusion that prevents local users
> >> from having the same name as any global user, so that we don't have
> >> ambiguity in GRANT and similar commands.  This doesn't seem simple to
> >> enforce (if we supported partial indexes on system catalogs, it would
> >> be ...) but surely this representation is more amenable to enforcing it
> >> than the existing one.
> 
> > Should be workable if you're creating a local name - just check against 
> > the list of global roles.
> 
> Concurrent creations won't be safe without some sort of locking scheme.
> A unique index would be a lot better way of plugging that hole than a
> system-wide lock on user creation.  But not sure how to define a unique
> index that allows (joe, db1) to coexist with (joe, db2) but not with
> (joe, 0).

Isn't this just a case of creating a suitable operator and an exclusion
constraint?  Defining the constraint in BKI might require extra
infrastructure, but it should be possible.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Torn page hazard in ginRedoUpdateMetapage()

2014-03-12 Thread Heikki Linnakangas


On 03/12/2014 02:05 PM, Robert Haas wrote:

On Wed, Mar 12, 2014 at 4:23 AM, Heikki Linnakangas
 wrote:

The attached patch doesn't apply any more, but it looks like this
issue still exists.


Fixed.


Did you forget to push?


Yep. Pushed now.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 107 matches

Mail list logo