Re: [HACKERS] Hashable custom types

2014-04-25 Thread Peter Geoghegan
On Fri, Apr 25, 2014 at 4:47 PM, Paul Ramsey  wrote:
> Is it possible to make custom types hashable? There's no hook in the
> CREATE TYPE call for a hash function, but can one be hooked up
> somewhere else? In an operator?

See 35.14.6., System Dependencies on Operator Classes


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hashable custom types

2014-04-25 Thread Paul Ramsey
When trying to write a recursive CTE using the PostGIS geometry type,
I was told this:

ERROR:  could not implement recursive UNION
DETAIL:  All column datatypes must be hashable.

Is it possible to make custom types hashable? There's no hook in the
CREATE TYPE call for a hash function, but can one be hooked up
somewhere else? In an operator?

Thanks,

P

-- 
Paul Ramsey
http://cleverelephant.ca
http://postgis.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Decrease MAX_BACKENDS to 2^16

2014-04-25 Thread Andres Freund
Hi,

Currently the maximum for max_connections (+ bgworkers + autovacuum) is
defined by
#define MAX_BACKENDS0x7f
which unfortunately means that some things like buffer reference counts
need a full integer to store references.

Since there's absolutely no sensible scenario for setting
max_connections that high, I'd like to change the limit to 2^16, so we
can use a uint16 in BufferDesc->refcount.

Does anyone disagree? This clearly is 9.5 material, but I wanted to
raise it early, since I plan to develop some stuff for 9.5 that'd depend
on lowering it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] slow startup due to LWLockAssign() spinlock

2014-04-25 Thread Andres Freund
On 2014-04-24 23:28:14 +0200, Andres Freund wrote:
> On 2014-04-24 12:43:13 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2014-04-24 11:02:44 -0400, Tom Lane wrote:
> > >> FWIW, I like the LWLockAssignBatch idea a lot better than the currently
> > >> proposed patch.  LWLockAssign is a low-level function that has no 
> > >> business
> > >> making risky assumptions about the context it's invoked in.
> > 
> > > I don't think LWLockAssignBatch() is that easy without introducing
> > > layering violations. It can't just return a pointer out of the main
> > > lwlock array that then can be ++ed clientside because MainLWLockArray's
> > > stride isn't sizeof(LWLock).
> > 
> > Meh.  I knew this business of using pointers instead of indexes would
> > have some downsides.
> > 
> > We could return the array stride ... kinda ugly, but since there's
> > probably only one consumer for this API, it's not *that* bad.  Could
> > wrap the stride-increment in a macro, perhaps.
> 
> I think I am just going to wait for 9.5 where I sure hope we can
> allocate the buffer lwlocks outside the main array...

For reference (and backup), here's my current patch for that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
>From 24cd57b86a5ad4dc625daa61b62663bb796a5e38 Mon Sep 17 00:00:00 2001
From: Andres Freund 
Date: Tue, 28 Jan 2014 17:06:01 +0100
Subject: [PATCH] lwlock-inline

---
 src/backend/storage/buffer/buf_init.c | 26 +--
 src/backend/storage/buffer/bufmgr.c   | 48 +--
 src/backend/storage/lmgr/lwlock.c | 11 +---
 src/include/storage/buf_internals.h   | 10 +---
 src/include/storage/lwlock.h  | 10 
 5 files changed, 66 insertions(+), 39 deletions(-)

diff --git a/src/backend/storage/buffer/buf_init.c b/src/backend/storage/buffer/buf_init.c
index e187242..27689d9 100644
--- a/src/backend/storage/buffer/buf_init.c
+++ b/src/backend/storage/buffer/buf_init.c
@@ -21,6 +21,7 @@
 BufferDesc *BufferDescriptors;
 char	   *BufferBlocks;
 int32	   *PrivateRefCount;
+LWLock		*BufferIOLocks;
 
 
 /*
@@ -62,6 +63,8 @@ int32	   *PrivateRefCount;
  *		backend.
  */
 
+static int			content_tranche_id, progress_tranche_id;
+static LWLockTranche content_tranche, progress_tranche;
 
 /*
  * Initialize shared buffer pool
@@ -79,10 +82,26 @@ InitBufferPool(void)
 		ShmemInitStruct("Buffer Descriptors",
 		NBuffers * sizeof(BufferDesc), &foundDescs);
 
+	BufferIOLocks = (LWLock *)
+		ShmemInitStruct("Buffer IO Locks",
+		NBuffers * sizeof(LWLock), &foundBufs);
+
 	BufferBlocks = (char *)
 		ShmemInitStruct("Buffer Blocks",
 		NBuffers * (Size) BLCKSZ, &foundBufs);
 
+	content_tranche_id = 1;
+	content_tranche.name = "Buffer Content Locks";
+	content_tranche.array_base = &BufferDescriptors->content_lock;
+	content_tranche.array_stride = sizeof(BufferDesc);
+	LWLockRegisterTranche(content_tranche_id, &content_tranche);
+
+	progress_tranche_id = 2;
+	progress_tranche.name = "Buffer IO Locks";
+	progress_tranche.array_base = BufferIOLocks;
+	progress_tranche.array_stride = sizeof(LWLock);
+	LWLockRegisterTranche(progress_tranche_id, &progress_tranche);
+
 	if (foundDescs || foundBufs)
 	{
 		/* both should be present or neither */
@@ -117,8 +136,8 @@ InitBufferPool(void)
 			 */
 			buf->freeNext = i + 1;
 
-			buf->io_in_progress_lock = LWLockAssign();
-			buf->content_lock = LWLockAssign();
+			LWLockInitialize(&buf->content_lock, content_tranche_id);
+			LWLockInitialize(&BufferIOLocks[i], progress_tranche_id);
 		}
 
 		/* Correct last entry of linked list */
@@ -168,6 +187,9 @@ BufferShmemSize(void)
 	/* size of buffer descriptors */
 	size = add_size(size, mul_size(NBuffers, sizeof(BufferDesc)));
 
+	/* size of io progress locks */
+	size = add_size(size, mul_size(NBuffers, sizeof(LWLock)));
+
 	/* size of data pages */
 	size = add_size(size, mul_size(NBuffers, BLCKSZ));
 
diff --git a/src/backend/storage/buffer/bufmgr.c b/src/backend/storage/buffer/bufmgr.c
index 4e46ddb..28357c9 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -651,7 +651,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 			 * happens to be trying to split the page the first one got from
 			 * StrategyGetBuffer.)
 			 */
-			if (LWLockConditionalAcquire(buf->content_lock, LW_SHARED))
+			if (LWLockConditionalAcquire((LWLock *) &buf->content_lock, LW_SHARED))
 			{
 /*
  * If using a nondefault strategy, and writing the buffer
@@ -673,7 +673,7 @@ BufferAlloc(SMgrRelation smgr, char relpersistence, ForkNumber forkNum,
 		StrategyRejectBuffer(strategy, buf))
 	{
 		/* Drop lock/pin and loop around for another buffer */
-		LWLockRelease(buf->content_lock);
+		LWLockRelease((LWLock *) &buf->content_lock);
 		UnpinBuffer(buf, true);
 		continue;
 	}
@@ -686,7 +686,7 @@ Bu

Re: [HACKERS] Expression indexes ignore typmod of expression

2014-04-25 Thread Andres Freund
On 2014-04-25 17:19:00 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-04-25 17:05:26 -0400, Tom Lane wrote:
> >> I think this is just a hangover from
> >> before we paid much attention to expression typmods at all, and
> >> propose the attached patch.
> 
> > Any chance it could cause problems with stored trees being different
> > from newly generated ones due to it? I.e. is it something that can be
> > done without a catversion bump?
> 
> Not sure.  I wasn't proposing this as a back-patch, just 9.4 only.

Then a clear +1 for me. It's rather confusing to see bpchar, a type very
rarely used explicitly in explain output.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

2014-04-25 Thread Tom Lane
Tomas Vondra  writes:
> On 23.4.2014 16:07, Tom Lane wrote:
>> To be concrete: let's add a new boolean parameter with the semantics
>> of "final function takes extra dummy arguments" (default false).
>> There would need to be one for the separate moving-aggregate final
>> function too, of course.

> Do we really need a separate parameter for this? Couldn't this be
> decided simply using the signature of the final function? Either it has
> a single parameter (current behavior), or it has the same parameters as
> the state transition function (new behavior).

The problem is that the CREATE AGGREGATE syntax only specifies the name of
the final function, not its argument list, so you have to make an
assumption about the argument list in order to look up the final function
in the first place.

I did consider the idea of looking for both signatures and using whatever
we find, but that seems fairly dangerous: the same CREATE AGGREGATE
command could give different results depending on what versions of the
final function happen to exist.  This would create an ordering hazard that
pg_dump could not reliably cope with, for example.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Expression indexes ignore typmod of expression

2014-04-25 Thread Tom Lane
Andres Freund  writes:
> On 2014-04-25 17:05:26 -0400, Tom Lane wrote:
>> I think this is just a hangover from
>> before we paid much attention to expression typmods at all, and
>> propose the attached patch.

> Any chance it could cause problems with stored trees being different
> from newly generated ones due to it? I.e. is it something that can be
> done without a catversion bump?

Not sure.  I wasn't proposing this as a back-patch, just 9.4 only.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] aggregate returning anyarray and 'cannot determine result data type'

2014-04-25 Thread Tomas Vondra
On 23.4.2014 16:07, Tom Lane wrote:
>
> To be concrete: let's add a new boolean parameter with the semantics
> of "final function takes extra dummy arguments" (default false).
> There would need to be one for the separate moving-aggregate final
> function too, of course.
> 
> The best naming idea I've got right now is "finalfunc_extra" and 
> "mfinalfunc_extra", but maybe somebody can do better?

Do we really need a separate parameter for this? Couldn't this be
decided simply using the signature of the final function? Either it has
a single parameter (current behavior), or it has the same parameters as
the state transition function (new behavior).

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Expression indexes ignore typmod of expression

2014-04-25 Thread Andres Freund
Hi,

On 2014-04-25 17:05:26 -0400, Tom Lane wrote:
> A Salesforce colleague asked me why, for something like
> 
> the index column ends up as "bpchar" and not "char(15)".  The CASE
> expression does get resolved as char(15), but it turns out that
> index.c just ignores that.

I've seen that before but never looked what's the origin. +1 for fixing
it.

>  I think this is just a hangover from
> before we paid much attention to expression typmods at all, and
> propose the attached patch.
> 
> Comments?

Any chance it could cause problems with stored trees being different
from newly generated ones due to it? I.e. is it something that can be
done without a catversion bump?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Expression indexes ignore typmod of expression

2014-04-25 Thread Tom Lane
A Salesforce colleague asked me why, for something like

regression=# create table foo1 (f1 char(15), f2 char(15));
CREATE TABLE
regression=# create index on foo1((case when f1>'z' then f1 else f2 end));
CREATE INDEX
regression=# \d foo1_f2_idx
 Index "public.foo1_f2_idx"
 Column |  Type  |Definition 
++---
 f2 | bpchar | (+
|| CASE +
|| WHEN f1 > 'z'::bpchar THEN f1+
|| ELSE f2  +
|| END)
btree, for table "public.foo1"

the index column ends up as "bpchar" and not "char(15)".  The CASE
expression does get resolved as char(15), but it turns out that
index.c just ignores that.  I think this is just a hangover from
before we paid much attention to expression typmods at all, and
propose the attached patch.

Comments?

regards, tom lane

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 47f0647..c932c83 100644
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
*** ConstructTupleDescriptor(Relation heapRe
*** 389,395 
  			to->attalign = typeTup->typalign;
  			to->attstattarget = -1;
  			to->attcacheoff = -1;
! 			to->atttypmod = -1;
  			to->attislocal = true;
  			to->attcollation = collationObjectId[i];
  
--- 389,395 
  			to->attalign = typeTup->typalign;
  			to->attstattarget = -1;
  			to->attcacheoff = -1;
! 			to->atttypmod = exprTypmod(indexkey);
  			to->attislocal = true;
  			to->attcollation = collationObjectId[i];
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Planned downtime @ Rackspace

2014-04-25 Thread Stephen Frost
Greetings,

  As some may be aware, we are currently working with Rackspace to
  upgrade the PostgreSQL infrastructure systems which they graciously
  host for us.  As part of these upgrades there will be downtime for
  systems hosted there.

  Our first planned downtime will be for ~ 2 hours on Monday, April
  28th, between 1500-2000 UTC (11am-4pm US/Eastern).  We have requested
  that the outage be towards the end of that window (1800-2000 UTC), but
  there is no guarantee that they'll be able to support that request.
  Clearly, we hope the actual downtime will be less than 2 hours.

  This downtime will impact all systems hosted @ Rackspace as they are
  upgrading the switch for us.  End-user services which will be impacted
  include:

yum.postgresql.org
wiki.postgresql.org
git master server (Committers only)
planet.postgresql.org
www.pgadmin.org, ftp.pgadmin.org
media.postgresql.org
commitfest.postgresql.org
developer.postgresql.org (Developer personal homepages)
redmine.postgresql.org
jdbc.postgresql.org
postgresopen.org
developer.pgadmin.org (sandbox, Jenkins)
babel.postgresql.org (Translation services)

  Redundant services (minimal impact expected):
ns2.postgresql.org (other nameservers will still work)
.us inbound mail relay (other MXs will still work)
.us website front-end (should be removed from pool)
FTP mirror (should be removed from pool)

  We anticipate this being the only whole-environment outage during
  this migration.  Future outages will happen for individual systems
  as we migrate them to the new hardware. 

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] Two minor bugs in GIN fast insertion WAL-logging

2014-04-25 Thread Heikki Linnakangas

(more fruit from my little page-image-comparison hack)

WAL replay of filling a GIN fast list page is a bit funky. I believe 
there are two bugs in it, but they're both harmless:


writeListPage function initializes the page, and adds a bunch of tuples 
to it. All the tuples are included in the WAL record, and marked as 
belonging to the buffer. So far so good. But since the page is 
re-initialized from scratch, its LSN is always 0, so XLogInsert will 
always create a full-page-image of it. That's harmless, but usually we 
don't do a full-page image when we have all the information to re-create 
the page from scratch anyway. It does cause some unnecessary bloating of 
the WAL, because the information to recreate the page from scratch takes 
less space than the full page image.


The second bug is in the redo routine of that, ginRedoInsertListPage. 
The loop that inserts all the tuples to the page is missing an "off++", 
so it inserts all the tuples to offset 1. So the tuples end up on the 
page in reverse order. Not just reverse physical order, like in the 
B-tree split code, but reverse itemno order. That happens to work 
because the order of the tuples doesn't matter to scans, and we also 
never remove individual tuples from fast list pages. You only see this 
happening if you run with full_page_images=off, because of the first bug.


Although these are harmless, the second bug in particular is a bit 
scary. I think we should fix and backpatch these.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: ECPG FETCH readahead

2014-04-25 Thread Boszormenyi Zoltan

2014-04-24 15:19 keltezéssel, Boszormenyi Zoltan írta:

2014-04-24 14:50 keltezéssel, Michael Meskes írta:

Thanks an awful lot Antonin.


Committer availability might well be the issue, but missing review
probably too.
Yes, you're right. If my taks is mostly one last glance and a commit I will make time 
for that.



Whether this review is enough to move the patch to "ready for committer"
- I tend to let the next CFM decide. (I don't find it productive to
ignite another round of discussion about kinds of reviews - already saw
some.)

I saw some remarks in your review that Zoltan wants to address. Once I got the
updated version I'll have a look at it.

Zoltan, could you send a new version by end of day tomorrow? I'll be sitting on
a plane for a longer time again on Saturday. :)


I will try to.


Unfortunately, I won't make the deadline because of life
(I had to attend a funeral today) and because Antonin has
opened a can of worms with this comment:


* How about a regression test for the new ECPGcursor_dml() function?


There are some aspects that may need a new discussion.

The SQL standard wants an "updatable cursor" for positioned DML
(i.e. UPDATE/DELETE with the WHERE CURRENT OF clause)
This means passing FOR UPDATE in the query.

FOR UPDATE + SCROLL cursor is an impossible combination,
ERROR is thrown when DECLARE is executed. This combination can
(and should?) be detected in the ECPG preprocessor and it would
prevent runtime errors. It's not implemented at the moment.

Fortunately, a previous discussion resulted in explicitly passing
NO SCROLL for cursors where SCROLL is not specified, it's in 25.patch

I intend to extend it a little for SQL standard compliance with
embedded SQL. FOR UPDATE should also implicitly mean NO SCROLL.
Both the FOR UPDATE + explicit SCROLL and the explicit SCROLL +
usage of positioned DML can be detected in the preprocessor and
they should throw an error. Then the regression test would really make
sense.

But these checks in ECPG would still leave a big hole, and it's the other
DECLARE variant with the query passed in a prepared statement with
"EXEC SQL PREPARE prep_stmt FROM :query_string;"

Plugging this hole would require adding a simplified syntax checker to
libecpg that only checks the SelectStmt or re-adding the backend code
to tell the application the cursor's scrollable (and perhaps the updatable)
property.

I must have forgotten but surely this was the reason for changing the
DECLARE command tag in the first place which was shot down already.
So, only the other choice remains, the syntax checker in ecpglib.

I think implementing it would make the caching code miss 9.4, since
it adds a whole new set of code but the Perl magic for the ECPG syntax
checker may be mostly reusable here.

Best regards,
Zoltán Böszörményi



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-25 Thread David Fetter
On Fri, Apr 25, 2014 at 10:58:29AM -0700, Josh Berkus wrote:
> On 04/24/2014 05:23 PM, Marti Raudsepp wrote:
> > On Thu, Apr 24, 2014 at 8:40 PM, Josh Berkus  wrote:
> >> A pseudo-random UUID is frankly pretty
> >> useless to me because (a) it's not really unique
> > 
> > This is FUD. A pseudorandom UUID contains 122 bits of randomness. As
> > long as you can trust the random number generator, the chances of a
> > value occurring twice can be estimated using the birthday paradox:
> > there's a 50% chance of having *one* collision in a set of 2^61 items.
> > Storing this amount of UUIDs alone requires 32 exabytes of storage.
> > Factor in the tuple and indexing overheads and you'd be needing close
> > to all the hard disk space ever manufactured in the world.
> 
> Well, I've already had collisions with UUID-OSSP, in production, with
> only around 20 billion values.  So clearly there aren't 122bits of true
> randomness in OSSP.  I can't speak for other implementations because I
> haven't tried them.
> 
> >> (b) it doesn't help me route data at all.
> > 
> > That's really out of scope for UUIDs. They're about generating
> > identifiers, not describing what the identifier means. UUIDs also
> > don't happen to cure cancer.
> 
> http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327
> 
> On the contrary, I would argue that an object identifier which is
> completely random is possibly the worst way to form an ID of all
> possible concepts; there's no relationship whatsoever between the ID,
> the application stack, and the application data; you don't even get the
> pseudo-time indexing you get with Serials.   The only reason to do it is
> because you're too lazy do implement a better way.
> 
> Or to put it another way: a value which is truly random is no identifier
> at all.

Not exactly.  It's at least potentially hiding information an attacker
could use, with all the caveats that carries.

> Compare this with a composite identifier which carries information about
> the node, table, and schema of origin for the tuple.  Not only does this
> help ensure uniqueness, but it also supports intelligent sharding and
> multi-master replication systems.  I don't speak hypothetically; we've
> done this in the past and will do it again in the future.

This is an excellent idea, but I don't think it's in scope for UUIDs.

> I would love to have some machinery inside PostgreSQL to make this
> easier (for example, a useful unique database ID), but I suspect that
> acutal implementation will always remain application-specific.
> 
> You may say "oh, that's not the job of the identifer", but if it's not,
> WTF is the identifer for, then?

Frequently, it's to provide some kind of opacity in the sense of not
have an obvious predecessor or successor.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-25 Thread Josh Berkus
On 04/24/2014 05:23 PM, Marti Raudsepp wrote:
> On Thu, Apr 24, 2014 at 8:40 PM, Josh Berkus  wrote:
>> A pseudo-random UUID is frankly pretty
>> useless to me because (a) it's not really unique
> 
> This is FUD. A pseudorandom UUID contains 122 bits of randomness. As
> long as you can trust the random number generator, the chances of a
> value occurring twice can be estimated using the birthday paradox:
> there's a 50% chance of having *one* collision in a set of 2^61 items.
> Storing this amount of UUIDs alone requires 32 exabytes of storage.
> Factor in the tuple and indexing overheads and you'd be needing close
> to all the hard disk space ever manufactured in the world.

Well, I've already had collisions with UUID-OSSP, in production, with
only around 20 billion values.  So clearly there aren't 122bits of true
randomness in OSSP.  I can't speak for other implementations because I
haven't tried them.

>> (b) it doesn't help me route data at all.
> 
> That's really out of scope for UUIDs. They're about generating
> identifiers, not describing what the identifier means. UUIDs also
> don't happen to cure cancer.

http://it.toolbox.com/blogs/database-soup/primary-keyvil-part-i-7327

On the contrary, I would argue that an object identifier which is
completely random is possibly the worst way to form an ID of all
possible concepts; there's no relationship whatsoever between the ID,
the application stack, and the application data; you don't even get the
pseudo-time indexing you get with Serials.   The only reason to do it is
because you're too lazy do implement a better way.

Or to put it another way: a value which is truly random is no identifier
at all.

Compare this with a composite identifier which carries information about
the node, table, and schema of origin for the tuple.  Not only does this
help ensure uniqueness, but it also supports intelligent sharding and
multi-master replication systems.  I don't speak hypothetically; we've
done this in the past and will do it again in the future.

I would love to have some machinery inside PostgreSQL to make this
easier (for example, a useful unique database ID), but I suspect that
acutal implementation will always remain application-specific.

You may say "oh, that's not the job of the identifer", but if it's not,
WTF is the identifer for, then?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clock sweep not caching enough B-Tree leaf pages?

2014-04-25 Thread Peter Geoghegan
I've now done a non-limited comparative benchmark of master against
the patch (once again, with usage_count starting at 6, and
BM_MAX_USAGE_COUNT at 30) with a Gaussian distribution. Once again,
the distribution threshold used was consistently 5.0, causing the
patched pgbench to report for each test:

transaction type: Custom query
scaling factor: 5000
standard deviation threshold: 5.0
access probability of top 20%, 10% and 5% records: 0.68269 0.38293 0.19741

Results are available from:

http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/3-sec-delay-gauss/

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Andres Freund
On 2014-04-25 18:25:44 +0200, Andres Freund wrote:
> On 2014-04-25 12:05:17 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > The case I am worried most about is queries like:
> > > SELECT a, b FROM f WHERE f > ROW(38, 'whatever') ORDER BY f;
> > > I've seen such generated by a some query generators for paging. But I
> > > guess that's something we're going to have to accept.
> > 
> > Meh ... is it likely that the columns involved in an ordering comparison
> > would be so wide as to be toasted out-of-line?  Such a query would only be
> > fast if the row value were indexed, which would pretty much preclude use
> > of wide columns.
> 
> In the cases I've seen it it was usually used in addition to a indexable
> condition, just for paging across different http requests.
> 
> As completely ridiculous example:

> before:
> postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pg_rewrite r WHERE r > 
> ('x'::name, '11854'::oid, NULL, NULL, NULL, NULL);
> QUERY PLAN
> 

Just for some clarity, that also happens with expressions like:
WHERE
ROW(ev_class, rulename, ev_action) >= ROW('pg_rewrite'::regclass, 
'_RETURN', NULL)
ORDER BY ROW(ev_class, rulename, ev_action);

which is what is generated by such query generators - where the leading
columns *are* indexed but not necessarily unique.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Andres Freund
On 2014-04-25 12:05:17 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > The case I am worried most about is queries like:
> > SELECT a, b FROM f WHERE f > ROW(38, 'whatever') ORDER BY f;
> > I've seen such generated by a some query generators for paging. But I
> > guess that's something we're going to have to accept.
> 
> Meh ... is it likely that the columns involved in an ordering comparison
> would be so wide as to be toasted out-of-line?  Such a query would only be
> fast if the row value were indexed, which would pretty much preclude use
> of wide columns.

In the cases I've seen it it was usually used in addition to a indexable
condition, just for paging across different http requests.

As completely ridiculous example:
before:
postgres=# EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pg_rewrite r WHERE r > 
('x'::name, '11854'::oid, NULL, NULL, NULL, NULL);
QUERY PLAN  
  
--
 Seq Scan on pg_rewrite r  (cost=0.00..12.36 rows=36 width=720) (actual 
time=0.425..0.425 rows=0 loops=1)
   Filter: (r.* > ROW('x'::name, 11854::oid, NULL::unknown, NULL::unknown, 
NULL::unknown, NULL::unknown))
   Rows Removed by Filter: 109
   Buffers: shared hit=11
 Planning time: 0.141 ms
 Execution time: 0.485 ms

after:
EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM pg_rewrite r WHERE r > ('x'::name, 
'11854'::oid, NULL, NULL, NULL, NULL);
 QUERY PLAN 


 Seq Scan on pg_rewrite r  (cost=0.00..12.36 rows=36 width=720) (actual 
time=14.257..14.257 rows=0 loops=1)
   Filter: (r.* > ROW('x'::name, 11854::oid, NULL::unknown, NULL::unknown, 
NULL::unknown, NULL::unknown))
   Rows Removed by Filter: 109
   Buffers: shared hit=152
 Planning time: 0.139 ms
 Execution time: 14.310 ms
(6 rows)


> I'm actually more worried about the function-returning-tuple case, as that
> might bite people who thought they'd use some cute functional notation or
> other and it wouldn't cost 'em anything.

Right, that's not actually all that infrequent :/.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Tom Lane
Andres Freund  writes:
> The case I am worried most about is queries like:
> SELECT a, b FROM f WHERE f > ROW(38, 'whatever') ORDER BY f;
> I've seen such generated by a some query generators for paging. But I
> guess that's something we're going to have to accept.

Meh ... is it likely that the columns involved in an ordering comparison
would be so wide as to be toasted out-of-line?  Such a query would only be
fast if the row value were indexed, which would pretty much preclude use
of wide columns.

I'm actually more worried about the function-returning-tuple case, as that
might bite people who thought they'd use some cute functional notation or
other and it wouldn't cost 'em anything.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Andres Freund
On 2014-04-25 11:22:09 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-04-24 19:40:30 -0400, Tom Lane wrote:
> >> * Because HeapTupleGetDatum might allocate a new tuple, the wrong thing
> >> might happen if the caller changes CurrentMemoryContext between
> >> heap_form_tuple and HeapTupleGetDatum.
> 
> > It's fscking ugly to allocate memory in a PG_RETURN_... But I don't
> > really have a better backward compatible idea :(
> 
> It's hardly without precedent; see PG_RETURN_INT64 or PG_RETURN_FLOAT8 on
> a 32-bit machine, for starters.  There's never been an assumption that
> these macros couldn't do that.

There's a fair bit of difference between allocating 8 bytes and
allocation of nearly unbounded size... But as I said, I don't really
have a better idea.

I agree that the risk from this patch seems more manageable than your
previous approach.

The case I am worried most about is queries like:
SELECT a, b FROM f WHERE f > ROW(38, 'whatever') ORDER BY f;
I've seen such generated by a some query generators for paging. But I
guess that's something we're going to have to accept.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Tom Lane
Andres Freund  writes:
> On 2014-04-24 19:40:30 -0400, Tom Lane wrote:
>> * Because HeapTupleGetDatum might allocate a new tuple, the wrong thing
>> might happen if the caller changes CurrentMemoryContext between
>> heap_form_tuple and HeapTupleGetDatum.

> It's fscking ugly to allocate memory in a PG_RETURN_... But I don't
> really have a better backward compatible idea :(

It's hardly without precedent; see PG_RETURN_INT64 or PG_RETURN_FLOAT8 on
a 32-bit machine, for starters.  There's never been an assumption that
these macros couldn't do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UUIDs in core WAS: 9.4 Proposal: Initdb creates a single table

2014-04-25 Thread Greg Stark
On Fri, Apr 25, 2014 at 1:43 AM, Marti Raudsepp  wrote:
> Obviously you can't use random(). That's why I talked about
> cryptographic PRNGs, crypto libraries do proper seeding and generate
> reliably random numbers all the time.


The difficulty lies not really in the PRNG implementation (which is
hard but well enough understood that it's not much of an issue these
days). The difficulty lies in obtaining enough entropy. There are ways
of obtaining enough entropy and they are available. But they're not
free.

Obtaining enough entropy requires access to hardware devices which
means a kernel system call. Kernel system calls are relatively slow
when you're talking about generating sequential IDs. They also deplete
the available entropy pool for other sources which may means they have
security consequences.

Which isn't to say they're a bad idea but like everything else in
engineering there are tradeoffs and no such thing as a free lunch.
You can avoid depleting the entropy pool by including data you expect
to be unique as a kind of fake entropy -- which quickly gets you back
to looking for things like MAC address to avoid duplicates across
systems.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Andres Freund
Hi,

On 2014-04-24 19:40:30 -0400, Tom Lane wrote:
> * Because HeapTupleGetDatum might allocate a new tuple, the wrong thing
> might happen if the caller changes CurrentMemoryContext between
> heap_form_tuple and HeapTupleGetDatum.

It's fscking ugly to allocate memory in a PG_RETURN_... But I don't
really have a better backward compatible idea :(

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Datums containing toasted fields are a bad idea(?)

2014-04-25 Thread Tom Lane
Heikki Linnakangas  writes:
> On 04/25/2014 02:40 AM, Tom Lane wrote:
>> * The patch changes HeapTupleGetDatum from a simple inline macro into
>> a function call.  This means that third-party extensions will not get
>> protection against creation of toast-pointer-containing composite Datums
>> until they recompile.

> One consequence of that is that an extension compiled with headers from 
> new minor version won't work with binaries from an older minor version. 
> Packagers beware.

Yeah, that's a possible issue, though I think we've done such things
before.  In any case, alternative approaches to fixing this would likely
also involve extensions needing to call core functions that don't exist
today; though maybe the number of affected extensions would be smaller.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A question about code in DefineRelation()

2014-04-25 Thread Hadi Moshayedi
>
> On second thought I noticed that that makes CREATE FOREIGN TABLE include
> an OID column in newly-created foreign tables wrongly, when the
> default_with_oids parameter is set to on.  Please find attached a patch.
>
>
The fix makes sense to me, since in ALTER TABLE SET WITH OIDS we check that
the relation is a table and not a foreign table:

3160 case AT_AddOids: /* SET WITH OIDS */
3161 ATSimplePermissions(rel, ATT_TABLE);

So, I think we should be consistent between DefineRelation() and alter
table.

-- Hadi.


Re: [HACKERS] Json(b) extension

2014-04-25 Thread Dmitry Dolgov
>>> If your goal is to make this functionality available as soon as
possible to users, an external extension is the way to go.
No, my primary goal is to create the useful contrib extension and help to
the community =)
So, I also want to do it as a pgxn extension.


On 25 April 2014 00:11, Josh Berkus  wrote:

> On 04/24/2014 03:46 AM, Dmitry Dolgov wrote:
> > Hi all,
> >
> > As you know, PostgreSQL introduced Json(b) support at the 9.4 version
> [1],
> > and hstore v2.0 saved in separate repository [2]. But although PostgreSQL
> > has this support at the core level, there are many useful functions,
> which
> > wasn't ported to Json(b) from hstore v2.0 and json. Here [3], I've made a
> > review of the missing Json(b) functions, which can be implemented. This
> > list can be updated in the future, of course. I want to implement the
> > missing functions in the form of extension (e.g. contrib/jsonx).
>
> Thanks for making this list!
>
> >
> > What do you think about this?
>
> I think you should do it as a pgxn extension, for now.  For it to be in
> contrib/ or core, we'd need to argue extensively about the names of
> operators and functions, which will take a while.If your goal is to
> make this functionality available as soon as possible to users, an
> external extension is the way to go.
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>