[HACKERS] Question: CREATE EXTENSION and create schema permission?

2011-08-21 Thread Kohei KaiGai
CreateExtension() possibly creates a new schema when the supplied
extension was not relocatable and the target schema was given by
control file of the extension.
However, it allows users to create a new schema with his ownership,
even if current user does not have permission to create a new schema.

Oid extowner = GetUserId();
  :
else if (control-schema != NULL)
{
/*
 * The extension is not relocatable and the author gave us a schema
 * for it.  We create the schema here if it does not already exist.
 */
schemaName = control-schema;
schemaOid = get_namespace_oid(schemaName, true);

if (schemaOid == InvalidOid)
{
schemaOid = NamespaceCreate(schemaName, extowner);
/* Advance cmd counter to make the namespace visible */
CommandCounterIncrement();
}
}

It seems to me that we should inject permission checks here like as
CreateSchemaCommand() doing.

/*
 * To create a schema, must have schema-create privilege on the current
 * database and must be able to become the target role (this does not
 * imply that the target role itself must have create-schema privilege).
 * The latter provision guards against giveaway attacks.  Note that a
 * superuser will always have both of these privileges a fortiori.
 */
aclresult = pg_database_aclcheck(MyDatabaseId, saved_uid, ACL_CREATE);
if (aclresult != ACLCHECK_OK)
aclcheck_error(aclresult, ACL_KIND_DATABASE,
   get_database_name(MyDatabaseId));

I didn't follow the discussion about extension so much when it got merged.
Please tell me, if it was a topic already discussed before.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-08-21 Thread Heikki Linnakangas

On 21.08.2011 07:10, Gokulakannan Somasundaram wrote:

d) In addition, currently there is no WAL Logging, while the bit is

cleared,

which would not be the case in future and hence the exclusive lock held

on

the visibility map is going to be held for a longer time.



This is false and has been false since the visibility map was first

implemented.

I can't understand this. If you are not doing this, then it would cause
consistency issues. Are you saying, we have a crash safe visibility map, but
you don't follow log the change before changing the contents/ WAL
principle. If so, please explain in detail. If you are doing it in the
normal way, then you should be logging the changes before making the changes
to the buffer and during that timeframe, you should be holding the lock on
the buffer. Heikki specifically pointed out, that you have brought in the
WAL Logging of visibility map, within the critical section.


I think you two are talking slightly past each other. There is no extra 
WAL record written when a bit is cleared in the visibility map, there is 
just a flag in the WAL record of the heap insert/update/delete. That is 
what Robert was trying to explain, that part hasn't changed since 8.4. 
What *did* change, however, in master, when the visibility map was made 
crash-safe, is the duration the lock on the visibility map page is held. 
Before that, the visibility map page was locked only briefly *after* the 
changes to the heap page were already applied and WAL record written. 
Now, the VM page lock is acquired and released at the same time as the 
lock on the heap page. It's held while the heap page changes are made 
and WAL record is written. I believe that's what Gokulakannan was trying 
to point out, and is worried that you might get contention on the VM 
page lock now because it's held for a much longer duration.


Gokulakannan, if you could come up with a test case that demonstrates 
that contention (or the lack thereof), that would be good. Otherwise 
we're just speculating.


If it's an issue, perhaps we could release the VM page lock early. We're 
not updating the LSN on it, so we don't need to wait for the WAL record 
to be written, I think. It's a bit out of the ordinary, though, so I 
wouldn't like to do that without an actual test case that shows it's an 
issue.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-08-21 Thread Heikki Linnakangas

On 21.08.2011 07:41, Gokulakannan Somasundaram wrote:

On Sat, Aug 20, 2011 at 4:57 AM, Gokulakannan Somasundaram


gokul...@gmail.com  wrote:

by your argument, if WALInserLock is held for 't' seconds, you should
definitely be holding visibility map lock for more than time frame 't'.


Nope, that's not how it works.  Perhaps you should read the code.
See, e.g., heap_update().

--


OK. I took a look at the patch you have supplied in
http://postgresql.1045698.n5.nabble.com/crash-safe-visibility-map-take-five-td4377235.html


Here's the patch as it was committed: 
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=503c7305a1e379f95649eef1a694d0c1dbdc674a



There is a code like this.

  {
  all_visible_cleared = true;
  PageClearAllVisible(BufferGetPage(buffer));
+visibilitymap_clear(relation,
+ItemPointerGetBlockNumber((heaptup-t_self)),
+vmbuffer);
  }

Here, you are not making an entry into the WAL. then there is an assumption
that the two bits will be in sync without any WAL entry. There is a chance
that the visibility map might be affected by partial page writes, where
clearing of a particular bit might not have been changed. And i am thinking
a lot of such issues. Can you just explain the background logic behind
ignoring the principle of WAL logging? What are the implemented principles,
that protect the Visibility map pages??


The all_visible_cleared flag is included in the WAL record of the insert 
(or update or delete). Partial page writes are not a problem, because we 
always fetch the VM page and clear the bit, regardless of the LSN on the 
VM page.


PS. Robert, the LOCKING section in the header comment of visibilitymap.c 
is out-of-date: it claims that the VM bit is cleared after releasing the 
lock on the heap page.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-08-21 Thread Gokulakannan Somasundaram
 The all_visible_cleared flag is included in the WAL record of the insert
(or update or delete). Partial page writes are not a problem, because we
 always fetch the VM page and clear the bit, regardless of the LSN on the
VM page.


Two things
a) First, my understanding of checkpoint is that it flushes all the pages,
that got changed below a particular LSN. If we are not having a LSN in the
visibility map, how we will be sure, that a visibility map page is getting
flushed/not? With the following approach, i can see only one issue. If the
heap page gets written and checkpointed and the visibility map doesn't get
synced during the checkpoint, then there is an issue. Can you please explain
me, how we get the assurance?

b) Even if we have a contention issue, Visibility map is a solution for a
considerable number of database scenarios. But it should not become a
default package. A table, with no chance of index-only scans should not
incur the extra overhead of crash safe visibility maps. Those tables should
be sparred from this extra overhead, as they don't have index only scans.

Gokul.


Re: [HACKERS] the big picture for index-only scans

2011-08-21 Thread Gokulakannan Somasundaram

 a) First, my understanding of checkpoint is that it flushes all the pages,
 that got changed below a particular LSN. If we are not having a LSN in the
 visibility map, how we will be sure, that a visibility map page is getting
 flushed/not?


Please ignore this statement.  I confused between the checkpoints
implemented in Postgres and some other database.

Gokul.


Re: [HACKERS] Question: CREATE EXTENSION and create schema permission?

2011-08-21 Thread Dimitri Fontaine
Kohei KaiGai kai...@kaigai.gr.jp writes:
 However, it allows users to create a new schema with his ownership,
 even if current user does not have permission to create a new schema.
[...]
 It seems to me that we should inject permission checks here like as
 CreateSchemaCommand() doing.

It seems to me the code has been written this way before we relaxed the
superuser only check in CREATE EXTENSION.  I'm not enough into security
to convince myself there's harm to protect against here, but I would
agree there's a sound logic into refusing to create the schema if the
current role isn't granted that operation.

Please note, though, that you're effectively forbidding the role to
create the extension.  As it's not relocatable, the role will not be
able to install it into another schema.  Which could be exactly what you
wanted to achieve.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question: CREATE EXTENSION and create schema permission?

2011-08-21 Thread Kohei KaiGai
2011/8/21 Dimitri Fontaine dimi...@2ndquadrant.fr:
 Kohei KaiGai kai...@kaigai.gr.jp writes:
 However, it allows users to create a new schema with his ownership,
 even if current user does not have permission to create a new schema.
 [...]
 It seems to me that we should inject permission checks here like as
 CreateSchemaCommand() doing.

 It seems to me the code has been written this way before we relaxed the
 superuser only check in CREATE EXTENSION.  I'm not enough into security
 to convince myself there's harm to protect against here, but I would
 agree there's a sound logic into refusing to create the schema if the
 current role isn't granted that operation.

 Please note, though, that you're effectively forbidding the role to
 create the extension.  As it's not relocatable, the role will not be
 able to install it into another schema.  Which could be exactly what you
 wanted to achieve.

The current implementation set the current user as owner of the new schema.
The default permission check of schema allows owner to create several kinds
of underlying objects.

In the result, we may consider a scenario that a user without permissions to
create new objects possibly get a schema created by CREATE EXTENSION
that allows him to create new objects (such as table, function, ...).

I don't think it is a desirable behavior. :-(

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Question: CREATE EXTENSION and create schema permission?

2011-08-21 Thread Dimitri Fontaine
Kohei KaiGai kai...@kaigai.gr.jp writes:
 The current implementation set the current user as owner of the new schema.
 The default permission check of schema allows owner to create several kinds
 of underlying objects.

 In the result, we may consider a scenario that a user without permissions to
 create new objects possibly get a schema created by CREATE EXTENSION
 that allows him to create new objects (such as table, function, ...).

 I don't think it is a desirable behavior. :-(

Agreed,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PushActiveSnapshot(GetTransactionSnapshot())

2011-08-21 Thread Simon Riggs
In common cases of snapshot use we run GetSnapshotData() into a
statically allocated snapshot, then immediately copy the static struct
into a dynamically allocated copy.

The static allocation was designed to remove the overhead of dynamic
allocation, but then we do it anyway.

The snapmgr code does this explicitly, but the reason isn't
documented, it just says we must do this.

Any one say why?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] the big picture for index-only scans

2011-08-21 Thread Robert Haas
On Sun, Aug 21, 2011 at 12:10 AM, Gokulakannan Somasundaram
gokul...@gmail.com wrote:
 Consider the TPC-C benchmark. Currently on a four core box, Oracle does
 29 transactions per minute. Say we are doing something around half of
 this. So a page should get quickly filled. If a vacuum runs and the
 transactions are not touched by the MakePayment transaction, then it will be
 marked as AllVisible. When the MakePayment transaction updates, it should
 clear that bit. If we test it out, with too little warehouses, we may not
 see the effect. Of Course the PGPROC infrastructure for generating
 transaction ids might be the biggest culprit, but if you ignore that this
 should be visible.

Actual testing does not appear to show a bottleneck in either of these areas.

 Maybe.  Of course, we're only talking about cases where the index-only
 scan optimization is in use (which is not all of them).

 But are you saying that, the optimization of looking at visibility map will
 happen only for Index-only scans and will not use the visibility map
 infrastructure for the normal index scans? That's definitely a good idea and
 improvement.

Well, yes.  And not only that, but the index-only scans patch has been
posted on this very mailing list, along with detailed submission
notes.  So you could go read, or try it, before jumping to the
conclusion that it doesn't work.

 d) In addition, currently there is no WAL Logging, while the bit is
 cleared,
 which would not be the case in future and hence the exclusive lock held
 on
 the visibility map is going to be held for a longer time.

 This is false and has been false since the visibility map was first
 implemented.

 I can't understand this. If you are not doing this, then it would cause
 consistency issues. Are you saying, we have a crash safe visibility map, but
 you don't follow log the change before changing the contents/ WAL
 principle. If so, please explain in detail. If you are doing it in the
 normal way, then you should be logging the changes before making the changes
 to the buffer and during that timeframe, you should be holding the lock on
 the buffer. Heikki specifically pointed out, that you have brought in the
 WAL Logging of visibility map, within the critical section.

 Heikki's comments, i am quoting:
  I believe Robert's changes to make the visibility map crash-safe covers
 that. Clearing the bit in the visibility map now happens within the same
 critical section as clearing the flag on the heap page and writing the WAL
 record.

 I would be able to respond to your other statements, once we are clear here.

There are extensive comments on this in visibilitymap.c and, in
heapam.c, heap_xlog_visible().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking sinval callback hook API

2011-08-21 Thread Robert Haas
On Fri, Aug 19, 2011 at 2:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Currently, we have two types of callbacks that can be registered to get
 control when an invalidation message is received: syscache callbacks and
 relcache callbacks.  It strikes me that we might be better advised to
 unify these into a single type of callback that gets a
 SharedInvalidationMessage struct pointer (we could pass NULL to signify
 a cache reset event).  That would avoid having to add another
 registration list every time we decide that there's a reason for
 callbacks to see another type of inval message.  There are a couple of
 reasonably near-term reasons why we might want to do this:

 1. Robert was speculating the other day about wanting to be able to
 snoop the inval traffic.  Right now, callbacks can only snoop a fairly
 small subset of it.

Is that true?  It appears to me that the events that aren't exposed at
all are smgr and relmap invalidations, which don't seem terribly
important, and presumably not a majority of the traffic.

 2. In conjunction with what I'm doing with plancache, it struck me that
 it might be nice to subdivide relcache inval messages into two types,
 one indicating a schema (DDL) change and one that just indicates that
 statistics changed; this would allow us to avoid redoing parse analysis
 and rewrite for a cached query as a consequence of autovacuum stats
 updates.  With the current scheme, plancache.c would need to register
 two different kinds of callbacks to handle that, or we'd need some other
 API change for relcache callbacks so they could distinguish.

Would this be enough for us to find a noticeable performance improvement?

 A small disadvantage of this is that callbacks would have to know about
 struct SharedInvalidationMessage, which right now isn't tremendously
 widely known, but that doesn't seem like a fatal objection to me.
 In practice that struct definition has been pretty stable.

I'm not opposed to trying to create a better/more universal API, but I
find that a bit grotty.  We've already resorted to some ugly
bit-space-preserving hacks in that structure, and I'm not sure we
won't have more in the future.  In particular, exposing the
backend_lo/backend_hi thing seems like a recipe for distributed
confusion.  Would it be too expensive to expose an opaque struct with
accessor functions?  Or pass the extracted values as separate
arguments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Rethinking sinval callback hook API

2011-08-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Aug 19, 2011 at 2:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 1. Robert was speculating the other day about wanting to be able to
 snoop the inval traffic.  Right now, callbacks can only snoop a fairly
 small subset of it.

 Is that true?  It appears to me that the events that aren't exposed at
 all are smgr and relmap invalidations, which don't seem terribly
 important, and presumably not a majority of the traffic.

Well, important is in the eye of the beholder here --- if you did need
to see one of those, you're flat outta luck.  It's also the case that
you can only snoop one catcache per registration, so there aren't enough
slots available in the fixed-size list to watch all the catcache
traffic.

 2. In conjunction with what I'm doing with plancache, it struck me that
 it might be nice to subdivide relcache inval messages into two types,
 one indicating a schema (DDL) change and one that just indicates that
 statistics changed; this would allow us to avoid redoing parse analysis
 and rewrite for a cached query as a consequence of autovacuum stats
 updates.  With the current scheme, plancache.c would need to register
 two different kinds of callbacks to handle that, or we'd need some other
 API change for relcache callbacks so they could distinguish.

 Would this be enough for us to find a noticeable performance improvement?

Impossible to say without coding it up and trying it ...

 A small disadvantage of this is that callbacks would have to know about
 struct SharedInvalidationMessage, which right now isn't tremendously
 widely known, but that doesn't seem like a fatal objection to me.
 In practice that struct definition has been pretty stable.

 I'm not opposed to trying to create a better/more universal API, but I
 find that a bit grotty.  We've already resorted to some ugly
 bit-space-preserving hacks in that structure, and I'm not sure we
 won't have more in the future.  In particular, exposing the
 backend_lo/backend_hi thing seems like a recipe for distributed
 confusion.  Would it be too expensive to expose an opaque struct with
 accessor functions?  Or pass the extracted values as separate
 arguments?

Well, that's exactly the reasoning that led to the current API.  But
we've not done well at extending that API to handle newly-added sinval
message types, and it's also designed firmly around the notion that
individual callbacks only want to see a small part of the traffic.
If I wanted to write a callback that saw *all* the traffic, the current
API is just not supportive of that.

Exposing SharedInvalidationMessage could be going too far in the other
direction, but I'm thinking we really ought to do something.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers