Re: [HACKERS] psycopg and two phase commit

2010-11-04 Thread Pavel Stehule
2010/11/4 Daniele Varrazzo :
> On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule  
> wrote:
>> Hello
>>
>> who is psycopg maintainer, please? Can somebody explains to me, why
>> psycopg doesn't support twophase commit still, although some
>> implementation was done in summer 2008?
>
> Hello Pavel,
>
> Just wanted to warn you that I have implemented the 2pc protocol in psycopg.
>
> I have this and other patches ready to be merged in the trunk: more
> details are available in
> http://initd.org/psycopg/articles/2010/11/02/new-features-upcoming-psycopg-release/
>
> I will try to have the patches released soon: definitely before the
> PyCon. If you fancy some testing you are welcome.
>

Hello,

it's very good news. A lot of people looking for it.

Regards

Pavel

I read a notice, but I didn't find a link for download, where is it, please?

Thank you

Pavel

> Best regards,
>
> -- Daniele
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-11-04 Thread Tom Lane
[ for the archives' sake ]

I wrote:
> I had a thought about how to make get_tabstat_entry() faster without
> adding overhead: what if we just plain remove the search, and always
> assume that a new entry has to be added to the tabstat array?

I spent some time looking into this idea.  It doesn't really work,
because there are places that will break if a transaction has more than
one tabstat entry for the same relation.  The one that seems most
difficult to fix is that pgstat_recv_tabstat() clamps its n_live_tuples
and n_dead_tuples values to be nonnegative after adding in each delta
received from a backend.  That is a good idea because it prevents insane
results if some messages get lost --- but if a transaction's updates get
randomly spread into several tabstat items, the intermediate counts
might get clamped, resulting in a wrong final answer even though nothing
was lost.

I also added some instrumentation printouts and found that in our
regression tests:
* about 10% of get_tabstat_entry() calls find an existing entry
  for the relation OID.  This seems to happen only when a
  relcache entry gets flushed mid-transaction, but that does
  happen, and not so infrequently either.
* about half of the transactions use as many as 20 tabstats,
  and 10% use 50 or more; but it drops off fast after that.
  Almost no transactions use as many as 100 tabstats.
It's not clear that these numbers are representative of typical
database applications, but they're something to start with anyway.

I also did some testing to compare the cost of get_tabstat_entry's
linear search against a dynahash.c table with OID key.  As I suspected,
a hash table would make this code a *lot* slower for small numbers of
tabstat entries: about a factor of 10 slower.  You need upwards of 100
tabstats touched in a transaction before the hash table begins to pay
for itself.  This is largely because dynahash doesn't have any cheap way
to reset a hashtable to empty, so you have to initialize and destroy the
table for each transaction.  By the time you've eaten that overhead,
you've already expended as many cycles as the linear search takes to
handle several dozen entries.

I conclude that if we wanted to do something about this, the most
practical solution would be the one of executing linear searches until
we get to 100+ tabstat entries in a transaction, and then building a
hashtable for subsequent searches.  However, it's exceedingly unclear
that it will ever be worth the effort or code space to do that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] timestamp of the last replayed transaction

2010-11-04 Thread Fujii Masao
On Thu, Nov 4, 2010 at 10:27 AM, Fujii Masao  wrote:
> On Tue, Nov 2, 2010 at 10:38 PM, Dimitri Fontaine
>  wrote:
>> Fujii Masao  writes:
>>> After 9.0 release, I've often heard that some people want to know
>>> how far transactions have been replayed in the standby in timestamp
>>> rather than LSN. So I'm thinking to include the function which returns
>>> the timestamp of the last applied transaction (i.e., commit/abort WAL
>>> record) in the core.
>>>
>>> Name: pg_last_replay_xact_timestamp (better name?)
>>> Return Type: timestamp with time zone
>>>
>>> Thought?
>>
>> How do you want to implement the tracking?
>
> I'm thinking to just expose GetLatestXTime(), i.e., 
> XLogCtl->recoveryLastXTime.

I attached the patch.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pg_last_replay_xact_timestamp_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 16:42:53 -0300 2010:
> Alvaro Herrera  writes:
> > 2. I think the guts of AlterExtensionNamespace (the large switch block)
> > should be elsewhere, probably in alter.c
> 
> That's implemented in the alter_extension patch v2, and that's much
> better, thanks for your continued input. Please note that it depends on
> the new set_schema.6.patch.

Hmm, seeing the amount of new includes in extension.c, I wonder if it'd
be better to move AlterExtensionNamespace to alter.c.

> (The problem with smaller patches is indeed the dependencies)

You can't please them all, I guess ...

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy snapshots?

2010-11-04 Thread Robert Haas
On Wed, Oct 20, 2010 at 8:24 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Oct 20, 2010 at 10:07 PM, Tom Lane  wrote:
>>> I'm less than convinced by the hypothesis that most transactions would
>>> avoid taking snapshots in this regime, anyway.  It would only hold up
>>> if there's little locality of reference in terms of which tuples are
>>> getting examined/modified by concurrent transactions, and that's a
>>> theory that seems much more likely to be wrong than right.
>
>> There will certainly be workloads where most transactions acquire a
>> snapshot, but just to take one obvious example, suppose we have a data
>> warehouse where every night we bulk load the day's data, and then we
>> run reporting queries all day.  Except during the overnight bulk
>> loads, there's no concurrent write activity at all, and thus no need
>> for snapshots.
>
> Well, yeah, but in this scenario there's also no contention involved in
> taking snapshots --- there are only readers of ProcArray and (IIRC) they
> only need shared locks on the array.  If you want to make any meaningful
> improvement in this area, you need something that solves the ProcArray
> access contention caused by a heavy mixed read/write transaction load.

There is certainly some improvement to be had even in eliminating
shared ProcArrayLocks, and the work that must be done while holding
them.  For example, you have to hold an exclusive lock to end a
transaction, so that's going to compete with any shared lockers.

But suppose we do have a heavy mixed read/write transaction load.  Who
is to say that requires a snapshot?  It will require a snapshot if one
transactions reads data that has been recently written by another
transaction, but I'm not convinced that'll necessarily happen that
often.  For example, suppose you have a table with a million records
and you have 200 concurrent database connections.  Each connection
repeatedly starts a transaction where it reads 10 records and then
writes 10 records.  When a new transaction starts, it overlaps 199
other transactions; if all of those transactions have done all of
their reads and writes already, there are 3,980 records in the table
for which we'll need a snapshot to determine tuple visibility;
assuming we look at no other tuples (all the accesses using index
scans rather than sequential scans), the chances that we'll need to
take a snapshot are only 1-((1-(3980/100))^20) = ~7.7%.  That's
pretty good, and of course if you have fewer overlapping transactions
or fewer operations per transaction it gets better very quickly.

Now, of course, if you have a lot of locality of reference, things
aren't going to be nearly so good.  If we assume that the accesses are
spread across only 100,000 records instead of 1,000,000, then each
transaction has a better-than-even chance of needing a snapshot.
However, in that situation, you're going to have other contention
problems, too: there's very significant probability that two backends
will actually try to update the same tuple, and one will sleep until
the other commits.   So maybe the cost of taking snapshots won't be
the biggest problem in that case anyway (he said hopefully).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 16:39:31 -0300 2010:
> Tom Lane  writes:
> > Not having read the patch, but ... the idea that was in the back of
> > my mind was to have a generic AlterObjectNamespace function that
> > would take parameters approximately like the following:
> 
> Please find attached what I came up with, that's the set_schema patch
> version 6.

Neat.

The has_privs_of_role() call has the wrong ACL_KIND argument in the
error report.

(Nitpick: don't use "e.g." at the end of the phrase.  It seems strange
to me.)

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy snapshots?

2010-11-04 Thread Robert Haas
On Thu, Nov 4, 2010 at 2:00 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Wed, Oct 20, 2010 at 8:11 PM, Robert Haas  wrote:
 I'm imagining that the kernel of a
 snapshot is just a WAL position, ie the end of WAL as of the time you
 take the snapshot (easy to get in O(1) time).  Visibility tests then
 reduce to "did this transaction commit with a WAL record located before
 the specified position?".
>
>> I spent a bunch of time thinking about this, and I don't see any way
>> to get the memory usage requirements down to something reasonable.
>> The problem is that RecentGlobalXmin might be arbitrarily far back in
>> XID space, and you'll need to know the LSN of every commit from that
>> point forward; whereas the ProcArray requires only constant space.
>
> That's like arguing that clog is no good because it doesn't fit in
> constant space.  ISTM it would be entirely practical to remember the
> commit LSN positions of every transaction back to RecentGlobalXmin,
> using a data structure similar to pg_subtrans --- in fact, it'd require
> exactly twice as much working space as pg_subtrans, ie 64 bits per XID
> instead of 32.  Now, it might be that access contention problems would
> make this unworkable (I think pg_subtrans works largely because we don't
> have to access it often) but it's not something that can be dismissed
> on space grounds.

Maybe I didn't explain that very well.  The point is not so much how
much memory you're using in an absolute sense as how much of it you
have to look at to construct a snapshot.  If you store a giant array
indexed by XID whose value is an LSN, you have to read a potentially
unbounded number of entries from that array.  You can either make a
single read through the relevant portion of the array (snapshot xmin
to snapshot xmax) or you can check each XID as you see it and try to
build up a local cache, but either way there's no fixed limit on how
many bytes must be read from the shared data structure.  That compares
unfavorably with the current design, where you do a one-time read of a
bounded amount of data and you're done.  I suspect your theory about
pg_subtrans is correct.

> [ thinks for a bit... ]  But actually this probably ends up being a
> wash or a loss as far as contention goes.  We're talking about a data
> structure that has to be updated during each commit, and read pretty
> frequently, and it's not obvious how that's any better than getting
> commit info from the ProcArray.  Although neither commit nor reading
> would require a *global* lock, so maybe there's a way ...

Well, if we're talking about the "giant XID array" design, or some
variant of that, I would expect that nearly all of the contention
would be on the last page or two, so I don't think it would be much
better than a global lock.  You might be able to get around that by
not using an LWLock, and instead using something like lock xchg or
LL/SC to atomatically update entries, but I'm not sure there's a
portable way to do such operations on anything larger than a 4-byte
word.  At any rate, I think the problem described in the preceding
paragraph is the more serious one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy snapshots?

2010-11-04 Thread Tom Lane
Robert Haas  writes:
> On Wed, Oct 20, 2010 at 8:11 PM, Robert Haas  wrote:
>>> I'm imagining that the kernel of a
>>> snapshot is just a WAL position, ie the end of WAL as of the time you
>>> take the snapshot (easy to get in O(1) time).  Visibility tests then
>>> reduce to "did this transaction commit with a WAL record located before
>>> the specified position?".

> I spent a bunch of time thinking about this, and I don't see any way
> to get the memory usage requirements down to something reasonable.
> The problem is that RecentGlobalXmin might be arbitrarily far back in
> XID space, and you'll need to know the LSN of every commit from that
> point forward; whereas the ProcArray requires only constant space.

That's like arguing that clog is no good because it doesn't fit in
constant space.  ISTM it would be entirely practical to remember the
commit LSN positions of every transaction back to RecentGlobalXmin,
using a data structure similar to pg_subtrans --- in fact, it'd require
exactly twice as much working space as pg_subtrans, ie 64 bits per XID
instead of 32.  Now, it might be that access contention problems would
make this unworkable (I think pg_subtrans works largely because we don't
have to access it often) but it's not something that can be dismissed
on space grounds.

[ thinks for a bit... ]  But actually this probably ends up being a
wash or a loss as far as contention goes.  We're talking about a data
structure that has to be updated during each commit, and read pretty
frequently, and it's not obvious how that's any better than getting
commit info from the ProcArray.  Although neither commit nor reading
would require a *global* lock, so maybe there's a way ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Alex Hunsaker
On Thu, Nov 4, 2010 at 14:29, Alex Hunsaker  wrote:
> On Thu, Nov 4, 2010 at 13:43, Hannu Krosing  wrote:
>> So your plan was to have some savepoint before each execute ?
>>
>> How would one rollback the latest transaction ?
>
> It is always rolled back.  Its how plperl works today:
> create or replace function foo() returns int as $$
> eval {
>    spi_exec_query('create table uniq (num int primary key');
>    spi_exec_query('insert into uniq (num) values (1), (1);', 1);
> };

To be clear, there is no reason to have both in an eval {}.  There is
no magic savepoint there.  if 'insert into' fails, the table will
still be created (assuming the transaction is not aborted later of
course).

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Alex Hunsaker
On Thu, Nov 4, 2010 at 13:43, Hannu Krosing  wrote:
> So your plan was to have some savepoint before each execute ?
>
> How would one rollback the latest transaction ?

It is always rolled back.  Its how plperl works today:
create or replace function foo() returns int as $$
eval {
spi_exec_query('create table uniq (num int primary key');
spi_exec_query('insert into uniq (num) values (1), (1);', 1);
};

if($@) {
 # do something ... $@ == "duplicate key value violates unique
constraint "uniq_pkey" at line 2."
  warn $@;
}

# oh well do something else
# note the transaction is _not_ aborted
spi_exec_query('select 1;', 1);
return 1;
$$ language plperl;

=# begin;
=# select foo();
=# select 1;
=# commit;

It does not matter if you use eval or not, its always in a sub transaction.

> I see. "exception when unique violation" in plpgsql  does automatic
> rollback to block start (matching BEGIN) so I assumed that your
> try/except sample is designed to do something similar

Basically, minus the rollback to start.  Its per query.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread David E. Wheeler
On Nov 4, 2010, at 4:20 AM, Peter Eisentraut wrote:

> On ons, 2010-11-03 at 14:15 -0700, David E. Wheeler wrote:
>> /me wants a global $dbh that mimics the DBI interface but just uses
>> SPI under the hood. Not volunteering, either…
> 
> Already exists: DBD::PgSPI.  Probably needs lots of updating through.

Funny I never noticed that before. I couldn't get it to build, of course. And 
it looks a bit heavy, relying on DBD::Pg. Seems like it'd be easier to write 
something that just uses SPI and emulates the DBI interface, IMHO.

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] lazy snapshots?

2010-11-04 Thread Robert Haas
On Wed, Oct 20, 2010 at 8:11 PM, Robert Haas  wrote:
>> so it's
>> not exploiting that resource at all.  I'm imagining that the kernel of a
>> snapshot is just a WAL position, ie the end of WAL as of the time you
>> take the snapshot (easy to get in O(1) time).  Visibility tests then
>> reduce to "did this transaction commit with a WAL record located before
>> the specified position?".  You'd need some index datastructure that made
>> it reasonably cheap to find out the commit locations of recently
>> committed transactions, where "recent" means "back to recentGlobalXmin".
>> That seems possibly do-able, though I don't have a concrete design in
>> mind.
>
> Interesting.  O(1) snapshots would be great.  I need to think about
> this more before commenting on it, though.

I spent a bunch of time thinking about this, and I don't see any way
to get the memory usage requirements down to something reasonable.
The problem is that RecentGlobalXmin might be arbitrarily far back in
XID space, and you'll need to know the LSN of every commit from that
point forward; whereas the ProcArray requires only constant space.  To
put this another way, for any given snapshot, the "interesting" XIDs
are the top-level XIDs between the its xmin and xmax; the number of
those that are invisible to the snapshot is bounded, but the total
number is unbounded.  To make this work, you'd need an inexpensive way
of finding the set of transactions which committed after a given LSN
with XIDs less than the snapshot's xmax.  Another idea for making that
work is to keep a table of the  pairs for every
snapshot in the system and only track exactly those commits that are
relevant, but it seems hard to think about how to maintain that data
structure, and hard also to prune the table of tracked commits as
snapshots are released.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Hannu Krosing
On Thu, 2010-11-04 at 11:07 -0600, Alex Hunsaker wrote:
> On Thu, Nov 4, 2010 at 03:54, Hannu Krosing  wrote:
> >> > try:
> >> > plpy.execute("insert into foo values(1)")
> >> > except plpy.UniqueViolation, e:
> >> > plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)
> >>
> >> Are you sure that having each try/except use a subtransaction is the
> >> right way to do it ?
> 
> I assumed the try was purely so you could 'catch' things.  And did not
> mean run it in a subtransaction (without the try block it still runs
> in one).

So your plan was to have some savepoint before each execute ?

How would one rollback the latest transaction ?

Or is it something else you mean by "subtransaction" ?

> Personally, I was looking more at:
> 
> >> > except plpy.UniqueViolation, e:
> >> > plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)
> 
> Which to me says if SPI has an error we get a nice error object back,
> that also lets you do the normal exception catching dance (if thats
> what you are in to...) and translates IMO better to how plpgsql works
> ("exception when unique_violation").

I see. "exception when unique violation" in plpgsql  does automatic
rollback to block start (matching BEGIN) so I assumed that your
try/except sample is designed to do something similar 

> > Another objection
> >
> >> I'd like to make it more explicit and use
> >>
> >> with plpy.subtransaction():
> >> do your stuff
> 
> Sounds more like a savepoint?

Yeah. SAVEPOINT command is how you start a "subtransaction" in plain
SQL.

-- 
---
Hannu Krosing
PostgreSQL Infinite Scalability and Preformance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Alvaro Herrera  writes:
> 2. I think the guts of AlterExtensionNamespace (the large switch block)
> should be elsewhere, probably in alter.c

That's implemented in the alter_extension patch v2, and that's much
better, thanks for your continued input. Please note that it depends on
the new set_schema.6.patch.

(The problem with smaller patches is indeed the dependencies)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/catalog/pg_depend.c
--- b/src/backend/catalog/pg_depend.c
***
*** 20,25 
--- 20,27 
  #include "catalog/indexing.h"
  #include "catalog/pg_constraint.h"
  #include "catalog/pg_depend.h"
+ #include "catalog/pg_extension.h"
+ #include "catalog/pg_namespace.h"
  #include "miscadmin.h"
  #include "utils/fmgroids.h"
  #include "utils/lsyscache.h"
***
*** 643,645  get_index_constraint(Oid indexId)
--- 645,699 
  
  	return constraintId;
  }
+ 
+ /*
+  * get_extension_namespace
+  *		Given the OID of an extension, return the OID of the schema it
+  *		depends on, or InvalidOid when not found
+  */
+ Oid
+ get_extension_namespace(Oid extensionId)
+ {
+ 	Oid			nspId = InvalidOid;
+ 	Relation	depRel;
+ 	ScanKeyData key[3];
+ 	SysScanDesc scan;
+ 	HeapTuple	tup;
+ 
+ 	/* Search the dependency table for the index */
+ 	depRel = heap_open(DependRelationId, AccessShareLock);
+ 
+ 	ScanKeyInit(&key[0],
+ Anum_pg_depend_classid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(ExtensionRelationId));
+ 	ScanKeyInit(&key[1],
+ Anum_pg_depend_objid,
+ BTEqualStrategyNumber, F_OIDEQ,
+ ObjectIdGetDatum(extensionId));
+ 	ScanKeyInit(&key[2],
+ Anum_pg_depend_objsubid,
+ BTEqualStrategyNumber, F_INT4EQ,
+ Int32GetDatum(0));
+ 
+ 	scan = systable_beginscan(depRel, DependDependerIndexId, true,
+ 			  SnapshotNow, 3, key);
+ 
+ 	while (HeapTupleIsValid(tup = systable_getnext(scan)))
+ 	{
+ 		Form_pg_depend deprec = (Form_pg_depend) GETSTRUCT(tup);
+ 
+ 		if (deprec->refclassid == NamespaceRelationId &&
+ 			deprec->refobjsubid == 0 &&
+ 			deprec->deptype == DEPENDENCY_NORMAL)
+ 		{
+ 			nspId = deprec->refobjid;
+ 			break;
+ 		}
+ 	}
+ 
+ 	systable_endscan(scan);
+ 	heap_close(depRel, AccessShareLock);
+ 
+ 	return nspId;
+ }
*** a/src/backend/catalog/pg_namespace.c
--- b/src/backend/catalog/pg_namespace.c
***
*** 76,80  NamespaceCreate(const char *nspName, Oid ownerId)
--- 76,92 
  	/* Record dependency on owner */
  	recordDependencyOnOwner(NamespaceRelationId, nspoid, ownerId);
  
+ 	/* Record dependency on extension, if we're in a CREATE EXTENSION */
+ 	if (create_extension)
+ 	{
+ 		ObjectAddress myself;
+ 
+ 		myself.classId = NamespaceRelationId;
+ 		myself.objectId = nspoid;
+ 		myself.objectSubId = 0;
+ 
+ 		recordDependencyOn(&myself, &extension, DEPENDENCY_INTERNAL);
+ 	}
+ 
  	return nspoid;
  }
*** a/src/backend/commands/alter.c
--- b/src/backend/commands/alter.c
***
*** 22,27 
--- 22,28 
  #include "commands/conversioncmds.h"
  #include "commands/dbcommands.h"
  #include "commands/defrem.h"
+ #include "commands/extension.h"
  #include "commands/proclang.h"
  #include "commands/schemacmds.h"
  #include "commands/tablecmds.h"
***
*** 189,194  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 190,199 
  			AlterConversionNamespace(stmt->object, stmt->newschema);
  			break;
  
+ 		case OBJECT_EXTENSION:
+ 			AlterExtensionNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_FUNCTION:
  			AlterFunctionNamespace(stmt->object, stmt->objarg, false,
     stmt->newschema);
***
*** 334,339  AlterObjectNamespace(Relation rel, int cacheId,
--- 339,436 
  		NamespaceRelationId, oldNspOid, nspOid);
  }
  
+ /*
+  * Do the SET SCHEMA depending on the object class.
+  *
+  * We only consider objects that have a namespace and that can exist
+  * without depending on another object (like a table) which will
+  * have its dependencies follow the SET SCHEMA operation.
+  */
+ void
+ AlterObjectNamespace_internal(ObjectAddress *thisobj, Oid nspOid)
+ {
+ 	switch (getObjectClass(thisobj))
+ 	{
+ 		case OCLASS_CLASS:
+ 		{
+ 			Relation classRel;
+ 			Relation rel = relation_open(thisobj->objectId, RowExclusiveLock);
+ 
+ 			switch (rel->rd_rel->relkind)
+ 			{
+ case RELKIND_COMPOSITE_TYPE:
+ 	/*
+ 	 * just skip the pg_class entry, we have a pg_type
+ 	 * entry too
+ 	 */
+ 	break;
+ 
+ default:
+ 	classRel = heap_open(RelationRelationId, RowExclusiveLock);
+ 	AlterRelationNamespaceInternal(classRel,
+    RelationGetRelid(rel),
+    RelationGetNamespace(rel),
+    nspOid,
+    true);
+ 	heap_close(classRel, RowExclusiveLock);
+ 	break;
+ 			}
+ 			relation_close(rel, RowExclusiveLock);
+ 			break;
+ 		}
+ 
+ 		case OCLASS_PROC:
+ 			AlterFunctionNamespace_oid(

Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Tom Lane  writes:
> Not having read the patch, but ... the idea that was in the back of
> my mind was to have a generic AlterObjectNamespace function that
> would take parameters approximately like the following:

Please find attached what I came up with, that's the set_schema patch
version 6.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/catalog/dependency.c
--- b/src/backend/catalog/dependency.c
***
*** 2706,2711  getObjectDescription(const ObjectAddress *object)
--- 2706,2726 
  }
  
  /*
+  * getObjectDescriptionOids: as above, except the object is specified by Oids
+  */
+ char *
+ getObjectDescriptionOids(Oid classid, Oid objid)
+ {
+ 	ObjectAddress	address;
+ 
+ 	address.classId = classid;
+ 	address.objectId = objid;
+ 	address.objectSubId = 0;
+ 
+ 	return getObjectDescription(&address);
+ }
+ 
+ /*
   * subroutine for getObjectDescription: describe a relation
   */
  static void
*** a/src/backend/catalog/namespace.c
--- b/src/backend/catalog/namespace.c
***
*** 2340,2345  LookupCreationNamespace(const char *nspname)
--- 2340,2372 
  }
  
  /*
+  * Check new namespace validity in ALTER OBJECT ... SET SCHEMA ... and
+  * ereport(ERROR, ...) in case of any problem.
+  */
+ void
+ CheckSetNamespace(Oid oldNspOid, Oid nspOid, Oid classid, Oid objid)
+ {
+ 	if (oldNspOid == nspOid)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+  errmsg("%s already exists in schema \"%s\"",
+ 		getObjectDescriptionOids(classid, objid),
+ 		get_namespace_name(nspOid;
+ 
+ 	/* disallow renaming into or out of temp schemas */
+ 	if (isAnyTempNamespace(nspOid) || isAnyTempNamespace(oldNspOid))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("cannot move objects into or out of temporary schemas")));
+ 
+ 	/* same for TOAST schema */
+ 	if (nspOid == PG_TOAST_NAMESPACE || oldNspOid == PG_TOAST_NAMESPACE)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("cannot move objects into or out of TOAST schema")));
+ }
+ 
+ /*
   * QualifiedNameGetCreationNamespace
   *		Given a possibly-qualified name for an object (in List-of-Values
   *		format), determine what namespace the object should be created in.
*** a/src/backend/commands/alter.c
--- b/src/backend/commands/alter.c
***
*** 14,21 
--- 14,23 
   */
  #include "postgres.h"
  
+ #include "catalog/indexing.h"
  #include "catalog/namespace.h"
  #include "catalog/pg_largeobject.h"
+ #include "catalog/pg_namespace.h"
  #include "commands/alter.h"
  #include "commands/conversioncmds.h"
  #include "commands/dbcommands.h"
***
*** 33,38 
--- 35,41 
  #include "utils/acl.h"
  #include "utils/builtins.h"
  #include "utils/lsyscache.h"
+ #include "utils/syscache.h"
  
  
  /*
***
*** 182,192  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 185,211 
     stmt->newschema);
  			break;
  
+ 		case OBJECT_CONVERSION:
+ 			AlterConversionNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_FUNCTION:
  			AlterFunctionNamespace(stmt->object, stmt->objarg, false,
     stmt->newschema);
  			break;
  
+ 		case OBJECT_OPERATOR:
+ 			AlterOperatorNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_OPCLASS:
+ 			AlterOpClassNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_OPFAMILY:
+ 			AlterOpFamilyNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_SEQUENCE:
  		case OBJECT_TABLE:
  		case OBJECT_VIEW:
***
*** 195,200  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 214,235 
  stmt->objectType, AccessExclusiveLock);
  			break;
  
+ 		case OBJECT_TSPARSER:
+ 			AlterTSParserNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_TSDICTIONARY:
+ 			AlterTSDictionaryNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_TSTEMPLATE:
+ 			AlterTSTemplateNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_TSCONFIGURATION:
+ 			AlterTSConfigurationNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_TYPE:
  		case OBJECT_DOMAIN:
  			AlterTypeNamespace(stmt->object, stmt->newschema);
***
*** 207,212  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 242,341 
  }
  
  /*
+  * Generic function to change the namespace of a given object, for simple
+  * cases (won't work for tables or functions, e.g.)
+  *
+  * The AlterFooNamespace() calls just above will call a function whose job
+  * is to lookup the arguments for the generic function here.
+  *
+  * Relation must already by open, it's the responsibility of the caller to
+  * close it.
+  */
+ void
+ AlterObjectNamespace(Relation rel, int cacheId,
+ 	 Oid classId, Oid objid, Oid nspOid,
+ 	

Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Alex Hunsaker
On Thu, Nov 4, 2010 at 03:54, Hannu Krosing  wrote:
>> > try:
>> >     plpy.execute("insert into foo values(1)")
>> > except plpy.UniqueViolation, e:
>> >     plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)
>>
>> Are you sure that having each try/except use a subtransaction is the
>> right way to do it ?

I assumed the try was purely so you could 'catch' things.  And did not
mean run it in a subtransaction (without the try block it still runs
in one).

Personally, I was looking more at:

>> > except plpy.UniqueViolation, e:
>> > plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)

Which to me says if SPI has an error we get a nice error object back,
that also lets you do the normal exception catching dance (if thats
what you are in to...) and translates IMO better to how plpgsql works
("exception when unique_violation").

> Another objection
>
>> I'd like to make it more explicit and use
>>
>> with plpy.subtransaction():
>>     do your stuff

Sounds more like a savepoint?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psycopg and two phase commit

2010-11-04 Thread Daniele Varrazzo
On Sat, Sep 18, 2010 at 5:01 PM, Pavel Stehule  wrote:
> Hello
>
> who is psycopg maintainer, please? Can somebody explains to me, why
> psycopg doesn't support twophase commit still, although some
> implementation was done in summer 2008?

Hello Pavel,

Just wanted to warn you that I have implemented the 2pc protocol in psycopg.

I have this and other patches ready to be merged in the trunk: more
details are available in
http://initd.org/psycopg/articles/2010/11/02/new-features-upcoming-psycopg-release/

I will try to have the patches released soon: definitely before the
PyCon. If you fancy some testing you are welcome.

Best regards,

-- Daniele

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alter column to type serial

2010-11-04 Thread Alvaro Herrera
Excerpts from Thom Brown's message of jue nov 04 09:05:01 -0300 2010:

> This would be instead of having to do:
> 
> CREATE SEQUENCE id_stuff_seq;
> 
> SELECT setval('id_stuff_seq', (SELECT max(id) FROM stuff))
> 
> ALTER TABLE stuff ALTER COLUMN id SET DEFAULT
> nextval('id_stuff_seq'::regclass);
> 
> Which would also mean the sequence would not get dropped with the table.

You can fix that with an ALTER SEQUENCE OWNED BY.

> Abhorrent idea, or acceptable?

I think the problem is in locking the table against futher insertions
while you do the setval.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 11:52:53 -0300 2010:
> Alvaro Herrera  writes:

> > 3. Not this patch, but I think using "extension" as a global variable
> > name is a bad idea.
> 
> What about create_extension_extension instead? I'm not thinking of
> something better, bikeshedding is opened.

CreateExtensionAddress?  (I like CamelCase for this because it makes
these variables stand out more against local ones, named in
stuffed_lower_case).

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 11:37:37 -0300 2010:
> Alvaro Herrera  writes:
> >> /* check for duplicate name (more friendly than unique-index failure) 
> >> */
> >> if (SearchSysCacheExists2(TYPENAMENSP,
> >>   CStringGetDatum(name),
> >>   ObjectIdGetDatum(nspOid)))
> >
> > Hmm, this check is wrong anyway, because you're looking in the pg_type
> > syscache for objects from an arbitrary catalog.  That needs to be fixed
> > somehow, but perhaps it needs to be handled by the callers, not in this
> > routine.  Otherwise you're going to need to pass the syscache ID, as
> > well as Datums identifying the object, and the number of Datums.
> 
> How embarrassing. I wonder why this works, too:
> 
> dim=# alter operator utils.@>(utils.ltree, utils.ltree) set schema public;
> ALTER OPERATOR

Well, I guess the operator doesn't exist in the pg_type syscache, so it
doesn't ereport(ERROR).

> Well, I'll go fix as you say, putting the check back into the
> callers. That won't help a bit with the code duplication feeling we have
> when reading the patch, though. Any idea on this front?

I'm not really sure about the code duplication bits.  There are plenty
of things you cannot factor into common routines because of the need to
handle different GETSTRUCT args, different number of syscache arguments,
etc.  The ALTER OWNER implementation is already "duplicated" for each
database object.  Your patch is already reducing duplication by moving
some common checks into namespace.c.

If there are more things that you could do by specifying a syscache ID
and datums to be passed to it, perhaps it would make sense to use them
as parameters to a function that does the whole bunch together.  This
probably doesn't belong into namespace.c though.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Robert Haas
On Thu, Nov 4, 2010 at 8:18 AM, Dimitri Fontaine  wrote:
> Robert Haas  writes:
>> On Thu, Nov 4, 2010 at 7:52 AM, Dimitri Fontaine  
>> wrote:
>>> What about create_extension_extension instead? I'm not thinking of
>>> something better, bikeshedding is opened.
>>
>> That doesn't seem very clear... I'm always suspicious of names that
>> use the same word twice, and in this case I have no idea what this
>> variable would supposedly refer to.
>
> The ObjectAddress of the extension currently being installed by the
> CREATE EXTENSION command we're "in" (executing the script). The variable
> create_extension is already a boolean only set to true if in the code
> path.

How about calling it CurrentExtensionObjectAddress or something like
that?  And maybe you don't need a boolean: if
(OidIsValid(CurrentExtensionObjectAddress.objid)) ...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Robert Haas  writes:
> On Thu, Nov 4, 2010 at 7:52 AM, Dimitri Fontaine  
> wrote:
>> What about create_extension_extension instead? I'm not thinking of
>> something better, bikeshedding is opened.
>
> That doesn't seem very clear... I'm always suspicious of names that
> use the same word twice, and in this case I have no idea what this
> variable would supposedly refer to.

The ObjectAddress of the extension currently being installed by the
CREATE EXTENSION command we're "in" (executing the script). The variable
create_extension is already a boolean only set to true if in the code
path.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Tom Lane  writes:
> Not having read the patch, but ... the idea that was in the back of
> my mind was to have a generic AlterObjectNamespace function that
> would take parameters approximately like the following:
>
>   OID of catalog containing object
>   Column number of catalog's namespace column (Anum_xxx constant)
>   OID of intended new namespace

Ah, the trick is to use the Anum_xxx, of course. I couldn't get rid of
thinking how to dynamically access by name... will have a try at that,
thanks for the idea.

> You could do a generic heap_open() on the catalog using the OID,
> and then use heap_modify_tuple to apply the namespace column update.

Thanks for pointing me to the right APIs: finding them is where the time
is mostly spent as far as I'm concerned.

> It might be nice to include the "object already exists" check here
> too, which could probably be done if in addition the column number
> of the name column were passed in.  Permission checks too, if the
> owner column number were passed in.  Etc.

Well it seems that depending on the object, sometime only superusers are
allowed to edit things, and sometime the owner too. Will add a boolean
superuser_only in the prototype.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Robert Haas
On Thu, Nov 4, 2010 at 7:52 AM, Dimitri Fontaine  wrote:
> What about create_extension_extension instead? I'm not thinking of
> something better, bikeshedding is opened.

That doesn't seem very clear... I'm always suspicious of names that
use the same word twice, and in this case I have no idea what this
variable would supposedly refer to.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Alvaro Herrera  writes:
> 1. wouldn't it make more sense to save the extension namespace in the
> extension catalog?

I don't think so, because the extension itself is not schema
qualified. What lives in the namespace the extension depends on is not
the extension itself, but its objects.

> 2. I think the guts of AlterExtensionNamespace (the large switch block)
> should be elsewhere, probably in alter.c

Makes sense, will move there.

> 3. Not this patch, but I think using "extension" as a global variable
> name is a bad idea.

What about create_extension_extension instead? I'm not thinking of
something better, bikeshedding is opened.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Tom Lane
Dimitri Fontaine  writes:
> Well, I'll go fix as you say, putting the check back into the
> callers. That won't help a bit with the code duplication feeling we have
> when reading the patch, though. Any idea on this front?

Not having read the patch, but ... the idea that was in the back of
my mind was to have a generic AlterObjectNamespace function that
would take parameters approximately like the following:

OID of catalog containing object
Column number of catalog's namespace column (Anum_xxx constant)
OID of intended new namespace

You could do a generic heap_open() on the catalog using the OID,
and then use heap_modify_tuple to apply the namespace column update.

It might be nice to include the "object already exists" check here
too, which could probably be done if in addition the column number
of the name column were passed in.  Permission checks too, if the
owner column number were passed in.  Etc.

Obviously this doesn't work for tables, but they're sufficiently
complex beasts that it's not unusual for them to need a different
code path.  Doesn't help for functions/operators either, since their
collision check isn't just name and namespace.  But that's OK IMO.
I'd be happy if we could unify the code paths for objects that have a
single catalog entry to update and a simple name/namespace collision
check to make.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 11:06:48 -0300 2010:

> Also attached, please find the complete version of ALTER EXTENSION ext
> SET SCHEMA name; with support for all contrib extensions. That's the
> example that allows to see the API (AlterFooNamespace_oid and _internal
> functions) in action: that should help devising the best refactoring.

Three comments,

1. wouldn't it make more sense to save the extension namespace in the
extension catalog?

2. I think the guts of AlterExtensionNamespace (the large switch block)
should be elsewhere, probably in alter.c

3. Not this patch, but I think using "extension" as a global variable
name is a bad idea.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alter column to type serial

2010-11-04 Thread Tom Lane
Thom Brown  writes:
> I suspected this may have been discussed previously, I just failed to find
> it.  And yes, it's purely for simplification, and to auto-clean sequences
> when tables are dropped.  I didn't think it would be straightforward, but
> clearly there are show-stoppers abound.

The latest thread I can find on the matter is
http://archives.postgresql.org/pgsql-general/2008-11/msg00825.php

although I seem to remember others.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Alvaro Herrera  writes:
>> /* check for duplicate name (more friendly than unique-index failure) */
>> if (SearchSysCacheExists2(TYPENAMENSP,
>>   CStringGetDatum(name),
>>   ObjectIdGetDatum(nspOid)))
>
> Hmm, this check is wrong anyway, because you're looking in the pg_type
> syscache for objects from an arbitrary catalog.  That needs to be fixed
> somehow, but perhaps it needs to be handled by the callers, not in this
> routine.  Otherwise you're going to need to pass the syscache ID, as
> well as Datums identifying the object, and the number of Datums.

How embarrassing. I wonder why this works, too:

dim=# alter operator utils.@>(utils.ltree, utils.ltree) set schema public;
ALTER OPERATOR
dim=# alter operator @>(utils.ltree, utils.ltree) set schema utils;
ALTER OPERATOR

We have :

static void
AlterOperatorNamespace_internal(Relation rel, Oid operOid, Oid nspOid)
{
...
CheckSetNamespace(oldNspOid, nspOid,
  OperatorRelationId, operOid, 
NameStr(oprForm->oprname));

Well, I'll go fix as you say, putting the check back into the
callers. That won't help a bit with the code duplication feeling we have
when reading the patch, though. Any idea on this front?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of jue nov 04 11:06:48 -0300 2010:
> Alvaro Herrera  writes:
> > FWIW I think you should use getObjectDescription, as in the attached
> > patch.  (Note the patch is incomplete and does not compile because only
> > one caller to CheckSetNamespace has been fixed).
> 
> I had to re-add the object name to the CheckSetNamespace prototype to
> handle this particular check:
> 
> /* check for duplicate name (more friendly than unique-index failure) */
> if (SearchSysCacheExists2(TYPENAMENSP,
>   CStringGetDatum(name),
>   ObjectIdGetDatum(nspOid)))

Hmm, this check is wrong anyway, because you're looking in the pg_type
syscache for objects from an arbitrary catalog.  That needs to be fixed
somehow, but perhaps it needs to be handled by the callers, not in this
routine.  Otherwise you're going to need to pass the syscache ID, as
well as Datums identifying the object, and the number of Datums.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Alter column to type serial

2010-11-04 Thread Thom Brown
On 4 November 2010 14:04, Tom Lane  wrote:

> Thom Brown  writes:
> > Would it be possible (or reasonable) to add support for changing the type
> of
> > a column to serial or bigserial (yes, yes, I know they're not actual
> > types)?
>
> We've looked at that in the past and decided there were enough corner
> cases that it wasn't clearly a good idea.  In particular, what do you do
> with the existing data in the column?  What do you do if there's already
> a DEFAULT expression for the column, throw it away?  In particular, what
> of the special case that the column is in fact already a serial, so the
> default is pointing at an existing sequence?
>
> It is possible to accomplish everything that such a command would do
> manually, so the argument for having it boils down to wanting it to
> be a bit easier.  But unless the command can always do the right thing
> automatically, I'm not sure "easy" is a good argument.
>
> There's also the objection that such an operation would actually have
> very little to do with ALTER COLUMN TYPE --- most of the things it would
> do are not that.  The fact that serial was bolted on as a fake type is a
> wart that maybe we shouldn't extend in this particular fashion.
>
>
I suspected this may have been discussed previously, I just failed to find
it.  And yes, it's purely for simplification, and to auto-clean sequences
when tables are dropped.  I didn't think it would be straightforward, but
clearly there are show-stoppers abound.

Thanks for the reply though. :)

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-04 Thread Dimitri Fontaine
Alvaro Herrera  writes:
> FWIW I think you should use getObjectDescription, as in the attached
> patch.  (Note the patch is incomplete and does not compile because only
> one caller to CheckSetNamespace has been fixed).

I had to re-add the object name to the CheckSetNamespace prototype to
handle this particular check:

/* check for duplicate name (more friendly than unique-index failure) */
if (SearchSysCacheExists2(TYPENAMENSP,
  CStringGetDatum(name),
  
ObjectIdGetDatum(nspOid)))

If you know how to get some struct attribute given a char * holding its
name, in C, I would adapt the patch and work on the refactoring asked
for by Tom.

Apart from that, it was just about adapting the call sites, which is
done in the attached set_schema.5.patch. Thanks!

Also attached, please find the complete version of ALTER EXTENSION ext
SET SCHEMA name; with support for all contrib extensions. That's the
example that allows to see the API (AlterFooNamespace_oid and _internal
functions) in action: that should help devising the best refactoring.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/catalog/dependency.c
--- b/src/backend/catalog/dependency.c
***
*** 2706,2711  getObjectDescription(const ObjectAddress *object)
--- 2706,2726 
  }
  
  /*
+  * getObjectDescriptionOids: as above, except the object is specified by Oids
+  */
+ char *
+ getObjectDescriptionOids(Oid classid, Oid objid)
+ {
+ 	ObjectAddress	address;
+ 
+ 	address.classId = classid;
+ 	address.objectId = objid;
+ 	address.objectSubId = 0;
+ 
+ 	return getObjectDescription(&address);
+ }
+ 
+ /*
   * subroutine for getObjectDescription: describe a relation
   */
  static void
*** a/src/backend/catalog/namespace.c
--- b/src/backend/catalog/namespace.c
***
*** 2340,2345  LookupCreationNamespace(const char *nspname)
--- 2340,2383 
  }
  
  /*
+  * Check new namespace validity in ALTER OBJECT ... SET SCHEMA ... and
+  * ereport(ERROR, ...) in case of any problem.
+  */
+ void
+ CheckSetNamespace(Oid oldNspOid, Oid nspOid, Oid classid, Oid objid,
+   const char *name)
+ {
+ 	if (oldNspOid == nspOid)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+  errmsg("%s already exists in schema \"%s\"",
+ 		getObjectDescriptionOids(classid, objid),
+ 		get_namespace_name(nspOid;
+ 
+ 	/* disallow renaming into or out of temp schemas */
+ 	if (isAnyTempNamespace(nspOid) || isAnyTempNamespace(oldNspOid))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("cannot move objects into or out of temporary schemas")));
+ 
+ 	/* same for TOAST schema */
+ 	if (nspOid == PG_TOAST_NAMESPACE || oldNspOid == PG_TOAST_NAMESPACE)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+  errmsg("cannot move objects into or out of TOAST schema")));
+ 
+ 	/* check for duplicate name (more friendly than unique-index failure) */
+ 	if (SearchSysCacheExists2(TYPENAMENSP,
+ 			  CStringGetDatum(name),
+ 			  ObjectIdGetDatum(nspOid)))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_DUPLICATE_OBJECT),
+  errmsg("%s already exists in schema \"%s\"",
+ 		getObjectDescriptionOids(classid, objid),
+ 		get_namespace_name(nspOid;
+ }
+ 
+ /*
   * QualifiedNameGetCreationNamespace
   *		Given a possibly-qualified name for an object (in List-of-Values
   *		format), determine what namespace the object should be created in.
*** a/src/backend/commands/alter.c
--- b/src/backend/commands/alter.c
***
*** 182,192  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 182,208 
     stmt->newschema);
  			break;
  
+ 		case OBJECT_CONVERSION:
+ 			AlterConversionNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_FUNCTION:
  			AlterFunctionNamespace(stmt->object, stmt->objarg, false,
     stmt->newschema);
  			break;
  
+ 		case OBJECT_OPERATOR:
+ 			AlterOperatorNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_OPCLASS:
+ 			AlterOpClassNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_OPFAMILY:
+ 			AlterOpFamilyNamespace(stmt->object, stmt->objarg, stmt->newschema);
+ 			break;
+ 
  		case OBJECT_SEQUENCE:
  		case OBJECT_TABLE:
  		case OBJECT_VIEW:
***
*** 195,200  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
--- 211,232 
  stmt->objectType, AccessExclusiveLock);
  			break;
  
+ 		case OBJECT_TSPARSER:
+ 			AlterTSParserNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_TSDICTIONARY:
+ 			AlterTSDictionaryNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT_TSTEMPLATE:
+ 			AlterTSTemplateNamespace(stmt->object, stmt->newschema);
+ 			break;
+ 
+ 		case OBJECT

Re: [HACKERS] Alter column to type serial

2010-11-04 Thread Tom Lane
Thom Brown  writes:
> Would it be possible (or reasonable) to add support for changing the type of
> a column to serial or bigserial (yes, yes, I know they're not actual
> types)?

We've looked at that in the past and decided there were enough corner
cases that it wasn't clearly a good idea.  In particular, what do you do
with the existing data in the column?  What do you do if there's already
a DEFAULT expression for the column, throw it away?  In particular, what
of the special case that the column is in fact already a serial, so the
default is pointing at an existing sequence?

It is possible to accomplish everything that such a command would do
manually, so the argument for having it boils down to wanting it to
be a bit easier.  But unless the command can always do the right thing
automatically, I'm not sure "easy" is a good argument.

There's also the objection that such an operation would actually have
very little to do with ALTER COLUMN TYPE --- most of the things it would
do are not that.  The fact that serial was bolted on as a fake type is a
wart that maybe we shouldn't extend in this particular fashion.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread Robert Haas
On Thu, Nov 4, 2010 at 6:35 AM, Stephen Frost  wrote:
> * Jan Urbański (wulc...@wulczer.org) wrote:
>> On 04/11/10 14:09, Robert Haas wrote:
>> > Hmm, I wonder how useful this is given that restriction.
>>
>> As KaiGai mentined, it's more to make bruteforcing difficult (read: tmie
>> consuming), right?
>
> Which it would still do, since the attacker would be bumping up against
> max_connections.  max_connections would be a DOS point, but that's no
> different from today.  Other things could be put in place to address
> that (max # of connections from a given IP or range could be implemented
> using iptables, as an example).
>
> 5 second delay w/ max connections at 100 would mean max of 20 attempts
> per second, no?  That's alot fewer than 100*(however many attempts can
> be done in a second).  Doing a stupid while true; psql -d blah; done
> managed to get 50 successful ident auths+no-db-found errors done in a
> second on one box here.  5000 >> 20, and I wasn't even trying.

OK.  I was just asking.  I don't object to it if people think it's
useful, especially if they are looking at it as "I would actually use
this on my system" rather than "I can imagine a hypothetical person
using this".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread KaiGai Kohei

(2010/11/04 22:05), Itagaki Takahiro wrote:

2010/11/4 KaiGai Kohei:

The attached patch is a contrib module to inject a few seconds
delay on authentication failed. It is also a proof of the concept
using the new ClientAuthentication_hook.

This module provides a similar feature to pam_faildelay on
operating systems. Injection of a few seconds delay on
authentication fails prevents (or makes hard at least) brute-force
attacks, because it limits number of candidates that attacker can
verify within a unit of time.


+1 for the feature.  We have "post_auth_delay" parameter,
but it has different purpose; it's as DEVELOPER_OPTIONS
for delay to attach a debugger.

BTW, the module could save CPU usage of the server on attacks,
but do nothing about connection flood attacks, right?
If an attacker attacks the server with multiple connections,
the server still consumes max_connections even with the module.


Good point. The pam_faildelay being the model of this feature also
does nothing for flood of connections attack.
However, if it closes the connection immediately, the attacker can
try to verify next candidate very soon. Do you have any good idea?

Thanks,
--
KaiGai Kohei 

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Tom Lane
Hannu Krosing  writes:
> Are you sure that having each try/except use a subtransaction is the
> right way to do it ?

Actually it is not: what you have to do is use a subtransaction in the
plpy.execute() operation, so that if the called SQL operation fails, you
can clean it up and then report the error to Python as if it were any
other Python error.  Messing with the host language's exception handling
is a sure route to misery.  plperl and pltcl both contain examples of
doing this properly.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread Stephen Frost
* Jan Urbański (wulc...@wulczer.org) wrote:
> On 04/11/10 14:09, Robert Haas wrote:
> > Hmm, I wonder how useful this is given that restriction.
> 
> As KaiGai mentined, it's more to make bruteforcing difficult (read: tmie
> consuming), right?

Which it would still do, since the attacker would be bumping up against
max_connections.  max_connections would be a DOS point, but that's no
different from today.  Other things could be put in place to address
that (max # of connections from a given IP or range could be implemented
using iptables, as an example).

5 second delay w/ max connections at 100 would mean max of 20 attempts
per second, no?  That's alot fewer than 100*(however many attempts can
be done in a second).  Doing a stupid while true; psql -d blah; done
managed to get 50 successful ident auths+no-db-found errors done in a
second on one box here.  5000 >> 20, and I wasn't even trying.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Comparison with "true" in source code

2010-11-04 Thread Robert Haas
On Wed, Nov 3, 2010 at 6:45 PM, Itagaki Takahiro
 wrote:
> On Wed, Nov 3, 2010 at 2:19 AM, Michael Meskes  wrote:
>> On Mon, Nov 01, 2010 at 12:17:02PM +0900, Itagaki Takahiro wrote:
>>> There are some "== true" in the codes, but they might not be safe
>>> because all non-zero values are true in C. Is it worth cleaning up them?
>
> Here is a proposed cleanup that replaces "boolean == true" with "boolean".
> I didn't touch "== false" unless they are not in pairs of comparisons
> with true because comparison with false is a valid C code.

It looks like you have one or two irrelevant whitespace changes in ecpg.c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread Jan Urbański
On 04/11/10 14:09, Robert Haas wrote:
> On Thu, Nov 4, 2010 at 6:05 AM, Itagaki Takahiro
>  wrote:
>> 2010/11/4 KaiGai Kohei :
>>> The attached patch is a contrib module to inject a few seconds
>>> delay on authentication failed. It is also a proof of the concept
>>> using the new ClientAuthentication_hook.
>>>
>>> This module provides a similar feature to pam_faildelay on
>>> operating systems. Injection of a few seconds delay on
>>> authentication fails prevents (or makes hard at least) brute-force
>>> attacks, because it limits number of candidates that attacker can
>>> verify within a unit of time.
>>
>> +1 for the feature.  We have "post_auth_delay" parameter,
>> but it has different purpose; it's as DEVELOPER_OPTIONS
>> for delay to attach a debugger.
>>
>> BTW, the module could save CPU usage of the server on attacks,
>> but do nothing about connection flood attacks, right?
>> If an attacker attacks the server with multiple connections,
>> the server still consumes max_connections even with the module.
> 
> Hmm, I wonder how useful this is given that restriction.

As KaiGai mentined, it's more to make bruteforcing difficult (read: tmie
consuming), right?

Jan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread Robert Haas
On Thu, Nov 4, 2010 at 6:05 AM, Itagaki Takahiro
 wrote:
> 2010/11/4 KaiGai Kohei :
>> The attached patch is a contrib module to inject a few seconds
>> delay on authentication failed. It is also a proof of the concept
>> using the new ClientAuthentication_hook.
>>
>> This module provides a similar feature to pam_faildelay on
>> operating systems. Injection of a few seconds delay on
>> authentication fails prevents (or makes hard at least) brute-force
>> attacks, because it limits number of candidates that attacker can
>> verify within a unit of time.
>
> +1 for the feature.  We have "post_auth_delay" parameter,
> but it has different purpose; it's as DEVELOPER_OPTIONS
> for delay to attach a debugger.
>
> BTW, the module could save CPU usage of the server on attacks,
> but do nothing about connection flood attacks, right?
> If an attacker attacks the server with multiple connections,
> the server still consumes max_connections even with the module.

Hmm, I wonder how useful this is given that restriction.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] contrib: auth_delay module

2010-11-04 Thread Itagaki Takahiro
2010/11/4 KaiGai Kohei :
> The attached patch is a contrib module to inject a few seconds
> delay on authentication failed. It is also a proof of the concept
> using the new ClientAuthentication_hook.
>
> This module provides a similar feature to pam_faildelay on
> operating systems. Injection of a few seconds delay on
> authentication fails prevents (or makes hard at least) brute-force
> attacks, because it limits number of candidates that attacker can
> verify within a unit of time.

+1 for the feature.  We have "post_auth_delay" parameter,
but it has different purpose; it's as DEVELOPER_OPTIONS
for delay to attach a debugger.

BTW, the module could save CPU usage of the server on attacks,
but do nothing about connection flood attacks, right?
If an attacker attacks the server with multiple connections,
the server still consumes max_connections even with the module.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] contrib: auth_delay module

2010-11-04 Thread KaiGai Kohei
The attached patch is a contrib module to inject a few seconds
delay on authentication failed. It is also a proof of the concept
using the new ClientAuthentication_hook.

This module provides a similar feature to pam_faildelay on
operating systems. Injection of a few seconds delay on
authentication fails prevents (or makes hard at least) brute-force
attacks, because it limits number of candidates that attacker can
verify within a unit of time.

Thanks,
-- 
KaiGai Kohei 


pgsql-v9.1-auth-delay.1.patch
Description: application/octect-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Kenneth Marshall
On Thu, Nov 04, 2010 at 10:00:40AM +, Dean Rasheed wrote:
> On 3 November 2010 09:24, Nicolas Barbier  wrote:
> > 2010/11/2 Kenneth Marshall :
> >
> >> Given that our hash implimentation mixes the input data well (It does.
> >> I tested it.) then a simple rotate-and-xor method is all that should
> >> be needed to maintain all of the needed information. The original
> >> hash function has done the heavy lifting in this case.
> >
> > Even with the perfect hash function for the elements, certain
> > combinations of elements could still lead to massive collisions. E.g.,
> > if repeated values are typical in the input data we are talking about,
> > then the rotate-and-xor method would still lead to collisions between
> > any array of the same values of certain lengths, regardless of the
> > value. In Tom's implementation, as he mentioned before, those
> > problematical lengths would be multiples of 32 (e.g., an array of 32
> > 1s would collide with an array of 32 2s would collide with an array of
> > 32 3s, etc).
> >
> 
> Yeah, rotate-and-xor is a pretty weak hashing algorithm, since any
> array of 32 identical elements will hash to either 0 or -1. Similarly
> various permutations or multiples of that array length will cause it
> to perform badly.
> 
> The multiply-by-m algorithm doesn't have that weakness, provided m is
> chosen carefully. There are a couple of qualities a good algorithm
> should possess:
> 
> 1). The bits from the individual element hash values should be
> distributed evenly so that no 2 different hash values would result in
> the same contribution to the final value. This is easy to achieve -
> just make sure that m is odd.
> 
> 2). The way that each element's hash value bits are distributed should
> be different from the way that every other element's hash value bits
> are distributed. m=31 achieves this pretty well, although there are
> plenty of other equally valid choices.
> 
> Regards,
> Dean
> 
Hi Dean,

In my comment yesterday, I included a simple function that would
allow us to leverage our current hash functions mixing process to
scramble the bits effectively and retaining the maximum amount of
information in the hash.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Alter column to type serial

2010-11-04 Thread Thom Brown
Hi all,

Would it be possible (or reasonable) to add support for changing the type of
a column to serial or bigserial (yes, yes, I know they're not actual
types)?  In effect this would mean that users who forgot to set up a
sequence could change it's type so that a new implicit sequence will be
created, set with its current value set to the highest value of whatever
column it was bound to.  This thought was triggered by a user on IRC wishing
to migrate from MySQL, but had tables with some sort of ID column without
any associated sequence.

So if you had:

CREATE TABLE stuff (id int, content text);

INSERT INTO stuff (id, content) values (1,'alpha'),(2,'beta'),(5,'gamma');

You could just issue:

ALTER TABLE stuff ALTER COLUMN id TYPE serial;

And continue as so:

INSERT INTO stuff (content) values ('delta');

SELECT id from stuff;

 id

  1
  2
  5
  6
(4 rows)

This would be instead of having to do:

CREATE SEQUENCE id_stuff_seq;

SELECT setval('id_stuff_seq', (SELECT max(id) FROM stuff))

ALTER TABLE stuff ALTER COLUMN id SET DEFAULT
nextval('id_stuff_seq'::regclass);

Which would also mean the sequence would not get dropped with the table.

Abhorrent idea, or acceptable?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Peter Eisentraut
On ons, 2010-11-03 at 14:15 -0700, David E. Wheeler wrote:
> /me wants a global $dbh that mimics the DBI interface but just uses
> SPI under the hood. Not volunteering, either…

Already exists: DBD::PgSPI.  Probably needs lots of updating through.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash support for arrays

2010-11-04 Thread Dean Rasheed
On 3 November 2010 09:24, Nicolas Barbier  wrote:
> 2010/11/2 Kenneth Marshall :
>
>> Given that our hash implimentation mixes the input data well (It does.
>> I tested it.) then a simple rotate-and-xor method is all that should
>> be needed to maintain all of the needed information. The original
>> hash function has done the heavy lifting in this case.
>
> Even with the perfect hash function for the elements, certain
> combinations of elements could still lead to massive collisions. E.g.,
> if repeated values are typical in the input data we are talking about,
> then the rotate-and-xor method would still lead to collisions between
> any array of the same values of certain lengths, regardless of the
> value. In Tom's implementation, as he mentioned before, those
> problematical lengths would be multiples of 32 (e.g., an array of 32
> 1s would collide with an array of 32 2s would collide with an array of
> 32 3s, etc).
>

Yeah, rotate-and-xor is a pretty weak hashing algorithm, since any
array of 32 identical elements will hash to either 0 or -1. Similarly
various permutations or multiples of that array length will cause it
to perform badly.

The multiply-by-m algorithm doesn't have that weakness, provided m is
chosen carefully. There are a couple of qualities a good algorithm
should possess:

1). The bits from the individual element hash values should be
distributed evenly so that no 2 different hash values would result in
the same contribution to the final value. This is easy to achieve -
just make sure that m is odd.

2). The way that each element's hash value bits are distributed should
be different from the way that every other element's hash value bits
are distributed. m=31 achieves this pretty well, although there are
plenty of other equally valid choices.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Hannu Krosing
On Thu, 2010-11-04 at 11:46 +0200, Hannu Krosing wrote:
> On Wed, 2010-11-03 at 21:43 +0100, Jan Urbański wrote:
> > The validator is ready, once I'm done with the hash tables I'll try to
> > fix up the error checking (get rid of the global error state) and
> > finally do what started it all, that is make plpythonu use
> > subtransactions for SPI and be able to do:
> > 
> > try:
> > plpy.execute("insert into foo values(1)")
> > except plpy.UniqueViolation, e:
> > plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)
> 
> Are you sure that having each try/except use a subtransaction is the
> right way to do it ?

Another objection 

> I'd like to make it more explicit and use 
> 
> with plpy.subtransaction():
> do your stuff

Possibly better syntax would be

with plpy.subtransaction() as subtrx:
try:
plpy.execute("insert into foo values(1)")
except plpy.UniqueViolation, e:
subtrx.rollback()
plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)


---
Hannu Krosing
PostgreSQL Infinite Scalability and Preformance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why does plperl cache functions using just a bool for is_trigger

2010-11-04 Thread Hannu Krosing
On Wed, 2010-11-03 at 21:43 +0100, Jan Urbański wrote:
> The validator is ready, once I'm done with the hash tables I'll try to
> fix up the error checking (get rid of the global error state) and
> finally do what started it all, that is make plpythonu use
> subtransactions for SPI and be able to do:
> 
> try:
> plpy.execute("insert into foo values(1)")
> except plpy.UniqueViolation, e:
> plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)

Are you sure that having each try/except use a subtransaction is the
right way to do it ?

I'd like to make it more explicit and use 

with plpy.subtransaction():
do your stuff

adding subtransactions to try/except would also act differently on
postgresql and python data, that is things in postgresql tables would
get rolled back but those made to python would not 

or at least make the "rollback to savepoint x" optional

try:
plpy.savepoint('sp1')
for i in range(-5,5)
plpy.execute("insert into foo values(%s)", [abs(10/i)])
except plpy.UniqueViolation, e:
plpy.rollback('sp1')
plpy.notice("Ooops, you got yourself a SQLSTATE %d", e.sqlstate)
except ZeroDivisionError:
plpy.notice("Only some values were inserted")



---
Hannu Krosing
PostgreSQL Infinite Scalability and Preformance Consultant
PG Admin Book: http://www.2ndQuadrant.com/books/



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-04 Thread Itagaki Takahiro
On Thu, Nov 4, 2010 at 6:04 PM, Shigeru HANADA
 wrote:
> For example:
> * PRIMARY ACCESS_METHOD -> HANDLER of FOREIGN DATA WRAPPER
> * am_scancost()         -> FdwRoutine.EstimateCosts()
> * am_open()             -> FdwRoutine.Open()
> * am_beginscan()        -> first call of FdwRoutine.Iterate()?

It might be good to have a separated "beginscan" method if we use
asynchronous scans in multiple foreign servers in one query
because multiple foreign servers can run their queries in parallel.
(Imagine that pushing-down aggregate function into each foreign server.)
I think it is different from "open" because it is called
before query execution, for example by EXPLAIN.

> * am_getnext()          -> FdwRoutine.Iterate()
> * am_rescan()           -> FdwRoutine.ReOpen()
> * am_close()            -> FdwRoutine.Close()
> * Table descriptor      -> Relation, Form_pg_class
> * Qual descriptor       -> PlanState.qual

Do you think you have all counterpart methods for VTI AMs?
If so, it's a good news ;-)  We could support foreign table
features as same level as Informix.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED estimated time of arrival?

2010-11-04 Thread Shigeru HANADA
On Wed, 03 Nov 2010 13:32:18 -0700
Eric Davies  wrote:
> On Informix, we were able to take advantage of the VTI (Virtual Table 
> Interface) feature to support "table" scans and indexing. (See 
> http://www.ibm.com/developerworks/data/zones/informix/library/techarticle/db_vti.html
>  
> .) Do you have any idea of how long it will be before SQL/MED on 
> PostgreSQL will be available, and perhaps how similar it will be to 
> Informix VTI?

SQL/MED is now under discussion/development for PostgreSQL 9.1, and
9.1 would be released one year after 9.0, maybe around Sep 2011? For
detail of release schedule, please see the development plan of
PostgreSQL 9.1.

http://wiki.postgresql.org/wiki/PostgreSQL_9.1_Development_Plan

I looked into VTI documents you've pointed.  ISTM that VTI and SQL/MED
would have a lot of common ideas, and most of VTI items would be able
to be mapped to one of SQL/MED items, except features about updating
data and indexing.

For example:

* PRIMARY ACCESS_METHOD -> HANDLER of FOREIGN DATA WRAPPER
* am_scancost() -> FdwRoutine.EstimateCosts()
* am_open() -> FdwRoutine.Open()
* am_beginscan()-> first call of FdwRoutine.Iterate()?
* am_getnext()  -> FdwRoutine.Iterate()
* am_rescan()   -> FdwRoutine.ReOpen()
* am_close()-> FdwRoutine.Close()
* Table descriptor  -> Relation, Form_pg_class
* Qual descriptor   -> PlanState.qual

I hope the summary of SQL/MED described in wiki page helps you.

http://wiki.postgresql.org/wiki/SQL/MED

Any comments and questions are welcome.

Regards,
--
Shigeru Hanada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers