date:20100819

Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-19 Thread Steven Schlansker

On Aug 19, 2010, at 2:35 PM, Tom Lane wrote:

> Steven Schlansker  writes:
>> I'm having a rather annoying problem - a particular string is causing the 
>> Postgres COPY functionality to lose a byte, causing data corruption in 
>> backups and transferred data.
> 
> I was able to reproduce this on my own Mac.  Some tracing shows that the
> problem is that isspace(0x85) returns true when in locale en_US.utf-8.
> This causes array_in to drop the final byte of the array element string,
> thinking that it's insignificant whitespace.

The 0x85 seems to be the second byte of a multibyte UTF-8
sequence.  I'm not at all experienced with character encodings so I could
be totally off base, but isn't it wrong to ever call isspace(0x85), 
whatever the result may be, given that the actual character is 0xCF85?
(U+03C5, GREEK SMALL LETTER UPSILON)


>  I believe that you must
> not have produced the data file data.copy on a Mac, or at least not in
> that locale setting, because array_out should have double-quoted the
> array element given that behavior of isspace().

Correct, it was produced on a Linux machine.  That said, the charset
there was also UTF-8.

> 
> Now, it's probably less than sane for isspace() to be behaving that way,
> since in a UTF8-based locale 0x85 can't be regarded as a valid character
> code at all.  But I'm not hopeful about the results of filing a bug with
> Apple, because their UTF8-based locales have a lot of other bu^H^Hdubious
> behaviors too, which they appear not to care much about.

I actually can't reproduce that behavior here:

#include 
#include 
int main() {
printf("%d\n", isspace(0x85));
return 0;
}

[ste...@xxx:~]% gcc -o test test.c
[ste...@xxx:~]% ./test
0
[ste...@xxx:~]% locale
LANG="en_US.utf-8"
LC_COLLATE="en_US.utf-8"
LC_CTYPE="en_US.utf-8"
LC_MESSAGES="en_US.utf-8"
LC_MONETARY="en_US.utf-8"
LC_NUMERIC="en_US.utf-8"
LC_TIME="en_US.utf-8"
LC_ALL=
[ste...@xxx:~]% uname -a
Darwin xxx.local 10.4.0 Darwin Kernel Version 10.4.0: Fri Apr 23 18:28:53 PDT 
2010; root:xnu-1504.7.4~1/RELEASE_I386 i386 i386


Thanks much for your help,
Steven Schlansker


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-19 Thread Steven Schlansker


On Aug 19, 2010, at 3:24 PM, Tom Lane wrote:
> Steven Schlansker  writes:
>> 
>> I'm not at all experienced with character encodings so I could
>> be totally off base, but isn't it wrong to ever call isspace(0x85), 
>> whatever the result may be, given that the actual character is 0xCF85?
>> (U+03C5, GREEK SMALL LETTER UPSILON)
> 
> We generally assume that in server-safe encodings, the ctype.h functions
> will behave sanely on any single-byte value.  You can argue the wisdom
> of that, but deciding to change that policy would be a rather massive
> code change; I'm not excited about going that direction.

Fair enough.  I presume there are no "server-safe encodings" for which
a multibyte sequence 0x XX20 would be valid - which would break anyway
(as the second byte looks like a real space)

> You need a setlocale() call, else the program acts as though it's in C
> locale regardless of environment.

Sigh.  I hate C sometimes. :-p

Anyway, it looks like this is actually a BSD bug which got copy +
pasted into Apple's Darwin source -

http://lists.freebsd.org/pipermail/freebsd-i18n/2007-September/000157.html

I have a couple of contacts at Apple so I'll see if there's any interest in
backporting a fix, but I wouldn't hope for it to happen quickly if at all...

Thanks for taking a look into fixing this, I hope you guys can reach
consensus on how to get it fixed :)

Best,
Steven Schlansker

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] small smgrcreate cleanup patch

2010-08-19 Thread Tom Lane

Robert Haas  writes:
> So I propose moving the TablespaceCreateDbspace() call to mdcreate(),
> removing the redundant check from smgrcreate(), and deleting that
> portion of the comment which says this is in the wrong place.

While I don't care for having smgr.c call tablespace.c, moving the call to
md.c instead is surely not an improvement :-(.  The problem here is that
we'd like the tablespace code to be above the smgr code, not below.
Calling it from md.c makes the layer inversion worse not better.

> You could argue that perhaps md.c isn't the right place either, but it
> certainly makes more sense than smgr.c, and I'd argue it's exactly
> right.

On what grounds pray tell?

> A related, interesting question is whether there's any purpose to the
> smgr layer at all.

Not a lot anymore, I think.  This has been ranted about before, but
I think the Berkeley guys bet on the wrong horse when they put an API
layer here.  The facilities that might once have been usefully switched
between at this level are now all down inside kernel device drivers.
I could see merging smgr.c and md.c entirely.  But they'd still be
appropriately placed below, not above, the tablespace commands.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security hook on authorization

2010-08-19 Thread KaiGai Kohei

(2010/08/20 11:45), Robert Haas wrote:
> 2010/8/19 KaiGai Kohei:
>> I also plan to add a security hook on authorization time.
>> It shall allow external security providers to set up credential of
>> the authenticated clients.
>>
>> Please note that it is not intended to control authentication process.
>> It is typically checked based on a pair of username and password.
>> What I want to discuss is things after success of this authentication
>> steps.
>>
>>  From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains
>> a security label of the peer process, so it does not need to consider
>> database username. But we can easily assume other security mechanism
>> which assigns a certain label based on the authenticated database user
>> such as Oracle Label Security.
>>
>> So, I think this hook should be also invoked on the code path of
>> SET SESSION AUTHORIZATION, not only database login time, although
>> SE-PostgreSQL ignores this case.
>>
>> So, I think SetSessionUserId() is a candidate to put this hook which is
>> entirely called from both of the code path.
>> This routine is to assign credential of the default database privilege
>> mechanism, so it seems to me it is a good point where external security
>> provider also assigns its credential of the authenticated database user.
> 
> How is this different from what we rejected before?
> 
It made clear the purpose of this hook.

I also intended to use the previous hook for authorization purpose,
but it was deployed just after initialize_acl() without no argument.
It might be suitable for SE-PostgreSQL, because it does not depend on
authenticated database user, but might be too specific.

The new hook shall be invoked on two code paths (database login and
SET SESSION AUTHORIZATION). It allows upcoming security module which
may assign client's credential based on the database user to utilize
this hook also.

Thanks,
-- 
KaiGai Kohei 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] security hook on authorization

2010-08-19 Thread Robert Haas

2010/8/19 KaiGai Kohei :
> I also plan to add a security hook on authorization time.
> It shall allow external security providers to set up credential of
> the authenticated clients.
>
> Please note that it is not intended to control authentication process.
> It is typically checked based on a pair of username and password.
> What I want to discuss is things after success of this authentication
> steps.
>
> From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains
> a security label of the peer process, so it does not need to consider
> database username. But we can easily assume other security mechanism
> which assigns a certain label based on the authenticated database user
> such as Oracle Label Security.
>
> So, I think this hook should be also invoked on the code path of
> SET SESSION AUTHORIZATION, not only database login time, although
> SE-PostgreSQL ignores this case.
>
> So, I think SetSessionUserId() is a candidate to put this hook which is
> entirely called from both of the code path.
> This routine is to assign credential of the default database privilege
> mechanism, so it seems to me it is a good point where external security
> provider also assigns its credential of the authenticated database user.

How is this different from what we rejected before?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] small smgrcreate cleanup patch

2010-08-19 Thread Robert Haas

smgrcreate() currently contains a call to TablespaceCreateDbspace().
As the comment says, this is a rather silly place to put it.  The
silliest thing about it, IMHO, is that it forces the following check
to be done in both smgrcreate() and mdcreate():

  if (isRedo && reln->md_fd[forknum] != NULL)
  return;

So I propose moving the TablespaceCreateDbspace() call to mdcreate(),
removing the redundant check from smgrcreate(), and deleting that
portion of the comment which says this is in the wrong place.  You
could argue that perhaps md.c isn't the right place either, but it
certainly makes more sense than smgr.c, and I'd argue it's exactly
right.  Moving the TablespaceCreateDbspace() logic into md.c (or
smgr.c) seems like it would be more awkward, though I suppose it could
be done.  One less-than-thrilling result would be that the
TablespaceCreateLock logic wouldn't be confined to tablespace.c, not
that that's *necessarily* a disaster.

A related, interesting question is whether there's any purpose to the
smgr layer at all.  Would we accept a patch that implemented an
alternative smgr layer, perhaps on a per-tablespace basis?  The only
real alternative I can imagine to "magnetic disk" storage would be
network storage.  You could imagine using such a thing for temporary
tablespaces, essentially writing out temporary table pages to a RAM
cache on a remote machine; or perhaps more interestingly, using it for
fault tolerance, keeping 2 or 3 copies of each page on different
servers.  Maybe someone will say "hey, just use iSCSI" or "hey, just
use AFS" or something like that, but I dunno if I trust them to do the
right thing with fsync, and they might not be as well-optimized as
would be possible if we rolled our own.  At any rate, if we DON'T
think we'd ever consider supporting something like this, perhaps we
ought to just fold the md.c stuff into smgr.c and eliminate all the
jumping through hoops.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company


smgrcreate_cleanup.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] security hook on authorization

2010-08-19 Thread KaiGai Kohei

I also plan to add a security hook on authorization time.
It shall allow external security providers to set up credential of
the authenticated clients.

Please note that it is not intended to control authentication process.
It is typically checked based on a pair of username and password.
What I want to discuss is things after success of this authentication
steps.

>From viewpoint of SE-PostgreSQL, it uses getpeercon(3) which obtains
a security label of the peer process, so it does not need to consider
database username. But we can easily assume other security mechanism
which assigns a certain label based on the authenticated database user
such as Oracle Label Security.

So, I think this hook should be also invoked on the code path of
SET SESSION AUTHORIZATION, not only database login time, although
SE-PostgreSQL ignores this case.

So, I think SetSessionUserId() is a candidate to put this hook which is
entirely called from both of the code path.
This routine is to assign credential of the default database privilege
mechanism, so it seems to me it is a good point where external security
provider also assigns its credential of the authenticated database user.

Thanks,
-- 
KaiGai Kohei 
 src/backend/utils/init/miscinit.c |   14 ++
 src/include/miscadmin.h   |4 
 2 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/src/backend/utils/init/miscinit.c b/src/backend/utils/init/miscinit.c
index b3243aa..81f7bd1 100644
--- a/src/backend/utils/init/miscinit.c
+++ b/src/backend/utils/init/miscinit.c
@@ -229,6 +229,15 @@ static int	SecurityRestrictionContext = 0;
 /* We also remember if a SET ROLE is currently active */
 static bool SetRoleIsActive = false;
 
+/*
+ * SetSessionUserId_hook allows external security providers to authorize
+ * the authenticated client on database login and SET SESSION AUTHORIZATION.
+ * It takes two arguments; userid_old and userid_new.
+ * If userid_old would be InvalidOid, it means the hook was invoked on
+ * database login time. Elsewhere, it was invoked due to SET SESSION
+ * AUTHORIZATION.
+ */
+SetSessionUserId_hook_type SetSessionUserId_hook = NULL;
 
 /*
  * GetUserId - get the current effective user ID.
@@ -282,6 +291,11 @@ SetSessionUserId(Oid userid, bool is_superuser)
 {
 	AssertState(SecurityRestrictionContext == 0);
 	AssertArg(OidIsValid(userid));
+
+	/* We also allow security provides to authorize the client */
+	if (SetSessionUserId_hook)
+		(*SetSessionUserId_hook)(SessionUserId, userid);
+
 	SessionUserId = userid;
 	SessionUserIsSuperuser = is_superuser;
 	SetRoleIsActive = false;
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index 2c775c1..5478de6 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -278,6 +278,10 @@ extern void SetDataDir(const char *dir);
 extern void ChangeToDataDir(void);
 extern char *make_absolute_path(const char *path);
 
+/* Hook for plugins to get control in SetSessionUserId */
+typedef void (*SetSessionUserId_hook_type)(Oid userid_old, Oid userid_new);
+extern PGDLLIMPORT SetSessionUserId_hook_type SetSessionUserId_hook;
+
 /* in utils/misc/superuser.c */
 extern bool superuser(void);	/* current user is superuser */
 extern bool superuser_arg(Oid roleid);	/* given user is superuser */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 06:03:57PM -0400, Robert Haas wrote:
> On Thu, Aug 19, 2010 at 5:53 PM, David Fetter  wrote:
> > On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote:
> >> "Kevin Grittner"  writes:
> >> > Robert Haas  wrote:
> >> >> I suppose there could also be a bit of an ambiguity if you're working
> >> >> with a type like int4 where the values are discrete steps.  Like, what
> >> >> do you do with {1, 2}?
> >>
> >> Hmm, good point.
> >>
> >> > The same thing you do with the avg function?
> >>
> >> avg's approach is not at all datatype-independent though.  If
> >> you're willing to give up the idea of a polymorphic median()
> >> function, that would probably be the thing to do.  If not, you
> >> need to take the left or right one of the two central elements.
> >
> > Whether the median needs to be in the sample is one question that
> > doesn't really have a universal right answer.  A separate
> > question, also without a universal right answer, is whether the
> > median needs to be in the same domain as the sample, int4 being
> > the case above.
> >
> > We could just go with "whatever Oracle, DB2 and MS-SQL Server
> > have," assuming it's the same thing, until something appears in
> > the SQL standard.
> 
> That's usually a sensible default, when in doubt.  If nothing else,
> it improves compatibility.

That's assuming we find such a thing, which I haven't yet.

Trying to stay cheery,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-19 Thread Tatsuo Ishii

> We generally assume that in server-safe encodings, the ctype.h functions
> will behave sanely on any single-byte value.

I think this "wisedom" is only true for C locale.  I'm not surprised
all that it does not work with non C locales.

>From array_funcs.c:

while (isspace((unsigned char) *p))
p++;

IMO this should be something like:

while (isspace((unsigned char) *p))
p += pg_mblen(p);
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] trace_recovery_messages

2010-08-19 Thread Tom Lane

Fujii Masao  writes:
> The explanation of trace_recovery_messages in the document
> is inconsistent with the definition of it in guc.c.

I've applied a patch for this.

I was tempted to propose that we just rip out trace_recovery_messages
altogether, but I suppose Simon would object.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread Quan Zongliang

I don't know how to edit documents exactly before.

Thanks.

On Thu, 19 Aug 2010 08:48:53 -0700
David Fetter  wrote:

> On Thu, Aug 19, 2010 at 10:24:54PM +0800, Quan Zongliang wrote:
> > documents attached. html and man-page
> 
> Thanks!
> 
> For future reference, the way to patch docs is by patching the SGML
> source.  Please find enclosed a patch which incorporates the code
> patch you sent with these docs.
> 
> Cheers,
> David.
> -- 
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter  XMPP: david.fet...@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
> 
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Quan Zongliang 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread Quan Zongliang

Because Windows's CreateService has serial start-type:
SERVICE_AUTO_START
SERVICE_BOOT_START
SERVICE_DEMAND_START
SERVICE_DISABLED
SERVICE_SYSTEM_START

Although all of them are not useful for pg service.
I think it is better to use enum.

On Thu, 19 Aug 2010 16:48:53 -0400
Alvaro Herrera  wrote:

> Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010:
> > On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote:
> > > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010:
> > > 
> > > > +
> > > > + -S  > > > class="parameter">
> > > 
> > > You omitted the start-type inside the  tag.  Also, the "a"
> > > and "d" values seem to be accepted but not documented.
> > 
> > D'oh!  Changed patch enclosed.  Now in context format :)
> 
> Thanks.
> 
> Another thing -- why is there an enum at all?  Seems it'd be
> simpler to assign the right value to the variable in the getopt() code
> to start with.
> 
> -- 
> Álvaro Herrera 
> The PostgreSQL Company - Command Prompt, Inc.
> PostgreSQL Replication, Consulting, Custom Development, 24x7 support


-- 
Quan Zongliang 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Avoiding deadlocks ...

2010-08-19 Thread Josh Berkus

Kevin,

This one is for you:

Two sessions, in transaction:

Process A   Process B

update session where id = X;
update order where orderid = 5;
update order where orderid = 5;
update order where orderid = 5;
... deadlock error.

It seems like we ought to be able to avoid a deadlock in this case;
there's a clear precedence of who grabbed the order row first.  Does
your serializability patch address the above situation at all?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-19 Thread Tom Lane

Steven Schlansker  writes:
> On Aug 19, 2010, at 2:35 PM, Tom Lane wrote:
>> I was able to reproduce this on my own Mac.  Some tracing shows that the
>> problem is that isspace(0x85) returns true when in locale en_US.utf-8.
>> This causes array_in to drop the final byte of the array element string,
>> thinking that it's insignificant whitespace.

> The 0x85 seems to be the second byte of a multibyte UTF-8
> sequence.

Check.

> I'm not at all experienced with character encodings so I could
> be totally off base, but isn't it wrong to ever call isspace(0x85), 
> whatever the result may be, given that the actual character is 0xCF85?
> (U+03C5, GREEK SMALL LETTER UPSILON)

We generally assume that in server-safe encodings, the ctype.h functions
will behave sanely on any single-byte value.  You can argue the wisdom
of that, but deciding to change that policy would be a rather massive
code change; I'm not excited about going that direction.

>> I believe that you must
>> not have produced the data file data.copy on a Mac, or at least not in
>> that locale setting, because array_out should have double-quoted the
>> array element given that behavior of isspace().

> Correct, it was produced on a Linux machine.  That said, the charset
> there was also UTF-8.

Right ... but you had an isspace function that meets our expectations.

> I actually can't reproduce that behavior here:

You need a setlocale() call, else the program acts as though it's in C
locale regardless of environment.  My test case looks like this:

$ cat isspace.c
#include 
#include 
#include 

int main()
{
  int c;

  setlocale(LC_ALL, "");

  for (c = 1; c < 256; c++)
{
  if (isspace(c))
printf("%3o is space\n", c);
}

  return 0;
}
$ gcc -O -Wall isspace.c
$ LANG=C ./a.out
 11 is space
 12 is space
 13 is space
 14 is space
 15 is space
 40 is space
$ LANG=en_US.utf-8 ./a.out
 11 is space
 12 is space
 13 is space
 14 is space
 15 is space
 40 is space
205 is space
240 is space
$ 

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 5:53 PM, David Fetter  wrote:
> On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote:
>> "Kevin Grittner"  writes:
>> > Robert Haas  wrote:
>> >> I suppose there could also be a bit of an ambiguity if you're working
>> >> with a type like int4 where the values are discrete steps.  Like, what
>> >> do you do with {1, 2}?
>>
>> Hmm, good point.
>>
>> > The same thing you do with the avg function?
>>
>> avg's approach is not at all datatype-independent though.  If you're
>> willing to give up the idea of a polymorphic median() function, that
>> would probably be the thing to do.  If not, you need to take the left
>> or right one of the two central elements.
>
> Whether the median needs to be in the sample is one question that
> doesn't really have a universal right answer.  A separate question,
> also without a universal right answer, is whether the median needs to
> be in the same domain as the sample, int4 being the case above.
>
> We could just go with "whatever Oracle, DB2 and MS-SQL Server have,"
> assuming it's the same thing, until something appears in the SQL
> standard.

That's usually a sensible default, when in doubt.  If nothing else, it
improves compatibility.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 05:40:46PM -0400, Tom Lane wrote:
> "Kevin Grittner"  writes:
> > Robert Haas  wrote:
> >> I suppose there could also be a bit of an ambiguity if you're working
> >> with a type like int4 where the values are discrete steps.  Like, what
> >> do you do with {1, 2}?
> 
> Hmm, good point.
>  
> > The same thing you do with the avg function?
> 
> avg's approach is not at all datatype-independent though.  If you're
> willing to give up the idea of a polymorphic median() function, that
> would probably be the thing to do.  If not, you need to take the left
> or right one of the two central elements.

Whether the median needs to be in the sample is one question that
doesn't really have a universal right answer.  A separate question,
also without a universal right answer, is whether the median needs to
be in the same domain as the sample, int4 being the case above.

We could just go with "whatever Oracle, DB2 and MS-SQL Server have,"
assuming it's the same thing, until something appears in the SQL
standard.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Tom Lane

"Kevin Grittner"  writes:
> Robert Haas  wrote:
>> I suppose there could also be a bit of an ambiguity if you're working
>> with a type like int4 where the values are discrete steps.  Like, what
>> do you do with {1, 2}?

Hmm, good point.

> The same thing you do with the avg function?

avg's approach is not at all datatype-independent though.  If you're
willing to give up the idea of a polymorphic median() function, that
would probably be the thing to do.  If not, you need to take the left
or right one of the two central elements.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Old git repo

2010-08-19 Thread Jeff Davis

On Thu, 2010-08-19 at 23:30 +0200, Magnus Hagander wrote:
> It might well be, and the cost is low. But if you're talking about
> gitweb links or so, they'll still be invalid, because it would have to
> be renamed to "postgresql-old" or something like that...

Sure, that's fine.

It would just be nice to have a place to turn to if you have an old SHA1
and no other information; even if it's slightly inconvenient.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [BUGS] COPY FROM/TO losing a single byte of a multibyte UTF-8 sequence

2010-08-19 Thread Tom Lane

Steven Schlansker  writes:
> I'm having a rather annoying problem - a particular string is causing the 
> Postgres COPY functionality to lose a byte, causing data corruption in 
> backups and transferred data.

I was able to reproduce this on my own Mac.  Some tracing shows that the
problem is that isspace(0x85) returns true when in locale en_US.utf-8.
This causes array_in to drop the final byte of the array element string,
thinking that it's insignificant whitespace.  I believe that you must
not have produced the data file data.copy on a Mac, or at least not in
that locale setting, because array_out should have double-quoted the
array element given that behavior of isspace().

Now, it's probably less than sane for isspace() to be behaving that way,
since in a UTF8-based locale 0x85 can't be regarded as a valid character
code at all.  But I'm not hopeful about the results of filing a bug with
Apple, because their UTF8-based locales have a lot of other bu^H^Hdubious
behaviors too, which they appear not to care much about.

In any case, stepping back and taking a larger viewpoint, it seems unsafe
for array_in/array_out to have any locale-sensitive behavior anyhow.
As an example, in a LATINx locale it is entirely sane for isspace() to
return true for 0xA0, while it should certainly not do so in C locale.
This means we are at risk of data corruption, ie dropping a valid data
character, when an array value starting or ending with 0xA0 is dumped
from a C-locale database and loaded into a LATINx-locale one.

So it seems like the safest answer is to modify array_in/array_out to
use an ASCII-only definition of isspace().  I believe this is
traditionally defined as space, tab, CR, LF, VT, FF.  We could perhaps
trim that further, like just space and tab, but there might be some risk
of breaking client code that expects the other traditional whitespace to
be ignored.

I'm not sure if there are any other places with similar risks.
hstore's I/O routines contain isspace calls, but I haven't
analyzed the implications.  There is an isspace call in record_out
but it is just there for cosmetic purposes and doesn't protect
any decisions in record_in, so I think it's okay if it makes
platform/locale-dependent choices.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Old git repo

2010-08-19 Thread Christopher Browne

On Thu, Aug 19, 2010 at 5:29 PM, Jeff Davis  wrote:
> The new git repository will have different SHA1s for all of the commits,
> so any old SHA1s will be useless without the old repository.
>
> Hopefully nobody used links to specific commits (or SHA1s) pointing to
> the old git repository for anything important. But I found myself doing
> so occasionally for unimportant things (if it was important, I included
> the date as a safeguard) -- so I assume a few other people did, as well.
>
> Would it be worth keeping the old git repository around in a read-only
> mode, just in case people have links/SHA1s floating around for it?

Perhaps 
"http://git.postgresql.org/gitweb?p=postgresql-before-official-git-migration.git;a=summary";
-- 
http://linuxfinances.info/info/linuxdistributions.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Old git repo

2010-08-19 Thread Magnus Hagander

On Thu, Aug 19, 2010 at 23:29, Jeff Davis  wrote:
> The new git repository will have different SHA1s for all of the commits,
> so any old SHA1s will be useless without the old repository.
>
> Hopefully nobody used links to specific commits (or SHA1s) pointing to
> the old git repository for anything important. But I found myself doing
> so occasionally for unimportant things (if it was important, I included
> the date as a safeguard) -- so I assume a few other people did, as well.
>
> Would it be worth keeping the old git repository around in a read-only
> mode, just in case people have links/SHA1s floating around for it?

It might well be, and the cost is low. But if you're talking about
gitweb links or so, they'll still be invalid, because it would have to
be renamed to "postgresql-old" or something like that...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Old git repo

2010-08-19 Thread Jeff Davis

The new git repository will have different SHA1s for all of the commits,
so any old SHA1s will be useless without the old repository.

Hopefully nobody used links to specific commits (or SHA1s) pointing to
the old git repository for anything important. But I found myself doing
so occasionally for unimportant things (if it was important, I included
the date as a safeguard) -- so I assume a few other people did, as well.

Would it be worth keeping the old git repository around in a read-only
mode, just in case people have links/SHA1s floating around for it?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [pgsql-hackers] Daily digest v1.11023 (17 messages)

2010-08-19 Thread Caleb Welton

From: Tom Lane mailto:t...@sss.pgh.pa.us>>
Date: August 19, 2010 10:25:36 AM PDT
To: David Fetter mailto:da...@fetter.org>>
Cc: Kevin Grittner 
mailto:kevin.gritt...@wicourts.gov>>, Robert Haas 
mailto:robertmh...@gmail.com>>, Pavel Stehule 
mailto:pavel.steh...@gmail.com>>, Greg Stark 
mailto:gsst...@mit.edu>>, PostgreSQL Hackers 
mailto:pgsql-hackers@postgresql.org>>
Subject: Re: wip: functions median and percentile

David Fetter mailto:da...@fetter.org>> writes:
On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
http://www.merriam-webster.com/dictionary/median

If you do a google search for "median" and poke around, you'll find
many places where this is the only definition mentioned; the others
seem to be rather infrequently used.  Why not make the commone usage
convenient?

The reason not to is the same reason that MEDIAN doesn't appear in the
SQL standard, namely that what's common in one field is wrong in
another.

Hmm, do you have any *evidence* that that's why it's not in the standard?

My own take on that is that it's reasonably probable that the SQL
committee might standardize a function by that name someday.  What we
need to be worrying about is the prospect that if there are multiple
definitions for the term, they might pick a different one than we did.
A name like "arithmetic_median" seems much less likely to get blindsided
by future standardization.

regards, tom lane

Median is in the standard, you just have to look a little harder, under the 
section
on inverse distribution functions:

SELECT  PERCENTILE_DIST(0.5) WITHIN GROUP (order by x) ...
or
SELECT  PERCENTILE_CONT(0.5) WITHIN GROUP (order by x) ...

Depending on whether you want a discrete or continuous median.

Oracle added support for the inverse distribution functions in Oracle 9, which
is perhaps why you can find it in the standard.

Oracle added the "median(x)" aggregate as a synonym for "percentile_cont(0.5) 
within group (order by x)"
in Oracle 10.

My money would be on this become standardized at some point, especially since 
it is a
much friendlier syntax.

Regards,
  Caleb

Re: [HACKERS] small typed-table bug in gram.y

2010-08-19 Thread Kevin Grittner

Robert Haas  wrote:
 
> While examining gram.y today I happened to notice an oversight in
the
> grammar rules for creating typed tables: the fourth CREATE TABLE
> production ignores $2.  Which I think means (although of course I
> didn't test it) that if you do something like "CREATE TEMP TABLE IF
> NOT EXISTS foo OF bar", the TEMP won't stick.
 
I tried it on a build from yesterday's HEAD.  You're right.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 04:48:53PM -0400, Alvaro Herrera wrote:
> Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010:
> > On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote:
> > > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010:
> > > 
> > > > +
> > > > + -S  > > > class="parameter">
> > > 
> > > You omitted the start-type inside the  tag.  Also,
> > > the "a" and "d" values seem to be accepted but not documented.
> > 
> > D'oh!  Changed patch enclosed.  Now in context format :)
> 
> Thanks.
> 
> Another thing -- why is there an enum at all?  Seems it'd be simpler
> to assign the right value to the variable in the getopt() code to
> start with.

That's a question for the patch author.  I was just cleaning up the
docs :)

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] small typed-table bug in gram.y

2010-08-19 Thread Robert Haas

While examining gram.y today I happened to notice an oversight in the
grammar rules for creating typed tables: the fourth CREATE TABLE
production ignores $2.  Which I think means (although of course I
didn't test it) that if you do something like "CREATE TEMP TABLE IF
NOT EXISTS foo OF bar", the TEMP won't stick.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread Alvaro Herrera

Excerpts from David Fetter's message of jue ago 19 16:40:18 -0400 2010:
> On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote:
> > Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010:
> > 
> > > +
> > > + -S  > > class="parameter">
> > 
> > You omitted the start-type inside the  tag.  Also, the "a"
> > and "d" values seem to be accepted but not documented.
> 
> D'oh!  Changed patch enclosed.  Now in context format :)

Thanks.

Another thing -- why is there an enum at all?  Seems it'd be
simpler to assign the right value to the variable in the getopt() code
to start with.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 03:47:43PM -0400, Alvaro Herrera wrote:
> Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010:
> 
> > +
> > + -S  > class="parameter">
> 
> You omitted the start-type inside the  tag.  Also, the "a"
> and "d" values seem to be accepted but not documented.

D'oh!  Changed patch enclosed.  Now in context format :)

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
*** a/doc/src/sgml/ref/pg_ctl-ref.sgml
--- b/doc/src/sgml/ref/pg_ctl-ref.sgml
***
*** 96,101  PostgreSQL documentation
--- 96,107 
 -U username
 -P password
 -D datadir
+-S
+  
+a[uto]
+d[emand]
+  
+
 -w
 -t seconds
 -o options
***
*** 377,382  PostgreSQL documentation
--- 383,400 

   
  
+ 
+ 
+  -S start-type
+  
+   
+Start type of the system service to register.  start-type can
+be auto, or demand, or
+the first letter of one of these two. If this is omitted,
+auto is used.
+   
+  
+ 
 

  
*** a/src/bin/pg_ctl/pg_ctl.c
--- b/src/bin/pg_ctl/pg_ctl.c
***
*** 69,74  typedef enum
--- 69,82 
RUN_AS_SERVICE_COMMAND
  } CtlCommand;
  
+ 
+ typedef enum
+ {
+   PGCTL_START_AUTO,
+   PGCTL_START_DEMAND
+ } PgctlWin32StartType;
+ 
+ 
  #define DEFAULT_WAIT  60
  
  static bool do_wait = false;
***
*** 87,92  static char *exec_path = NULL;
--- 95,101 
  static char *register_servicename = "PostgreSQL"; /* FIXME: + 
version ID? */
  static char *register_username = NULL;
  static char *register_password = NULL;
+ static PgctlWin32StartType win32_start_type = PGCTL_START_AUTO;
  static char *argv0 = NULL;
  static bool allow_core_files = false;
  
***
*** 1132,1137  pgwin32_doRegister(void)
--- 1141,1147 
  {
SC_HANDLE   hService;
SC_HANDLE   hSCM = OpenSCManager(NULL, NULL, SC_MANAGER_ALL_ACCESS);
+   DWORD   starttype;
  
if (hSCM == NULL)
{
***
*** 1145,1153  pgwin32_doRegister(void)
exit(1);
}
  
if ((hService = CreateService(hSCM, register_servicename, 
register_servicename,
   SERVICE_ALL_ACCESS, 
SERVICE_WIN32_OWN_PROCESS,
! 
SERVICE_AUTO_START, SERVICE_ERROR_NORMAL,
  
pgwin32_CommandLine(true),
   NULL, NULL, "RPCSS\0", register_username, register_password)) == 
NULL)
{
--- 1155,1168 
exit(1);
}
  
+   if (win32_start_type == PGCTL_START_AUTO)
+   starttype = SERVICE_AUTO_START;
+   else
+   starttype = SERVICE_DEMAND_START;
+ 
if ((hService = CreateService(hSCM, register_servicename, 
register_servicename,
   SERVICE_ALL_ACCESS, 
SERVICE_WIN32_OWN_PROCESS,
! starttype, 
SERVICE_ERROR_NORMAL,
  
pgwin32_CommandLine(true),
   NULL, NULL, "RPCSS\0", register_username, register_password)) == 
NULL)
{
***
*** 1586,1592  do_help(void)
printf(_("  %s killSIGNALNAME PID\n"), progname);
  #if defined(WIN32) || defined(__CYGWIN__)
printf(_("  %s register   [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] 
[-D DATADIR]\n"
!"[-w] [-t SECS] [-o \"OPTIONS\"]\n"), 
progname);
printf(_("  %s unregister [-N SERVICENAME]\n"), progname);
  #endif
  
--- 1601,1607 
printf(_("  %s killSIGNALNAME PID\n"), progname);
  #if defined(WIN32) || defined(__CYGWIN__)
printf(_("  %s register   [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] 
[-D DATADIR]\n"
!"[-S START-TYPE] [-w] [-t SECS] [-o 
\"OPTIONS\"]\n"), progname);
printf(_("  %s unregister [-N SERVICENAME]\n"), progname);
  #endif
  
***
*** 1627,1632  do_help(void)
--- 1642,1654 
printf(_("  -N SERVICENAME  service name with which to register 
PostgreSQL server\n"));
printf(_("  -P PASSWORD password of account to register PostgreSQL 
server\n"));
printf(_("  -U USERNAME user name of account to register PostgreSQL 
server\n"));
+   printf(_("  -S START-TYPE   service start type to register PostgreSQL 
server,\n"
+   "  can be auto or demand\n"));
+ 
+

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Kevin Grittner

Robert Haas  wrote:
 
> I suppose there could also be a bit of an ambiguity if you're
working
> with a type like int4 where the values are discrete steps.  Like,
what
> do you do with {1, 2}?
 
The same thing you do with the avg function?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 4:21 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> You've made this assertion at least three times now, but I confess
>> that I've only ever learned one way to compute a median; and quick
>> Google searches for "median", "kinds of median", and few other
>> variants failed to turn up anything obvious either.
>
> There are different ways to define it when the number of samples is even.
> However I believe that "use the mean of the two middle values" is much
> the most common way to deal with that.

I suppose there could also be a bit of an ambiguity if you're working
with a type like int4 where the values are discrete steps.  Like, what
do you do with {1, 2}?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Tom Lane

Robert Haas  writes:
> You've made this assertion at least three times now, but I confess
> that I've only ever learned one way to compute a median; and quick
> Google searches for "median", "kinds of median", and few other
> variants failed to turn up anything obvious either.

There are different ways to define it when the number of samples is even.
However I believe that "use the mean of the two middle values" is much
the most common way to deal with that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 9:46 AM, David Fetter  wrote:
> As to median, please make sure you say in detail which median you're
> using and name it so, as there is no single, authoritative median.

You've made this assertion at least three times now, but I confess
that I've only ever learned one way to compute a median; and quick
Google searches for "median", "kinds of median", and few other
variants failed to turn up anything obvious either.  It seems to me
that if median is good enough for Oracle, Sybase, Excel, and the nun
who taught my fifth-grade elementary school class, it ought to be good
enough for us, too.  (Don't mess with Sr. Catherine!)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread Alvaro Herrera

Excerpts from David Fetter's message of jue ago 19 11:48:53 -0400 2010:

> +
> + -S  class="parameter">

You omitted the start-type inside the  tag.  Also, the "a"
and "d" values seem to be accepted but not documented.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refactoring comment.c

2010-08-19 Thread Tom Lane

Robert Haas  writes:
> Any other kibitzing before I commit this?

Sure ...

+  * If the object is a relation or a child object of a relation (e.g. an
+  * attribute or contraint, *relp will set to point to that relation).  This

Parenthesis in the wrong place here, grammar and spelling not much better.
Also, I still feel that this comment could do better about explaining the
behavior, particularly with respect to locking.  Perhaps say

+  * If the target object is a relation or a child object of a relation
+  * (e.g. an attribute or constraint), the relation is also opened, and *relp
+  * receives the open relcache entry pointer; otherwise *relp is set to NULL.
+  * This is a bit grotty but it makes life simpler, since the caller will
+  * typically need the relcache entry too.  Caller must close the relcache
+  * entry when done with it.  The relation is locked with the specified
+  * lockmode if the target object is the relation itself or an attribute,
+  * but for other child objects, only AccessShareLock is acquired on the
+  * relation.


+   ScanKeyInit(&skey[0], ObjectIdAttributeNumber, BTEqualStrategyNumber,
+   F_OIDEQ, ObjectIdGetDatum(address.objectId));

There's a standard convention for the layout of ScanKeyInit calls, and
this isn't it.  Trivial, I know, but it's better to make similar code
look similar.

There's no longer any need for a diff in src/backend/parser/Makefile.

+ #define OBJECTADDRESS_H
+ 
+ #include "nodes/parsenodes.h"
+ #include "nodes/pg_list.h"
+ #include "storage/lock.h"
+ #include "utils/rel.h"

You shouldn't need pg_list.h here, as parsenodes.h surely includes it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Tom Lane

David Fetter  writes:
> On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote:
>> A name like "arithmetic_median" seems much less likely to get
>> blindsided by future standardization.

> Yep.

OTOH, if Pavel's right that Oracle already has an aggregate named
median(), it seems quite unlikely that the SQL committee would pick
a definition incompatible with Oracle's to standardize on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Kevin Grittner

Pavel Stehule  wrote:
 
> I looked there and Oracle11g use "median" in common sense.
 
As does Sybase IQ.  FWIW, Excel spreadsheets do, too.
 
The chance of the SQL committee picking some other meaning for bare
MEDIAN seems vanishingly small; although I have to grant that with
only Oracle and Sybase as a precedent in the SQL world, it's not
zero.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> On 19/08/10 20:18, Tom Lane wrote:
>> But I'm still not seeing how this self-pipe hack avoids a race
>> condition.  If the signal handler is sending a byte whenever it
>> executes, then you could have bytes already stacked up in the pipe
>> from previous interrupts that didn't happen to come while inside
>> pg_usleep.  If you clear those before sleeping, you have a race
>> condition, and if you don't, then you fail to sleep the intended
>> amount of time even though there was no interrupt this time.

> You clear the pipe after waking up.

Hmm ... that doesn't answer my second objection about failing to sleep
the expected amount of time, but on reflection I guess that can't be
done: we have to be able to cope with interrupts occurring just before
the sleep actually begins, and there's no way to define "just before"
except by reference to the calling code having done whatever it might
need to do before/after sleeping.

> Hmm, will need to think about a suitable API for that.

Offhand I'd suggest something like

SetSleepInterrupt() -- called by signal handlers, writes pipe
ClearSleepInterrupt()   -- called by sleep-and-do-something loops, clears pipe

pg_usleep() itself remains the same, but it is now guaranteed to return
immediately if SetSleepInterrupt is called, or has been called since the
last ClearSleepInterrupt.

> The nice thing 
> would be that we could implement it using pselect() where available. 
> (And reliable - the Linux select() man page says that glibc's pselect() 
> is emulated using select(), and suffers from the very same race 
> condition pselect() was invented to solve. How awful is that!?)

Ick.  So how would you tell if it's trustworthy?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 01:25:36PM -0400, Tom Lane wrote:
> David Fetter  writes:
> > On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
> >> http://www.merriam-webster.com/dictionary/median
> >> 
> >> If you do a google search for "median" and poke around, you'll find
> >> many places where this is the only definition mentioned; the others
> >> seem to be rather infrequently used.  Why not make the commone usage
> >> convenient?
> 
> > The reason not to is the same reason that MEDIAN doesn't appear in the
> > SQL standard, namely that what's common in one field is wrong in
> > another.
> 
> Hmm, do you have any *evidence* that that's why it's not in the standard?

Only from Joe Celko, who was on the standards committee, and has
written on the subject.  There's nothing I've found in the standard
itself for this, if that's what you're looking for.

> My own take on that is that it's reasonably probable that the SQL
> committee might standardize a function by that name someday.  What
> we need to be worrying about is the prospect that if there are
> multiple definitions for the term, they might pick a different one
> than we did.

Exactly :)

> A name like "arithmetic_median" seems much less likely to get
> blindsided by future standardization.

Yep.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Pavel Stehule

2010/8/19 David Fetter :
> On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
>> David Fetter  wrote:
>>
>> > Median may be useful, but we pretty much can't just call it
>> > "median." Instead, we need to call it something like "left_median"
>> > or "arithmetic_median."
>>
>> I think it would be reasonable, and perhaps preferable, to use just
>> "median" for the semantics described in most dictionaries -- for
>> example, this:
>>
>> http://www.merriam-webster.com/dictionary/median
>>
>> If you do a google search for "median" and poke around, you'll find
>> many places where this is the only definition mentioned; the others
>> seem to be rather infrequently used.  Why not make the commone usage
>> convenient?
>
> The reason not to is the same reason that MEDIAN doesn't appear in the
> SQL standard, namely that what's common in one field is wrong in
> another.

I think some else. The reason can be more simple - implementation of
median is significantly harder  then other aggregates.

I looked there and Oracle11g use "median" in common sense.

Regards

Pavel Stehule
>
> Cheers,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fet...@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Pavel Stehule

2010/8/19 David Fetter :
> On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote:
>> On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane  wrote:
>> > Greg Stark  writes:
>> >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule  
>> >> wrote:
>> >>> I am sending a prototype implementation of functions median and
>> >>> percentile. This implementation is very simple and I moved it to
>> >>> contrib for this moment - it is more easy maintainable. Later I'll
>> >>> move it to core.
>> >
>> >> So if the entire result set fits in memory it would be nice to use the
>> >> O(n) Quickselect algorithm -- which would only be a small change to
>> >> the existing Quicksort code -- instead of sorting the entire set.
>> >
>> > That seems like rather a lot of added infrastructure for functions whose
>> > popularity is at best unknown.  I think we should KISS for the first
>> > implementation.
>>
>> +1.  I think the functions are useful, but the perfect is the enemy of the 
>> good.
>
> Percentile is already there as NTILE, a windowing function.  Median

I don't think it. The purpose is same, but usage is different -
aggregate functions can be used as window func, but window functions
cannot be used as aggregates.


> may be useful, but we pretty much can't just call it "median."
> Instead, we need to call it something like "left_median" or
> "arithmetic_median."

I put some time to deep searching in this area - and I talked with my
kolegues mathematican and everywhere use a just median. Some longer or
different name isn't barrier for me, but people can be confused.

Regards
Pavel

>
> Cheers,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fet...@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Heikki Linnakangas


On 19/08/10 20:18, Tom Lane wrote:

Heikki Linnakangas  writes:

On 19/08/10 19:57, Tom Lane wrote:

Hmm, but couldn't you still do that inside pg_usleep?  Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.



I don't understand, do what inside pg_usleep()?


I was envisioning still using pg_usleep, and having this interaction
between signal handlers and the delay be private to pg_usleep, rather
than putting such ugly code into forty-nine different places.  There
are a *lot* of places where we have loops that break down delays
into at-most-one-second pg_usleep calls, and if we're going to have a
hack like this we should fix them all, not just two or three that SR
cares about.


Hmm, will need to think about a suitable API for that. The nice thing 
would be that we could implement it using pselect() where available. 
(And reliable - the Linux select() man page says that glibc's pselect() 
is emulated using select(), and suffers from the very same race 
condition pselect() was invented to solve. How awful is that!?)



But I'm still not seeing how this self-pipe hack avoids a race
condition.  If the signal handler is sending a byte whenever it
executes, then you could have bytes already stacked up in the pipe
from previous interrupts that didn't happen to come while inside
pg_usleep.  If you clear those before sleeping, you have a race
condition, and if you don't, then you fail to sleep the intended
amount of time even though there was no interrupt this time.


You clear the pipe after waking up. Before sending all the pending WAL 
and deciding that you're fully caught up again:


for(;;)
{
  1. select()
  2. clear pipe
  3. send any pending WAL
}

There's more information on the self-pipe trick at e.g 
http://lwn.net/Articles/177897/


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Tom Lane

David Fetter  writes:
> On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
>> http://www.merriam-webster.com/dictionary/median
>> 
>> If you do a google search for "median" and poke around, you'll find
>> many places where this is the only definition mentioned; the others
>> seem to be rather infrequently used.  Why not make the commone usage
>> convenient?

> The reason not to is the same reason that MEDIAN doesn't appear in the
> SQL standard, namely that what's common in one field is wrong in
> another.

Hmm, do you have any *evidence* that that's why it's not in the standard?

My own take on that is that it's reasonably probable that the SQL
committee might standardize a function by that name someday.  What we
need to be worrying about is the prospect that if there are multiple
definitions for the term, they might pick a different one than we did.
A name like "arithmetic_median" seems much less likely to get blindsided
by future standardization.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> On 19/08/10 19:57, Tom Lane wrote:
>> Hmm, but couldn't you still do that inside pg_usleep?  Signal handlers
>> that do that couldn't know if they were interrupting a sleep per se,
>> so this would have to be a backend-wide convention.

> I don't understand, do what inside pg_usleep()?

I was envisioning still using pg_usleep, and having this interaction
between signal handlers and the delay be private to pg_usleep, rather
than putting such ugly code into forty-nine different places.  There
are a *lot* of places where we have loops that break down delays
into at-most-one-second pg_usleep calls, and if we're going to have a
hack like this we should fix them all, not just two or three that SR
cares about.

But I'm still not seeing how this self-pipe hack avoids a race
condition.  If the signal handler is sending a byte whenever it
executes, then you could have bytes already stacked up in the pipe
from previous interrupts that didn't happen to come while inside
pg_usleep.  If you clear those before sleeping, you have a race
condition, and if you don't, then you fail to sleep the intended
amount of time even though there was no interrupt this time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 12:12:12PM -0500, Kevin Grittner wrote:
> David Fetter  wrote:
>  
> > Median may be useful, but we pretty much can't just call it
> > "median." Instead, we need to call it something like "left_median"
> > or "arithmetic_median."
>  
> I think it would be reasonable, and perhaps preferable, to use just
> "median" for the semantics described in most dictionaries -- for
> example, this:
>  
> http://www.merriam-webster.com/dictionary/median
>  
> If you do a google search for "median" and poke around, you'll find
> many places where this is the only definition mentioned; the others
> seem to be rather infrequently used.  Why not make the commone usage
> convenient?

The reason not to is the same reason that MEDIAN doesn't appear in the
SQL standard, namely that what's common in one field is wrong in
another.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Kevin Grittner

David Fetter  wrote:

> Median may be useful, but we pretty much can't just call it
> "median." Instead, we need to call it something like "left_median"
> or "arithmetic_median."

I think it would be reasonable, and perhaps preferable, to use just
"median" for the semantics described in most dictionaries -- for
example, this:

http://www.merriam-webster.com/dictionary/median

If you do a google search for "median" and poke around, you'll find
many places where this is the only definition mentioned; the others
seem to be rather infrequently used.  Why not make the commone usage
convenient?

-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Heikki Linnakangas


On 19/08/10 19:57, Tom Lane wrote:

Heikki Linnakangas  writes:

On 19/08/10 16:38, Tom Lane wrote:

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?



Instead of using pg_usleep(), call select() directly, waiting not only
for the timeout, but also for data to arrive on the "self-pipe". The
signal handler writes a byte to the self-pipe, waking up the select().
That way the select() is interupted by the signal arriving, even if
signals per se don't interrupt it. And it closes the race condition
involved with setting a flag in the signal handler and checking that in
the main loop.


Hmm, but couldn't you still do that inside pg_usleep?  Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.


I don't understand, do what inside pg_usleep()?

We only need to respond quickly to one signal, the one that tells 
walsender "there's some new WAL that you should send". We can rely on 
polling for all the other signals like SIGHUP for config reload or 
shutdown request, like we do today.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 12:49:45PM -0400, Robert Haas wrote:
> On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane  wrote:
> > Greg Stark  writes:
> >> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule  
> >> wrote:
> >>> I am sending a prototype implementation of functions median and
> >>> percentile. This implementation is very simple and I moved it to
> >>> contrib for this moment - it is more easy maintainable. Later I'll
> >>> move it to core.
> >
> >> So if the entire result set fits in memory it would be nice to use the
> >> O(n) Quickselect algorithm -- which would only be a small change to
> >> the existing Quicksort code -- instead of sorting the entire set.
> >
> > That seems like rather a lot of added infrastructure for functions whose
> > popularity is at best unknown.  I think we should KISS for the first
> > implementation.
> 
> +1.  I think the functions are useful, but the perfect is the enemy of the 
> good.

Percentile is already there as NTILE, a windowing function.  Median
may be useful, but we pretty much can't just call it "median."
Instead, we need to call it something like "left_median" or
"arithmetic_median."

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> On 19/08/10 16:38, Tom Lane wrote:
>> Considering that pg_usleep is implemented with select, I'm not following
>> what you mean by "replace pg_usleep() with select()"?

> Instead of using pg_usleep(), call select() directly, waiting not only 
> for the timeout, but also for data to arrive on the "self-pipe". The 
> signal handler writes a byte to the self-pipe, waking up the select(). 
> That way the select() is interupted by the signal arriving, even if 
> signals per se don't interrupt it. And it closes the race condition 
> involved with setting a flag in the signal handler and checking that in 
> the main loop.

Hmm, but couldn't you still do that inside pg_usleep?  Signal handlers
that do that couldn't know if they were interrupting a sleep per se,
so this would have to be a backend-wide convention.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the

2010-08-19 Thread Heikki Linnakangas


On 19/08/10 18:08, Robert Haas wrote:

On Thu, Aug 19, 2010 at 9:47 AM, Tom Lane  wrote:

Another possibility is for EXECUTE USING to coerce any unknowns to TEXT
before it calls the parser at all.  This would square with the typical
default assumption for unknown literals, and it would avoid having to
have any semantics changes below the SPI call.


That seems more intuitive than just chucking an error.


Ok, I reverted the previous patch, and did that instead.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> While testing the recent issue with unknown params in EXECUTE USING, I 
> accidentally did this:

>EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t;

> The mistake I made? I put the USING and INTO clauses in wrong order, 
> INTO needs to go first. We should throw an error on that, but it looks 
> like the INTO clause is just silently ignored.

This is more interesting than it looks.  It appears that the plpgsql
parser interprets the USING's argument expression as being
'bar' INTO t
so it generates a plplgsql expression with query
SELECT 'bar' INTO t
and the only reason that you don't get a failure is that
exec_simple_check_plan fails to notice the intoClause, so it thinks
this represents a "simple expression", which means it evaluates the
'bar' subexpression and ignores the INTO altogether.  That's
certainly a bug in exec_simple_check_plan :-(

I think that accepting this order of the clauses would require some
duplication of code in the stmt_dynexecute production.  It might be
worth doing anyway, because if you made this mistake then certainly
others will.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 11:33 AM, Tom Lane  wrote:
> Greg Stark  writes:
>> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule  
>> wrote:
>>> I am sending a prototype implementation of functions median and
>>> percentile. This implementation is very simple and I moved it to
>>> contrib for this moment - it is more easy maintainable. Later I'll
>>> move it to core.
>
>> So if the entire result set fits in memory it would be nice to use the
>> O(n) Quickselect algorithm -- which would only be a small change to
>> the existing Quicksort code -- instead of sorting the entire set.
>
> That seems like rather a lot of added infrastructure for functions whose
> popularity is at best unknown.  I think we should KISS for the first
> implementation.

+1.  I think the functions are useful, but the perfect is the enemy of the good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Heikki Linnakangas


On 19/08/10 16:38, Tom Lane wrote:

Heikki Linnakangas  writes:

BTW, on what platforms signals don't interrupt sleep? Although that
issue has been discussed many times before, I couldn't find any
reference to a real platform in the archives.


I've got one in captivity (my old HPUX box).  Happy to test whatever you
come up with.

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?


Instead of using pg_usleep(), call select() directly, waiting not only 
for the timeout, but also for data to arrive on the "self-pipe". The 
signal handler writes a byte to the self-pipe, waking up the select(). 
That way the select() is interupted by the signal arriving, even if 
signals per se don't interrupt it. And it closes the race condition 
involved with setting a flag in the signal handler and checking that in 
the main loop.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Heikki Linnakangas


On 19/08/10 18:08, Alvaro Herrera wrote:

Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:

On 19/08/10 04:46, Robert Haas wrote:



  And so far we haven't seen a patch for that.
Somebody write one.  And then let's get it reviewed and committed RSN.


Fujii is on vacation, but I've started working on it. The two issues
with Fujii's latest patch are that it would not respond promptly on
platforms where signals don't interrupt sleep, and it suffers the
classic race condition that pselect() was invented for. I'm going to
replace pg_usleep() with select(), and use the so called "self-pipe
trick" to get over the race condition. I have that written up but I want
to do some testing and cleanup before posting the patch.


Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
select() doesn't handle pipes, only sockets.  You may need some extra
hack to make it work there.


That's fine, we can do the naive set-a-flag implementation on Windows 
because on Windows our "signals" are only delivered at 
CHECK_FOR_INTERRUPTS(), so we don't have the race condition to begin with.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refactoring comment.c

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 11:57 AM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010:
>>> Here's v3.
>
>> The header comment in objectaddress.c contains a funny mistake: it says
>> it works with ObjectAddresses.  However, ObjectAddresses is a different
>> type altogether, so I recommend not using that as plural for
>> ObjectAddress.  Maybe "ObjectAddress objects"?  :-D
>
> Alternatively, maybe ObjectAddresses was a bad choice of type name,
> and it should be ObjectAddressList or ObjectAddressArray or some such.
> But changing that might be more trouble than it's worth.

Yeah, I think it was a bad choice of type name.  If I were otherwise
touching that code, I'd probably advocate for changing it, but since
I'm not, I'm inclined to just reword the comment.  It might be
something to keep in mind if we ever overhaul that part of the system,
though, since at that point anything that must be back-patched will
have merge conflicts anyway.

Any other kibitzing before I commit this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] refactoring comment.c

2010-08-19 Thread Tom Lane

Alvaro Herrera  writes:
> Excerpts from Robert Haas's message of miÃ© ago 18 21:32:48 -0400 2010:
>> Here's v3.

> The header comment in objectaddress.c contains a funny mistake: it says
> it works with ObjectAddresses.  However, ObjectAddresses is a different
> type altogether, so I recommend not using that as plural for
> ObjectAddress.  Maybe "ObjectAddress objects"?  :-D

Alternatively, maybe ObjectAddresses was a bad choice of type name,
and it should be ObjectAddressList or ObjectAddressArray or some such.
But changing that might be more trouble than it's worth.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fw: patch for pg_ctl.c to add windows service start-type

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 10:24:54PM +0800, Quan Zongliang wrote:
> documents attached. html and man-page

Thanks!

For future reference, the way to patch docs is by patching the SGML
source.  Please find enclosed a patch which incorporates the code
patch you sent with these docs.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/doc/src/sgml/ref/pg_ctl-ref.sgml b/doc/src/sgml/ref/pg_ctl-ref.sgml
index 20d87bb..a4e675e 100644
--- a/doc/src/sgml/ref/pg_ctl-ref.sgml
+++ b/doc/src/sgml/ref/pg_ctl-ref.sgml
@@ -377,6 +377,18 @@ PostgreSQL documentation
   
  
 
+
+
+ -S 
+ 
+  
+   Start type of the system service to register.  start-type can
+   be auto, or demand, or
+   the first letter of one of these two. If this is omitted,
+   auto is used.
+  
+ 
+

   
 
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 1caec12..42a8aa9 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -69,6 +69,14 @@ typedef enum
RUN_AS_SERVICE_COMMAND
 } CtlCommand;
 
+
+typedef enum
+{
+   PGCTL_START_AUTO,
+   PGCTL_START_DEMAND
+} PgctlWin32StartType;
+
+
 #define DEFAULT_WAIT   60
 
 static bool do_wait = false;
@@ -87,6 +95,7 @@ static char *exec_path = NULL;
 static char *register_servicename = "PostgreSQL";  /* FIXME: + 
version ID? */
 static char *register_username = NULL;
 static char *register_password = NULL;
+static PgctlWin32StartType win32_start_type = PGCTL_START_AUTO;
 static char *argv0 = NULL;
 static bool allow_core_files = false;
 
@@ -1132,6 +1141,7 @@ pgwin32_doRegister(void)
 {
SC_HANDLE   hService;
SC_HANDLE   hSCM = OpenSCManager(NULL, NULL, SC_MANAGER_ALL_ACCESS);
+   DWORD   starttype;
 
if (hSCM == NULL)
{
@@ -1145,9 +1155,14 @@ pgwin32_doRegister(void)
exit(1);
}
 
+   if (win32_start_type == PGCTL_START_AUTO)
+   starttype = SERVICE_AUTO_START;
+   else
+   starttype = SERVICE_DEMAND_START;
+
if ((hService = CreateService(hSCM, register_servicename, 
register_servicename,
   SERVICE_ALL_ACCESS, 
SERVICE_WIN32_OWN_PROCESS,
- 
SERVICE_AUTO_START, SERVICE_ERROR_NORMAL,
+ starttype, 
SERVICE_ERROR_NORMAL,
  
pgwin32_CommandLine(true),
   NULL, NULL, "RPCSS\0", register_username, register_password)) == 
NULL)
{
@@ -1586,7 +1601,7 @@ do_help(void)
printf(_("  %s killSIGNALNAME PID\n"), progname);
 #if defined(WIN32) || defined(__CYGWIN__)
printf(_("  %s register   [-N SERVICENAME] [-U USERNAME] [-P PASSWORD] 
[-D DATADIR]\n"
-"[-w] [-t SECS] [-o \"OPTIONS\"]\n"), 
progname);
+"[-S START-TYPE] [-w] [-t SECS] [-o 
\"OPTIONS\"]\n"), progname);
printf(_("  %s unregister [-N SERVICENAME]\n"), progname);
 #endif
 
@@ -1627,6 +1642,13 @@ do_help(void)
printf(_("  -N SERVICENAME  service name with which to register 
PostgreSQL server\n"));
printf(_("  -P PASSWORD password of account to register PostgreSQL 
server\n"));
printf(_("  -U USERNAME user name of account to register PostgreSQL 
server\n"));
+   printf(_("  -S START-TYPE   service start type to register PostgreSQL 
server,\n"
+   "  can be auto or demand\n"));
+
+   printf(_("\nStart types are:\n"));
+   printf(_("  auto   service start automatically during system 
startup\n"));
+   printf(_("  demand service start on demand\n"));
+
 #endif
 
printf(_("\nReport bugs to .\n"));
@@ -1696,6 +1718,25 @@ set_sig(char *signame)
 
 
 
+#if defined(WIN32) || defined(__CYGWIN__)
+static void
+set_starttype(char *starttypeopt)
+{
+   if (strcmp(starttypeopt, "a") == 0 || strcmp(starttypeopt, "auto") == 0)
+   win32_start_type = PGCTL_START_AUTO;
+   else if (strcmp(starttypeopt, "d") == 0 || strcmp(starttypeopt, 
"demand") == 0)
+   win32_start_type = PGCTL_START_DEMAND;
+   else
+   {
+   write_stderr(_("%s: unrecognized start type \"%s\"\n"), 
progname, starttypeopt);
+   do_advice();
+   exit(1);
+   }
+}
+#endif
+
+
+
 int
 main(int argc, char **argv)
 {
@@ -1772,7 +1813,7 @@ main(int argc, char **argv)
/* process command-line options */
while (optind < argc)
{
-   while ((c = getopt_

Re: [HACKERS] refactoring comment.c

2010-08-19 Thread Alvaro Herrera

Excerpts from Robert Haas's message of mié ago 18 21:32:48 -0400 2010:

> Here's v3.

The header comment in objectaddress.c contains a funny mistake: it says
it works with ObjectAddresses.  However, ObjectAddresses is a different
type altogether, so I recommend not using that as plural for
ObjectAddress.  Maybe "ObjectAddress objects"?  :-D

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Tom Lane

Greg Stark  writes:
> On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule  
> wrote:
>> I am sending a prototype implementation of functions median and
>> percentile. This implementation is very simple and I moved it to
>> contrib for this moment - it is more easy maintainable. Later I'll
>> move it to core.

> So if the entire result set fits in memory it would be nice to use the
> O(n) Quickselect algorithm -- which would only be a small change to
> the existing Quicksort code -- instead of sorting the entire set.

That seems like rather a lot of added infrastructure for functions whose
popularity is at best unknown.  I think we should KISS for the first
implementation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] FTS wildcard and custom ispell dictionary problem

2010-08-19 Thread darklow

Hello,

 I am using PostgreSQL 8.4 full text search in following way:
 Custom FTS configuration called "dc2" with these dictionaries in
following order for asciihword token: latvian_ispell, english_stem,
russian_stem

 Latvian ispell dictionary contains words with different endings but
same meaning (latvian langiage specifics, plural words, etc)

 The problem starts when using wildcard :* to_tsquery syntax.

 For example. If i look for the word "kriev" i am automatically adding
wildcard using syntax:  to_tsquery('dc2', 'kriev:*');
 By searching kriev:* FTS founds word "krievs" in latvian_ispell
dictionary which is totally okei.

 SELECT * from ts_debug('dc2', 'kriev:*');
   alias   |   description   | token |
dictionaries   |   dictionary   | lexemes
---+-+---+--++--
 asciiword | Word, all ASCII | kriev |
{latvian_ispell,english_stem,russian_stem} | latvian_ispell | {krievs}
 blank | Space symbols   | :*| {}


If understand correctly now database uses not kriev:* but krievs:* for
following queries.
And here is the problem, data contains also word: Krievija, and in
this case search doesn't find it, because it looks for Krievs:* and
not Kriev:* anymore.

Is there any solution anone could suggest to get results by both
criterias - kriev:* (starting query) and krievs:* (founded in ispell
dict).
Only idea i had is to somehow combine two tsqueries one -
to_tsquery('dc2', 'kriev:*') and to_tsquery('english', 'kriev:*'); so
the search looks for both - kriev:* and krievs:* but anyway didnt
figured out any syntax i could use :(

Thanks

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO

2010-08-19 Thread Alvaro Herrera

Excerpts from Heikki Linnakangas's message of jue ago 19 04:29:19 -0400 2010:
> While testing the recent issue with unknown params in EXECUTE USING, I 
> accidentally did this:
> 
> postgres=# DO $$
> DECLARE
>t text;
> BEGIN
>EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t;
>RAISE NOTICE '%', t;
> END;
> $$;
> NOTICE:  
> DO
> 
> The mistake I made? I put the USING and INTO clauses in wrong order, 
> INTO needs to go first. We should throw an error on that, but it looks 
> like the INTO clause is just silently ignored.

Can't we just accept either order?

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Magnus Hagander

On Thu, Aug 19, 2010 at 17:08, Alvaro Herrera
 wrote:
> Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:
>> On 19/08/10 04:46, Robert Haas wrote:
>
>> >  And so far we haven't seen a patch for that.
>> > Somebody write one.  And then let's get it reviewed and committed RSN.
>>
>> Fujii is on vacation, but I've started working on it. The two issues
>> with Fujii's latest patch are that it would not respond promptly on
>> platforms where signals don't interrupt sleep, and it suffers the
>> classic race condition that pselect() was invented for. I'm going to
>> replace pg_usleep() with select(), and use the so called "self-pipe
>> trick" to get over the race condition. I have that written up but I want
>> to do some testing and cleanup before posting the patch.
>
> Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
> select() doesn't handle pipes, only sockets.  You may need some extra
> hack to make it work there.

We have a pipe implementation using sockets that is used on Windows
for just this reason, IIRC.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the

2010-08-19 Thread David E. Wheeler

On Aug 19, 2010, at 8:08 AM, Robert Haas wrote:

>> Another possibility is for EXECUTE USING to coerce any unknowns to TEXT
>> before it calls the parser at all.  This would square with the typical
>> default assumption for unknown literals, and it would avoid having to
>> have any semantics changes below the SPI call.
> 
> That seems more intuitive than just chucking an error.

It'd be nice if SPI itself could work this way for UNKNOWNs, too.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Alvaro Herrera

Excerpts from Heikki Linnakangas's message of jue ago 19 02:02:34 -0400 2010:
> On 19/08/10 04:46, Robert Haas wrote:

> >  And so far we haven't seen a patch for that.
> > Somebody write one.  And then let's get it reviewed and committed RSN.
> 
> Fujii is on vacation, but I've started working on it. The two issues 
> with Fujii's latest patch are that it would not respond promptly on 
> platforms where signals don't interrupt sleep, and it suffers the 
> classic race condition that pselect() was invented for. I'm going to 
> replace pg_usleep() with select(), and use the so called "self-pipe 
> trick" to get over the race condition. I have that written up but I want 
> to do some testing and cleanup before posting the patch.

Hmm, IIRC the self-pipe trick doesn't work on Windows, mainly because
select() doesn't handle pipes, only sockets.  You may need some extra
hack to make it work there.

-- 
Álvaro Herrera 
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 9:47 AM, Tom Lane  wrote:
> Another possibility is for EXECUTE USING to coerce any unknowns to TEXT
> before it calls the parser at all.  This would square with the typical
> default assumption for unknown literals, and it would avoid having to
> have any semantics changes below the SPI call.

That seems more intuitive than just chucking an error.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wip: functions median and percentile

2010-08-19 Thread Greg Stark

On Thu, Aug 19, 2010 at 11:59 AM, Pavel Stehule  wrote:
> I am sending a prototype implementation of functions median and
> percentile. This implementation is very simple and I moved it to
> contrib for this moment - it is more easy maintainable. Later I'll
> move it to core.

So if the entire result set fits in memory it would be nice to use the
O(n) Quickselect algorithm -- which would only be a small change to
the existing Quicksort code -- instead of sorting the entire set. That
should be possible both for percentile() and median(). It would be
good to generalize the tuplesort api sufficiently so that you can
implement this as a contrib module even if we eventually decide to
integrate it in core. It's probably worth having two copies of the
sort code, one for Quickselect and one for Quicksort just for speed,
though I suppose it's worth benchmarking.

But I'm not aware of a generalization of tapesort to allow O(n)
selection with the sequential i/o properties of tapesort. It would be
especially interesting, probably more so than the in-memory case.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Pavel Stehule

2010/8/19 David Fetter :
> On Thu, Aug 19, 2010 at 12:45:13PM +0200, Pavel Stehule wrote:
>> Hello
>>
>> > I'll test both variant first. Maybe there are not any significant
>> > difference between them. Now nodeAgg can build, fill a tuplesort.
>> > So I think is natural use it. It needs only one - skip a calling a
>> > transident function and directly call final function with external
>> > tuplesort. Minimally you don't need 2x same code.
>>
>> yesterday I did a small test. Aggregates without transident
>> functions are only about 2% faster, so there has no sense thinking
>> more about them.  I'll send a patch with median and percentile
>> functions immediately - these functions are implemented like usual
>> aggregates.
>
> NTILE is already a windowing function.  Might want to check into any
> performance improvements you can give that.

The performance has to be +/- same. It is based on same technology - tuplesort.

>
> As to median, please make sure you say in detail which median you're
> using and name it so, as there is no single, authoritative median.

I searched about this. And I found so there is a different methods for
same statistic value with different names. But result has to be same.
I don't found a other median than "arithmetic" or "financial" median
(with a two derived forms - left, right median). These methods are a
different forms of some SQL - see
http://www.simple-talk.com/sql/t-sql-programming/median-workbench/ and
it is SQL related solutions. With tuplesort I can to use a simple
solution - I have a number of elements, have  a sorted set and can
move to n position is set.

 Next forms of median what I found are estimations.

regards

Pavel

>
> Cheers,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
> Skype: davidfetter      XMPP: david.fet...@gmail.com
> iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> I'm starting to wonder if it's worth enforcing the rule that all unknown 
> Params must be coerced to the same target type. We could just document 
> the behavior. Or maybe we should revert the whole thing, and add a check 
> to PL/pgSQL EXECUTE USING code to just throw a nicer error message if 
> you pass an unknown parameter in the USING clause.

+1 for the latter.  There's no good reason to be passing unknowns to USING.
I'm also getting more and more uncomfortable with the amount of new
behavior that's being pushed into an existing SPI call.

Another possibility is for EXECUTE USING to coerce any unknowns to TEXT
before it calls the parser at all.  This would square with the typical
default assumption for unknown literals, and it would avoid having to
have any semantics changes below the SPI call.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread David Fetter

On Thu, Aug 19, 2010 at 12:45:13PM +0200, Pavel Stehule wrote:
> Hello
> 
> > I'll test both variant first. Maybe there are not any significant
> > difference between them. Now nodeAgg can build, fill a tuplesort.
> > So I think is natural use it. It needs only one - skip a calling a
> > transident function and directly call final function with external
> > tuplesort. Minimally you don't need 2x same code.
> 
> yesterday I did a small test. Aggregates without transident
> functions are only about 2% faster, so there has no sense thinking
> more about them.  I'll send a patch with median and percentile
> functions immediately - these functions are implemented like usual
> aggregates.

NTILE is already a windowing function.  Might want to check into any
performance improvements you can give that.

As to median, please make sure you say in detail which median you're
using and name it so, as there is no single, authoritative median.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CommitFest 2009-07: Yay, Kevin! Thanks, reviewers!

2010-08-19 Thread Tom Lane

Heikki Linnakangas  writes:
> BTW, on what platforms signals don't interrupt sleep? Although that 
> issue has been discussed many times before, I couldn't find any 
> reference to a real platform in the archives.

I've got one in captivity (my old HPUX box).  Happy to test whatever you
come up with.

Considering that pg_usleep is implemented with select, I'm not following
what you mean by "replace pg_usleep() with select()"?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PL/pgsSQL EXECUTE USING INTO

2010-08-19 Thread Robert Haas

On Thu, Aug 19, 2010 at 4:29 AM, Heikki Linnakangas
 wrote:
> While testing the recent issue with unknown params in EXECUTE USING, I
> accidentally did this:
>
> postgres=# DO $$
> DECLARE
>  t text;
> BEGIN
>  EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t;
>  RAISE NOTICE '%', t;
> END;
> $$;
> NOTICE:  
> DO
>
> The mistake I made? I put the USING and INTO clauses in wrong order, INTO
> needs to go first. We should throw an error on that, but it looks like the
> INTO clause is just silently ignored.

Another option would be to make it work as expected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] wip: functions median and percentile

2010-08-19 Thread Pavel Stehule

Hello

I am sending a prototype implementation of functions median and
percentile. This implementation is very simple and I moved it to
contrib for this moment - it is more easy maintainable. Later I'll
move it to core.

These functions are relative simple, there are not barrier for
implementation own specific mutations of this functions - so I propose
move to core only basic and well known form of these to core.

postgres=# select median(v) from generate_series(1,10) g(v);
 median

5.5
(1 row)

Time: 1.475 ms
postgres=# select percentile(v,50) from generate_series(1,10) g(v);
 percentile

  5
(1 row)

Time: 0.626 ms

This implementation is based on tuplesort and the speed is relative
well - the result from 100 rows is less 1 sec.

Regards

Pavel Stehule
*** ./contrib/median/Makefile.orig	2010-08-19 12:38:56.144777253 +0200
--- ./contrib/median/Makefile	2010-08-18 20:23:39.180156339 +0200
***
*** 0 
--- 1,17 
+ # $PostgreSQL: pgsql/contrib/median/Makefile,v 1.1 2008/07/29 18:31:20 tgl Exp $
+ 
+ MODULES = median
+ DATA_built = median.sql
+ DATA = uninstall_median.sql
+ REGRESS = median
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = contrib/median
+ top_builddir = ../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
*** ./contrib/median/median.c.orig	2010-08-19 12:39:01.456650776 +0200
--- ./contrib/median/median.c	2010-08-19 12:35:32.104649418 +0200
***
*** 0 
--- 1,244 
+ /*
+  * $PostgreSQL: pgsql/contrib/citext/citext.c,v 1.2 2009/06/11 14:48:50 momjian Exp $
+  */
+ #include "postgres.h"
+ 
+ #include "funcapi.h"
+ #include "miscadmin.h"
+ #include "catalog/pg_type.h"
+ #include "parser/parse_coerce.h"
+ #include "parser/parse_oper.h"
+ #include "utils/builtins.h"
+ #include "utils/tuplesort.h"
+ 
+ Datum median_transfn(PG_FUNCTION_ARGS);
+ Datum median_finalfn(PG_FUNCTION_ARGS);
+ Datum percentile_transfn(PG_FUNCTION_ARGS);
+ Datum percentile_finalfn(PG_FUNCTION_ARGS);
+ 
+ 
+ #ifdef PG_MODULE_MAGIC
+ PG_MODULE_MAGIC;
+ #endif
+ 
+ PG_FUNCTION_INFO_V1(median_transfn);
+ PG_FUNCTION_INFO_V1(median_finalfn);
+ PG_FUNCTION_INFO_V1(percentile_transfn);
+ PG_FUNCTION_INFO_V1(percentile_finalfn);
+ 
+ 
+ typedef struct
+ {
+ 	int	nelems;		/* number of valid entries */
+ 	Tuplesortstate *sortstate;
+ 	FmgrInfo	cast_func_finfo;
+ 	int	p;		/* nth for percentille */
+ } StatAggState;
+ 
+ static StatAggState *
+ makeStatAggState(FunctionCallInfo fcinfo)
+ {
+ 	MemoryContext oldctx;
+ 	MemoryContext aggcontext;
+ 	StatAggState *aggstate;
+ 	Oid	sortop,
+ 			castfunc;
+ 	Oid	   valtype;
+ 	CoercionPathType		pathtype;
+ 
+ 	if (!AggCheckCallContext(fcinfo, &aggcontext))
+ 	{
+ 		/* cannot be called directly because of internal-type argument */
+ 		elog(ERROR, "string_agg_transfn called in non-aggregate context");
+ 	}
+ 
+ 	oldctx = MemoryContextSwitchTo(aggcontext);
+ 	
+ 	aggstate = (StatAggState *) palloc(sizeof(StatAggState));
+ 	aggstate->nelems = 0;
+ 	
+ 	valtype = get_fn_expr_argtype(fcinfo->flinfo, 1);
+ 	get_sort_group_operators(valtype,
+ 		true, false, false,
+ 		&sortop, NULL, NULL);
+ 
+ 	aggstate->sortstate = tuplesort_begin_datum(valtype,
+ 			sortop,
+ 			SORTBY_NULLS_DEFAULT,
+ 			work_mem, false);
+ 
+ 	MemoryContextSwitchTo(oldctx);
+ 
+ 	if (valtype != FLOAT8OID)
+ 	{
+ 		/* find a cast function */
+ 	
+ 		pathtype = find_coercion_pathway(FLOAT8OID, valtype,
+ 	COERCION_EXPLICIT,
+ 	&castfunc);
+ 		if (pathtype == COERCION_PATH_FUNC)
+ 		{
+ 			Assert(OidIsValid(castfunc));
+ 			fmgr_info_cxt(castfunc, &aggstate->cast_func_finfo,
+ 	aggcontext);
+ 		} 
+ 		else if (pathtype == COERCION_PATH_RELABELTYPE)
+ 		{
+ 			aggstate->cast_func_finfo.fn_oid = InvalidOid;
+ 		}
+ 		else 
+ 			elog(ERROR, "no conversion function from %s %s",
+ 	 format_type_be(valtype),
+ 	 format_type_be(FLOAT8OID));
+ 	}
+ 
+ 	return aggstate;
+ }
+ 
+ /*
+  *  append a non NULL value to tuplesort
+  */
+ Datum
+ median_transfn(PG_FUNCTION_ARGS)
+ {
+ 	StatAggState *aggstate;
+ 	
+ 	aggstate = PG_ARGISNULL(0) ? NULL : (StatAggState *) PG_GETARG_POINTER(0);
+ 	
+ 	if (!PG_ARGISNULL(1))
+ 	{
+ 		if (aggstate == NULL)
+ 			aggstate = makeStatAggState(fcinfo);
+ 		
+ 		tuplesort_putdatum(aggstate->sortstate, PG_GETARG_DATUM(1), false);
+ 		aggstate->nelems++;
+ 	}
+ 
+ 	PG_RETURN_POINTER(aggstate);
+ }
+ 
+ static double 
+ to_double(Datum value, FmgrInfo *cast_func_finfo)
+ {
+ 	if (cast_func_finfo->fn_oid != InvalidOid)
+ 	{
+ 		return DatumGetFloat8(FunctionCall1(cast_func_finfo, value));
+ 	}
+ 	else
+ 		return DatumGetFloat8(value);
+ }
+ 
+ Datum
+ median_finalfn(PG_FUNCTION_ARGS)
+ {
+ 	StatAggState *aggstate;
+ 	
+ 	Assert(AggCheckCallContext(fcinfo, NULL));
+ 
+ 	aggstate = PG_ARGISNULL(0) ? NULL : (StatAggState *) PG_GETARG_POINTER(0);
+ 
+ 	if (aggst

Re: [HACKERS] proposal: tuplestore, tuplesort aggregate functions

2010-08-19 Thread Pavel Stehule

Hello

>
> I'll test both variant first. Maybe there are not any significant
> difference between them. Now nodeAgg can build, fill a tuplesort. So I
> think is natural use it. It needs only one - skip a calling a
> transident function and directly call final function with external
> tuplesort. Minimally you don't need 2x same code.

yesterday I did a small test. Aggregates without transident functions
are only about 2% faster, so there has no sense thinking more about
them.  I'll send a patch with median and percentile functions
immediately - these functions are implemented like usual aggregates.

Regards

Pavel

>
> Regards
>
> Pavel Stehule
>
>>                        regards, tom lane
>>
>

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] git: uh-oh

2010-08-19 Thread Magnus Hagander

On Thu, Aug 19, 2010 at 07:00, Michael Haggerty  wrote:
> Magnus Hagander wrote:
>> Is there some way to make cvs2git work this way, and just not bother
>> even trying to create merge commits, or is that fundamentally
>> impossible and we need to look at another tool?
>
> The good news: (I just reminded myself/realized that) Max Bowsher has
> already implemented pretty much exactly what you want in the cvs2svn
> trunk version, including noting in the commit messages any cherry-picks
> that are not reflected in the repo ancestry.

Ah, that's great.


> The bad news: It is broken [1].  But I don't think it should be too much
> work to fix it.

That's less great of course, but it gives hope!

Thanks for your continued efforts!


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PL/pgsSQL EXECUTE USING INTO

2010-08-19 Thread Heikki Linnakangas

While testing the recent issue with unknown params in EXECUTE USING, I 
accidentally did this:


postgres=# DO $$
DECLARE
  t text;
BEGIN
  EXECUTE 'SELECT ''foo'' || $1' USING 'bar' INTO t;
  RAISE NOTICE '%', t;
END;
$$;
NOTICE:  
DO

The mistake I made? I put the USING and INTO clauses in wrong order, 
INTO needs to go first. We should throw an error on that, but it looks 
like the INTO clause is just silently ignored.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Coerce 'unknown' type parameters to the right type in the

2010-08-19 Thread Heikki Linnakangas


On 18/08/10 18:03, Heikki Linnakangas wrote:

On 18/08/10 16:57, Tom Lane wrote:

hei...@postgresql.org (Heikki Linnakangas) writes:

Log Message:
---
Coerce 'unknown' type parameters to the right type in the fixed-params
parse_analyze() function. That case occurs e.g with PL/pgSQL
EXECUTE ... USING 'stringconstant'.



The coercion with a CoerceViaIO node. The result is similar to the
coercion
via input function performed for unknown constants in coerce_type(),
except that this happens at runtime.


Unfortunately, this entirely fails to enforce the rule that an unknown
Param be coerced the same way everywhere. You'd need a cleanup pass as
well, cf check_variable_parameters().


Yeah, you're right. I'll find a way to do the cleanup pass in fixed
params case too.


It turned out to be messier than I imagined, but I have a working patch 
now. It still doesn't behave exactly like the variable params case, 
though. To wit:


postgres=# DO $$
declare
  t text;
begin
  EXECUTE 'SELECT 1+ $1, $1' INTO t USING '123' ;
  RAISE NOTICE '%', t;
end;
$$;
ERROR:  could not determine data type of parameter $1
LINE 1: SELECT 1+ $1, $1
  ^
QUERY:  SELECT 1+ $1, $1
CONTEXT:  PL/pgSQL function "inline_code_block" line 5 at EXECUTE statement

The varparams code doesn't thrown an error on that. At the first 
reference to $1, the parameter is resolved to int4. On the 2nd 
reference, it's an int4 and there's nothing to force coercion to 
anything else, so it's returned as an int4. In the fixed params case, 
however, that throws an error. We could make it behave the same if we 
really wanted to, but that would add even more code.


I'm starting to wonder if it's worth enforcing the rule that all unknown 
Params must be coerced to the same target type. We could just document 
the behavior. Or maybe we should revert the whole thing, and add a check 
to PL/pgSQL EXECUTE USING code to just throw a nicer error message if 
you pass an unknown parameter in the USING clause.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
diff --git a/src/backend/parser/analyze.c b/src/backend/parser/analyze.c
index 6b99a10..2cb9f31 100644
--- a/src/backend/parser/analyze.c
+++ b/src/backend/parser/analyze.c
@@ -92,6 +92,10 @@ parse_analyze(Node *parseTree, const char *sourceText,
 
 	query = transformStmt(pstate, parseTree);
 
+	/* make sure all is well with parameter types */
+	if (numParams > 0)
+		check_fixed_parameters(pstate, query);
+
 	free_parsestate(pstate);
 
 	return query;
diff --git a/src/backend/parser/parse_param.c b/src/backend/parser/parse_param.c
index 60917f4..7cb34c2 100644
--- a/src/backend/parser/parse_param.c
+++ b/src/backend/parser/parse_param.c
@@ -59,7 +59,10 @@ static Node *variable_coerce_param_hook(ParseState *pstate, Param *param,
 static Node *fixed_coerce_param_hook(ParseState *pstate, Param *param,
 		   Oid targetTypeId, int32 targetTypeMod,
 		   int location);
-static bool check_parameter_resolution_walker(Node *node, ParseState *pstate);
+static bool check_fixed_parameter_resolution_walker(Node *node,
+		ParseState *pstate);
+static bool check_variable_parameter_resolution_walker(Node *node,
+		   ParseState *pstate);
 
 
 /*
@@ -322,6 +325,135 @@ variable_coerce_param_hook(ParseState *pstate, Param *param,
 	return NULL;
 }
 
+
+/*
+ * Check for consistent assignment of unknown parameters after completion
+ * of parsing with parse_fixed_parameters.
+ *
+ * Note: this code intentionally does not check that all parameter positions
+ * were used, nor that all got non-UNKNOWN types assigned.	Caller of parser
+ * should enforce that if it's important.
+ */
+void
+check_fixed_parameters(ParseState *pstate, Query *query)
+{
+	FixedParamState *parstate = (FixedParamState *) pstate->p_ref_hook_state;
+
+	/*
+	 * If parse_fixed_parameters() didn't resolve any unknown types, there's
+	 * nothing to do.
+	 */
+	if (parstate->unknownParamTypes != NULL)
+		(void) query_tree_walker(query,
+ check_fixed_parameter_resolution_walker,
+ (void *) pstate, 0);
+}
+
+/*
+ * Traverse a fully-analyzed tree to verify that all references to unknown
+ * Params are coerced to the same type.  Although we check in
+ * fixed_coerce_param_hook() that an unknown Param is not coerced to different
+ * types at different locations in the query, some Params might still be
+ * uncoerced, if there wasn't anything to force their coercion, and yet other
+ * instances seen later might have gotten coerced.
+ */
+static bool
+check_fixed_parameter_resolution_walker(Node *node, ParseState *pstate)
+{
+	FixedParamState *parstate = (FixedParamState *) pstate->p_ref_hook_state;
+
+	if (node == NULL)
+		return false;
+
+	/*
+	 * Check if this is a CoerceViaIO(Param of type 'unknown') construct,
+	 * created by parse_fixed_parameters(). In theory, it could be a similar
+	 * construct created by other means, but that doesn't currently happen;
+	 * unknown Params needing coerc

75 matches

Mail list logo