[HACKERS] PATCH: CITEXT 2.0 v2

2008-07-05 Thread David E . Wheeler

On Jun 27, 2008, at 18:22, David E. Wheeler wrote:

Please find attached a patch adding a locale-aware, case-insensitive  
text type, called citext, as a contrib module.


Here is a new version of the patch, with the following changes:

* Fixed formatting to be more like core.
* Added appropriate NEGATORs to operators.
* Removed NEGATOR from the || operator.
* Added hash index function and operator class.
* The = operator now supports HASHES and MERGES.
* citext_cmp and citextcmp both return int32.
* Changed // comments to /* comments */.
* Added test confirming láska'::citext <> 'laská'::citext.
* A few other organizational, formatting, and pasto fixes.
* Updated the FAQ entry on case-insensitive queries to recommend  
citext (it would, of course, need to be translated).


Stuff I was asked about but didn't change:

* citext_cmp() still uses varstr_cmp() instead of strncmp(). When I  
tried the latter, everything seemed to be equivalent.
* citext_smaller() and citext_larger() don't have memory leaks, says  
Tom, so I added no calls to PG_FREE_IF_COPY().


Thank you everyone for your feedback and suggestions!

Best,

David



citext2.patch.gz
Description: GNU Zip compressed data





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest rules

2008-07-05 Thread Tom Lane
Robert Treat <[EMAIL PROTECTED]> writes:
> Hmm, looks like some of the things I was thinking about have been added 
> recenelt... cool. One question I have still remains though, on the main 
> developer page (http://wiki.postgresql.org/wiki/Development_information) it 
> has a link to the "current commitfest", which points to september's 
> commitfest page. ISTM the current commitfest is July's, since that's the one 
> we're currently working on.

The meaning of "current commitfest" as used on that page is "the place
you should submit a new patch today".  I agree there's a terminological
problem here, and we need to somehow distinguish that meaning from "the
commitfest we are currently trying to close out".  But you are not
helping matters by trying to eliminate the distinction.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] time_stamp type

2008-07-05 Thread Tom Lane
"Stephen R. van den Berg" <[EMAIL PROTECTED]> writes:
> What's the deal with this type?

It's a domain over timestamptz, as required by the SQL spec definition
of the information_schema.

postgres=# \dD information_schema.time_stamp
  List of domains
   Schema   |Name|Type |
  Modifier  | Check 
++-++---
 information_schema | time_stamp | timestamp(2) with time zone | default 
('now'::text)::timestamp(2) with time zone | 
(1 row)

[ re-reads spec... ]  Hm, actually the spec is self-contradictory here:
SQL99 20.7 saith

 CREATE DOMAIN TIME_STAMP AS TIMESTAMP (2)
 DEFAULT CURRENT_TIMESTAMP(2);

which appears to imply that TIME_STAMP is a domain over timestamp
*without* time zone ... but that is contradicted by the specification
that the default is CURRENT_TIMESTAMP, which yields a value *with*
time zone.  (LOCALTIMESTAMP is the function that should have been
mentioned if they really meant without time zone.)

[ pokes further... ]  Hmm, last year's SQL200n draft saith

CREATE DOMAIN TIME_STAMP AS TIMESTAMP(2) WITH TIME ZONE;

with no mention of a default.  I do wish these people could make
up their minds.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCHES] [HACKERS] Solaris ident authentication using unix domain sockets

2008-07-05 Thread Tom Lane
Andrew Dunstan <[EMAIL PROTECTED]> writes:
> Robert Treat wrote:
>> Hmm... I've always been told that Solaris didn't support this because the 
>> Solaris developers feel that IDENT is inherently insecure.

> We don't actually use the Ident protocol for Unix sockets on any 
> platform.

Indeed.  If the Solaris folk feel that getupeercred() is insecure,
they had better explain why their kernel is that broken.  This is
entirely unrelated to the known shortcomings of the "ident" IP
protocol.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread David E. Wheeler

On Jul 5, 2008, at 08:13, Tom Lane wrote:


Stupid question: What would the negation of || actually be? There
isn't one is, there?


Per the docs, NEGATOR is only sensible for operators returning  
boolean.


Message received. Many thanks, Tom, as usual.

Best,

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread David E. Wheeler

On Jul 5, 2008, at 02:58, Gregory Stark wrote:

do I need to worry about memory leaks in citext_eq, citext_ne,  
citext_gt,

etc.,


yes


Thanks.

David

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread David E. Wheeler

On Jul 5, 2008, at 02:58, Gregory Stark wrote:


   txt = cilower( PG_GETARG_TEXT_PP(0) );
   str = VARDATA_ANY(txt);

   result = hash_any((unsigned char *) str, VARSIZE_ANY_EXHDR(txt));


I thought your data type implemented a locale dependent collation,  
not just

a case insensitive collation. That is, does this hash agree with your
citext_eq on strings like "foo bar" <=> "foobar" and "fooß" <=>  
"fooss" ?


CITEXT is basically intended to replace all those queries that do  
`WHERE LOWER(col) = LOWER(?)` by doing it internally. That's it. It's  
locale-aware to the same extent that `LOWER()` is (and that citext 1.0  
is not, since it only compares ASCII characters case-insensitively).  
And I expect that it does, in fact, agree with your examples, in that  
all the current tests for = and <> pass:


try=# select 'foo bar' = 'foobar';
 ?column?
--
 f

try=# SELECT 'fooß' = 'fooss';
 ?column?
--
 f


You may have to use strxfrm


In the patch against CVS HEAD, it uses str_tolower() in formatting.c.

Best,

David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A Windows x64 port of PostgreSQL

2008-07-05 Thread Ron Mayer

chris wrote:

 C++0x standards
committee where they finalized long long as being required to be 8


AFAIK, we oughtn't care what C++ standards say, because PostgreSQL is
implemented in C, and therefore needs to follow what the *C* standards
say.


I agree the C++ standards should matter one bit to postgresql, but
AFAIK C99 also says "long long" is at least 64 bits too -- but if
we're talking C99, we'd be better off using whichever of int64_t
or int_least64_t or int_fast64_t we really meant anyway.  Since we
don't I assume we're trying to be compatible with pre-c99 C too
which AFAICT means you can't assume much about "long long" either.
Pre-C99 you can't really count on much.  I've spent time where "int"
was 20 bits; and on another platform where int was 32 bits and
long 40 bits.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [PATCHES] [HACKERS] Solaris ident authentication using unix domain sockets

2008-07-05 Thread Andrew Dunstan



Robert Treat wrote:

On Thursday 03 July 2008 14:01:22 Tom Lane wrote:
  

Garick Hamlin <[EMAIL PROTECTED]> writes:


I have a patch that I have been using to support postgresql's
notion of ident authentication when using unix domain sockets on
Solaris.  This patch basically just adds support for using
getupeercred() on Solaris so unix sockets and ident auth works just
like it does on Linux and elsewhere.
  

Cool.




Hmm... I've always been told that Solaris didn't support this because the 
Solaris developers feel that IDENT is inherently insecure. If that is more 
than just a philosphical opinion, I wonder if there should be additional 
hurdles in place to enable this on that platform. Note that isn't an 
objection from me, though I'm curious if any of the Sun guys want to chime in 
on this. 

  



We don't actually use the Ident protocol for Unix sockets on any 
platform. AIUI, this patch just implements what we do on platforms like 
Linux or *BSD.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest rules

2008-07-05 Thread Robert Treat
On Saturday 05 July 2008 18:07:46 Robert Treat wrote:
> On Thursday 03 July 2008 15:12:08 Joshua D. Drake wrote:
> > On Thu, 2008-07-03 at 20:06 +0100, Dave Page wrote:
> > > On Thu, Jul 3, 2008 at 8:02 PM, Marko Kreen <[EMAIL PROTECTED]> wrote:
> > > > On 7/3/08, Dave Page <[EMAIL PROTECTED]> wrote:
> > > >> it concerns me that despite it being day 3 of the July commit fest,
> > > >>  people are still being advised to add new items to the wiki page.
> > > >>
> > > >>  So please - new patches to the September  page!
> > > >
> > > > But updates to existing patches should be ok?
> > >
> > > Yes.
> >
> > Perhaps this would be helpful:
> >
> > http://wiki.postgresql.org/wiki/CommitFest:Help
>
> Actually, I think we need to be better about linking to that and the
> various commmitfest pages within the wiki itself. Right now it's a bit
> disorienting trying to find your way around within the wiki itself
> well, maybe I will get some time to fix that in the next couple of days.
>

Hmm, looks like some of the things I was thinking about have been added 
recenelt... cool. One question I have still remains though, on the main 
developer page (http://wiki.postgresql.org/wiki/Development_information) it 
has a link to the "current commitfest", which points to september's 
commitfest page. ISTM the current commitfest is July's, since that's the one 
we're currently working on. Also, if you're looking to submit a new patch, 
the July page also has a link to Septembers page, so you end up where you 
need to be; but if you want to see what is currently being worked on, going 
to Septembers page has no link to July's page, so you end up having to hit 
the search engine to find the right page. So, I'm think that the first link 
should point to whichever commitfest page is either actively being worked on, 
or will be next to be worked on (the link being updated next when the July 
fest closes). Any objections?

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECURSIVE updated to CVS TIP

2008-07-05 Thread David Fetter
On Sat, Jul 05, 2008 at 10:43:57AM +0200, Hans-Juergen Schoenig wrote:
> hello david,
>
> i did some quick testing with this wonderful patch.
> it seems there are some flaws in there still:
>
> test=# explain select count(*)
> test-# from ( WITH RECURSIVE t(n) AS ( SELECT 1 UNION ALL SELECT 
> DISTINCT n+1 FROM t )
> test(# SELECT * FROM t WHERE n < 50) as t
> test-# WHERE n < 100;
> server closed the connection unexpectedly
> This probably means the server terminated abnormally
> before or while processing the request.
> The connection to the server was lost. Attempting reset: Failed.
> !> \q
>
> this one will kill the planner :(
> removing the (totally stupid) distinct avoids the core dump.

Any idea why this might be happening?

> i found one more issue;
>
> -- broken: wrong result
> test=# select count(*) from ( WITH RECURSIVE t(n) AS (
> SELECT 1 UNION ALL SELECT n + 1 FROM t)
> SELECT * FROM t WHERE n < 50) as t WHERE n < (
> select count(*) from ( WITH RECURSIVE t(n) AS (
> SELECT 1 UNION ALL SELECT n + 1 FROM t )
> SELECT * FROM t WHERE n < 50) as t WHERE n < 100) ;
>  count
> ---
>  1
> (1 row)
>
> if i am not totally wrong, this should give us a different result.

What result should it give, and what do you think is going wrong here?

> i am looking forward to see this patch in core :).

So am I :)

> it is simply wonderful ...
>
>   many thanks,

Thanks go to the kind people who actually wrote the thing.  I've just
been using git to keep the bit-rot off it :)

Cheers,
David.
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest rules

2008-07-05 Thread Robert Treat
On Thursday 03 July 2008 18:16:38 Dave Page wrote:
> On Thu, Jul 3, 2008 at 10:37 PM, Tom Lane <[EMAIL PROTECTED]> wrote:
> > "Merlin Moncure" <[EMAIL PROTECTED]> writes:
> >> Are you suggesting that omission of a patch on the 'fest' page means
> >> that you are bumped from the fest?
> >
> > No, if you had submitted the patch on time then the correct next step
> > is to get it added to the fest page; I don't think that should be
> > controversial.  But the reviewers aren't gonna review it if it's not
> > listed on that page...
>
> Right, but the author should take some responsibility for ensuring the
> patch is listed on time. What we don't want is forgotten patches
> getting added at the last minute, right as the CommitFest manager is
> wrapping things up having got 95% of the patches reviewed and the
> other 5% in progress.
>

I think people are still working there way through the process, but it's 
starting to sound like submitting a patch involves two steps from now on; 
email to the list, and add your patch to the next commitfest page. Does that 
sound right? 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest rules

2008-07-05 Thread Robert Treat
On Thursday 03 July 2008 15:12:08 Joshua D. Drake wrote:
> On Thu, 2008-07-03 at 20:06 +0100, Dave Page wrote:
> > On Thu, Jul 3, 2008 at 8:02 PM, Marko Kreen <[EMAIL PROTECTED]> wrote:
> > > On 7/3/08, Dave Page <[EMAIL PROTECTED]> wrote:
> > >> it concerns me that despite it being day 3 of the July commit fest,
> > >>  people are still being advised to add new items to the wiki page.
> > >>
> > >>  So please - new patches to the September  page!
> > >
> > > But updates to existing patches should be ok?
> >
> > Yes.
>
> Perhaps this would be helpful:
>
> http://wiki.postgresql.org/wiki/CommitFest:Help
>

Actually, I think we need to be better about linking to that and the various 
commmitfest pages within the wiki itself. Right now it's a bit disorienting 
trying to find your way around within the wiki itself well, maybe I will 
get some time to fix that in the next couple of days. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Solaris ident authentication using unix domain sockets

2008-07-05 Thread Robert Treat
On Thursday 03 July 2008 14:01:22 Tom Lane wrote:
> Garick Hamlin <[EMAIL PROTECTED]> writes:
> > I have a patch that I have been using to support postgresql's
> > notion of ident authentication when using unix domain sockets on
> > Solaris.  This patch basically just adds support for using
> > getupeercred() on Solaris so unix sockets and ident auth works just
> > like it does on Linux and elsewhere.
>
> Cool.
>

Hmm... I've always been told that Solaris didn't support this because the 
Solaris developers feel that IDENT is inherently insecure. If that is more 
than just a philosphical opinion, I wonder if there should be additional 
hurdles in place to enable this on that platform. Note that isn't an 
objection from me, though I'm curious if any of the Sun guys want to chime in 
on this. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread Simon Riggs

On Sun, 2008-07-06 at 03:40 +0900, H.Harada wrote:
> Hi,
> 
> 2008/7/6 Simon Riggs <[EMAIL PROTECTED]>:
> >
> > On Sat, 2008-07-05 at 16:20 +0200, Martijn van Oosterhout wrote:
> >> On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
> >
> >> > http://umitanuki.net/pgsql/wfv01/design.html
> >> >
> >> > The problem is, as written in the "Things to discussed" section of the
> >> > document, how you define window functions (e.g. RANK()). My idea is to
> >> > treat them as specialized functions such as SET OF functions and mark
> >> > it in pg_proc. But this doesn't resolve RANK() boundary problem.
> >>
> >> Actually, I would make RANK() and ROW_NUMBER() act more like
> >> aggregates. ISTM you have two kinds of window functions:
> >>
> >> - aggregation: a result is calculated over a set and the result copied
> >>   across all the rows.
> >> - order depenadant: same as above, but the result is different for each
> >>   row.
> >>
> >> I think you could make the latter work using the current aggregation
> >> setup, just by calling the final_func for each row rather than just
> >> once at the end.
> >
> > AFAICS there's no overlap between windowed aggregates and normal
> > aggregates, so we can different infrastructure for each. I like the
> > suggestion of doing it very similarly to current aggregates, but I would
> > introduce a new function hook for windowed aggregates, wfunc.
> 
> I think there are two types of functions for windowed mode.
> - windowed aggregate
> this type of function is exactly same as normal aggregate. So we use
> functions that have been in pgsql already. Actually in my patch above,
> I didn't introduce any new function. This type of function includes
> simply sum(), avg(), etc. which returns same values on a partition or
> a window frame.
> 
> - windowed function
> this is the NEW type of function. I guess we should add a new function
> type to pgsql. This type of function includes rank(), rank_dense(),
> row_number(), etc. Windowed functions returns different values per
> tuple.
> 
> The difference between two types is if the function returns the same
> value during a partition or different values.
> 
> So, windowed aggregate and normal aggregate overlap each other. How
> you know which one is that you see OVER clause in SQL just after the
> function call. When you see OVER after func(), and pg_proc says it's
> an aggregate, it's a windowed aggregate. Otherwise, it's a windowed
> function.
> 
> If I misunderstood about those definitions please correct me.

Yes, I understand that and I think Martijn does also.

I've done some thinking and rooting around on this and I think I have a
different proposal for you, different to what we just discussed.

SQL2008 specifies window functions as

* rank functions
* distribution functions: percent_rank() and cume_dist()
* rownumber()
* ntile()
* lead() and lag()
* first, last and n-th value functions
* inverse distribution functions (similar to n-th value, based upon
distribution function results)

plus window aggregate functions (the normal aggregates COUNT, SUM etc)

Now looking through all of those, I don't see *any* window functions
that need access to different datatypes, or actually need to see the
values of the attributes. 

The normal aggregates work with windows identically to the way they do
without windows, so no change needed there.

AFAICS we could define all of the non-aggregate window functions on the
above list *without* defining them as functions in pg_proc. That would
be a benefit because the window functions are very powerful and we'd
need to give them access to any/all tuples in the window.

So that would mean we don't provide a mechanism for user-defined
windowed aggregate functions at all. Which solves the discussion about
how to pass generic info through to them (at least long enough to get
the first implementation done).

We do already have such functions in code, e.g. greatest(). Sure they
need to be defined in code, but we don't need to come up with a generic
API for them.

If you disagree, think about how we'd implement lag() or ntile() and
what info we'd need to pass them.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
2008/7/5 Martijn van Oosterhout <[EMAIL PROTECTED]>:
> On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
>> Hi,
>>
>> As I proposed a month before, I am working on window function.
>
> Very nice!
>
>> http://umitanuki.net/pgsql/wfv01/design.html
>>
>> The problem is, as written in the "Things to discussed" section of the
>> document, how you define window functions (e.g. RANK()). My idea is to
>> treat them as specialized functions such as SET OF functions and mark
>> it in pg_proc. But this doesn't resolve RANK() boundary problem.
>
> Actually, I would make RANK() and ROW_NUMBER() act more like
> aggregates. ISTM you have two kinds of window functions:
>
> - aggregation: a result is calculated over a set and the result copied
>  across all the rows.
> - order depenadant: same as above, but the result is different for each
>  row.

So I agree the definition of these two types.

> I think you could make the latter work using the current aggregation
> setup, just by calling the final_func for each row rather than just
> once at the end.

How do you know which type of two above is used in the same SQL syntax?


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
Hi,

2008/7/6 Simon Riggs <[EMAIL PROTECTED]>:
>
> On Sat, 2008-07-05 at 16:20 +0200, Martijn van Oosterhout wrote:
>> On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
>
>> > http://umitanuki.net/pgsql/wfv01/design.html
>> >
>> > The problem is, as written in the "Things to discussed" section of the
>> > document, how you define window functions (e.g. RANK()). My idea is to
>> > treat them as specialized functions such as SET OF functions and mark
>> > it in pg_proc. But this doesn't resolve RANK() boundary problem.
>>
>> Actually, I would make RANK() and ROW_NUMBER() act more like
>> aggregates. ISTM you have two kinds of window functions:
>>
>> - aggregation: a result is calculated over a set and the result copied
>>   across all the rows.
>> - order depenadant: same as above, but the result is different for each
>>   row.
>>
>> I think you could make the latter work using the current aggregation
>> setup, just by calling the final_func for each row rather than just
>> once at the end.
>
> AFAICS there's no overlap between windowed aggregates and normal
> aggregates, so we can different infrastructure for each. I like the
> suggestion of doing it very similarly to current aggregates, but I would
> introduce a new function hook for windowed aggregates, wfunc.

I think there are two types of functions for windowed mode.
- windowed aggregate
this type of function is exactly same as normal aggregate. So we use
functions that have been in pgsql already. Actually in my patch above,
I didn't introduce any new function. This type of function includes
simply sum(), avg(), etc. which returns same values on a partition or
a window frame.

- windowed function
this is the NEW type of function. I guess we should add a new function
type to pgsql. This type of function includes rank(), rank_dense(),
row_number(), etc. Windowed functions returns different values per
tuple.

The difference between two types is if the function returns the same
value during a partition or different values.

So, windowed aggregate and normal aggregate overlap each other. How
you know which one is that you see OVER clause in SQL just after the
function call. When you see OVER after func(), and pg_proc says it's
an aggregate, it's a windowed aggregate. Otherwise, it's a windowed
function.

If I misunderstood about those definitions please correct me.

> Denserank is fairly simple
>
> CREATE AGGREGATE denserank()
> (
>sfunc = increment
>stype = bigint
>initcond = 0
> )

I think this is ROW_NUMBER(), not DENSE_RANK(), isn't it?

> rank() is fairly complex because the state data must track 3 things:
> * the number of tuples seen so far (bigint)
> * the value of the last tuple seen (anyelement)
> * the rank of the last tuple seen (bigint)
>
> sfunc would compare the new value with the last value, if they match
> then we return the rank of the last tuple. If they don't match then we
> set the stored value and rank, then return the number of tuples seen so
> far as the rank.

Yeah, I think RANK() needs to know some or all of tuple of current
row. But it seems not to take any argument, which is the "boundary
problem" I said. Definitely RANK() or windowed function should know
about current tuple so that when to increment rank. But how? As
explicit argument? or specialized method?


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread Simon Riggs

On Sat, 2008-07-05 at 16:20 +0200, Martijn van Oosterhout wrote:
> On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:

> > http://umitanuki.net/pgsql/wfv01/design.html
> > 
> > The problem is, as written in the "Things to discussed" section of the
> > document, how you define window functions (e.g. RANK()). My idea is to
> > treat them as specialized functions such as SET OF functions and mark
> > it in pg_proc. But this doesn't resolve RANK() boundary problem.
> 
> Actually, I would make RANK() and ROW_NUMBER() act more like
> aggregates. ISTM you have two kinds of window functions:
> 
> - aggregation: a result is calculated over a set and the result copied
>   across all the rows.
> - order depenadant: same as above, but the result is different for each
>   row.
> 
> I think you could make the latter work using the current aggregation
> setup, just by calling the final_func for each row rather than just
> once at the end.

AFAICS there's no overlap between windowed aggregates and normal
aggregates, so we can different infrastructure for each. I like the
suggestion of doing it very similarly to current aggregates, but I would
introduce a new function hook for windowed aggregates, wfunc.

i.e. to create a windowed aggregate you would do

CREATE AGGREGATE window_func() 
(
sfunc = ...
stype = ...
wfunc = ...
initcond = 
)

For each row we would execute the transition function (sfunc) then, if
there is a window function (wfunc) then we call that to return a value
for this tuple (so in that case we execute two functions per tuple in
the window). If wfunc is not set then we return the transition datatype
itself.

Doing it this way

* it will be clear which aggregates are windowed and which non-windowed,
so we can avoid errors running a windowed aggregate in a non-windowed
context

* it also allows us to avoid executing two functions when the windowed
function is very simple - denserank() for example just returns the
number of rows seen so far in the window.

Denserank is fairly simple

CREATE AGGREGATE denserank()
(
sfunc = increment
stype = bigint
initcond = 0
)

rank() is fairly complex because the state data must track 3 things:
* the number of tuples seen so far (bigint)
* the value of the last tuple seen (anyelement)
* the rank of the last tuple seen (bigint)

sfunc would compare the new value with the last value, if they match
then we return the rank of the last tuple. If they don't match then we
set the stored value and rank, then return the number of tuples seen so
far as the rank.

> That would make RANK() a normal aggrgate which returns the number of
> distinct values seen so far (assuming input is ordered) and
> ROW_NUMBER() is just an alias for COUNT().

> I hope this is clear, let me know if it doesn't make sense.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Training, Services and Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] time_stamp type

2008-07-05 Thread Stephen R. van den Berg
What's the deal with this type?  Is it used internally?
It's the only type that seems to have anything meaningfull in
typdefaultbin and typdefault columns in pg_type.
-- 
Sincerely,
   Stephen R. van den Berg.

WARNING: Do not look into laser with remaining eye

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread Tom Lane
"David E. Wheeler" <[EMAIL PROTECTED]> writes:
>>> Also OPERATOR || has probably wrong negator.
>> 
>> Right, good catch.

> Stupid question: What would the negation of || actually be? There  
> isn't one is, there?

Per the docs, NEGATOR is only sensible for operators returning boolean.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] introduction of WIP window function patch

2008-07-05 Thread Martijn van Oosterhout
On Sat, Jul 05, 2008 at 07:04:29PM +0900, H.Harada wrote:
> Hi,
> 
> As I proposed a month before, I am working on window function.

Very nice!

> http://umitanuki.net/pgsql/wfv01/design.html
> 
> The problem is, as written in the "Things to discussed" section of the
> document, how you define window functions (e.g. RANK()). My idea is to
> treat them as specialized functions such as SET OF functions and mark
> it in pg_proc. But this doesn't resolve RANK() boundary problem.

Actually, I would make RANK() and ROW_NUMBER() act more like
aggregates. ISTM you have two kinds of window functions:

- aggregation: a result is calculated over a set and the result copied
  across all the rows.
- order depenadant: same as above, but the result is different for each
  row.

I think you could make the latter work using the current aggregation
setup, just by calling the final_func for each row rather than just
once at the end.

That would make RANK() a normal aggrgate which returns the number of
distinct values seen so far (assuming input is ordered) and
ROW_NUMBER() is just an alias for COUNT().

I hope this is clear, let me know if it doesn't make sense.

Have a nice day,
-- 
Martijn van Oosterhout   <[EMAIL PROTECTED]>   http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while 
> boarding. Thank you for flying nlogn airlines.


signature.asc
Description: Digital signature


[HACKERS] introduction of WIP window function patch

2008-07-05 Thread H . Harada
Hi,

As I proposed a month before, I am working on window function.
Although this work is at quite early step, I would like to introduce
it since a part of it have been finished. If you can afford and are
interested in it, please review the document and patch, or compile the
applied source to execute an attached sample SQL.

http://umitanuki.net/pgsql/wfv01/design.html

Currently, only aggregation over window does work. I am planning to
work for the combination of window and normal aggregation from now on,
which I guess I can manage to do.

The problem is, as written in the "Things to discussed" section of the
document, how you define window functions (e.g. RANK()). My idea is to
treat them as specialized functions such as SET OF functions and mark
it in pg_proc. But this doesn't resolve RANK() boundary problem.

I am so happy with any kind of comments, reviews or critiques.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread Gregory Stark
"David E. Wheeler" <[EMAIL PROTECTED]> writes:

> do I need to worry about memory leaks in citext_eq, citext_ne, citext_gt,
> etc.,

yes

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: CITEXT 2.0

2008-07-05 Thread Gregory Stark
"David E. Wheeler" <[EMAIL PROTECTED]> writes:

> On Jul 3, 2008, at 09:53, Alvaro Herrera wrote:
>
>>> Thanks. What would citext_hash() look like? I don't see a  text_hash() to
>>> borrow from anywhere in src/.
>>
>> See hash_any().  I assume the difficulty is making sure that
>> hash("FOO") = hash("foo") ...
>
> Great, big help, thank you. So does this look sensible?
>
> txt = cilower( PG_GETARG_TEXT_PP(0) );
> str = VARDATA_ANY(txt);
>
> result = hash_any((unsigned char *) str, VARSIZE_ANY_EXHDR(txt));

I thought your data type implemented a locale dependent collation, not just
a case insensitive collation. That is, does this hash agree with your
citext_eq on strings like "foo bar" <=> "foobar" and "fooß" <=> "fooss" ?

You may have to use strxfrm

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [COMMITTERS] pgsql: Fix a couple of bugs in win32 shmem name generation: * Don't cut

2008-07-05 Thread Magnus Hagander
Alvaro Herrera wrote:
> Magnus Hagander wrote:
>> Tom Lane wrote:
> 
>>> According to what you just told me, the original coding is storing the
>>> name in a "local namespace", which presumably means it won't conflict
>>> anyway.  Ergo, the existing coding is simply broken and there's nothing
>>> we can do about it.
>> Local namespace = Session local, not process local. So it would properly
>> protect against two processes started in the same session. One session
>> is, for example, an interactive login. But not if they were started by
>> different users, since they'd be in different sessions.
> 
> But those different users would not have access to the same set of
> files, so it wouldn't work anyway, right?

Depends on what permissions you set on the directory, obviously
Default ones depend on windows version and where in the filesystem they go.

//Magnus

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH RECURSIVE updated to CVS TIP

2008-07-05 Thread Hans-Juergen Schoenig

hello david,

i did some quick testing with this wonderful patch.
it seems there are some flaws in there still:

test=# explain select count(*)
test-# from ( WITH RECURSIVE t(n) AS ( SELECT 1 UNION ALL  
SELECT DISTINCT n+1 FROM t )

test(# SELECT * FROM t WHERE n < 50) as t
test-# WHERE n < 100;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!> \q

this one will kill the planner :(
removing the (totally stupid) distinct avoids the core dump.


i found one more issue;

-- broken: wrong result
test=# select count(*) from ( WITH RECURSIVE t(n) AS (
SELECT 1 UNION ALL SELECT n + 1 FROM t)
SELECT * FROM t WHERE n < 50) as t WHERE n < (
select count(*) from ( WITH RECURSIVE t(n) AS (
SELECT 1 UNION ALL SELECT n + 1 FROM t )
SELECT * FROM t WHERE n < 50) as t WHERE n < 100) ;
 count
---
 1
(1 row)

if i am not totally wrong, this should give us a different result.

i am looking forward to see this patch in core :).
it is simply wonderful ...

many thanks,

hans






On Jul 3, 2008, at 1:11 AM, David Fetter wrote:


Folks,

Please find patch enclosed, including some documentation.

Can we see about getting this in this commitfest?

Cheers,
David.
--
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/ 
donate

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers




--
Cybertec Schönig & Schönig GmbH
PostgreSQL Solutions and Support
Gröhrmühlgasse 26, 2700 Wiener Neustadt
Tel: +43/1/205 10 35 / 340
www.postgresql-support.de, www.postgresql-support.com