Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-09-05 Thread Gregory Stark

Regarding the patch listed on the commitfest 3 new functions into intarray
and intagg (which I just noticed has a reviewer listed -- doh):

http://archives.postgresql.org/message-id/[EMAIL PROTECTED]

I definitely like the int_array_append_aggregate function but I don't see
anything int[] specific about it. We should be able to have a generic
array_union() aggregate which uses the same IsA(fcinfo-context, AggState)
trick to scribble on its state variable. It don't even see any reason it
couldn't work for arrays of varlenas, though it would take a bit of
restructuring.

So I would be definitely for a adding this to core if it were rewritten to
work with generic arrays which, unless there are problems I'm not seeing, I
don't think would be very hard.

As far as detailed code commentary the only thing which jumps out at me is
that it's using MemoryContextAlloc to grow the array instead of repalloc which
seems like a waste. This isn't a new thing though, it was how intagg was
written and this patch just didn't change it.

I'm not against putting more functions into intagg and intarray and bidx and
the grouping/counting thing seem like they might be useful functionality. but
I have a feeling others might feel differently.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-09-05 Thread Markus Wanner

Hi,

Gregory Stark wrote:

Regarding the patch listed on the commitfest 3 new functions into intarray
and intagg (which I just noticed has a reviewer listed -- doh):


..well, just add your name as well, no?


I definitely like the int_array_append_aggregate function but I don't see
anything int[] specific about it. We should be able to have a generic
array_union() aggregate which uses the same IsA(fcinfo-context, AggState)
trick to scribble on its state variable. It don't even see any reason it
couldn't work for arrays of varlenas, though it would take a bit of
restructuring.


Yeah, the same idea was bugging me. Doesn't such code already exist?


So I would be definitely for a adding this to core if it were rewritten to
work with generic arrays which, unless there are problems I'm not seeing, I
don't think would be very hard.

As far as detailed code commentary the only thing which jumps out at me is
that it's using MemoryContextAlloc to grow the array instead of repalloc which
seems like a waste. This isn't a new thing though, it was how intagg was
written and this patch just didn't change it.


Oh, good catch.


I'm not against putting more functions into intagg and intarray and bidx and
the grouping/counting thing seem like they might be useful functionality. but
I have a feeling others might feel differently.


The naming 'bidx' seems a bit weired to me, but otherwise I'm also 
optimistic about it.


Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-09-05 Thread Gregory Stark
Markus Wanner [EMAIL PROTECTED] writes:

 Hi,

 Gregory Stark wrote:
 Regarding the patch listed on the commitfest 3 new functions into intarray
 and intagg (which I just noticed has a reviewer listed -- doh):

 ..well, just add your name as well, no?

Yeah, people should feel free to comment on items even if other people have or
are as well. It would have just been more useful for me to pick one that
didn't already have someone reading it is all.

 As far as detailed code commentary the only thing which jumps out at me is
 that it's using MemoryContextAlloc to grow the array instead of repalloc 
 which
 seems like a waste. This isn't a new thing though, it was how intagg was
 written and this patch just didn't change it.

 Oh, good catch.

Actually on further thought what's going on is that it's resizing the array by
doubling its size when necessary. palloc/repalloc does that as well, so you're
getting two layers of extra space to handle reallocations. I think it's
simpler and cleaner to just repalloc just enough space at each growth and let
repalloc handle allocating extra space to handle future growth. I think that
would be necessary for handling varlenas also.

 The naming 'bidx' seems a bit weired to me, but otherwise I'm also optimistic
 about it.

If we wanted to put that in core it would make more sense to have a flag on
the array indicating whether it's sorted or not which is maintained whenever
we construct or alter an array. Then just have the regular _int_contains()
(which is an operator @) take advantage of it if it's set and the data type
is fixed-size.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-09-05 Thread Markus Wanner

Hi,

Gregory Stark wrote:

The naming 'bidx' seems a bit weired to me, but otherwise I'm also optimistic
about it.


If we wanted to put that in core


Uh.. ATM it's just a patch against contrib. I don't think 'bidx' needs 
to go into core. Otherwise we'd have to move the whole intarr contrib 
module as well, no?



it would make more sense to have a flag on
the array indicating whether it's sorted or not which is maintained whenever
we construct or alter an array. Then just have the regular _int_contains()
(which is an operator @) take advantage of it if it's set and the data type
is fixed-size.


Yeah, that sounds like a good thing. Currently, the intarr module 
doesn't provide the optimized functions for the outside world 
(_int_inter_inner() and such.. the _inner appendix really means inside 
intarr module only). I've ended up copy'n'pasting that code into my own 
module, where I take care about ordering myself. But still want to 
maintain the optimization.


However, that's probably not within the scope of this patch.

Regards

Markus Wanner



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-09-05 Thread Markus Wanner

Hi,

Gregory Stark wrote:

I definitely like the int_array_append_aggregate function but I don't see
anything int[] specific about it. We should be able to have a generic
array_union() aggregate which uses the same IsA(fcinfo-context, AggState)
trick to scribble on its state variable. It don't even see any reason it
couldn't work for arrays of varlenas, though it would take a bit of
restructuring.


I've just noticed that this already is a todo item:

  Add array_accum() and array_to_set() functions for arrays

The standards specify array_agg() and UNNEST.

See also:
http://archives.postgresql.org/pgsql-hackers/2007-08/msg00464.php

Regards

Markus Wanner

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-08-14 Thread Zdenek Kotala

Dmitry Koterov napsal(a):

Hello.

Here are these functions with detailed documentation:
http://en.dklab.ru/lib/dklab_postgresql_patch/


Added to next commit fest patch list.

Zdenek

--
Zdenek Kotala  Sun Microsystems
Prague, Czech Republic http://sun.com/postgresql


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Patch: propose to include 3 new functions into intarray and intagg

2008-08-13 Thread Dmitry Koterov
Hello.

Here are these functions with detailed documentation:
http://en.dklab.ru/lib/dklab_postgresql_patch/

intagg.int_array_append_aggregate(int[]): fast merge arrays into one large
list
intarray._int_group_count_sort(int[], bool): frequency-based sorting
intarray.bidx(int[], int): binary search in a sorted array

Tested for about a year on a real PostgreSQL cluster (10 machines, Slony
replication) under a heavy load (millions of requests).
No crash nor memory problem detected during a year, so I suppose these
functions are well-tested.

What do you think about that?