Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Regarding the patch listed on the commitfest 3 new functions into intarray and intagg (which I just noticed has a reviewer listed -- doh): http://archives.postgresql.org/message-id/[EMAIL PROTECTED] I definitely like the int_array_append_aggregate function but I don't see anything int[] specific about it. We should be able to have a generic array_union() aggregate which uses the same IsA(fcinfo-context, AggState) trick to scribble on its state variable. It don't even see any reason it couldn't work for arrays of varlenas, though it would take a bit of restructuring. So I would be definitely for a adding this to core if it were rewritten to work with generic arrays which, unless there are problems I'm not seeing, I don't think would be very hard. As far as detailed code commentary the only thing which jumps out at me is that it's using MemoryContextAlloc to grow the array instead of repalloc which seems like a waste. This isn't a new thing though, it was how intagg was written and this patch just didn't change it. I'm not against putting more functions into intagg and intarray and bidx and the grouping/counting thing seem like they might be useful functionality. but I have a feeling others might feel differently. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's RemoteDBA services! -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Hi, Gregory Stark wrote: Regarding the patch listed on the commitfest 3 new functions into intarray and intagg (which I just noticed has a reviewer listed -- doh): ..well, just add your name as well, no? I definitely like the int_array_append_aggregate function but I don't see anything int[] specific about it. We should be able to have a generic array_union() aggregate which uses the same IsA(fcinfo-context, AggState) trick to scribble on its state variable. It don't even see any reason it couldn't work for arrays of varlenas, though it would take a bit of restructuring. Yeah, the same idea was bugging me. Doesn't such code already exist? So I would be definitely for a adding this to core if it were rewritten to work with generic arrays which, unless there are problems I'm not seeing, I don't think would be very hard. As far as detailed code commentary the only thing which jumps out at me is that it's using MemoryContextAlloc to grow the array instead of repalloc which seems like a waste. This isn't a new thing though, it was how intagg was written and this patch just didn't change it. Oh, good catch. I'm not against putting more functions into intagg and intarray and bidx and the grouping/counting thing seem like they might be useful functionality. but I have a feeling others might feel differently. The naming 'bidx' seems a bit weired to me, but otherwise I'm also optimistic about it. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Markus Wanner [EMAIL PROTECTED] writes: Hi, Gregory Stark wrote: Regarding the patch listed on the commitfest 3 new functions into intarray and intagg (which I just noticed has a reviewer listed -- doh): ..well, just add your name as well, no? Yeah, people should feel free to comment on items even if other people have or are as well. It would have just been more useful for me to pick one that didn't already have someone reading it is all. As far as detailed code commentary the only thing which jumps out at me is that it's using MemoryContextAlloc to grow the array instead of repalloc which seems like a waste. This isn't a new thing though, it was how intagg was written and this patch just didn't change it. Oh, good catch. Actually on further thought what's going on is that it's resizing the array by doubling its size when necessary. palloc/repalloc does that as well, so you're getting two layers of extra space to handle reallocations. I think it's simpler and cleaner to just repalloc just enough space at each growth and let repalloc handle allocating extra space to handle future growth. I think that would be necessary for handling varlenas also. The naming 'bidx' seems a bit weired to me, but otherwise I'm also optimistic about it. If we wanted to put that in core it would make more sense to have a flag on the array indicating whether it's sorted or not which is maintained whenever we construct or alter an array. Then just have the regular _int_contains() (which is an operator @) take advantage of it if it's set and the data type is fixed-size. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production Tuning -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Hi, Gregory Stark wrote: The naming 'bidx' seems a bit weired to me, but otherwise I'm also optimistic about it. If we wanted to put that in core Uh.. ATM it's just a patch against contrib. I don't think 'bidx' needs to go into core. Otherwise we'd have to move the whole intarr contrib module as well, no? it would make more sense to have a flag on the array indicating whether it's sorted or not which is maintained whenever we construct or alter an array. Then just have the regular _int_contains() (which is an operator @) take advantage of it if it's set and the data type is fixed-size. Yeah, that sounds like a good thing. Currently, the intarr module doesn't provide the optimized functions for the outside world (_int_inter_inner() and such.. the _inner appendix really means inside intarr module only). I've ended up copy'n'pasting that code into my own module, where I take care about ordering myself. But still want to maintain the optimization. However, that's probably not within the scope of this patch. Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Hi, Gregory Stark wrote: I definitely like the int_array_append_aggregate function but I don't see anything int[] specific about it. We should be able to have a generic array_union() aggregate which uses the same IsA(fcinfo-context, AggState) trick to scribble on its state variable. It don't even see any reason it couldn't work for arrays of varlenas, though it would take a bit of restructuring. I've just noticed that this already is a todo item: Add array_accum() and array_to_set() functions for arrays The standards specify array_agg() and UNNEST. See also: http://archives.postgresql.org/pgsql-hackers/2007-08/msg00464.php Regards Markus Wanner -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Dmitry Koterov napsal(a): Hello. Here are these functions with detailed documentation: http://en.dklab.ru/lib/dklab_postgresql_patch/ Added to next commit fest patch list. Zdenek -- Zdenek Kotala Sun Microsystems Prague, Czech Republic http://sun.com/postgresql -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Patch: propose to include 3 new functions into intarray and intagg
Hello. Here are these functions with detailed documentation: http://en.dklab.ru/lib/dklab_postgresql_patch/ intagg.int_array_append_aggregate(int[]): fast merge arrays into one large list intarray._int_group_count_sort(int[], bool): frequency-based sorting intarray.bidx(int[], int): binary search in a sorted array Tested for about a year on a real PostgreSQL cluster (10 machines, Slony replication) under a heavy load (millions of requests). No crash nor memory problem detected during a year, so I suppose these functions are well-tested. What do you think about that?