Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Jesper Krogh
On 2010-05-29 15:56, Jan Urbański wrote: On 29/05/10 12:34, Jesper Krogh wrote: On 2010-05-28 23:47, Jan Urbański wrote: On 28/05/10 22:22, Tom Lane wrote: Now I tried to substitute some numbers there, and so assuming the English language has ~1e6 words H(W) is around 6.5. Let's

Re: [HACKERS] [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-30 Thread Andres Freund
On Sunday 30 May 2010 04:56:09 Greg Stark wrote: This sounds familiar. If you search back in the archives around 2004 or so I think you'll find a similar discussion when we replaced the crc32 implementation with what we have now. We put a fair amount of effort into searching for faster

Re: [HACKERS] small exclusion constraints patch

2010-05-30 Thread Marko Tiikkaja
On 2010-05-30 06:55 +0300, Robert Haas wrote: On Sat, May 29, 2010 at 11:40 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yes, I've seen Jeff's example. It's a cute hack but somehow I doubt that there is going to be a land rush to implement such things. Can you point to any pre-existing example

[HACKERS] server authentication over Unix-domain sockets

2010-05-30 Thread Peter Eisentraut
It has been discussed several times in the past that there is no way for a client to authenticate a server over Unix-domain sockets. So depending on circumstances, a local user could easily insert his own server and collect passwords and data. Suggestions for possible remedies included: You can

Re: [HACKERS] pg_trgm

2010-05-30 Thread Peter Eisentraut
On sön, 2010-05-30 at 11:05 +0900, Tatsuo Ishii wrote: Wait. This works fine for me with stock pg_trgm. local is C and encoding is UTF8. What version of PostgreSQL are you using? Mine is 8.4.4. This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If you run this in

Re: [HACKERS] Is there anyway to get list of table name, before raw parser is analyze?

2010-05-30 Thread Andrew Dunstan
Mohammad Heykal Abdillah wrote: On Min, 2010-05-30 at 00:44 -0400, Andrew Dunstan wrote: Mohammad Heykal Abdillah wrote: Hi all, Right now i am trying to understand how SQL parser is work. My question is there anyway to get list of table name and its atribut before raw parser is

Re: [HACKERS] Is there anyway to get list of table name, before raw parser is analyze?

2010-05-30 Thread Mohammad Heykal Abdillah
On Min, 2010-05-30 at 07:57 -0400, Andrew Dunstan wrote: Yes that i know, expanding '*' is done in analyzer part. I am try to do is, move the expanding process to before raw_parser that produce by gram.y is processed by analyzer. Like this : sql query - gram.y - raw_parse_tree - (expand

Re: [HACKERS] Winflex

2010-05-30 Thread Andrew Dunstan
Magnus Hagander wrote: Using http://www.postgresql.org/ftp/misc/winflex/ on a 64-bit windows, but in 32-bit mode (this certainly used to work), trying to build HEAD. first of all, it returns error code 128 when running flex -V, which means our script claims it doesn't work. Commenting out

Re: [HACKERS] dividing money by money

2010-05-30 Thread Kevin Grittner
Andy Balholm wrote: On May 26, 2010, at 11:18 AM, Kevin Grittner wrote: Do you want to package this up as a patch for 9.1? If not, is it OK if I do? I'm not quite sure how to go about changing it from an add-on function to a built-in one. So if you want to do that, go ahead. If you'd

Re: [HACKERS] small exclusion constraints patch

2010-05-30 Thread Tom Lane
Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: On 2010-05-30 06:55 +0300, Robert Haas wrote: I've often wished for the ability to constrain a tale to hold just one row, so I don't find that use case implausible at all. As I pointed out in

Re: [HACKERS] Is there anyway to get list of table name, before raw parser is analyze?

2010-05-30 Thread Tom Lane
Mohammad Heykal Abdillah heykal.abdil...@gmail.com writes: Yes that i know, expanding '*' is done in analyzer part. I am try to do is, move the expanding process to before raw_parser that produce by gram.y is processed by analyzer. Like this : sql query - gram.y - raw_parse_tree - (expand the

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Tom Lane
Jesper Krogh jes...@krogh.cc writes: On 2010-05-29 15:56, Jan Urbański wrote: AFAIK statistics for everything other than tsvectors are built based on the values of whole rows. Wouldn't it make sense to treat array types like the tsvectors? Yeah, I have a personal TODO item to look into that

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes: This is still ignoring the point: arbitrarily changing the module's longstanding standard behavior isn't acceptable. You need to provide a way for the user to control the behavior. (Once you've done that, I think it can be just either alnum or

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Jan Urbański
Jesper Krogh jes...@krogh.cc writes: On 2010-05-29 15:56, Jan Urbański wrote: AFAIK statistics for everything other than tsvectors are built based on the values of whole rows. Wouldn't it make sense to treat array types like the tsvectors? Yeah, I have a personal TODO item to look

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Tom Lane
Jan =?UTF-8?Q?Urba=C5=84ski?= wulc...@wulczer.org writes: I think the only relevance of stopwords to the current problem is that *if* stopwords have been removed, we would see a Zipfian distribution with the first few entries removed, and I'm not sure if it's still really Zipfian afterwards.

Re: [HACKERS] functional call named notation clashes with SQL feature

2010-05-30 Thread Tom Lane
So as far as I can tell, no one is opposed to replacing expr AS name with name := expr in the named-parameter syntax. Obviously we had better get this done before beta2. Is anyone actually working on the code/docs changes? If not, I'll pick it up. regards, tom lane --

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tatsuo Ishii
This is in 9.0, because 8.4 doesn't recognize the \u escape syntax. If you run this in 8.4, you're just comparing a sequence of ASCII letters and digits. Hum. Still I prefer 8.4's behavior since anything is better than returning NaN. It seems 9.0 does not have any escape route for

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-30 Thread Greg Stark
On Sun, May 30, 2010 at 5:29 PM, Greg Stark gsst...@mit.edu wrote: On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: I read through that thread and couldn't find much discussion of alternative CRC implementations --- we spent all our time on arguing about whether we needed

Re: [HACKERS] small exclusion constraints patch

2010-05-30 Thread Robert Haas
On Sun, May 30, 2010 at 10:01 AM, Tom Lane t...@sss.pgh.pa.us wrote: Marko Tiikkaja marko.tiikk...@cs.helsinki.fi writes: On 2010-05-30 06:55 +0300, Robert Haas wrote: I've often wished for the ability to constrain a tale to hold just one row, so I don't find that use case implausible at all.

Re: [HACKERS] pg_trgm

2010-05-30 Thread Greg Stark
On Sun, May 30, 2010 at 3:41 PM, Tom Lane t...@sss.pgh.pa.us wrote: I don't think it's unreasonable to insist that behavioral changes be made in an upward compatible fashion ... especially ones that seem as least as likely to break some current usages as to enable new usages. Fwiw I don't

Re: [HACKERS] small exclusion constraints patch

2010-05-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: ... The fact that not very many people will want to do something is not a reason to prevent it. It's not about preventing it for no reason. The proposed patch removes a significant sanity check from code that still hasn't gotten out of beta. I might be

Re: [HACKERS] pg_trgm

2010-05-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: There seem to be three behaviours on the table here: You're neglecting 4) Let the user decide whether he wants pg_trgm to consider word elements to be alphanumerics or any non-space. The main problem I have with Tatsuo's patch is that it forecloses any

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Jan Urbański
On 30/05/10 09:08, Jesper Krogh wrote: On 2010-05-29 15:56, Jan Urbański wrote: On 29/05/10 12:34, Jesper Krogh wrote: I can fairly easy try out patches or do other kind of testing. I'll try to come up with a patch for you to try and fiddle with these values before Monday. Here's a

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-30 Thread Andres Freund
On Sunday 30 May 2010 18:29:31 Greg Stark wrote: On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: I read through that thread and couldn't find much discussion of alternative CRC implementations --- we spent all our time on arguing about whether we needed 64-bit CRC or

Re: [HACKERS] Re: [RFC][PATCH]: CRC32 is limiting at COPY/CTAS/INSERT ... SELECT + speeding it up

2010-05-30 Thread Andres Freund
On Sunday 30 May 2010 18:43:12 Greg Stark wrote: On Sun, May 30, 2010 at 5:29 PM, Greg Stark gsst...@mit.edu wrote: On Sun, May 30, 2010 at 4:54 AM, Tom Lane t...@sss.pgh.pa.us wrote: I read through that thread and couldn't find much discussion of alternative CRC implementations --- we

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: Here's a patch against recent git, but should apply to 8.4 sources as well. It would be interesting to measure the memory and time needed to analyse the table after applying it, because we will be now using a lot bigger bucket size

Re: [HACKERS] tsvector pg_stats seems quite a bit off.

2010-05-30 Thread Jan Urbański
On 31/05/10 00:07, Tom Lane wrote: =?UTF-8?B?SmFuIFVyYmHFhHNraQ==?= wulc...@wulczer.org writes: I committed the attached revised version of the patch. Revisions are mostly minor but I did make two substantive changes: * The patch changed the target number of mcelems from 10 *

Re: [HACKERS] dividing money by money

2010-05-30 Thread Andy Balholm
On May 30, 2010, at 6:53 AM, Kevin Grittner wrote: You would basically move the functions and their prototypes to cash.c and cash.h, and then (instead of CREATE FUNCTION, etc.) add corresponding entries to pg_proc.h and pg_operator.h. (If I'm missing something, someone please jump in.) Of

Re: [HACKERS] release notes

2010-05-30 Thread Andrew Dunstan
Tim Bunce wrote: On Mon, May 17, 2010 at 11:34:37AM -0400, Tom Lane wrote: Andrew Dunstan and...@dunslane.net writes: Why do the release notes say this, under plperl: * PL/Perl subroutines are now given names (Tim Bunce) This is for the use of profiling and code coverage

Re: [HACKERS] dividing money by money

2010-05-30 Thread Tom Lane
Andy Balholm a...@balholm.com writes: How do I come up with OID numbers for my new operators and functions? Go into src/include/catalog and run the unused_oids script found there. Pick some. Your chances of getting numbers close to those currently in use for money-related functions are

[HACKERS] PGCon 2010 pictures available

2010-05-30 Thread Oleg Bartunov
Hi there, I finally managed to publish selected pictures from developers meeting and PGCon 2010 closing session. The URL is http://www.flickr.com/photos/obartunov/sets/72157624042768831/ To see full size photo click on All sizes, then Original. I'd be happy to correct mistakes and add

Re: [HACKERS] fillfactor gets set to zero for toast tables

2010-05-30 Thread Takahiro Itagaki
This is still an open item for 9.0, and we should also backport it to 8.4. Which do we take on? Or is there better idea? Alvaro's idea: Maybe a better solution is to have some kind of notion of a default-only entry, which is sufficient to insert the default into the struct but isn't

[HACKERS] Adding regexp_match() function

2010-05-30 Thread Daniele Varrazzo
Hello, I'd like to contribute a regexp_match() function as discussed in bug #5469 [1] The aim is to overcome the limitation outlined in the thread above http://archives.postgresql.org/pgsql-bugs/2010-05/msg00227.php. PostgreSQL currently offers the function regexp_matches(), a SRF (which, unless

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes: Excerpts from Michael Renner's message of sáb may 15 20:24:36 -0400 2010: I've written a simple tool to generate traffic on a database [1], which did about 30 TX/inserts per second to a table. Upon inspecting the data in the table, I noticed the

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Robert Haas
On Sun, May 30, 2010 at 10:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: pretty clear what is going on.  See the logic in RelationGetBufferForTuple, and note that at no time do we have any FSM data for the bid table: Is this because, in the absence of updates or deletes, we never vacuum it? 4.

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Takahiro Itagaki
Tom Lane t...@sss.pgh.pa.us wrote: 3. After awhile, autovacuum notices all the insert activity and kicks off an autoanalyze on the bid table. When committed, this forces a relcache flush for each other backend's relcache entry for bid. In particular, the smgr targblock gets reset. 4.

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes: On Sun, May 30, 2010 at 10:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: pretty clear what is going on.  See the logic in RelationGetBufferForTuple, and note that at no time do we have any FSM data for the bid table: Is this because, in the absence of

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Greg Stark
On Mon, May 31, 2010 at 3:42 AM, Tom Lane t...@sss.pgh.pa.us wrote: note that at no time do we have any FSM data for the bid table: 3. After awhile, autovacuum notices all the insert activity and kicks off an autoanalyze on the bid table.  When committed, this forces a relcache flush for

Re: [HACKERS] Unexpected page allocation behavior on insert-only tables

2010-05-30 Thread Tom Lane
Greg Stark gsst...@mit.edu writes: This is an analyze-only scan? Why does analyze need to issue a relcache flush? Directly: to cause other backends to pick up the updated pg_class row (with new relpages/reltuples data). Indirectly: to cause cached plans for the rel to be invalidated, so that