Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL]
Am Mittwoch, den 19.10.2005, 22:04 +0200 schrieb Tino Wildenhain: > Am Mittwoch, den 19.10.2005, 16:29 -0300 schrieb Marc G. Fournier: > > I'm CC'ng this over to -hackers ... Tom? Comments? > > > ... > > >> Then we are broken too :) > > >> > > >> # select 'a ' = 'a '; > > >> ?column? > > >> -- > > >> f > > >> (1 row) > > > experiment=# SELECT 'a '::char = 'a '::char; > ?column? > -- > t > Sorry, copied wrong line :) experiment=# SELECT 'a '::char(10) = 'a '::char(10); ?column? -- t and: SELECT '|' || 'foo '::char(10) || '|'; ?column? -- |foo| vs. SELECT '|' || 'foo ' || '|'; ?column? -- |foo | ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] [HACKERS] 'a' == 'a '
Tom Lane <[EMAIL PROTECTED]> writes: > Chris Travers <[EMAIL PROTECTED]> writes: > > If I understand the spec correctly, it seems to indicate that this is > > specific to the locale/character set. > > The spec associates padding behavior with collations, which per spec are > separate from the datatypes --- that is, you should be able to able to > specify a collation for each string-type table column (whether char(N) > or varchar(N)) and even for each literal string constant. We do not > currently have that capability, and accordingly fall back to binding > PAD SPACE behavior to char(N) and NO PAD behavior to varchar(N). > > AFAICS this choice is allowed by the spec since the default collation is > implementation-defined. Does it even make sense for char(N) to not be space padded? I had the impression char(N) was always N characters long, not more or less. I can't picture any other character being used for padding, then you would need a more flexible rtrim function. And I can understand the collation order determining whether 'a' and 'a ' compare equal. But surely if you store 'a' in a varchar(N) you have to get 'a' back out, not some other string! Does the spec really allow varchar to actually be padded and not just compare ignoring trailing space? (I can't believe anyone really wants varchar to be space padded. Space padding always seemed like a legacy feature for databases with fixed record length data types. Why would anyone want a string data type that can't represent all strings?) -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [GENERAL] [HACKERS] 'a' == 'a '
Chris Travers <[EMAIL PROTECTED]> writes: > If I understand the spec correctly, it seems to indicate that this is > specific to the locale/character set. The spec associates padding behavior with collations, which per spec are separate from the datatypes --- that is, you should be able to able to specify a collation for each string-type table column (whether char(N) or varchar(N)) and even for each literal string constant. We do not currently have that capability, and accordingly fall back to binding PAD SPACE behavior to char(N) and NO PAD behavior to varchar(N). AFAICS this choice is allowed by the spec since the default collation is implementation-defined. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [GENERAL] Call for translators
Lyubomir Rusanov wrote: > Hi, > I am also interested in helping building Bulgarian translation for > PostgreSQL. > I think that we will not have enough time for 8.1 but maybe for 8.2. The best you could do is submit your incremental improvements for 8.1, so there is at least _some_ translation. Then you complete it for 8.2. I advise you start with the client-side tools (which you can reasonably expect to finish for 8.1), and leave the backend messages for last, because of the size. -- Alvaro Herrerahttp://www.PlanetPostgreSQL.org Voy a acabar con todos los humanos / con los humanos yo acabaré voy a acabar con todos / con todos los humanos acabaré (Bender) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [GENERAL] [HACKERS] 'a' == 'a '
Josh Berkus wrote: Dann, I think that whatever is done ought to be whatever the standard says. If I misinterpret the standard and PostgreSQL is doing it right, then that is fine. It is just that PostgreSQL is very counter-intuitive compared to other database systems that I have used in this one particular area. When I read the standard, it looked to me like PostgreSQL was not performing correctly. It is not unlikely that I read it wrong. AFAIT, the standard says "implementation-specific". So we're standard. The main cost for comparing trimmed values is performance; factoring an rtrim into every comparison will add significant overhead to the already CPU-locked process of, for example, creating indexes. We're looking for ways to make the comparison operators lighter-weight, not heavier. If I understand the spec correctly, it seems to indicate that this is specific to the locale/character set. Assuming that the standard doesn't have anything to do with any character sets, it should be possible to make this available for those who want it as an initdb option. Whether or not this is important enough to offer or not is another matter. Personally my questions are: 1) How many people have been bitten by this badly? 2) How many people have been bitten by joins that depend on padding? Personally, unlike case folding, this seems to be an area where a bit of documentation (i.e. all collation sets have are assumed to have the NO PAD option in the SQL standard) would be sufficient to answer to questions of standards-compliance. My general perspective on this is that if trailing blanks are a significant hazard for your application, then trim them on data input. That requires a *lot* less peformance overhead than doing it every time you compare something. In general I agree. But I am not willing to jump to the conclusion that it will never be warranted to add this as an initdb option. I am more interested in what cases people see where this would be required. But I agree that the bar is much higher than it is in many other cases. Best Wishes, Chris Travers Metatron Technology Consulting ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] RSS feeds of CVS revision logs
Hello, Some of you may have noted the new project I am playing with the PostgreSQL Source browser. It is a Subversion->Trac interface to the PostgreSQL CVS repository. The entire repository is represented. It is currently updated daily but I am hoping to get this down to hourly in the near future. The site can be viewed here: http://www-new.commandprompt.com/projects/public/pgsql The URL will change in the future, probably to pgcode.org or maybe just projects.commandprompt.com. I was speaking with JoshB and he mentioned that one particular facet of this may be interesting to you folks, which is the RSS feeds. JoshB was thinking this would be useful for replacing pgsql-patches@ for some people as their mailboxes get filled up. You have the ability using any RSS reader to subscribe to any portion of the revision log. For example, for the docs project, they can subscribe to Trunk->docs and thus only see the revision logs within their feed for the docs project. At your leisure you may either choose to see only a brief synopsis for the complete commit message. Here is an example of the short feed: http://www-new.commandprompt.com/projects/public/pgsql/log/trunk?limit=100&mode=stop_on_copy&stop_rev=&format=rss Here is an example of the long feed: http://www-new.commandprompt.com/projects/public/pgsql/log/trunk?verbose=on&format=rss&stop_rev=&limit=100&mode=stop_on_copy Notice the only real difference is the verbose=on The power of this at least from my perspective, is that I use Thunderbird as my RSS reader. I can view a current revision, click forward, and send a question to the list about a specific change. Sincerely, Joshua D. Drake Command Prompt, Inc. -- The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564 PostgreSQL Replication, Consulting, Custom Development, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 'a' == 'a '
If there is a significant performance benefit to not expanding text columns in comparison operations, then it seems it should be OK. I probably read the standard wrong, but it seems to me that varchar, char, and bpchar columns should all behave the same (e.g. if you do not expand with or the PAD character (whatever that is) then all char type columns should behave the same. I guess that there could be different default collations for different column types though (that is clearly allowed in the standard). Perhaps it just needs to be documented in such a way that even a blockhead like me can comprehend it easily. > -Original Message- > From: Josh Berkus [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 5:06 PM > To: pgsql-hackers@postgresql.org > Cc: Dann Corbit; Stephan Szabo; Terry Fielder; Tino Wildenhain; Marc G. > Fournier; [EMAIL PROTECTED]; pgsql-general@postgresql.org > Subject: Re: [HACKERS] 'a' == 'a ' > > Dann, > > > I think that whatever is done ought to be whatever the standard says. > > If I misinterpret the standard and PostgreSQL is doing it right, then > > that is fine. It is just that PostgreSQL is very counter-intuitive > > compared to other database systems that I have used in this one > > particular area. When I read the standard, it looked to me like > > PostgreSQL was not performing correctly. It is not unlikely that I read > > it wrong. > > AFAIT, the standard says "implementation-specific". So we're standard. > > The main cost for comparing trimmed values is performance; factoring an > rtrim into every comparison will add significant overhead to the already > CPU-locked process of, for example, creating indexes. We're looking for > ways to make the comparison operators lighter-weight, not heavier. > > My general perspective on this is that if trailing blanks are a > significant > hazard for your application, then trim them on data input. That requires > a *lot* less peformance overhead than doing it every time you compare > something. > > Changing the behaviour would break backwards compatibility for some users. > For that matter, I've been subscribed to 8 PostgreSQL mailing lists since > 1999, and this is the first time I can recall someone complaining about > this comparison behavior. So it's obviously not a widespread issue. > > -- > --Josh > > Josh Berkus > Aglio Database Solutions > San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 'a' == 'a '
Dann, > I think that whatever is done ought to be whatever the standard says. > If I misinterpret the standard and PostgreSQL is doing it right, then > that is fine. It is just that PostgreSQL is very counter-intuitive > compared to other database systems that I have used in this one > particular area. When I read the standard, it looked to me like > PostgreSQL was not performing correctly. It is not unlikely that I read > it wrong. AFAIT, the standard says "implementation-specific". So we're standard. The main cost for comparing trimmed values is performance; factoring an rtrim into every comparison will add significant overhead to the already CPU-locked process of, for example, creating indexes. We're looking for ways to make the comparison operators lighter-weight, not heavier. My general perspective on this is that if trailing blanks are a significant hazard for your application, then trim them on data input. That requires a *lot* less peformance overhead than doing it every time you compare something. Changing the behaviour would break backwards compatibility for some users. For that matter, I've been subscribed to 8 PostgreSQL mailing lists since 1999, and this is the first time I can recall someone complaining about this comparison behavior. So it's obviously not a widespread issue. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase)
> -Original Message- > From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 2:46 PM > To: Dann Corbit > Cc: Terry Fielder; Tino Wildenhain; Marc G. Fournier; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org; pgsql- > [EMAIL PROTECTED] > Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle > buysInnobase) > > On Wed, Oct 19, 2005 at 02:05:20PM -0700, Dann Corbit wrote: > > > When the compared datatypes are VARCHAR: YES > > > > What is the value of doing that? > > > > I can see plenty of harm and absolutely no return. We are talking about > > blank padding before comparison. Do you really want 'Danniel ' > > considered distinct from 'Danniel ' in a comparison? In real life, > > what does that buy you? > > Well, looking from the point of view of using indexes, indexes can only > really match on things that are equal. Which means the system is going > to have to trim them anyway. I'm of the opinion that strings are > strings and spaces are no different from other characters. > > That bit of the standard quoted earlier, if you read the PAD character > that is different from any other character as being the NUL character, > then 'a' is clearly different from 'a'. This > whacky space behaviour is something I associate with the char(N) and is > the main reason I never use it. > > > Perhaps this is old hat to the long-timers around here and there is a > > good explanation as to why varchar should have non-blank padding when > > comparisons are performed. Can someone point me to documentation that > > explains it? > > The way I understood it: > > char(N) is blank padding > varchar(N) is not > > If you make varchar(n) do blank padding, then what's the difference > between the two types? Storage. The blank padding is only for comparison purposes. >You may as well get rid of one... All the other database systems seem to handle it in the way that I expect. Which is not to say that it is the right way or that it agrees with the standard. But it is how it appears to me, so far. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase)
> -Original Message- > From: Stephan Szabo [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 2:34 PM > To: Dann Corbit > Cc: Terry Fielder; Tino Wildenhain; Marc G. Fournier; > [EMAIL PROTECTED]; pgsql-hackers@postgresql.org; pgsql- > [EMAIL PROTECTED] > Subject: Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] > Oracle buysInnobase) > > On Wed, 19 Oct 2005, Dann Corbit wrote: > > > > -Original Message- > > > From: Terry Fielder [mailto:[EMAIL PROTECTED] > > > Sent: Wednesday, October 19, 2005 2:05 PM > > > To: Dann Corbit > > > Cc: Tino Wildenhain; Marc G. Fournier; [EMAIL PROTECTED]; > > > pgsql-hackers@postgresql.org; pgsql-general@postgresql.org > > > Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle > > > buysInnobase) > > > > > > > > > > > > Dann Corbit wrote: > > > > Try this query in Oracle, SQL*Server, DB/2, Informix, etc.: > > > > > > > > connxdatasync=# select 1 where cast('a' as varchar(30)) = cast('a ' > > as > > > > varchar(30)); > > > > ?column? > > > > -- > > > > (0 rows) > > > > > > > > I see how you can interpret the SQL Standard to make the above > > response > > > > a correct one. But is it the response that you would like? > > > > > > When the compared datatypes are VARCHAR: YES > > > > What is the value of doing that? > > > > I can see plenty of harm and absolutely no return. We are talking about > > blank padding before comparison. Do you really want 'Danniel ' > > considered distinct from 'Danniel ' in a comparison? In real life, > > what does that buy you? > > It buys you the ability to store things where trailing spaces are > signficant (for example passwords) within the existing limitations of not > having a full set of the collation behavior. I suppose that there may be some value in having 'J&^% ' be different from 'J&^% '. I would expect to insert a password with trailing blanks to be the same as inserting a password with no trailing blanks. I think that whatever is done ought to be whatever the standard says. If I misinterpret the standard and PostgreSQL is doing it right, then that is fine. It is just that PostgreSQL is very counter-intuitive compared to other database systems that I have used in this one particular area. When I read the standard, it looked to me like PostgreSQL was not performing correctly. It is not unlikely that I read it wrong. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase)
On Wed, Oct 19, 2005 at 02:05:20PM -0700, Dann Corbit wrote: > > When the compared datatypes are VARCHAR: YES > > What is the value of doing that? > > I can see plenty of harm and absolutely no return. We are talking about > blank padding before comparison. Do you really want 'Danniel ' > considered distinct from 'Danniel ' in a comparison? In real life, > what does that buy you? Well, looking from the point of view of using indexes, indexes can only really match on things that are equal. Which means the system is going to have to trim them anyway. I'm of the opinion that strings are strings and spaces are no different from other characters. That bit of the standard quoted earlier, if you read the PAD character that is different from any other character as being the NUL character, then 'a' is clearly different from 'a'. This whacky space behaviour is something I associate with the char(N) and is the main reason I never use it. > Perhaps this is old hat to the long-timers around here and there is a > good explanation as to why varchar should have non-blank padding when > comparisons are performed. Can someone point me to documentation that > explains it? The way I understood it: char(N) is blank padding varchar(N) is not If you make varchar(n) do blank padding, then what's the difference between the two types? You may as well get rid of one... Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpl2oFftSN1C.pgp Description: PGP signature
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle
On Wed, 19 Oct 2005, Dann Corbit wrote: > > -Original Message- > > From: Terry Fielder [mailto:[EMAIL PROTECTED] > > Sent: Wednesday, October 19, 2005 2:05 PM > > To: Dann Corbit > > Cc: Tino Wildenhain; Marc G. Fournier; [EMAIL PROTECTED]; > > pgsql-hackers@postgresql.org; pgsql-general@postgresql.org > > Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle > > buysInnobase) > > > > > > > > Dann Corbit wrote: > > > Try this query in Oracle, SQL*Server, DB/2, Informix, etc.: > > > > > > connxdatasync=# select 1 where cast('a' as varchar(30)) = cast('a ' > as > > > varchar(30)); > > > ?column? > > > -- > > > (0 rows) > > > > > > I see how you can interpret the SQL Standard to make the above > response > > > a correct one. But is it the response that you would like? > > > > When the compared datatypes are VARCHAR: YES > > What is the value of doing that? > > I can see plenty of harm and absolutely no return. We are talking about > blank padding before comparison. Do you really want 'Danniel ' > considered distinct from 'Danniel ' in a comparison? In real life, > what does that buy you? It buys you the ability to store things where trailing spaces are signficant (for example passwords) within the existing limitations of not having a full set of the collation behavior. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase)
> -Original Message- > From: Terry Fielder [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 2:05 PM > To: Dann Corbit > Cc: Tino Wildenhain; Marc G. Fournier; [EMAIL PROTECTED]; > pgsql-hackers@postgresql.org; pgsql-general@postgresql.org > Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle > buysInnobase) > > > > Dann Corbit wrote: > > Try this query in Oracle, SQL*Server, DB/2, Informix, etc.: > > > > connxdatasync=# select 1 where cast('a' as varchar(30)) = cast('a ' as > > varchar(30)); > > ?column? > > -- > > (0 rows) > > > > I see how you can interpret the SQL Standard to make the above response > > a correct one. But is it the response that you would like? > > When the compared datatypes are VARCHAR: YES What is the value of doing that? I can see plenty of harm and absolutely no return. We are talking about blank padding before comparison. Do you really want 'Danniel ' considered distinct from 'Danniel ' in a comparison? In real life, what does that buy you? And even if people think it is a good idea, it seems to be at odds with the way that I read the standard (though I am easily confused by simple enough language at times). It could well be that I am jaded from years of doing it the wrong way (I expect two character strings with all leading non-blanks in agreement to compare equal). Perhaps this is old hat to the long-timers around here and there is a good explanation as to why varchar should have non-blank padding when comparisons are performed. Can someone point me to documentation that explains it? [snip] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buys
"Guy Rouillier" <[EMAIL PROTECTED]> writes: > Tino Wildenhain wrote: > > > > experiment=# SELECT 'a '::char = 'a '::char; > > ?column? > > -- > > t > > > > This does't show anything useful, because the ::char casting simply > takes the first char of any string: > > select 'abc'::char = 'axy'::char > > Also results in 'true'. Hopefully no one in this discussion would want > those two strings to be equal. A better experiment: db=> select 'a '::char(3) = 'a '::char(3); ?column? -- t (1 row) -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buysInnobase)
Try this query in Oracle, SQL*Server, DB/2, Informix, etc.: connxdatasync=# select 1 where cast('a' as varchar(30)) = cast('a ' as varchar(30)); ?column? -- (0 rows) I see how you can interpret the SQL Standard to make the above response a correct one. But is it the response that you would like? Suppose, for instance, that you have an employee table in your database. You have another table called benefits. Perhaps it is even in another database. Conceivably even in a database other than PostgreSQL. Anyway, you want to match information between the two systems so you join on some redundant columns like a.First_Name = b.fname AND a.Last_name = b.lname AND a.Middle_Initial = b.mi AND a.City = b.city AND a.Street_name = b.street If the columns are not the same length (or one is fixed length and the other variable), then 'Danniel' won't match 'Danniel' unless you trim it. If you trim it, then the indexes go out the window. If the indexes go out the window, then we table scan. I don't like that behavior. Perhaps others who know more than me can say why not blank padding comparisons is a good idea. Clearly, one could argue that having redundant data is bad and that every attribute in a database intended to match should be exactly the same type. But there are lots of database systems badly designed. And of well designed systems, it is not uncommon to have more than one database in your organization, and a need to perform federated joins as well because of it. > -Original Message- > From: Tino Wildenhain [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 1:05 PM > To: Marc G. Fournier > Cc: Dann Corbit; [EMAIL PROTECTED]; pgsql- > [EMAIL PROTECTED]; pgsql-general@postgresql.org > Subject: Re: 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle > buysInnobase) > > Am Mittwoch, den 19.10.2005, 16:29 -0300 schrieb Marc G. Fournier: > > I'm CC'ng this over to -hackers ... Tom? Comments? > > > > On Wed, 19 Oct 2005, Dann Corbit wrote: > > > > > Yes, clearly that is the wrong result according to the SQL standard. > > > > > > Here is a SQL*Server query: > > > select 1 where 'a' = 'a ' AND 'a' = 'a ' AND 'a ' = 'a ' > > > > > > It returns (correctly): 1 > > > > > >> -Original Message- > > >> From: [EMAIL PROTECTED] [mailto:pgsql-general- > > >> [EMAIL PROTECTED] On Behalf Of Marc G. Fournier > > >> Sent: Wednesday, October 19, 2005 11:41 AM > > >> To: [EMAIL PROTECTED] > > >> Cc: pgsql-general@postgresql.org > > >> Subject: Re: [pgsql-advocacy] [GENERAL] Oracle buys Innobase > > >> > > >> On Wed, 19 Oct 2005, [EMAIL PROTECTED] wrote: > > >> > > >>> I was referring to trailing blanks, but did not explicitly say it, > > >>> though showed it in the examples. I am pretty sure that the SQL > > >>> standard says that trailing whitespace is insignificant in string > > >>> comparison. > > >> > > >> Then we are broken too :) > > >> > > >> # select 'a ' = 'a '; > > >> ?column? > > >> -- > > >> f > > >> (1 row) > > > experiment=# SELECT 'a '::char = 'a '::char; > ?column? > -- > t > ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buys
Am Mittwoch, den 19.10.2005, 16:29 -0300 schrieb Marc G. Fournier: > I'm CC'ng this over to -hackers ... Tom? Comments? > > On Wed, 19 Oct 2005, Dann Corbit wrote: > > > Yes, clearly that is the wrong result according to the SQL standard. > > > > Here is a SQL*Server query: > > select 1 where 'a' = 'a ' AND 'a' = 'a ' AND 'a ' = 'a ' > > > > It returns (correctly): 1 > > > >> -Original Message- > >> From: [EMAIL PROTECTED] [mailto:pgsql-general- > >> [EMAIL PROTECTED] On Behalf Of Marc G. Fournier > >> Sent: Wednesday, October 19, 2005 11:41 AM > >> To: [EMAIL PROTECTED] > >> Cc: pgsql-general@postgresql.org > >> Subject: Re: [pgsql-advocacy] [GENERAL] Oracle buys Innobase > >> > >> On Wed, 19 Oct 2005, [EMAIL PROTECTED] wrote: > >> > >>> I was referring to trailing blanks, but did not explicitly say it, > >>> though showed it in the examples. I am pretty sure that the SQL > >>> standard says that trailing whitespace is insignificant in string > >>> comparison. > >> > >> Then we are broken too :) > >> > >> # select 'a ' = 'a '; > >> ?column? > >> -- > >> f > >> (1 row) experiment=# SELECT 'a '::char = 'a '::char; ?column? -- t ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] A costing analysis tool
Kevin, > This would require capture of information beyond what I was thinking > about in terms of schema. Do you think we need to capture just index > type, or something more? Do you propose that we capture the pg_* > metadata related to every object referenced in the plan, every object > related to every table in the query, or everything in the entire > database? Not right away. However, having collected the info in the first place would give us the option of duping a copy of the system tables so that we could mine them for more information. Also, for our test harness, we will have pre-programmed object information for the objects we use. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [pgsql-advocacy] [GENERAL] Oracle buys Innobase
Given this part of that same rule applied to the strings: "b) If the length in characters of X is not equal to the length in characters of Y, then the shorter string is effectively replaced, for the purposes of comparison, with a copy of itself that has been extended to the length of the longer string by concatenation on the right of one or more pad characters, where the pad character is chosen based on CS. If CS has the NO PAD characteristic, then the pad character is an implementation-dependent character different from any character in the character set of X and Y that collates less than any string under CS. Otherwise, the pad character is a ." I do not see how two strings which are otherwise equal (except for length or blank padding) can possibly compare unequal unless the NO PAD property is applied and the implementation defined pad character is also something other than a . Is that the case for PostgreSQL? Even if it is, is seems truly bizarre that the NO PAD attribute would be applied to string constants. > -Original Message- > From: Marc G. Fournier [mailto:[EMAIL PROTECTED] > Sent: Wednesday, October 19, 2005 12:53 PM > To: Dann Corbit > Cc: Stephan Szabo; Marc G. Fournier; [EMAIL PROTECTED]; pgsql- > [EMAIL PROTECTED]; pgsql-general@postgresql.org > Subject: RE: [pgsql-advocacy] [GENERAL] Oracle buys Innobase > > On Wed, 19 Oct 2005, Dann Corbit wrote: > > >> -Original Message- > >> From: Stephan Szabo [mailto:[EMAIL PROTECTED] > >> Sent: Wednesday, October 19, 2005 12:39 PM > >> To: Dann Corbit > >> Cc: Marc G. Fournier; [EMAIL PROTECTED]; pgsql- > >> [EMAIL PROTECTED] > >> Subject: Re: [pgsql-advocacy] [GENERAL] Oracle buys Innobase > >> > >> On Wed, 19 Oct 2005, Dann Corbit wrote: > >> > >>> Yes, clearly that is the wrong result according to the SQL standard. > >>> > >>> Here is a SQL*Server query: > >>> select 1 where 'a' = 'a ' AND 'a' = 'a ' AND 'a ' = 'a ' > >>> > >>> It returns (correctly): 1 > >> > >> Doesn't that depend on the collating sequence in use, or is a NO PAD > >> collating sequence not allowed here? > > > > If the implementation defines constants as NO PAD and the implementation > > defined pad character is something other than space, then they could > > compare unequal. > > > > I would find that implementation disturbing. But I am easily bent out > > of shape. > > > > The attached HTML file in my earlier post is the official quote from the > > SQL 99 standard. That is the formal and correct definition, far > > superior to my off the cuff approximations. > > 'k, if I'm reading the right section (you say its bolded, but I'm using > pine which doesn't seem to do a good job of reading HTML): > > === > d) Depending on the collating sequence, two strings may compare as > equal even if they are of different lengths or contain different > sequences of characters. When any of the operations MAX, MIN, and > DISTINCT reference a grouping column, and the UNION, EXCEPT, and > INTERSECT operators refer to character strings, the specific value > selected by these operations from a set of such equal values is > implementation-dependent. > === > > I think the key part of that 'clause' is "two strings *may* compare as > equal" ... sounds implementation dependent to me, depending on how the > implementor interprets it ... or am I reading the wrong section? > > > Marc G. Fournier Hub.Org Networking Services > (http://www.hub.org) > Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: > 7615664 ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [pgsql-advocacy] [GENERAL] Oracle buys Innobase
On Wed, 19 Oct 2005, Dann Corbit wrote: -Original Message- From: Stephan Szabo [mailto:[EMAIL PROTECTED] Sent: Wednesday, October 19, 2005 12:39 PM To: Dann Corbit Cc: Marc G. Fournier; [EMAIL PROTECTED]; pgsql- [EMAIL PROTECTED] Subject: Re: [pgsql-advocacy] [GENERAL] Oracle buys Innobase On Wed, 19 Oct 2005, Dann Corbit wrote: Yes, clearly that is the wrong result according to the SQL standard. Here is a SQL*Server query: select 1 where 'a' = 'a ' AND 'a' = 'a ' AND 'a ' = 'a ' It returns (correctly): 1 Doesn't that depend on the collating sequence in use, or is a NO PAD collating sequence not allowed here? If the implementation defines constants as NO PAD and the implementation defined pad character is something other than space, then they could compare unequal. I would find that implementation disturbing. But I am easily bent out of shape. The attached HTML file in my earlier post is the official quote from the SQL 99 standard. That is the formal and correct definition, far superior to my off the cuff approximations. 'k, if I'm reading the right section (you say its bolded, but I'm using pine which doesn't seem to do a good job of reading HTML): === d) Depending on the collating sequence, two strings may compare as equal even if they are of different lengths or contain different sequences of characters. When any of the operations MAX, MIN, and DISTINCT reference a grouping column, and the UNION, EXCEPT, and INTERSECT operators refer to character strings, the specific value selected by these operations from a set of such equal values is implementation-dependent. === I think the key part of that 'clause' is "two strings *may* compare as equal" ... sounds implementation dependent to me, depending on how the implementor interprets it ... or am I reading the wrong section? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] A costing analysis tool
This would require capture of information beyond what I was thinking about in terms of schema. Do you think we need to capture just index type, or something more? Do you propose that we capture the pg_* metadata related to every object referenced in the plan, every object related to every table in the query, or everything in the entire database? >>> Josh Berkus >>> [smip] the only way you're going analyze index costing by type of index is if the index name is stored somewhere (db_object) so that it can be matched to its characteristics. [snip] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] 'a' == 'a ' (Was: RE: [pgsql-advocacy] [GENERAL] Oracle buys Innobase)
I'm CC'ng this over to -hackers ... Tom? Comments? On Wed, 19 Oct 2005, Dann Corbit wrote: Yes, clearly that is the wrong result according to the SQL standard. Here is a SQL*Server query: select 1 where 'a' = 'a ' AND 'a' = 'a ' AND 'a ' = 'a ' It returns (correctly): 1 -Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-general- [EMAIL PROTECTED] On Behalf Of Marc G. Fournier Sent: Wednesday, October 19, 2005 11:41 AM To: [EMAIL PROTECTED] Cc: pgsql-general@postgresql.org Subject: Re: [pgsql-advocacy] [GENERAL] Oracle buys Innobase On Wed, 19 Oct 2005, [EMAIL PROTECTED] wrote: I was referring to trailing blanks, but did not explicitly say it, though showed it in the examples. I am pretty sure that the SQL standard says that trailing whitespace is insignificant in string comparison. Then we are broken too :) # select 'a ' = 'a '; ?column? -- f (1 row) Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] A costing analysis tool
Robert Treat wrote: On Tuesday 18 October 2005 18:05, Kevin Grittner wrote: Regarding the idea of a site where results could be posted and loaded into a database which would be available for public access -- I agree that would be great; however, my client is not willing to take that on. If anyone wants to volunteer, that wuold be fantastic. Josh, does the setup on the foundry homepages give you enough to do this? pgFoundry currently does not provide either a database or web programs for groups, for security reasons. All you get is the ability to make static pages. The new machine will provide both, we hope. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] A costing analysis tool
Maybe we could associate a set of defaults to runtime_environment, and you would associate any overrides with the runtime_options. Does this address both your concerns? >>> Josh Berkus >>> Kevin, > When it gets downt to the detail, it may make sense to combine > or split some of these. For example, runtime_options should > probably not have a column for each currently known option, > but a child table which maps to all non-default option values. I'm a little cautious about storing only non-defaults; the defaults have changed from version to version, after all. If we did that, we'd need to have a "defaults" table in the db as a reference list. Also, we'll need to store runtime options both on the "machine" level and on the "query" level, in order to allow testing of changing an enable_* or other query cost option at runtime. Not sure how to capture this, though. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] A costing analysis tool
On Tuesday 18 October 2005 18:05, Kevin Grittner wrote: > Regarding the idea of a site where results could be posted > and loaded into a database which would be available for > public access -- I agree that would be great; however, my > client is not willing to take that on. If anyone wants to > volunteer, that wuold be fantastic. > Josh, does the setup on the foundry homepages give you enough to do this? -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] A costing analysis tool
I'm not interested in storing less information. I'm trying to make sure that all redundant information is justified. Since I plan to store the actual query text and the full EXPLAIN ANALYZE output, every column I pull out and put in another table is redundant data. The questions are, why do we have it, is it going to be used heavily enough in such a way that it should be stored redundantly? If the goal is to be able to get at the number of filters, etc., perhaps another table which digests this to a form which can be used in summary queries is what's needed. If this is a rare type of query which few people will run, maybe they can go to the EXPLAIN ANALYZE, using the line number info, and grab it from there. -Kevin >>> Josh Berkus >>> I'm also really unclear on why you're so focused on storing less information rather than more. In an "investigation" tool like this, it's important to collect as much data as possible because we don't know what's going to be valuable until we analyze it. You seem to be starting out with the idea that you *already* know exactly where the problems are located, in which case why develop a tool at all? Just fix the problem. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] A costing analysis tool
Kevin, > When it gets downt to the detail, it may make sense to combine > or split some of these. For example, runtime_options should > probably not have a column for each currently known option, > but a child table which maps to all non-default option values. I'm a little cautious about storing only non-defaults; the defaults have changed from version to version, after all. If we did that, we'd need to have a "defaults" table in the db as a reference list. Also, we'll need to store runtime options both on the "machine" level and on the "query" level, in order to allow testing of changing an enable_* or other query cost option at runtime. Not sure how to capture this, though. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] A costing analysis tool
Summary of schema I'm considering. Comments welcome. When it gets downt to the detail, it may make sense to combine or split some of these. For example, runtime_options should probably not have a column for each currently known option, but a child table which maps to all non-default option values. submitter Identifies who submitted test results. Name, optional info such as email address and organization runtime_environment Identifies the overall test environment. OS with distribution & version, CPU number, type speed, RAM, background load, static configuration, etc. This provides context for a series of tests, to see how the numbers look in a given environment dataset_characteristics Identifies data metrics which may affect costing accuracy Table counts, row counts, column counts, disk space used, level of fragmentation. Maybe some of the "standard" tests will share common dataset characteristics across multiple environments. cache_state Identifies the level of initial caching for a test, and the degree to which the referenced data can be cached during execution runtime_options Identifies the runtime choices in effect for a query run. The state of EXPLAIN, ANALYZE, enable_xxx, and dynamic configuration settings, such as random_page_cost. query Identifies a test query, possibly run by many people in many environments against various datasets with different cache states and runtime options test_result_summary Ties a query to details about a run, with a summary of results. Run time from the client perspective, rows returned. test_result_step_detail Shows EXPLAIN ANALYZE information (if any) for each step. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 8.04 and RedHat/CentOS init script issue
Hi, On Tue, 18 Oct 2005, Tony Caduto wrote: I installed 8.04 via RPM on Centos 4.2 which is the same as RedHat 4.2 and while booting the init script reports that the daemon [FAILED], but after I logon it shows the postmaster running and I am able to connect from any client remotely. I made not modifcations to the script and there is nothing out of the ordinary in the log. Hmm. In 8.0.4 RPM init scripts, we were using a 1 second of sleep time (see sleep 1 line in the init script). On some cases where the system is slow, you are prompted about the startup failure; however this is not the real case. In 8.1 RPMs, the sleep time was increased to 2 seconds; which we believe that won't have the problem you've reported: http://cvs.pgfoundry.org/cgi-bin/cvsweb.cgi/pgsqlrpms/patches/8.1/postgresql.init?rev=1.2&content-type=text/x-cvsweb-markup So please increase this sleep time and give another try. Regards, -- Devrim GUNDUZ Kivi Bilişim Teknolojileri - http://www.kivi.com.tr devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] A costing analysis tool
Kevin, > If we stored the actual queries and the EXPLAIN ANALYZE results (when > generated) in the database, what would be the purpose of the node_name, > db_object, and condition_detail columns? They don't seem like they > would be useful for statistical analysis, and it seems like the > information would be more useful in context. Are these column really > needed? Yes. For example, the only way you're going analyze index costing by type of index is if the index name is stored somewhere (db_object) so that it can be matched to its characteristics. For condition_detail, again we could determine that (for example) we have costing problems when filters involve more than 2 columns or complex expressions. Node_name is as actually duplicative of some of the other columns, so I suppose it could be dropped. > For a given node_type, are there mutiple valid condition_type values? > If so, I need to modify my python script to capture this. If not, I > don't see a need to store it. I'm not sure. Even if there aren't now, there could be in the future. I'm more focused on supporting cross-node-type conditions. For example, "Filter" conditions can apply to a variety of node types (Index Scan, Merge Join, Subquery Scan, Seq Scan, aggregates). If we were costing Filters, we'd want to be able to aggregate their stats regardless of the node in which they occurred. I'm also really unclear on why you're so focused on storing less information rather than more. In an "investigation" tool like this, it's important to collect as much data as possible because we don't know what's going to be valuable until we analyze it. You seem to be starting out with the idea that you *already* know exactly where the problems are located, in which case why develop a tool at all? Just fix the problem. -- --Josh Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] A costing analysis tool
If we stored the actual queries and the EXPLAIN ANALYZE results (when generated) in the database, what would be the purpose of the node_name, db_object, and condition_detail columns? They don't seem like they would be useful for statistical analysis, and it seems like the information would be more useful in context. Are these column really needed? For a given node_type, are there mutiple valid condition_type values? If so, I need to modify my python script to capture this. If not, I don't see a need to store it. -Kevin >>> Josh Berkus >>> [snip] so, for example, the query step: "-> Seq Scan on detail0009 (cost=0.00..20500.11 rows=26 width=1005) (actual time=453.000..5983.000 rows=53588 loops=1)" " Filter: ((txcontenttype ~~* '%html%'::text) AND ((vchost)::text ~~* '%www.%'::text))" Could be displayed as: query_instance 12008 step_id 14701 parent_step 14698 node_name Seq Scan on detail0009 node_type Seq Scan cost_start 0 cost_end20500.11 est_rows26 time_start 453.0 time_end5983.0 actual_rows 53588 loops 1 db_object detail009 condition_type Filter condition_detail((txcontenttype ~~* '%html%'::text) AND ((vchost)::text ~~* '%www.%'::text)) By collecting all of this data, you make it possible to perform other sorts of analysis on the cost estimates. For example, statistical analysis might tell us that 3-or-more-condition filters take significantly longer to execute than single-condition filters, which would be important to know for the cost model. Limiting it to collecting only 3 of the 13 bits of node data produced by EA would very much limit the usefulness of the tool and the reliability of its statistics. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Optimization system
http://lca2005.linux.org.au/Papers/Neil%20Conway/Inside%20the%20PostgreSQL%20Query%20Optimizer/pg_query_optimizer.pdf > -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Sebastián Ávila > Sent: Wednesday, October 19, 2005 9:04 AM > To: pgsql-hackers@postgresql.org > Subject: [HACKERS] Optimization system > > Hi people, > > Please I need help about this, where I could find information about > the query optimization system in postgresql? > Thanks! > > ---(end of broadcast)--- > TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Optimization system
see src/backend/optimizer/README On 10/19/05, Sebastián Ávila <[EMAIL PROTECTED]> wrote: Hi people,Please I need help about this, where I could find information aboutthe query optimization system in postgresql?Thanks!---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings-- Respectfully,Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation http://www.enterprisedb.com/
[HACKERS] Optimization system
Hi people, Please I need help about this, where I could find information about the query optimization system in postgresql? Thanks! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Segv in pg_autovacuum in 8.0.x
daveg wrote: Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit of SEGVing and exiting when a table gets dropped out from under it. This creates problems if you rely on pg_autovacuum for the bulk of your vacuuming as it forgets it's statistics when it is restarted and so will skip some desireable vacuums. I have a patch for this that should apply against any 8.0.x if there is any interest. Please post the patch, it might get committed into 8.0.x. I personally haven't heard any other reports of it dieing when tables are dropped, but I don't doubt there are some lingering issues like this. I looked at the new autovacuum in 8.1 and it appears from casual inspection not to have the same problem. The autovacuum in 8.1 is a very different critter. While its design is based on the contrib autovacuum, it is mostly a total rewrite, and probably a vast improvement :-) Matt ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [GENERAL] Call for translators
Hi, I am also interested in helping building Bulgarian translation for PostgreSQL. I think that we will not have enough time for 8.1 but maybe for 8.2. Regards Lyubomir Rusanov Ivan Pavlov wrote: Peter Eisentraut wrote: As the release of PostgreSQL 8.1 draws near, it is once again time to update the translations of the program messages. To check the status of your language, check out this web site: http://developer.postgresql.org/~petere/nlsstatus/ To participate in the translation effort, please follow the instructions set out on that page, or contact me directly if you have further questions. Is there any particular reason why there is no translation in Bulgarian? If it's only the lack of contributors, you can count me as the first one. Ivan Pavlov ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] 2nd try @NetBSD/2.0 Alpha
Martijn van Oosterhout wrote: > On Tue, Oct 18, 2005 at 10:07:26PM -0500, Larry Rosenman wrote: >>> ...or attach with a debugger like gdb. >> >> d'oh. I go stupid occasionally :) >> >> If someone wants me to, I can try that. > > Yes, actually. See, its dying in the seg test already with: > > -- Open intervals > SELECT '0..'::seg AS seg; > ! ERROR: floating-point exception > ! DETAIL: An invalid floating-point operation was signaled. This > probably means an out-of-range result or an invalid operation, s > uch as division by zero. SELECT '0...'::seg AS seg; ! ERROR: > floating-point exception ! DETAIL: An invalid floating-point > operation was signaled. This probably means an out-of-range result or > an invalid operation, s uch as division by zero. > > You need to attach a debugger to find out where that error is > actually happenening. Just startup the backend, connect to it and > connect gdb to the newly spawned backend and just run that query by > hand. Then you should get the backtrace at SIGFPE. > I don't have the time today (need to do some paying work). However, If someone wants, I can pass an ID/PW along so that they May find it. (or it can wait till the weekend). LER -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: ler@lerctr.org US Mail: 3535 Gaspar Drive, Dallas, TX 75220-3611 US ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Question about Ctrl-C and less
On Wed, Oct 19, 2005 at 04:24:21AM -0700, Kevin Brown wrote: > It does? It looked to me like it was setting the signal handler to > SIG_IGN before doing the popen(), and resetting it to the internal > signal handler after doing the pclose(). Oops, you're right. It works because less sets it's own signal handler, as does more. Sure, the status is inherited but I doubt there's a pager out there that doesn't set its own signal handlers. Still, best do it properly. > If I were to take that approach, I'd do the check immediately after > writing to the pipe for sure, and possibly after reading from the > database handle. I'd have to look at the code to see what's going on > and thus where the best places to stick the checks are. But in > general, you want to put a check anywhere within the loop that is > immediately after a blocking operation such as a read or write. > However, that said... Well, the pager is only run after all the data has been collected from the server so none of this is an issue while the query is being processed. Send the cancel request and toss the data. Once we start the pager just check after each line is printed. > The semantics for handling SIGINT here should be pretty much the same > as they are when there's no pager at all (which I know works because I You can't do a pclose in a signal handler, it's not one of the "reeentrant safe" functions and could lead to deadlocks. The signal manpage documents the ones you can use. Just set a flag. Setting the descriptor to NULL is worse because then we have check before every output function. fprintf(NULL, ...) will segfault on most architechtures I wager. Also, the signal handler can't free the memory, there's way too many arrays. A flag allows the output functions to exit sanely. > If you're hitting SIGQUIT a lot then perhaps you need to assign it to > a different key. :-) I use SIGQUIT to get out of the situation you get currently when you press Ctrl-C to interrupt less. Somehow all the "q"s go to psql and all the Ctrl-Ds to less. SIGQUIT just kills them both from where I can clean the terminal up. So I will press SIGQUIT less once we fix up psql to not do stupid things on SIGINT. :) Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgp3AL1cKVSLD.pgp Description: PGP signature
Re: [HACKERS] [GENERAL] Call for translators
Peter Eisentraut wrote: > As the release of PostgreSQL 8.1 draws near, it is once again time to update > the translations of the program messages. To check the status of your > language, check out this web site: > > http://developer.postgresql.org/~petere/nlsstatus/ > > To participate in the translation effort, please follow the instructions set > out on that page, or contact me directly if you have further questions. > Is there any particular reason why there is no translation in Bulgarian? If it's only the lack of contributors, you can count me as the first one. Ivan Pavlov ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Question about Ctrl-C and less
On 2005-10-19, Kevin Brown <[EMAIL PROTECTED]> wrote: > Making assumptions about what the pager will do upon receipt of SIGINT > is folly as well. > > Setting up SIGINT to be ignored may be the right answer (I don't > believe it is -- see below), but if so then it needs to be done > properly. If it gets ignored prior to the popen(), then the child > will also end up ignoring it by default, because signal disposition is > inherited by child processes. If we ignore SIGINT, it should be after > the popen(), not before. I do not believe it is possible to do the signal disposition correctly and still use popen() to run the pager. (You would need to reimplement popen using raw syscalls.) > So I think the right answer here is for psql to handle SIGINT > internally by doing a pclose() first The chances that psql can do this safely approach zero. pclose() is not a signal-safe function, so it can only be called from a signal handler if you _know_ that the signal did not interrupt any non-signal-safe function. (Nor can the signal handler longjmp out in such a case, unless the code is never again going to call any unsafe function.) -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_dump permissions
On 2005-10-19, Christopher Kings-Lynne <[EMAIL PROTECTED]> wrote: > Suppose I have a database with a table in it. I try to dump that > database. The user I dump as is not the owner of that table nor does it > have the SELECT permission on that table. > > Why can't I do a schema-only dump of that table? I find that I need > SELECT permission on the table, even though my user can happily query > the catalogs. pg_dump takes an access share lock on everything it touches, to avoid having the schema change out from under it. -- Andrew, Supernews http://www.supernews.com - individual and corporate NNTP services ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Question about Ctrl-C and less
Martijn van Oosterhout wrote: > On Tue, Oct 18, 2005 at 09:32:25PM -0700, Kevin Brown wrote: > > So I think the right answer here is for psql to handle SIGINT > > internally by doing a pclose() first (and at this point, it probably > > should ignore SIGINT altogether), then returning to the main loop > > (and, of course, cleaning up anything that needs it along the way). > > If the child hasn't exited then pclose() will block until it has. The > > end result should be the semantics you want: if psql is in the middle > > of sending a bunch of rows of output to the pager, this will interrupt > > that process. If the pager remains running then it will hopefully > > give the user the ability to scroll through whatever results were sent > > to it. > > That's what I meant by "more comprehensive patch". Basically, the > longjmp has to go because it leaks memory (and file descriptors) and > doesn't allow you to control things at the right level. My little patch > basically set the signal handler *after* the popen so everything works > as expected. It does? It looked to me like it was setting the signal handler to SIG_IGN before doing the popen(), and resetting it to the internal signal handler after doing the pclose(). Am I examining the wrong patch? > My plan is to have the interrupt handler set a flag "control_c_pressed" > and check it at strategic points. Then memory can be deallocated and > returned properly. It's a lot more invasive and the corners cases will > be "interesting". If I were to take that approach, I'd do the check immediately after writing to the pipe for sure, and possibly after reading from the database handle. I'd have to look at the code to see what's going on and thus where the best places to stick the checks are. But in general, you want to put a check anywhere within the loop that is immediately after a blocking operation such as a read or write. However, that said... The semantics for handling SIGINT here should be pretty much the same as they are when there's no pager at all (which I know works because I can interrupt the output anytime and get the psql prompt back). In fact, this suggests that the proper course of action would be to store the pipe's file descriptor someplace accessible to the normal SIGINT handler, and have the SIGINT handler check its value upon receipt of SIGINT. If the value is nonzero then pclose() it and then proceed as before (and zero the descriptor after any pclose() so that the handler always does the right thing). If there's any additional memory that has to be freed, then do that too, of course. That'll take care of the descriptor leak and it'll get you the right semantics as a nice bonus. > Your point about SIGQUIT is valid. I didn't include it in my patch > since I wasn't sure what the expected behaviour would be. If I got core > files everytime I pressed SIGQUIT, I'd have a lot of core files > scattered around my disk; one of the reasons I disable core files by > default. If you're hitting SIGQUIT a lot then perhaps you need to assign it to a different key. :-) Seriously, you shouldn't be using SIGQUIT unless you want a core file. Otherwise, use SIGINT. If the application is unresponsive to SIGINT, my normal course of action is to suspend it via job control signals (i.e., send SIGTSTP) and then send it a SIGTERM via the kill command. If that doesn't do it, then it gets a SIGKILL. And if it doesn't stop upon receipt of SIGTSTP I kill it externally and complain to its developers (assuming it isn't hung in some uninterruptible sleep in the kernel or something). > OTOH, if I wanted to trap psql, I'd run it under a debugger or attach > one which would catch SIGQUIT even if the program ignores it. Suppose it's hung on a really hard to reproduce condition? Point being that you don't necessarily know ahead of time that you're going to want a core file. Otherwise SIGQUIT wouldn't have the default semantics it has, and it wouldn't have a terminal key associated with it. The guys that designed this stuff knew what they were doing. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] collector/autovacuum after crash (8.1beta3)
Hi All, At system crash or poweroff the autovacuum statistics will be lost, because this statistics only stored in RAM and saved/restored at service shutdown/startup. I think it should be saved periodically and not to be deleted after crash. -Sygma ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] 2nd try @NetBSD/2.0 Alpha
On Tue, Oct 18, 2005 at 10:07:26PM -0500, Larry Rosenman wrote: > >...or attach with a debugger like gdb. > > d'oh. I go stupid occasionally :) > > If someone wants me to, I can try that. Yes, actually. See, its dying in the seg test already with: -- Open intervals SELECT '0..'::seg AS seg; ! ERROR: floating-point exception ! DETAIL: An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, s uch as division by zero. SELECT '0...'::seg AS seg; ! ERROR: floating-point exception ! DETAIL: An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, s uch as division by zero. You need to attach a debugger to find out where that error is actually happenening. Just startup the backend, connect to it and connect gdb to the newly spawned backend and just run that query by hand. Then you should get the backtrace at SIGFPE. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgp437RUDPiCT.pgp Description: PGP signature
Re: [HACKERS] Question about Ctrl-C and less
On Tue, Oct 18, 2005 at 09:32:25PM -0700, Kevin Brown wrote: > So I think the right answer here is for psql to handle SIGINT > internally by doing a pclose() first (and at this point, it probably > should ignore SIGINT altogether), then returning to the main loop > (and, of course, cleaning up anything that needs it along the way). > If the child hasn't exited then pclose() will block until it has. The > end result should be the semantics you want: if psql is in the middle > of sending a bunch of rows of output to the pager, this will interrupt > that process. If the pager remains running then it will hopefully > give the user the ability to scroll through whatever results were sent > to it. That's what I meant by "more comprehensive patch". Basically, the longjmp has to go because it leaks memory (and file descriptors) and doesn't allow you to control things at the right level. My little patch basically set the signal handler *after* the popen so everything works as expected. My plan is to have the interrupt handler set a flag "control_c_pressed" and check it at strategic points. Then memory can be deallocated and returned properly. It's a lot more invasive and the corners cases will be "interesting". Your point about SIGQUIT is valid. I didn't include it in my patch since I wasn't sure what the expected behaviour would be. If I got core files everytime I pressed SIGQUIT, I'd have a lot of core files scattered around my disk; one of the reasons I disable core files by default. OTOH, if I wanted to trap psql, I'd run it under a debugger or attach one which would catch SIGQUIT even if the program ignores it. Anyway, thanks for the response, hopefully we can get this sorted out in a later release... -- Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. pgpZo9xP9YYFB.pgp Description: PGP signature
[HACKERS] Segv in pg_autovacuum in 8.0.x
Apologies if this is old news, but pg_autovacuum in 8.0.x has the bad habit of SEGVing and exiting when a table gets dropped out from under it. This creates problems if you rely on pg_autovacuum for the bulk of your vacuuming as it forgets it's statistics when it is restarted and so will skip some desireable vacuums. I have a patch for this that should apply against any 8.0.x if there is any interest. I looked at the new autovacuum in 8.1 and it appears from casual inspection not to have the same problem. -dg -- David Gould [EMAIL PROTECTED] If simplicity worked, the world would be overrun with insects. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] pg_dump permissions
Suppose I have a database with a table in it. I try to dump that database. The user I dump as is not the owner of that table nor does it have the SELECT permission on that table. Why can't I do a schema-only dump of that table? I find that I need SELECT permission on the table, even though my user can happily query the catalogs. PostgreSQL version: 7.4 Chris ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] SIGSEGV in Postgres 8.0.3 (libpq4)
Anand Kumria wrote: Hi, I have a set of perl scripts which invoke each other (via system()); eventually I found that they were crashing and ultimately causing Perl to SIGSEGV. I am using Debian testing and have recompiled postgres-8.0.3-15 to include debugging symbols. This is a dual-CPU dual-stacked IPv4/IPv6 host running Linux 2.4.22. From what I have been able to determine the problem is in libpq4: Starting program: /usr/bin/perl ./add_address 265 2001:502:d399:0:0:0:0:44 [Thread debugging using libthread_db enabled] [New Thread 16384 (LWP 8523)] (no debugging symbols found) Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 16384 (LWP 8523)] The first thing the developers will ask for is a reproducible test-case. Could you reduce "add_address" to the minimum? The next thing they'll ask is whether you built with --enable-thread-safety turned on. http://www.postgresql.org/docs/8.0/static/libpq-threading.html -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 6: explain analyze is your friend