Re: [HACKERS] Selecting a constant question: A summary
> For some Unicode character sets, element_width can be as much as 4 In UTF8 one char can be up to 6 bytes, so 4 is not correct in general. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question: A summary
"Chuck McDevitt" <[EMAIL PROTECTED]> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Even if it was assigned a text datatype it would be the unconstrainted "text" not varchar(1). If we used varchar(1) then things like: create table foo as select 'foo'; would create a table with varchar(3) which would then complain if you tried to insert 'foobar' into. There doesn't seem to be enough evidence that the user intended to constrain the input to just 'foo' in that case. Of course right now you get a table with a column of type "unknown" which is very unlikely to be what the user expects. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question: A summary
On Wed, Jun 13, 2007 at 02:12:37AM -0400, Chuck McDevitt wrote: > So, where x = '(1,2)' might be legal for comparing to x, but a field of > type varchar(5) might not be, as in where x = y, where y is type > varchar(5) containing '(1,2)'. Normally, just about every type can be converted to or from text. So if postgres converted to varchar first you have problems with the statement x='const' where x is type "foo". It's now ambiguous since you either convert x to varchar or the constant to "foo". Instead, postgres marks the constant as unknown and now it always gets converted because nothing can convert to unknown. Thus far this system has worked excellently, though not perfectly obviously. It matches people's expectations well, which is the most important part. > I find PostgreSQL's handling of this strange, as I come from systems > where 'xxx' is either a varchar or char type, in all contexts, and > implicit casts handle any needed conversions. > But now I understand why it does things this way. User-defined types makes for lots of interesting choices, but they are by far the most powerful feature of postgres and we don't want to mess with that. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Selecting a constant question: A summary
I see... PostgreSQL wants to guess the datatype, given no clear syntactic information, and perhaps a varchar(n) wouldn't be a valid cast to some of the possible datatypes. So, where x = '(1,2)' might be legal for comparing to x, but a field of type varchar(5) might not be, as in where x = y, where y is type varchar(5) containing '(1,2)'. (Time values don't have this problem in pure ANSI SQL, since the literal is TIME '12:34', but I can see for user types it might be ambiguous). I find PostgreSQL's handling of this strange, as I come from systems where 'xxx' is either a varchar or char type, in all contexts, and implicit casts handle any needed conversions. But now I understand why it does things this way. Thanks. -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 9:50 PM To: Chuck McDevitt Cc: Andrew Hammond; Josh Berkus; pgsql-hackers@postgresql.org; Dann Corbit; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question: A summary "Chuck McDevitt" <[EMAIL PROTECTED]> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Because, for instance, it might be intended as an integer or float or numeric value. Change the content a little, like '(1,2)' or '12:34', and maybe it's a point or time value. There are plenty of contexts in which the intended type of a literal is obviously not text/varchar. We assign unknown initially as a way of flagging that the type assignment is uncertain. Once we have a value that we think is varchar (a table column for instance), the rules for deciding to cast it to a different type get a lot more stringent. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
Hannu Krosing wrote: Ühel kenal päeval, T, 2007-06-12 kell 13:40, kirjutas Larry McGhaw: For what its worth .. Your statement about why we are the first people to mention this problem really got me thinking. Anyone who would attempt to write an ODBC driver for Postgres would run into the exact same issue. So I installed the official Postgres ODBC driver, and ran the identical query and here are my results: I probably should have looked at this first There is a whole Postgres ODBC dialog dedicated to the very subject of this thread: Handling of "unknown" data sizes. The pgodbc driver is configured to treat unknowns as varchar(255) by default, As shown by my example below. This can be configured up or down as desired. BTW, what is the reason you are writing your own ODBC driver ? They aren't I don't think. I think they are using the ODBC driver as an example. Joshua D. Drake What problems in the official one are you trying to solve ? -- Hannu ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question: A summary
"Chuck McDevitt" <[EMAIL PROTECTED]> writes: > Just a curiosity question: Why is the type of a literal '1' "unknown" > instead of varchar(1)? Because, for instance, it might be intended as an integer or float or numeric value. Change the content a little, like '(1,2)' or '12:34', and maybe it's a point or time value. There are plenty of contexts in which the intended type of a literal is obviously not text/varchar. We assign unknown initially as a way of flagging that the type assignment is uncertain. Once we have a value that we think is varchar (a table column for instance), the rules for deciding to cast it to a different type get a lot more stringent. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question: A summary
Just a curiosity question: Why is the type of a literal '1' "unknown" instead of varchar(1)? Wouldn't varchar(1) cast properly to any use of the literal '1'? What is the benefit of assuming it's an unknown? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
Ühel kenal päeval, T, 2007-06-12 kell 13:40, kirjutas Larry McGhaw: > For what its worth .. Your statement about why we are the first people > to mention this problem really got me thinking. Anyone who would > attempt to write an ODBC driver for Postgres would run into the exact > same issue. So I installed the official Postgres ODBC driver, and ran > the identical query and here are my results: > > I probably should have looked at this first There is a whole > Postgres ODBC dialog dedicated to the very subject of this thread: > Handling of "unknown" data sizes. The pgodbc driver is configured to > treat unknowns as varchar(255) by default, > As shown by my example below. This can be configured up or down as > desired. BTW, what is the reason you are writing your own ODBC driver ? What problems in the official one are you trying to solve ? -- Hannu ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question: A summary
"Andrew Hammond" <[EMAIL PROTECTED]> writes: > - Implicit casting of unknown to char(n) or anything else seems rather > sketchy to me, but I can't see any specific objection, except that... > - I don't know when the right time to do the cast is. And doing it too > early seems obviously wrong. Well, I don't see any reason that we'd consider an implicit cast to char(N) without context to drive us in that direction. The system is currently biased to prefer casts to text. You could make a reasonable case for forcing a cast to text if the constant's type is still unresolved at the end of parsing, and indeed people have proposed that off and on just so that clients would have one less type to think about. In itself it doesn't do anything for Dann's problem though, because unspecified width is unspecified width. I've been thinking lately about trying harder to unify the text and varchar types; I'm not sure about details yet, except that text should be *exactly* the same thing as unconstrained-width varchar, rather than almost the same except we claim it's a different type. The reason I'd been thinking about this was mainly to get rid of the complexity and runtime overhead that comes from having RelabelType nodes all over the place when someone uses varchar instead of text. But if we did that, we could also arrange that unknown literals coerce to varchar(N) with N equal to their actual width, rather than coercing to text, and not create any weird corner-case behaviors in the type system. But at the end of the day this all would only solve Dann's problem for the specific case of a SELECT with an undecorated literal constant in its target list. He's still going to have to deal with unknown-width columns in an enormous variety of cases, and so I completely fail to see the point of changing the system's behavior for this one case. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question: A summary
Josh Berkus <[EMAIL PROTECTED]> writes: >> What's the point? You keep reminding us that your code is middleware >> that can't assume anything much about the queries you're dealing with. > Hmmm? I thought that Dann was just talking about constants, and not column > results. Am I confused? Well, the specific example he was on about was a constant, but I don't think it does him any good for us to fix just that one case. He'll still have to deal with columns of indeterminate width in a whole lot of other cases. If there were a reasonable path for us to report a useful width bound in *every* case, then I could see spending time on it ... but there's not. BTW, it would certainly be trivial to extend libpq to report the actual max width of a column within an already-retrieved PGresult. This isn't anything the client code can't compute for itself, of course, but libpq could get it in somewhat fewer cycles. However, I'm under the impression that Dann wants the number at statement prepare time, and we simply haven't got the information then. > I'd think it would be possible to do this in an abstract way ... having a > "DisplayLength()" call for each data type and value. That would require > casting the constant, though, or computing all uncast constants as text. No, the point is about predicting the max width of a column of a query result in advance of actually running the query. After you've got the values in hand, it's not a very interesting problem. Before, well, consider these examples: select repeat(text_col, int_col) from my_table; select repeat(text_col, int_col * random()) from my_table; select repeat(text_col, some_user_defined_function(int_col)) from my_table; The problem's really not soluble unless you want to dumb Postgres down to approximately the capabilities of SQL89 -- no user-defined functions, let alone user-defined types, plus pull out a whole lot of the built-in functions that don't have readily predictable result widths. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Selecting a constant question: A summary
On 6/12/07, Josh Berkus <[EMAIL PROTECTED]> wrote: Tom, > What's the point? You keep reminding us that your code is middleware > that can't assume anything much about the queries you're dealing with. > Therefore, I see no real value in fixing up one corner case. Your > argument about space allocation falls to the ground unless we can > provide a guaranteed, and usefully tight, upper bound on the column > width in *every* situation. If we cannot (which we can't), you're still > going to need those client-side "kluges". Hmmm? I thought that Dann was just talking about constants, and not column results. Am I confused? > BTW, the reason I'm resistant to even thinking about this is that > Postgres is designed as an extensible system. Trying to do what you > want is not a matter of fixing literal constants and concatenation > and one or two other places --- it's a matter of imposing a new and > potentially hard-to-meet requirement on every datatype under the sun, > including a lot of user-written code that we don't control and would > break by adding such a requirement. So it's not even likely that we'd > think very hard about making this work, let alone actually do it. I'd think it would be possible to do this in an abstract way ... having a "DisplayLength()" call for each data type and value. That would require casting the constant, though, or computing all uncast constants as text. The simplest formulation of this problem appears to be that constant strings that are uncast are treated as type unknown. The connx guys seem to think that they should be implicitly cast to char(n) where n is the length of the string. Is that a reasonable description, or are you guys looking for something more general? If you're just talking about the strings, then here are the thoughts I've gleaned from the preceding thread. - This makes possible some performance tweaks for drivers - It achieves spec compliance (albeit for a stupid part of the spec) - Implicit casting of unknown to char(n) or anything else seems rather sketchy to me, but I can't see any specific objection, except that... - I don't know when the right time to do the cast is. And doing it too early seems obviously wrong. - This only helps in corner case of string constants that are 1. not already cast and 2. not manipulated in any way And that seems like a very small corner case with little or no practical use. I guess if you have some code that turns query output into some flavor of pretty-print, it'd make sense to have a constant column as output of a CASE statement or something. - The corner case must already be correctly handled by the general case for arbitrary sized text, or alternatively phrased: there is no way to conform to the standard while supporting arbitrary sized text. Unless you're willing to pay the cost of scanning twice, or maintaining "biggest entry" data for each variable length column. - I don't know how much effort it would require to implement this, nor how much complexity it would add to the code base. Clearly both of these would be non-zero values. Given the above, I agree with Tom: this seems like corner case where the returns are marginal at best, compared to the cost to implement and maintain. Is there something I'm getting wrong in this summary? Andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question: A summary
Tom, > What's the point? You keep reminding us that your code is middleware > that can't assume anything much about the queries you're dealing with. > Therefore, I see no real value in fixing up one corner case. Your > argument about space allocation falls to the ground unless we can > provide a guaranteed, and usefully tight, upper bound on the column > width in *every* situation. If we cannot (which we can't), you're still > going to need those client-side "kluges". Hmmm? I thought that Dann was just talking about constants, and not column results. Am I confused? > BTW, the reason I'm resistant to even thinking about this is that > Postgres is designed as an extensible system. Trying to do what you > want is not a matter of fixing literal constants and concatenation > and one or two other places --- it's a matter of imposing a new and > potentially hard-to-meet requirement on every datatype under the sun, > including a lot of user-written code that we don't control and would > break by adding such a requirement. So it's not even likely that we'd > think very hard about making this work, let alone actually do it. I'd think it would be possible to do this in an abstract way ... having a "DisplayLength()" call for each data type and value. That would require casting the constant, though, or computing all uncast constants as text. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question: A summary
As Dann pointed out we were one of the first companies to port Postgres to windows many many years ago (7.1 days), and part of that porting work is in the current postgresql product. As I pointed out in a prior post, for the ODBC specification at least (probably others), a maximum upper bound on returned data *must* be reported and determined ahead of time when using binding ... A technique where the client application allocates memory for the data and supplies a pointer to that memory location for the driver. Postgres unlike other databases shifts the burden of determining this maximum size to the client and/or driver. Our company specializes in access to wide variety of databases, both relational and non relational, including SQL Server, Oracle, DB2, Sybase, Informix, etc. Postgres sticks out as the only database that we have encountered with this behavior .. Which is why we posted the original message. Also as Dann pointed out even if this issue was addressed, it does not help us because every existing installation of Postgres has the metadata bug, so we *have* to bandaid it at the client/driver level anyway. At least we have a record of the issue, so the next time a developer in the community runs across the same oddity hopefully they will find this and won't be scratching their heads like we were for a bit :) Thanks lm -Original Message- From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 3:11 PM To: Dann Corbit Cc: pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question: A summary "Dann Corbit" <[EMAIL PROTECTED]> writes: > In the case of a SELECT query that selects a fixed constant of any > sort, it would be a definite improvement for PostgreSQL to give some > sort of upper maximum. What's the point? You keep reminding us that your code is middleware that can't assume anything much about the queries you're dealing with. Therefore, I see no real value in fixing up one corner case. Your argument about space allocation falls to the ground unless we can provide a guaranteed, and usefully tight, upper bound on the column width in *every* situation. If we cannot (which we can't), you're still going to need those client-side "kluges". In my opinion, variable-length data is a fact of life and you should endeavor to make your code deal with it gracefully. There are bits of the SQL spec that assume fixed-width data specifications are useful, but to be blunt that's all a holdover from 1960s 80-column-punch-card thinking. It's no way to design a modern application. BTW, the reason I'm resistant to even thinking about this is that Postgres is designed as an extensible system. Trying to do what you want is not a matter of fixing literal constants and concatenation and one or two other places --- it's a matter of imposing a new and potentially hard-to-meet requirement on every datatype under the sun, including a lot of user-written code that we don't control and would break by adding such a requirement. So it's not even likely that we'd think very hard about making this work, let alone actually do it. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question: A summary
"Dann Corbit" <[EMAIL PROTECTED]> writes: > In the case of a SELECT query that selects a fixed constant of any sort, > it would be a definite improvement for PostgreSQL to give some sort of > upper maximum. What's the point? You keep reminding us that your code is middleware that can't assume anything much about the queries you're dealing with. Therefore, I see no real value in fixing up one corner case. Your argument about space allocation falls to the ground unless we can provide a guaranteed, and usefully tight, upper bound on the column width in *every* situation. If we cannot (which we can't), you're still going to need those client-side "kluges". In my opinion, variable-length data is a fact of life and you should endeavor to make your code deal with it gracefully. There are bits of the SQL spec that assume fixed-width data specifications are useful, but to be blunt that's all a holdover from 1960s 80-column-punch-card thinking. It's no way to design a modern application. BTW, the reason I'm resistant to even thinking about this is that Postgres is designed as an extensible system. Trying to do what you want is not a matter of fixing literal constants and concatenation and one or two other places --- it's a matter of imposing a new and potentially hard-to-meet requirement on every datatype under the sun, including a lot of user-written code that we don't control and would break by adding such a requirement. So it's not even likely that we'd think very hard about making this work, let alone actually do it. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question: A summary
Dan, > Secondly, we > believe that we should treat the customers the way that we want to be > treated. > I think that the PostgreSQL group has managed the first objective, but > not the second. I just read this whole thread, and I feel that the sort of comment above is completely unjustified, and counterproductive to your goal of adding a feature to PostgreSQL which will make your driver work better. You'll be a lot more likely to persuade people in the community to work with you if you're not trying to convince them to change the project culture at the same time. You are on the developer mailing list for an open, community-based open source project and *not* a commercial company. Therefore we do not have "customers" and your paradigm is wrong. The PostgreSQL developers *are* treating you exactly has they expect to be treated; as a developer, meaning that you argue things out and defend your desire for a change. If you read anybody else's discussion on this list you'll see that's how everyone else interacts. If anything you've gotten more than your fair share of attention ... 40+ posts from 1/2 dozen senior developers in less than 48 hours! If you would prefer a more formal customer-vendor relationship, then I suggest that you sign up as a customer of EnterpriseDB, Red Hat, Sun, Command Prompt, SRA etc. or similar. Now, that aside: > According to SQL/CLI and ODBC 3.5, we should bind the length of a > character column. This is a much better approach. Standards are always nice. > But it would be very nice if the database could provide a good estimate > for us so that PostgreSQL could work like all of the other database > systems. Code full of kludges is harder to maintain. Do you have any information about how binding works in other databases? A clear roadmap would make it easier for eventual developer implementation, and obviously this is a solved problem elsewhere. > And so I hope that we can get off on a better foot this time. If the > answer is "No, the priority for this sort of thing is low, and we do not > consider it important for our customers." Again, we don't have "customers". So your desire to implement a change in behavior is dependant on: 1. Getting this list to agree on the specification; 2. Convincing an *individual* PostgreSQL developer or contributing company that this issue is in their high priority interest to fix, OR Fixing it yourself and submitting the patch to PostgreSQL.org. > Then we will have to work > around it. Hopefully, at least, it will get put into a queue of future > enhancements. Getting it on the TODO list is a good first step. However, that doesn't get it implemented until it becomes some other developer's problem as well. -- Josh Berkus PostgreSQL @ Sun San Francisco ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question: A summary
Dann Corbit wrote: First a comment: At CONNX Solutions Inc., we believe sincerely that we should do whatever is necessary to make our customers prosper. This means creation of excellent tools and being responsive to customer needs. Secondly, we believe that we should treat the customers the way that we want to be treated. I think that the PostgreSQL group has managed the first objective, but not the second. Of course, that is only an opinion, but I think that success hinges on both factors. Our objective in this issue has also been to improve PostgreSQL so that it can become more useful to the end users and not to denigrate the work of the engineers that have toiled on it. I will also admit that frustration has caused our tone to become sharp at times. This is clearly a mistake on our part and for this, I apologize. Woah, now this is interesting. This morning, I read this whole thread, wondering what in the world could possibly be taking so long ;). I will admit that many of us in the community tend to try to provide a solution without actually understanding the problem. I think it kind of comes with the territory, a lot of times it seems like FOSS is all about the work around versus the solution because the solution takes longer. Perception is a powerful thing. Personally, I didn't see any of the community doing anything but trying their best to help you with the problem you were experiencing. What I did see, is a lot of tenseness from your side, to what basically amounts to "free" support. Remember that we are here, at no cost to you. Lastly, the PostgreSQL community doesn't have customer. You have customer, CMD has customers, the PostgreSQL community does not. The best correlation I can give you is this. We (the community) are all a team. You are part of that team. We are not your vendor. Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Larry McGhaw > Sent: Tuesday, June 12, 2007 1:40 PM > To: Martijn van Oosterhout > Cc: Andrew Dunstan; Hannu Krosing; Tom Lane; Alvaro Herrera; Dann Corbit; > Gregory Stark; pgsql-hackers@postgresql.org > Subject: RE: [HACKERS] Selecting a constant question > > For what its worth .. Your statement about why we are the first people to > mention this problem really got me thinking. Anyone who would attempt to > write an ODBC driver for Postgres would run into the exact same issue. > So I installed the official Postgres ODBC driver, and ran the identical > query and here are my results: > > I probably should have looked at this first There is a whole Postgres > ODBC dialog dedicated to the very subject of this thread: > Handling of "unknown" data sizes. The pgodbc driver is configured to > treat unknowns as varchar(255) by default, > As shown by my example below. This can be configured up or down as > desired. > > SQLExecDirect: > In: hstmt = 0x003C18E0, szSqlStr = "Select a,b,c, '123' , '123'::char(3), > '123'::varchar(3) from...", cbSqlStr = -3 > Return: SQL_SUCCESS=0 > > Describe Column All: > icol, szColName, *pcbColName, *pfSqlType, *pcbColDef, *pibScale, > *pfNullable > 1, a, 1, SQL_VARCHAR=12, 20, 0, SQL_NULLABLE=1 > 2, b, 1, SQL_CHAR=1, 10, 0, SQL_NULLABLE=1 > 3, c, 1, SQL_INTEGER=4, 10, 0, SQL_NULLABLE=1 > 4, ?column?, 8, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 > 5, bpchar, 6, SQL_CHAR=1, 3, 0, SQL_NULLABLE=1 > 6, varchar, 7, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 > > From psqlodbc.h > > #define MAX_VARCHAR_SIZE 255 /* default maximum size of >* varchar fields (not including null > term) */ > > So I guess the bottom line is that we are not the first to encounter this > problem .. Its just been covered up by assigning > An arbitrary maximum size .. So I guess we will do the same and make it > configurable like the official postgres driver. Of course, the downside here is that choosing a default will truncate the data when the actual data is larger than the default chosen. ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Selecting a constant question: A summary
First a comment: At CONNX Solutions Inc., we believe sincerely that we should do whatever is necessary to make our customers prosper. This means creation of excellent tools and being responsive to customer needs. Secondly, we believe that we should treat the customers the way that we want to be treated. I think that the PostgreSQL group has managed the first objective, but not the second. Of course, that is only an opinion, but I think that success hinges on both factors. Our objective in this issue has also been to improve PostgreSQL so that it can become more useful to the end users and not to denigrate the work of the engineers that have toiled on it. I will also admit that frustration has caused our tone to become sharp at times. This is clearly a mistake on our part and for this, I apologize. Next, the problem: According to SQL/CLI and ODBC 3.5, we should bind the length of a character column. Here are some references from the relevant documentation (SQL/CLI and ODBC are clones of one another): == ANSI/ISO/IEC 9075-3-1999 for Information Technology Database Language SQL Part 3: Call-Level Interface (SQL/CLI) Section 6.5 BindCol Along with function SQLBindCol from the ODBC specification http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/ht m/odbcsqlbindcol.asp This bit should be sufficient to explain what we are after: "BufferLength [Input] Length of the *TargetValuePtr buffer in bytes. The driver uses BufferLength to avoid writing past the end of the *TargetValuePtr buffer when returning variable-length data, such as character or binary data. Note that the driver counts the null-termination character when returning character data to *TargetValuePtr. *TargetValuePtr must therefore contain space for the null-termination character or the driver will truncate the data. When the driver returns fixed-length data, such as an integer or a date structure, the driver ignores BufferLength and assumes the buffer is large enough to hold the data. It is therefore important for the application to allocate a large enough buffer for fixed-length data or the driver will write past the end of the buffer. SQLBindCol returns SQLSTATE HY090 (Invalid string or buffer length) when BufferLength is less than 0 but not when BufferLength is 0. However, if TargetType specifies a character type, an application should not set BufferLength to 0, because ISO CLI-compliant drivers return SQLSTATE HY090 (Invalid string or buffer length) in that case." == Now, there are times when (according to the spec) we have to defer binding. However, this causes great problems for end user tools and should only be done in what is basically a dire emergency. In the case of a SELECT query that selects a fixed constant of any sort, it would be a definite improvement for PostgreSQL to give some sort of upper maximum. For example: SELECT "Eastern Division", sum(Inventory_level), sum(Inventory_backorder), Manager_last_name FROM WHERE division_id = 9 GROUP BY Manager_last_name Will return 3 columns of data. The first column is of unknown length. Imagine if you are a spreadsheet in OpenOffice: http://www.openoffice.org/ which happens to support ODBC connections. You would like to fill out a report for the president of your company. Unfortunately, the first column is of "unknown length" That makes it a bit difficult to format this spreadsheet. Now, I will admit that we may not know a-priori if "Eastern Division" is character or Unicode or MBCS. But in the worst case scenario it will be (16 + 1) * element_width bytes in length. For some Unicode character sets, element_width can be as much as 4, so that leaves 68 octets as an upper possible maximum. Now, you might protest, 68 bytes might be much too large. That is true, but I know that if I allocate 68 bytes we will not have data truncation. It is no worse than a varchar(255) field that has a largest item 15 characters wide in it. The grid will successfully bind and we will be able to produce the report. Generally speaking, grids are smart enough to automatically resize themselves to max_length(grid_column_title, grid_column_data) and so the report will look very nice. It is also true that it is possible for us to work around the problem. We certainly can know the exact type information about the constants in our queries and reformat the PostgreSQL queries to decorate them with things like: SELECT "Eastern Division"::char(16), sum(Inventory_level)::Numeric(16,4), sum(Inventory_backorder) ::Numeric(16,4), Manager_last_name FROM WHERE division_id = 9 GROUP BY Manager_last_name But it would be very nice if the database could provide a good estimate for us so that PostgreSQL could work like all of the other database systems. Code full of kludges is ha
Re: [HACKERS] Selecting a constant question
For what its worth .. Your statement about why we are the first people to mention this problem really got me thinking. Anyone who would attempt to write an ODBC driver for Postgres would run into the exact same issue. So I installed the official Postgres ODBC driver, and ran the identical query and here are my results: I probably should have looked at this first There is a whole Postgres ODBC dialog dedicated to the very subject of this thread: Handling of "unknown" data sizes. The pgodbc driver is configured to treat unknowns as varchar(255) by default, As shown by my example below. This can be configured up or down as desired. SQLExecDirect: In: hstmt = 0x003C18E0, szSqlStr = "Select a,b,c, '123' , '123'::char(3), '123'::varchar(3) from...", cbSqlStr = -3 Return: SQL_SUCCESS=0 Describe Column All: icol, szColName, *pcbColName, *pfSqlType, *pcbColDef, *pibScale, *pfNullable 1, a, 1, SQL_VARCHAR=12, 20, 0, SQL_NULLABLE=1 2, b, 1, SQL_CHAR=1, 10, 0, SQL_NULLABLE=1 3, c, 1, SQL_INTEGER=4, 10, 0, SQL_NULLABLE=1 4, ?column?, 8, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 5, bpchar, 6, SQL_CHAR=1, 3, 0, SQL_NULLABLE=1 6, varchar, 7, SQL_VARCHAR=12, 255, 0, SQL_NULLABLE=1 From psqlodbc.h #define MAX_VARCHAR_SIZE255 /* default maximum size of * varchar fields (not including null term) */ So I guess the bottom line is that we are not the first to encounter this problem .. Its just been covered up by assigning An arbitrary maximum size .. So I guess we will do the same and make it configurable like the official postgres driver. Thanks lm -Original Message- From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 10:43 AM To: Larry McGhaw Cc: Andrew Dunstan; Hannu Krosing; Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Hi, Nobody is tring to attack anyone, but we're all surprised this is an issue since you're the first person to have mentioned it. I have some a query to test below: On Tue, Jun 12, 2007 at 10:21:09AM -0700, Larry McGhaw wrote: > We noticed inexplicably that when we used a constant with a postgres > query, our records per second dropped From 60,000 records per second > to 600 records per second, so we started digging into the issue. > > We discovered that libpq was not describing the metadata properly for > the constant column, and it appears That the 3rd party grid control > was relying on that metadata somehow .. > The bottom line is that there was > A huge performance drag. What I don't understand is *why* it's complaining about the constant column and not, for example, any other variable length column. There are a very small number of cases where a useful length is returned, 99% of the time it doesn't, yet you're obviously not get any performance problems there. Just a quick test, does the problem go away if you do: SELECT '1'::varchar FROM table; If that fixes it then the bug is (probably) that the middleware thinks that a length of -2 means it's 65534 bytes long. Note, in the test query I gave, it will return -1 for the length. I don't want to blame the middleware, but I want to make sure we're diagnosing the problem correctly. If that query has the same problem, then we really need to think of something else. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to litigate. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
"Larry McGhaw" <[EMAIL PROTECTED]> writes: > The statement above is contrary to my actual results. The proper length > is returned in all non-const cases. > > Here is a specific example: > > test=# create table test1 ( a varchar(20), b char(10), c integer ); > CREATE TABLE It's not returning a length at all though. It's returning the typmod, ie, the thing in the parentheses above. In that respect it's perfectly correct to return -1 for the '123' case as well since it's interpreted as an unbounded string and has no maximum length. It happens to only be three characters but then the values in the table could happen to be much less than the 10 or 20 characters you declared them as. The reason you might want to get this has more to do with understanding the semantics of the data you're receiving than optimizing storage. If you queried a Numeric column you would get something very different from the length from which you could extract the maximum precision and scale. This might help you display or work with the results maintaining the precision and scale a user expects. One reason why it might be useful to add an actual measure of the expected length (Postgres does make guesses about the length for planning purposes) would be to so a driver could size buffers appropriately. For example, in psql where we use cursors to process rows, we might want to automatically use a fetch count calculated to be large enough to receive approximately one ethernet frame of data. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
That one surprised me as well. Thanks lm -Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 12:00 PM To: Larry McGhaw Cc: Brian Hurt; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Larry McGhaw wrote: > For constant '123'::varchar(3) libpq returns the following: > Pqfsize returns -1 > Pqfmod returns -1 > > That one certainly looks odd. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
Larry McGhaw wrote: For constant '123'::varchar(3) libpq returns the following: Pqfsize returns -1 Pqfmod returns -1 That one certainly looks odd. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
>> What I don't understand is *why* it's complaining about the constant column >> and not, for example, any other variable length column. There are a very small >> number of cases where a useful length is returned, 99% of the time it doesn't, >> yet you're obviously not get any performance problems there. The statement above is contrary to my actual results. The proper length is returned in all non-const cases. Here is a specific example: test=# create table test1 ( a varchar(20), b char(10), c integer ); CREATE TABLE test=# Note .. The table is empty, and contains no data at this point: Select a, b, c, '123' , '123'::char(3), '123'::varchar(3) from test1 For column a libpq returns the following: Pqfsize returns -1 Pqfmod (-4) returns 20 For column b libpq returns the following: Pqfsize returns -1 Pqfmod (-4) returns 10 For column c libpq returns the following: Pqfsize returns 4 For constant '123' libpq returns the following: Pqfsize returns -2 Pqfmod returns -1 For constant '123'::char(3) libpq returns the following: Pqfsize returns -1 Pqfmod (-4) returns 3 For constant '123'::varchar(3) libpq returns the following: Pqfsize returns -1 Pqfmod returns -1 Thanks lm -Original Message- From: Brian Hurt [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 11:09 AM To: Larry McGhaw Cc: pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Larry McGhaw wrote: >I'm really frustrated by this process I'm not trying to attack anyone >here. I'm just surprised that no one will even entertain the idea that >this is an issue that needs to be addressed. > >Instead nearly all of the responses have been attacking the >applications that rely on the metadata. > > Having been following this debate, I think what people have really been attacking is the idea that the metadata for: SELECT '1' AS varchar_column; should be different from the metadata for: SELECT varchar_column FROM really_big_table; or for: SELECT varchar_column FROM really_small_table; Or at least that's what I've taken away from the dicussion- it's not so much that the metadata shouldn't be relied on, it's that the metadata may be more generic than theoretically necessary. And that the metadata may not contain the length of a variable length field even when that length is known. Brian ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
Larry McGhaw wrote: I'm really frustrated by this process I'm not trying to attack anyone here. I'm just surprised that no one will even entertain the idea that this is an issue that needs to be addressed. Instead nearly all of the responses have been attacking the applications that rely on the metadata. Having been following this debate, I think what people have really been attacking is the idea that the metadata for: SELECT '1' AS varchar_column; should be different from the metadata for: SELECT varchar_column FROM really_big_table; or for: SELECT varchar_column FROM really_small_table; Or at least that's what I've taken away from the dicussion- it's not so much that the metadata shouldn't be relied on, it's that the metadata may be more generic than theoretically necessary. And that the metadata may not contain the length of a variable length field even when that length is known. Brian ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
Hi, Nobody is tring to attack anyone, but we're all surprised this is an issue since you're the first person to have mentioned it. I have some a query to test below: On Tue, Jun 12, 2007 at 10:21:09AM -0700, Larry McGhaw wrote: > We noticed inexplicably that when we used a constant with a postgres > query, our records per second dropped > From 60,000 records per second to 600 records per second, so we started > digging into the issue. > > We discovered that libpq was not describing the metadata properly for > the constant column, and it appears > That the 3rd party grid control was relying on that metadata somehow .. > The bottom line is that there was > A huge performance drag. What I don't understand is *why* it's complaining about the constant column and not, for example, any other variable length column. There are a very small number of cases where a useful length is returned, 99% of the time it doesn't, yet you're obviously not get any performance problems there. Just a quick test, does the problem go away if you do: SELECT '1'::varchar FROM table; If that fixes it then the bug is (probably) that the middleware thinks that a length of -2 means it's 65534 bytes long. Note, in the test query I gave, it will return -1 for the length. I don't want to blame the middleware, but I want to make sure we're diagnosing the problem correctly. If that query has the same problem, then we really need to think of something else. Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Selecting a constant question
I'm really frustrated by this process I'm not trying to attack anyone here. I'm just surprised that no one will even entertain the idea that this is an issue that needs to be addressed. Instead nearly all of the responses have been attacking the applications that rely on the metadata. Let me back up and explain the situation. This issue came to light for us when we were using a query tool to examine performance of postgres queries. We were not only measuring the performance of the database itself, but also the TCP/IP transport, And the rendering of the data .. Comparing SQL Server, Oracle, and Postgres head to head with the same queries. We noticed inexplicably that when we used a constant with a postgres query, our records per second dropped From 60,000 records per second to 600 records per second, so we started digging into the issue. We discovered that libpq was not describing the metadata properly for the constant column, and it appears That the 3rd party grid control was relying on that metadata somehow .. The bottom line is that there was A huge performance drag. * OK ... I agree that the memory handling in the grid control could be better, but I would imagine that this issue is not an isolated to this one particular control, and that other applications and controls that rely on resultset metadata may have this issue. Bottom line, we only reported this problem because we thought you would be interested in doing everything possible to make postgres more mainstream and conform to SQL standards. In the past such suggestions have been absorbed with zeal. I have no vested interest in you improving the interface or not, and I'm not going to "plead a case" for you To do something that every other commercial database has done out of the box. It is in your hands now :) Thanks lm -Original Message- From: Andrew Dunstan [mailto:[EMAIL PROTECTED] Sent: Tuesday, June 12, 2007 4:50 AM To: Larry McGhaw Cc: Hannu Krosing; Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Larry McGhaw wrote: > Again, the issue is not our tool, but the deficiency in libpq/postgres > ... even mysql gets its right .. why not Postgres? > > Its not hard for a database to report metadata properly. > > if I issue a sql statement: > select '123' from > the database should report that the maximum length of the 1st column > in the resultset is 3 ... it cant be any more plain than that. > > > Making assertions like this does not make your case for you. If you think it's that easy then send in a patch. I suspect that doing what you want in the cases where it could be supported would require a protocol change, with possibly an extra field in the RowDescription object. If that's true you'd need to make a very good and compelling case indeed for such a change. If this is so vital I'm curious to know why driver authors haven't been screaming about it until now. I'm not dismissing what you want, but just waving your hand and saying "it's not hard" really won't do. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
Larry McGhaw wrote: Again, the issue is not our tool, but the deficiency in libpq/postgres ... even mysql gets its right .. why not Postgres? Its not hard for a database to report metadata properly. if I issue a sql statement: select '123' from the database should report that the maximum length of the 1st column in the resultset is 3 ... it cant be any more plain than that. Making assertions like this does not make your case for you. If you think it's that easy then send in a patch. I suspect that doing what you want in the cases where it could be supported would require a protocol change, with possibly an extra field in the RowDescription object. If that's true you'd need to make a very good and compelling case indeed for such a change. If this is so vital I'm curious to know why driver authors haven't been screaming about it until now. I'm not dismissing what you want, but just waving your hand and saying "it's not hard" really won't do. cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
On Tue, Jun 12, 2007 at 12:47:55PM +0200, Zeugswetter Andreas ADI SD wrote: > I think this focuses too much on those cases where it is not possible. > When it is not feasible like with a text column, clients deal with it > already (obviously some better than others). > It is for those cases where it would be feasible, like constants (or > concateneted columns), where the max length if properly returned could > be used to improve performance. For constants there is a basic problem that Postgres, if at all possible, doesn't even analyse the string at all. If it's not part of a join or sort, then in every likelyhood it's passed through the entire execution untouched and comes out the other end as type unknown. The length indicator of -2 indicates a null-terminated string, postgres never even bothered calculating the length of it. For the situation of concatinating varchar columns, it's a fairly special case. The typmod, in the *special case* of varchar is the maximum length, but for other types it means something else. Additionally, the planner doesn't know that || is concatination, a consequence of the user-defined operators. So to make this work you need to change the planner so that: 1. It special cases varchar to know what the typmod means 2. It special cases the || operator to add the typmods together. 3. Has to take special care not to break user-defined operators All a pile of hacks and special cases to handle something that, to be honest, the vast majority of people never notice. So no, no patch is going to be accepted to handle this special case, because it's far too hacky for a corner case. On the other hand, if you can piggyback it into something like the "user-defined typmod" stuff, it may have a better chance, though I really think the first problem is basically "won't fix" from an optimisation point of view. Hope this clarifies things a bit, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Selecting a constant question
Heikki Linnakangas wrote: > Actually, if you're in such a high throughput, client-side CPU-intensive > situation that this makes any difference, why are you copying the value > to another buffer in the first place? Just access it directly in the > libpq buffer returned by PQgetvalue, and move on. That's a *very* good point. The original design for the pgAdmin query tool made it's own copy of the data to display in the grid which is exactly why we used to get complaints about having a query time and a display time. The modern versions use a virtual table which enables the grid to retrieve the data directly from the libpq buffer when it needs to draw each cell which has effectively eliminated that display time. Regards, Dave. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
Zeugswetter Andreas ADI SD wrote: Thats exactly the point. Consider select mytext from mytable ; How can PostgreSQL possibly know the maximum length of the returned values *before* it has scanned the whole table? I think this focuses too much on those cases where it is not possible. When it is not feasible like with a text column, clients deal with it already (obviously some better than others). It is for those cases where it would be feasible, like constants (or concateneted columns), where the max length if properly returned could be used to improve performance. I doubt there's any measurable performance benefit here. You might as well allocate a buffer of say 128 bytes, and enlarge it from there when you see a value larger than that. Even in the worst case, you'll only need to enlarge the buffer a few times per query until you reach the real max length. Actually, if you're in such a high throughput, client-side CPU-intensive situation that this makes any difference, why are you copying the value to another buffer in the first place? Just access it directly in the libpq buffer returned by PQgetvalue, and move on. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
> Thats exactly the point. Consider > select mytext from mytable ; > > How can PostgreSQL possibly know the maximum length of the > returned values *before* it has scanned the whole table? I think this focuses too much on those cases where it is not possible. When it is not feasible like with a text column, clients deal with it already (obviously some better than others). It is for those cases where it would be feasible, like constants (or concateneted columns), where the max length if properly returned could be used to improve performance. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
Dann Corbit wrote: -Original Message- From: Hannu Krosing [mailto:[EMAIL PROTECTED] Since libpq function PQfsize returns -2 for all constant character strings in SQL statements ... What is the proper procedure to determine the length of a constant character column after query execution but before fetching the first row of data? Why not just get the first row and determine the width from it before you actually use any of tha data ? What if the second row is 1000x longer? Thats exactly the point. Consider select mytext from mytable ; How can PostgreSQL possibly know the maximum length of the returned values *before* it has scanned the whole table? greetings, Florian Pflug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Selecting a constant question
"Larry McGhaw" <[EMAIL PROTECTED]> writes: > The database *knows* this size of the char constant (obviously), and > should report the size via a metadata call, as all other relational > databases do. I'm not even clear whether you and Dan are talking about the same thing. He's talking about the number of bytes required hold the constant. You seem to be talking about the character length of strings. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
> > Again, *all* other major relational databases do this ... > even blob fields have a maximum length reported from the database. > > So what are you doing with the max length? Not all data types > and values have a meaningful max length, so you have to be > able to deal with variable length data anyway. Imho it has a lot to do with optimizing the interface. If you know, that the max length is e.g. 16 bytes in UTF-8 for the 3 chars, you will probably not want any on the fly allocation smarts and preallocate and bind those 16 bytes. When the max length value gets larger, and it is a variable lenght type, the overhead of varlen allocation smarts starts to pay off. A generic interface should keep the sql parsing smarts at a minimum, thus it cannot know that a returned column is actually a text constant. Imho the request for a max length is very reasonable, but has no value once it exceeds a certain size e.g. 64k. Andreas ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
Larry McGhaw wrote: Again, *all* other major relational databases do this ... even blob fields have a maximum length reported from the database. So what are you doing with the max length? Not all data types and values have a meaningful max length, so you have to be able to deal with variable length data anyway. For blobs, exactly what max length would you like to get; 1GB? 1TB? Why, what good is that for? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
Again, the issue is not our tool, but the deficiency in libpq/postgres ... even mysql gets its right .. why not Postgres? Its not hard for a database to report metadata properly. if I issue a sql statement: select '123' from the database should report that the maximum length of the 1st column in the resultset is 3 ... it cant be any more plain than that. Thanks lm From: Hannu Krosing [mailto:[EMAIL PROTECTED] Sent: Mon 6/11/2007 10:43 PM To: Larry McGhaw Cc: Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw: > As far as I am aware these statements are true. If you have a > specific example you could provide to the contrary that would be > interesting. > > Even if there are such conditions it does not change the fact that > libpq and/or postgresql is deficient in this area. > > For any query, the database should be capable of describing the > metadata for the columns, which includes > 1) the column type > and > 2) the column maximum length. > > This is such a basic database interface principle that I very > disappointed that someone has not recognized this and simply said " > yes, we see the issue we will work on it". > > Again, *all* other major relational databases do this ... even blob > fields have a maximum length reported from the database. > > I hope someone who truly understands database interfaces will read > this thread and address the issue. > For now we will have to "special case" postgres in our application > until it is addressed. > or redesign your application so that it allocates memory as needed and won't waste client memory by allocating maximum possible amount for each and every grid cell weather needed or not ;) As I understand from this discussion you are writing some kind of middleware (i.e. tools), and I'd expect toolmakers to do the right thing. allocating as much as possibly ever needed is something that would be excusable in quick-n-dirty end user application, but not in a tool. Hannu
Re: [HACKERS] Selecting a constant question
Ühel kenal päeval, E, 2007-06-11 kell 22:55, kirjutas Dann Corbit: > > -Original Message- ... > > > I hope someone who truly understands database interfaces will read > > > this thread and address the issue. > > > For now we will have to "special case" postgres in our application > > > until it is addressed. > > > > > > > or redesign your application so that it allocates memory as needed and > > won't waste client memory by allocating maximum possible amount for each > > and every grid cell weather needed or not ;) > > > > As I understand from this discussion you are writing some kind of > > middleware (i.e. tools), and I'd expect toolmakers to do the right > > thing. > > In this case the middleware is: > ODBC/JDBC/OLEDB/.NET data drivers for PostgreSQL. > > There are other related tools, but the above is the product for which the bug > needs corrected. You mean you use some kind of "Grid" inside JDBC/.NET drivers , and it needs to know max size for a column ? can't you replace it with a dynamically allocated Grid component, which would also work well for other expressions, not just constants ? > > allocating as much as possibly ever needed is something that would be > > excusable in quick-n-dirty end user application, but not in a tool. > > It's a requirement of the ODBC/JDBC/OLEDB/.NET specifications. Is that a requirement only for "constants" or for any expression, say "SELECT substring(reallybigblob, 1, random(100)) from somebigtable" ? > I suppose we could scan the > table twice to figure out how large a column might be, but that would make > the PostgreSQL > driver run at 1/2 speed. Not a very appetizing solution. by scanninc twice you find out how big the largest column _is_, not might be . > None of the other database vendors has any trouble reporting this information > correctly. By "this information" you mean the max possible size of data returned by and expression ? - Hannu ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Hannu Krosing [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 10:43 PM > To: Larry McGhaw > Cc: Tom Lane; Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van > Oosterhout; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw: > > As far as I am aware these statements are true. If you have a > > specific example you could provide to the contrary that would be > > interesting. > > > > Even if there are such conditions it does not change the fact that > > libpq and/or postgresql is deficient in this area. > > > > For any query, the database should be capable of describing the > > metadata for the columns, which includes > > 1) the column type > > and > > 2) the column maximum length. > > > > This is such a basic database interface principle that I very > > disappointed that someone has not recognized this and simply said " > > yes, we see the issue we will work on it". > > > > Again, *all* other major relational databases do this ... even blob > > fields have a maximum length reported from the database. > > > > I hope someone who truly understands database interfaces will read > > this thread and address the issue. > > For now we will have to "special case" postgres in our application > > until it is addressed. > > > > or redesign your application so that it allocates memory as needed and > won't waste client memory by allocating maximum possible amount for each > and every grid cell weather needed or not ;) > > As I understand from this discussion you are writing some kind of > middleware (i.e. tools), and I'd expect toolmakers to do the right > thing. In this case the middleware is: ODBC/JDBC/OLEDB/.NET data drivers for PostgreSQL. There are other related tools, but the above is the product for which the bug needs corrected. > allocating as much as possibly ever needed is something that would be > excusable in quick-n-dirty end user application, but not in a tool. It's a requirement of the ODBC/JDBC/OLEDB/.NET specifications. I suppose we could scan the table twice to figure out how large a column might be, but that would make the PostgreSQL driver run at 1/2 speed. Not a very appetizing solution. None of the other database vendors has any trouble reporting this information correctly. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Hannu Krosing [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 8:42 PM > To: Dann Corbit > Cc: Tom Lane; Gregory Stark; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit: > > > -Original Message- > > > From: Tom Lane [mailto:[EMAIL PROTECTED] > > > Sent: Monday, June 11, 2007 1:32 PM > > > To: Dann Corbit > > > Cc: Gregory Stark; pgsql-hackers@postgresql.org > > > Subject: Re: [HACKERS] Selecting a constant question > ... > > > You should be treating typlen as signed not unsigned, and not assuming > > a > > > fixed width for any negative value. > > > > > > Since the width refers to the server internal representation, and not > > to > > > what comes down the wire, I find it pretty strange for an application > > to > > > be using typlen for anything at all actually. > > > > Thanks for the response. > > > > Since libpq function PQfsize returns -2 for all constant character > > strings in SQL statements ... What is the proper procedure to determine > > the length of a constant character column after query execution but > > before fetching the first row of data? > > Why not just get the first row and determine the width from it before > you actually use any of tha data ? What if the second row is 1000x longer? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
Ühel kenal päeval, E, 2007-06-11 kell 22:11, kirjutas Larry McGhaw: > As far as I am aware these statements are true. If you have a > specific example you could provide to the contrary that would be > interesting. > > Even if there are such conditions it does not change the fact that > libpq and/or postgresql is deficient in this area. > > For any query, the database should be capable of describing the > metadata for the columns, which includes > 1) the column type > and > 2) the column maximum length. > > This is such a basic database interface principle that I very > disappointed that someone has not recognized this and simply said " > yes, we see the issue we will work on it". > > Again, *all* other major relational databases do this ... even blob > fields have a maximum length reported from the database. > > I hope someone who truly understands database interfaces will read > this thread and address the issue. > For now we will have to "special case" postgres in our application > until it is addressed. > or redesign your application so that it allocates memory as needed and won't waste client memory by allocating maximum possible amount for each and every grid cell weather needed or not ;) As I understand from this discussion you are writing some kind of middleware (i.e. tools), and I'd expect toolmakers to do the right thing. allocating as much as possibly ever needed is something that would be excusable in quick-n-dirty end user application, but not in a tool. Hannu ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
Ühel kenal päeval, E, 2007-06-11 kell 13:38, kirjutas Dann Corbit: > > -Original Message- > > From: Tom Lane [mailto:[EMAIL PROTECTED] > > Sent: Monday, June 11, 2007 1:32 PM > > To: Dann Corbit > > Cc: Gregory Stark; pgsql-hackers@postgresql.org > > Subject: Re: [HACKERS] Selecting a constant question ... > > You should be treating typlen as signed not unsigned, and not assuming > a > > fixed width for any negative value. > > > > Since the width refers to the server internal representation, and not > to > > what comes down the wire, I find it pretty strange for an application > to > > be using typlen for anything at all actually. > > Thanks for the response. > > Since libpq function PQfsize returns -2 for all constant character > strings in SQL statements ... What is the proper procedure to determine > the length of a constant character column after query execution but > before fetching the first row of data? Why not just get the first row and determine the width from it before you actually use any of tha data ? -- Hannu ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
As far as I am aware these statements are true. If you have a specific example you could provide to the contrary that would be interesting. Even if there are such conditions it does not change the fact that libpq and/or postgresql is deficient in this area. For any query, the database should be capable of describing the metadata for the columns, which includes 1) the column type and 2) the column maximum length. This is such a basic database interface principle that I very disappointed that someone has not recognized this and simply said " yes, we see the issue we will work on it". Again, *all* other major relational databases do this ... even blob fields have a maximum length reported from the database. I hope someone who truly understands database interfaces will read this thread and address the issue. For now we will have to "special case" postgres in our application until it is addressed. Thanks lm From: Tom Lane [mailto:[EMAIL PROTECTED] Sent: Mon 6/11/2007 5:32 PM To: Larry McGhaw Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org Subject: Re: [HACKERS] Selecting a constant question "Larry McGhaw" <[EMAIL PROTECTED]> writes: > I think perhaps we have lost sight of the main issue: > 1) libpq can properly describe the maximum internal data size of any > numeric or char column in a table via Pqfsize > 2) libpq can properly describe the maximum internal data size of any > varchar column via Pqfmod > 3) libpq can properly describe the maximum internal data size of any > numeric constant in a SQL statement via Pqfsize None of the above statements are actually true, at least not when you take off your blinders and note the existence of unconstrained-width numeric and text columns. > The database *knows* this size of the char constant (obviously), No, what it knows (and reports) is type information. There are a small number of datatypes where you can infer a maximum width from knowledge of the datatype. There are many others where you can't set an upper bound from this knowledge --- at least not a usefully tight one. Anyway, if we were to cast those constants to something other than unknown, it would be text, not varchar, and you'd still have the same issue. regards, tom lane
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 3:35 PM > To: Dann Corbit > Cc: Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; > Larry McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > Giving me the information about the data type will be enough. As an > > example, in this case we have varchar data. If the server should be so > > kind as to report varchar(1) for '1' or varchar(3) for '123' then I > > would not have any difficulty binding the data to a grid. > > This seems merest fantasy. Reflect on multibyte character sets for a > bit --- even if it's known that the column is varchar(3) there is no > guarantee that the value will fit in 3 bytes. If the server bound the data as UNICODE, then it will tell me UNICODE(3). I know how big this will be. In the worst case scenario it will fit in 3*4 = 12 bytes. If the server is built without UNICODE enabled, then it will definitely fit in 3 bytes. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 5:32 PM > To: Larry McGhaw > Cc: Alvaro Herrera; Dann Corbit; Gregory Stark; Martijn van Oosterhout; > pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > "Larry McGhaw" <[EMAIL PROTECTED]> writes: > > I think perhaps we have lost sight of the main issue: > > 1) libpq can properly describe the maximum internal data size of any > > numeric or char column in a table via Pqfsize > > 2) libpq can properly describe the maximum internal data size of any > > varchar column via Pqfmod > > 3) libpq can properly describe the maximum internal data size of any > > numeric constant in a SQL statement via Pqfsize > > None of the above statements are actually true, at least not when you > take off your blinders and note the existence of unconstrained-width > numeric and text columns. Unconstrained width columns are not what are being discussed here. It is constant expressions of known width. > > The database *knows* this size of the char constant (obviously), > > No, what it knows (and reports) is type information. There are a small > number of datatypes where you can infer a maximum width from knowledge > of the datatype. There are many others where you can't set an upper > bound from this knowledge --- at least not a usefully tight one. If you do not know how large 1::numeric is, then how can you know whether it is safe or not to insert it into a column of type numeric(12,4)? If you do not know how large 'Joe'::varchar is, then how can you know whether it is safe to insert it into a column of type varchar(256)? Clearly, neither of these operations will cause any problems and so the size of a constant can be determined. > Anyway, if we were to cast those constants to something other than > unknown, it would be text, not varchar, and you'd still have the same > issue. Other database systems can manage this, and the programmers of those database systems are not smarter than the programmers of the PostgreSQL group. Therefore I can conclude that if the PostgreSQL group decides it is important, then they can figure out the size of a string or numeric constant. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
"Larry McGhaw" <[EMAIL PROTECTED]> writes: > I think perhaps we have lost sight of the main issue: > 1) libpq can properly describe the maximum internal data size of any > numeric or char column in a table via Pqfsize > 2) libpq can properly describe the maximum internal data size of any > varchar column via Pqfmod > 3) libpq can properly describe the maximum internal data size of any > numeric constant in a SQL statement via Pqfsize None of the above statements are actually true, at least not when you take off your blinders and note the existence of unconstrained-width numeric and text columns. > The database *knows* this size of the char constant (obviously), No, what it knows (and reports) is type information. There are a small number of datatypes where you can infer a maximum width from knowledge of the datatype. There are many others where you can't set an upper bound from this knowledge --- at least not a usefully tight one. Anyway, if we were to cast those constants to something other than unknown, it would be text, not varchar, and you'd still have the same issue. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Andrew Dunstan > Sent: Monday, June 11, 2007 5:12 PM > To: Larry McGhaw > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > > > Larry McGhaw wrote: > > 4) libpq **cannot** describe the maximum internal data size of a char or > > varchar constant! > > Example: select '123' from > > > > This is clearly a bug or serious oversight in libpq that should be > > addressed. > > > > The database *knows* this size of the char constant (obviously), and > > should report the size via a metadata call, as all other relational > > databases do. > > > > > > > > What is not clear to me is why it is so important for you to know the > length of a piece of data you are supplying. If it is so vitally > important, you could always cast it, e.g. select '123'::varchar(3) We're a middleware company. We are not in control of the queries that are sent. We can intercept and reformat them, and perhaps that is what we will need to do for PostgreSQL ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
Larry McGhaw wrote: 4) libpq **cannot** describe the maximum internal data size of a char or varchar constant! Example: select '123' from This is clearly a bug or serious oversight in libpq that should be addressed. The database *knows* this size of the char constant (obviously), and should report the size via a metadata call, as all other relational databases do. What is not clear to me is why it is so important for you to know the length of a piece of data you are supplying. If it is so vitally important, you could always cast it, e.g. select '123'::varchar(3) cheers andrew ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Kris Jurka > Sent: Monday, June 11, 2007 5:04 PM > To: Larry McGhaw > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > > > On Mon, 11 Jun 2007, Larry McGhaw wrote: > > > I think perhaps we have lost sight of the main issue: > > > > 2) libpq can properly describe the maximum internal data size of any > > varchar column via Pqfmod > > SELECT cola || colb FROM tab; Suggestion: Return (column size of cola) + (column size of colb) in the maximum length field. > > 3) libpq can properly describe the maximum internal data size of any > > numeric constant in a SQL statement via Pqfsize > > SELECT 3::numeric; Suggestion: Return sizeof (numeric(1,0)) -- after all, it's a constant here. In the words of the great poet "Spike Lee": 'Always do the right thing.' ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
On Mon, 11 Jun 2007, Larry McGhaw wrote: I think perhaps we have lost sight of the main issue: 2) libpq can properly describe the maximum internal data size of any varchar column via Pqfmod SELECT cola || colb FROM tab; 3) libpq can properly describe the maximum internal data size of any numeric constant in a SQL statement via Pqfsize SELECT 3::numeric; Kris Jurka ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
I think perhaps we have lost sight of the main issue: 1) libpq can properly describe the maximum internal data size of any numeric or char column in a table via Pqfsize 2) libpq can properly describe the maximum internal data size of any varchar column via Pqfmod 3) libpq can properly describe the maximum internal data size of any numeric constant in a SQL statement via Pqfsize 4) libpq **cannot** describe the maximum internal data size of a char or varchar constant! Example: select '123' from This is clearly a bug or serious oversight in libpq that should be addressed. The database *knows* this size of the char constant (obviously), and should report the size via a metadata call, as all other relational databases do. Thanks lm -Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:44 PM To: Dann Corbit Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit wrote: > If the server bound the data as UNICODE, then it will tell me > UNICODE(3). I know how big this will be. > > In the worst case scenario it will fit in 3*4 = 12 bytes. > > If the server is built without UNICODE enabled, then it will > definitely fit in 3 bytes. Unless it's some other multibyte encoding. And nowadays, the server is always "unicode enabled". The stuff sent down the wire is unicode or not depending on a configuration parameter. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
I think perhaps we have lost sight of the main issue: 1) libpq can properly describe the maximum internal data size of any numeric or char column in a table via Pqfsize 2) libpq can properly describe the maximum internal data size of any varchar column via Pqfmod 3) libpq can properly describe the maximum internal data size of any numeric constant in a SQL statement via Pqfsize 4) libpq **cannot** describe the maximum internal data size of a char or varchar constant! Example: select '123' from This is clearly a bug or serious oversight in libpq that should be addressed. The database *knows* this size of the char constant (obviously), and should report the size via a metadata call, as all other relational databases do. Thanks lm -Original Message- From: Alvaro Herrera [mailto:[EMAIL PROTECTED] Sent: Monday, June 11, 2007 3:44 PM To: Dann Corbit Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw Subject: Re: [HACKERS] Selecting a constant question Dann Corbit wrote: > If the server bound the data as UNICODE, then it will tell me > UNICODE(3). I know how big this will be. > > In the worst case scenario it will fit in 3*4 = 12 bytes. > > If the server is built without UNICODE enabled, then it will > definitely fit in 3 bytes. Unless it's some other multibyte encoding. And nowadays, the server is always "unicode enabled". The stuff sent down the wire is unicode or not depending on a configuration parameter. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 3:50 PM > To: Dann Corbit > Cc: Martijn van Oosterhout; Alvaro Herrera; Gregory Stark; pgsql- > [EMAIL PROTECTED]; Larry McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > >> To be honest, the concept that a widget requires a constant that can't > >> be changed later is also a bit odd. > > > Not when the data itself is a constant that cannot be changed. > > Surely this case is not sufficiently important to justify designing > your entire application (not to mention the client/server protocol) > around it. You're always going to have variable-width columns in there > somewhere. Right. But normally I get back a length for those variable length columns, or I can collect it from the metadata of the database. Surely, PostgreSQL can determine the size of a constant string. Otherwise it would be impossible to know if it would be safe to insert a constant string into a database column. PostgreSQL has decided upon a data type, and gives me data bound in that type. It is only the length that it is unwilling to divulge. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Alvaro Herrera [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 3:44 PM > To: Dann Corbit > Cc: Tom Lane; Gregory Stark; Martijn van Oosterhout; pgsql- > [EMAIL PROTECTED]; Larry McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > Dann Corbit wrote: > > > If the server bound the data as UNICODE, then it will tell me > > UNICODE(3). I know how big this will be. > > > > In the worst case scenario it will fit in 3*4 = 12 bytes. > > > > If the server is built without UNICODE enabled, then it will definitely > > fit in 3 bytes. > > Unless it's some other multibyte encoding. And nowadays, the server is > always "unicode enabled". The stuff sent down the wire is unicode or > not depending on a configuration parameter. Even at that, we still know an absolute maximum of 12 bytes. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: >> To be honest, the concept that a widget requires a constant that can't >> be changed later is also a bit odd. > Not when the data itself is a constant that cannot be changed. Surely this case is not sufficiently important to justify designing your entire application (not to mention the client/server protocol) around it. You're always going to have variable-width columns in there somewhere. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Selecting a constant question
Dann Corbit wrote: > If the server bound the data as UNICODE, then it will tell me > UNICODE(3). I know how big this will be. > > In the worst case scenario it will fit in 3*4 = 12 bytes. > > If the server is built without UNICODE enabled, then it will definitely > fit in 3 bytes. Unless it's some other multibyte encoding. And nowadays, the server is always "unicode enabled". The stuff sent down the wire is unicode or not depending on a configuration parameter. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 3:29 PM > To: Dann Corbit > Cc: Alvaro Herrera; Gregory Stark; pgsql-hackers@postgresql.org; Larry > McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote: > > Sure, but when I bind to a grid, I need to know a-priori how big the > > biggest returned instance can be. Reading the entire data set twice to > > learn the size of a constant seems rather conceptually odd to me. > > To be honest, the concept that a widget requires a constant that can't > be changed later is also a bit odd. Not when the data itself is a constant that cannot be changed. > There are many times you won't know > beforehand how big the data is, surely the framework should be smart > enough to handle these cases? If it were impossible to know the size of a string constant supplied in the query, then I think I would agree with you here. However, it seems to me that the maximum possible size of such a known, constant-width string is not hard to determine. > Start the width at 100, if it turns out to be too small, make it > bigger... If that were a good idea, then why report data sizes at all? Just let it always be a surprise when it comes streaming down the pipe. Honestly, I cannot fathom this answer. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
Dann Corbit wrote: I have a PostgreSQL feature request: Report the maximum size of a variable length string from the server. Surely, we cannot be the only people who will need this information. If (for example) someone wants to bind to a grid, then the maximum size has to be known in advance. Does PQfmod not tell you what you need if the field is varchar(n) ? cheers andrew ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: > Giving me the information about the data type will be enough. As an > example, in this case we have varchar data. If the server should be so > kind as to report varchar(1) for '1' or varchar(3) for '123' then I > would not have any difficulty binding the data to a grid. This seems merest fantasy. Reflect on multibyte character sets for a bit --- even if it's known that the column is varchar(3) there is no guarantee that the value will fit in 3 bytes. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Selecting a constant question
On Mon, Jun 11, 2007 at 03:18:33PM -0700, Dann Corbit wrote: > Sure, but when I bind to a grid, I need to know a-priori how big the > biggest returned instance can be. Reading the entire data set twice to > learn the size of a constant seems rather conceptually odd to me. To be honest, the concept that a widget requires a constant that can't be changed later is also a bit odd. There are many times you won't know beforehand how big the data is, surely the framework should be smart enough to handle these cases? Start the width at 100, if it turns out to be too small, make it bigger... Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Selecting a constant question
Dann Corbit wrote: > > Oh, you have the length information for each datum all right. It's on > > the first four bytes of it. > > Sure, but when I bind to a grid, I need to know a-priori how big the > biggest returned instance can be. Reading the entire data set twice to > learn the size of a constant seems rather conceptually odd to me. Did you read up on typmod already? I think that's part of the info sent down in the query response. -- Alvaro Herrera http://www.flickr.com/photos/alvherre/ "No single strategy is always right (Unless the boss says so)" (Larry Wall) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Alvaro Herrera [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 3:16 PM > To: Dann Corbit > Cc: Gregory Stark; Martijn van Oosterhout; pgsql-hackers@postgresql.org; > Larry McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > Dann Corbit wrote: > > > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > > > > > In fact psql needs it and implements this. It has to skim through the > > > entire > > > result set to calculate the column widths. It's quite a lot of work > > but > > > the > > > server is in no better position to do it than psql. > > > > Reading the data twice sounds a little painful. What if there are 30 > > million rows? > > You get an "out of memory" error. > > > > On the contrary the server is missing quite a bit of information of > > > how you intend to display the information. Do you need the number of > > > bytes or characters? Are all the characters the same width in your > > > display system? What about currency symbols? Do you intend to > > > reverse any quoting or just display backslashes? > > > > Giving me the information about the data type will be enough. As an > > example, in this case we have varchar data. If the server should be so > > kind as to report varchar(1) for '1' or varchar(3) for '123' then I > > would not have any difficulty binding the data to a grid. > > Oh, you have the length information for each datum all right. It's on > the first four bytes of it. Sure, but when I bind to a grid, I need to know a-priori how big the biggest returned instance can be. Reading the entire data set twice to learn the size of a constant seems rather conceptually odd to me. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
Dann Corbit wrote: > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > > > In fact psql needs it and implements this. It has to skim through the > > entire > > result set to calculate the column widths. It's quite a lot of work > but > > the > > server is in no better position to do it than psql. > > Reading the data twice sounds a little painful. What if there are 30 > million rows? You get an "out of memory" error. > > On the contrary the server is missing quite a bit of information of > > how you intend to display the information. Do you need the number of > > bytes or characters? Are all the characters the same width in your > > display system? What about currency symbols? Do you intend to > > reverse any quoting or just display backslashes? > > Giving me the information about the data type will be enough. As an > example, in this case we have varchar data. If the server should be so > kind as to report varchar(1) for '1' or varchar(3) for '123' then I > would not have any difficulty binding the data to a grid. Oh, you have the length information for each datum all right. It's on the first four bytes of it. -- Alvaro Herrera http://www.PlanetPostgreSQL.org/ "The West won the world not by the superiority of its ideas or values or religion but rather by its superiority in applying organized violence. Westerners often forget this fact, non-Westerners never do." (Samuel P. Huntington) ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Gregory Stark [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 2:41 PM > To: Dann Corbit > Cc: Martijn van Oosterhout; pgsql-hackers@postgresql.org; Larry McGhaw > Subject: Re: [HACKERS] Selecting a constant question > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > > Surely, we cannot be the only people who will need this information. If > > (for example) someone wants to bind to a grid, then the maximum size has > > to be known in advance. > > In fact psql needs it and implements this. It has to skim through the > entire > result set to calculate the column widths. It's quite a lot of work but > the > server is in no better position to do it than psql. Reading the data twice sounds a little painful. What if there are 30 million rows? > On the contrary the server is missing quite a bit of information of how > you > intend to display the information. Do you need the number of bytes or > characters? Are all the characters the same width in your display system? > What > about currency symbols? Do you intend to reverse any quoting or just > display > backslashes? Giving me the information about the data type will be enough. As an example, in this case we have varchar data. If the server should be so kind as to report varchar(1) for '1' or varchar(3) for '123' then I would not have any difficulty binding the data to a grid. > Even knowing how many characters and assuming fixed character widths that > wouldn't even be enough to set your grid control widths. Usually people > like > numeric quantities decimal aligned and so two records "1.00" and "0.01" > will > take much more width than two records with "1.00" and "2.00". SQL*Server, Oracle, Ingres, DB/2 and other database systems somehow manage to do it, so I guess it is not technically intractable. I suspect that your own ODBC/JDBC and other drivers suffer from this same effect. Now, I do recognize that sometimes nobody is going to know how big something is, including the server. But with a query using a constant it seems like it ought to be well defined to me. Perhaps the difficulties are escaping me because I am not familiar with the low level guts of this problem. But I suspect that lots of people besides me would benefit if sizes of things were known when it is possible to know them. As I said before, I see that it cannot be known right now. So I am putting it in as a feature request. If you could be so kind as to point out the right spot to look in the server code, I imagine we could fix it and check in the patch ourselves. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: > Surely, we cannot be the only people who will need this information. If > (for example) someone wants to bind to a grid, then the maximum size has > to be known in advance. In fact psql needs it and implements this. It has to skim through the entire result set to calculate the column widths. It's quite a lot of work but the server is in no better position to do it than psql. On the contrary the server is missing quite a bit of information of how you intend to display the information. Do you need the number of bytes or characters? Are all the characters the same width in your display system? What about currency symbols? Do you intend to reverse any quoting or just display backslashes? Even knowing how many characters and assuming fixed character widths that wouldn't even be enough to set your grid control widths. Usually people like numeric quantities decimal aligned and so two records "1.00" and "0.01" will take much more width than two records with "1.00" and "2.00". -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Dann Corbit > Sent: Monday, June 11, 2007 1:52 PM > To: Martijn van Oosterhout > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > > -Original Message- > > From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] > > Sent: Monday, June 11, 2007 1:46 PM > > To: Dann Corbit > > Subject: Re: [HACKERS] Selecting a constant question > > > > On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote: > > > Our application is using the libPQ interface to access postgres. > > > > > > The query is "select '123' from " .. the table is not > > > important. > > > > > > After executing the query, we interrogate the metadata of the result > set > > > using the PQfsize, PQfmod and PQftype functions. > > > > Did you read the documentation of the PQfsize function? > > > > PQfsize returns the space allocated for this column in a database row, > > in other words the size of the server's internal representation of the > > data type. (Accordingly, it is not really very useful to clients.) A > > negative value indicates the data type is variable-length. > > > > > http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE > C- > > SELECT-INFO > > > > > The size of the column is returned as 65534 (or -2 if you consider > this > > > a signed short value) > > > > It's variable length, you can't say anything more. > > So what you are saying is that the constant '1' is variable length, and > there is no way to find out the maximum length from the database. I have a PostgreSQL feature request: Report the maximum size of a variable length string from the server. Surely, we cannot be the only people who will need this information. If (for example) someone wants to bind to a grid, then the maximum size has to be known in advance. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Martijn van Oosterhout [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 1:46 PM > To: Dann Corbit > Subject: Re: [HACKERS] Selecting a constant question > > On Mon, Jun 11, 2007 at 01:29:37PM -0700, Dann Corbit wrote: > > Our application is using the libPQ interface to access postgres. > > > > The query is "select '123' from " .. the table is not > > important. > > > > After executing the query, we interrogate the metadata of the result set > > using the PQfsize, PQfmod and PQftype functions. > > Did you read the documentation of the PQfsize function? > > PQfsize returns the space allocated for this column in a database row, > in other words the size of the server's internal representation of the > data type. (Accordingly, it is not really very useful to clients.) A > negative value indicates the data type is variable-length. > > http://www.postgresql.org/docs/8.2/interactive/libpq-exec.html#LIBPQ-EXE C- > SELECT-INFO > > > The size of the column is returned as 65534 (or -2 if you consider this > > a signed short value) > > It's variable length, you can't say anything more. So what you are saying is that the constant '1' is variable length, and there is no way to find out the maximum length from the database. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Selecting a constant question
"Tom Lane" <[EMAIL PROTECTED]> writes: > Trying the example in psql seems to be about the same speed both ways, with > if anything a slight advantage to select '1'. Fwiw I see a slight advantage for '1' as well. I wonder why. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Tom Lane [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 1:32 PM > To: Dann Corbit > Cc: Gregory Stark; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > The issue is this: > > Postgres describes the column with a typmod of -1 (unknown) and a length > > of 65534. > > Oh, you're looking at typlen not typmod. Please observe the comments in > pg_type.h: > > /* >* For a fixed-size type, typlen is the number of bytes we use to >* represent a value of this type, e.g. 4 for an int4. But for > a >* variable-length type, typlen is negative. We use -1 to indicate > a >* "varlena" type (one that has a length word), -2 to indicate a >* null-terminated C string. >*/ > int2typlen; > > You should be treating typlen as signed not unsigned, and not assuming a > fixed width for any negative value. > > Since the width refers to the server internal representation, and not to > what comes down the wire, I find it pretty strange for an application to > be using typlen for anything at all actually. Thanks for the response. Since libpq function PQfsize returns -2 for all constant character strings in SQL statements ... What is the proper procedure to determine the length of a constant character column after query execution but before fetching the first row of data? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: > The issue is this: > Postgres describes the column with a typmod of -1 (unknown) and a length > of 65534. Oh, you're looking at typlen not typmod. Please observe the comments in pg_type.h: /* * For a fixed-size type, typlen is the number of bytes we use to * represent a value of this type, e.g. 4 for an int4. But for a * variable-length type, typlen is negative. We use -1 to indicate a * "varlena" type (one that has a length word), -2 to indicate a * null-terminated C string. */ int2typlen; You should be treating typlen as signed not unsigned, and not assuming a fixed width for any negative value. Since the width refers to the server internal representation, and not to what comes down the wire, I find it pretty strange for an application to be using typlen for anything at all actually. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
On Mon, Jun 11, 2007 at 12:55:55PM -0700, Dann Corbit wrote: > The issue is this: > > Postgres describes the column with a typmod of -1 (unknown) and a length > of 65534. Postgres does no such thing. How can it possibly know the maximum size of a column before executing the query? Have a nice day, -- Martijn van Oosterhout <[EMAIL PROTECTED]> http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
Re: [HACKERS] Selecting a constant question
> -Original Message- > From: Gregory Stark [mailto:[EMAIL PROTECTED] > Sent: Monday, June 11, 2007 12:48 PM > To: Dann Corbit > Cc: pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] Selecting a constant question > > "Dann Corbit" <[EMAIL PROTECTED]> writes: > > > SELECT 1 FROM test.dbo.a_003 > > > > gets about 60,000 records per second > > > > SELECT '1' FROM test.dbo.a_003 > > > > gets about 600 records per second. > > > > The cause is that postgres describes the return column as "unknown" > > length 65534 in the 2nd case. > > Wait, back up. How does this cause it to go slower? The issue is this: Postgres describes the column with a typmod of -1 (unknown) and a length of 65534. This means that any client application must reserve 65534 bytes of spaces for every row of data (like a grid control for example), which postgres should know (and report) that the maximum length of the column is 1. This is not a PSQL issue, it's an issue with other products relying on the accuracy of the reported postgres metadata for a given SQL statement. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: > SELECT 1 FROM test.dbo.a_003 > > gets about 60,000 records per second > > SELECT '1' FROM test.dbo.a_003 > > gets about 600 records per second. > > The cause is that postgres describes the return column as "unknown" > length 65534 in the 2nd case. Wait, back up. How does this cause it to go slower? -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Selecting a constant question
"Dann Corbit" <[EMAIL PROTECTED]> writes: > SELECT 1 FROM test.dbo.a_003 > gets about 60,000 records per second > SELECT '1' FROM test.dbo.a_003 > gets about 600 records per second. > The cause is that postgres describes the return column as "unknown" > length 65534 in the 2nd case. Postgres describes it in no such fashion --- unknown will always have a typmod of -1 which means "unspecified". Possibly you have some client code that knows much less than it thinks it does about the meanings of typmod values? The actual volume of data transmitted is going to be just about the same either way, so I'm not sure you've diagnosed the cause of slowdown correctly. Trying the example in psql seems to be about the same speed both ways, with if anything a slight advantage to select '1'. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Selecting a constant question
SELECT 1 FROM test.dbo.a_003 gets about 60,000 records per second SELECT '1' FROM test.dbo.a_003 gets about 600 records per second. The cause is that postgres describes the return column as "unknown" length 65534 in the 2nd case. Since the value is a constant, it seems rather odd to make the length 65534 characters. Why not make it char(1) or some other appropriate and less costly data type? After all, it isn't going to grow during the query.