Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On 10/29/2011 03:26 PM, Eric Ridge wrote: Would y'all accept a patch that extended the "SELECT *" syntax to let you list fields to exclude from the A_Star? Quite regularly I'll be testing queries via psql and want to see all the columns from a fairly wide table except maybe a giant text or xml column. A syntax like: SELECT * EXCLUDING (big_col1, big_col2) FROM foo; If it is quite regular I would actually argue two things: 1. Use a view 2. You haven't normalized correctly I am not trying to be a pedantic zealot or anything but those would be my arguments against. Sincerely, Joshua D. Drake -- Command Prompt, Inc. - http://www.commandprompt.com/ PostgreSQL Support, Training, Professional Services and Development The PostgreSQL Conference - http://www.postgresqlconference.org/ @cmdpromptinc - @postgresconf - 509-416-6579 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
2011/10/30 Darren Duncan : > I agree that this feature would be quite useful and should be included in > SQL. The exact syntax is less of an issue, but just the ability to cleanly > say "select all columns except for these". I have in fact argued for the > same feature in the past. > > If you want to and can implement this feature then more power to you. I'll > look forward to it being in Pg 9.2. > > I think then the only discussion point should be what (terse) syntax to use > for it, not whether the feature should exist at all. > > Arguing against this feature is like arguing against supporting "where > not()" or "except" or "not in". One should be able to do complements not > only of rows but of columns too. Basic good language design. My practice speaks so this is not true - I don't know only bad designed projects or very bad designed projects that needs too. I don't see any reason why do it on SQL level. It can sence only in psql as same special filter - if we would to enhace a report features there. Regards Pavel Stehule > > -- Darren Duncan > > Eric Ridge wrote: >> >> Would y'all accept a patch that extended the "SELECT *" syntax to let >> you list fields to exclude from the A_Star? >> >> Quite regularly I'll be testing queries via psql and want to see all >> the columns from a fairly wide table except maybe a giant text or xml >> column. A syntax like: >> >> SELECT * EXCLUDING (big_col1, big_col2) FROM foo; >> >> would be pretty handy. It would definitely save some typing in >> certain cases. It seems like such a syntax would better document the >> intent of a query too, rather than leaving one wondering if "big_col1" >> was supposed to be omitted from the target list or not. >> >> Anyways, I just wanted to run the idea by youse guys before I put too >> much more effort into it. I've already made what appear to be the >> minimum necessary changes to gram.y, and a few quick greps through the >> code make me think the rest will be pretty easy. >> >> Maybe the SQL spec says something about this and nobody's done the work >> yet? >> >> Thanks for your input! >> >> eric >> > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2011/10/30 Tom Lane : > Hitoshi Harada writes: >> On Sat, Oct 29, 2011 at 8:13 AM, Tom Lane wrote: >>> I have not looked at the code, but ISTM the way that this has to work is >>> that you set up a portal for each active scan. Then you can fetch a few >>> rows at a time from any one of them. > >> Hmm, true. Looking back at the original proposal (neither did I look >> at the code,) there seems to be a cursor mode. ISTM it is hard for fdw >> to know how the whole plan tree looks, so consequently do we always >> cursor regardless of estimated row numbers? > > I think we have to. Even if we estimate that a given scan will return > only a few rows, what happens if we're wrong? We don't want to blow out > memory on the local server by retrieving gigabytes in one go. Oh, I overlooked the possibility of wrong estimation. Old PostgreSQL uses 1000 as default estimation, so big table which has not been analyzed may crashes the backend. To ensure the data retrieving safe, we need to get actual amount of result, maybe by executing SELECT COUNT(*) in planning phase. It sounds too heavy to do for every scan, and it still lacks actual width. One possible idea is to change default value of min_cursur_rows option to 0 so that pgsql_fdw uses CURSOR by default, but it seems not enough. I'll drop simple SELECT mode from first version of pgsql_fdw for safety. >> I haven't had much experiences around cursor myself, but is it as >> efficient as non-cursor? > > No, but if you need max efficiency you shouldn't be using foreign tables > in the first place; they're always going to be expensive to access. > > It's likely that making use of native protocol portals (instead of > executing a lot of FETCH commands) would help. But I think we'd be well > advised to do the first pass with just the existing libpq facilities, > and then measure to see where to improve performance. I long for protocol-level cursor. :) -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2011/10/29 Hitoshi Harada : > I have a doubt here, on sharing connection for each server. What if > there are simultaneous scan on the same plan? Say, > > -> Nested Loop > -> Foreign Scan to table T1 on server A > -> Foreign Scan to table T2 on server A > > Okay, you are thinking about Foreign Join, so example above is too > simple. But it is always possible to execute such a query if foreign > scan nodes are separated far, isn't it? As far as I see from your > explanation, scan T1 and scan T2 share the same connection. Now join > node scans one row from left (T1) while asking rows from right (T2) > without fetching all the rows from left. If T2 requests to server A, > the connection's result (of T1) is discarded. Am I understand > correctly? I think that sharing a connection doesn't cause any problem. In cursor mode, using multiple cursors concurrently through one connection is OK. In SELECT mode, pgsql_fdw executes SELECT statement with PQexecParams and retrieves whole result *inside* the first Iterate call for an outer tuple. So libpq connection is already available when another scan needs to call Iterate function. -- Shigeru Hanada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Sat, Oct 29, 2011 at 4:36 PM, Tom Lane wrote: >> Well, it might not be strictly necessary for pg_utf8_increment() and >> pg_eucjp_increment(), but it's clearly necessary for the generic >> incrementer function for exactly the same reason it was needed in the >> old coding. I suppose we could weaken the rule to "you must leave a >> valid character behind rather than a bunch of bytes that doesn't >> encode to a character", but the cycle savings are negligible and the >> current rule seems both simpler and more bullet-proof. > > No, it's *not* necessary any more, AFAICS. make_greater_string discards > the data and overwrites it with a null instantly upon getting a failure > return from the incrementer. The reason we used to need it was that we > did pg_mbcliplen after failing to increment, but now we do that before > we ever increment anything, so we already know the length of the last > character. It doesn't matter whether those bytes are still valid or > contain garbage. Oh, dude. I think you are right. I guess we can rip that crap out then. That's much cleaner. >>> I'm also quite distressed that you ignored my advice to limit the number >>> of combinations tried. This patch could be horribly slow when dealing >>> with wide characters, eg think what will happen when starting from >>> U+1. > >> Uh, I think it will try at most one character in that position and >> then truncate away that character entirely, per my last email on this >> topic (to which you never responded): >> http://archives.postgresql.org/pgsql-hackers/2011-09/msg01195.php > > Oh! You are right, I was expecting it to try multiple characters at the > same position before truncating the string. This change seems to have > lobotomized things rather thoroughly. What is the rationale for that? > As an example, when dealing with a single-character string, it will fail > altogether if the next code value sorts out-of-order, so this seems to > me to be a rather large step backwards. > > I think we ought to go back to the previous design of incrementing till > failure and then truncating, which puts the onus on the incrementer to > make a reasonable tradeoff of how many combinations to try per character > position. There's a simple tweak we could make to the patch to limit > that: once we've maxed out a lower-order byte of a multibyte char, > *don't reset it to minimum*, just move on to incrementing the next > higher byte. This preserves the old property that the maximum number of > combinations tried is bounded by 256 * string's length in bytes. On this point I believe you are still confused. The old code tried one character per position, and the new code tries one character per position. Nothing has been lobotomized in any way. The difference is that the old code used a "guess and check" approach to generate the character, so there was an inner loop that was trying to generate a character (possibly generating various garbage strings that did not represent a character along the way) and then, upon success, checked the sort order of that single string before truncating and retrying. The new code does exactly the same thing in the outer loop - i.e. truncate one character per iteration - but the inner loop has, at least for UTF-8 and EUC-JP, been replaced with an algorithm that is guaranteed to produce a valid character without needing to loop. Now having said that, I think there is a possibility for some improvement here. If we know we're not going to spend a lot of time uselessly screwing around trying to get something that will pass pg_verifymbstr(), then we could probably afford to call ltproc several times per position, rather than just once. But that's not restoring the behavior of the old algorithm; that's improving on the old algorithm. And before we do that, we need to think about a couple of things: first, that silly looping behavior is still there for anything other than UTF-8 and EUC-JP. Until we have constant-time increment functions for every encoding we support, we probably don't want to get too jiggy with it; second, we need to convince ourselves that this will succeed in a meaningful number of cases where the current algorithm fails. I played around a bit with the UTF-8 case (with collation = en_US.UTF-8) before committing this and I suspect that trying 4 or 5 characters per position could be a win - you might for example be looking at something like an accented E, and you might have several different versions of E in a row before you get to something that's no longer E-like. If you get to that point and still don't find something that compares favorably, it's probably time to throw in the towel. Another idea would be to do the first increment by one, and then increment by two, four, eight before giving up, or something like that. It probably needs some research, and frankly I'm happy to leave it to someone who is having a real-world problem with it. The fact that we haven't gotten any complaints be
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
I agree that this feature would be quite useful and should be included in SQL. The exact syntax is less of an issue, but just the ability to cleanly say "select all columns except for these". I have in fact argued for the same feature in the past. If you want to and can implement this feature then more power to you. I'll look forward to it being in Pg 9.2. I think then the only discussion point should be what (terse) syntax to use for it, not whether the feature should exist at all. Arguing against this feature is like arguing against supporting "where not()" or "except" or "not in". One should be able to do complements not only of rows but of columns too. Basic good language design. -- Darren Duncan Eric Ridge wrote: Would y'all accept a patch that extended the "SELECT *" syntax to let you list fields to exclude from the A_Star? Quite regularly I'll be testing queries via psql and want to see all the columns from a fairly wide table except maybe a giant text or xml column. A syntax like: SELECT * EXCLUDING (big_col1, big_col2) FROM foo; would be pretty handy. It would definitely save some typing in certain cases. It seems like such a syntax would better document the intent of a query too, rather than leaving one wondering if "big_col1" was supposed to be omitted from the target list or not. Anyways, I just wanted to run the idea by youse guys before I put too much more effort into it. I've already made what appear to be the minimum necessary changes to gram.y, and a few quick greps through the code make me think the rest will be pretty easy. Maybe the SQL spec says something about this and nobody's done the work yet? Thanks for your input! eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
On Sat, Oct 29, 2011 at 6:35 PM, Stephen Frost wrote: >> Maybe the SQL spec says something about this and nobody's done the work yet? > > I don't know of anything like this in the spec. Also, there would be > concern about this possibly going against spec, breaking possibly valid > queries, promoting keywords to reserved words, and maybe ending up in a > bad situation if the SQL committee decides to support that kind of > syntax for something completely different. At least concerning breaking valid queries and promoting keywords, I don't think the former can happen (they'd fail to parse today) and the latter doesn't seem necessary as "EXCLUDING"'s use in this case appears to be completely unambiguous. However, I realize there's no second-guessing what the SQL committee might do in the future. > In general, I doubt this is something we'd implement, but others may > feel differently. I hope so. :) > What might be interesting to consider is how hard it > would be to make psql smarter when it comes to line editing in this > regard. Maybe if there was a way to easily expand the '*' from psql and > then you could remove the columns from the list easily..? Probably really dang hard, especially when you consider a "SELECT *" involving lots of joins. And even if it turned out to be easy, it would be limited to psql. Anyways, it's just something I've wanted for quite awhile and thought I'd actually do the work to make it happen, *if* y'all would take it. eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Eric, * Eric Ridge (eeb...@gmail.com) wrote: > It seems like such a syntax would better document the > intent of a query too, rather than leaving one wondering if "big_col1" > was supposed to be omitted from the target list or not. Well, I expect most here would say that any application query should be specific about exactly what columns it wants (iow- don't use select * in your apps :). As for it being useful in psql, I could see that. > Anyways, I just wanted to run the idea by youse guys before I put too > much more effort into it. I've already made what appear to be the > minimum necessary changes to gram.y, and a few quick greps through the > code make me think the rest will be pretty easy. > > Maybe the SQL spec says something about this and nobody's done the work yet? I don't know of anything like this in the spec. Also, there would be concern about this possibly going against spec, breaking possibly valid queries, promoting keywords to reserved words, and maybe ending up in a bad situation if the SQL committee decides to support that kind of syntax for something completely different. In general, I doubt this is something we'd implement, but others may feel differently. What might be interesting to consider is how hard it would be to make psql smarter when it comes to line editing in this regard. Maybe if there was a way to easily expand the '*' from psql and then you could remove the columns from the list easily..? Thanks, Stephen signature.asc Description: Digital signature
[HACKERS] Thoughts on "SELECT * EXCLUDING (...) FROM ..."?
Would y'all accept a patch that extended the "SELECT *" syntax to let you list fields to exclude from the A_Star? Quite regularly I'll be testing queries via psql and want to see all the columns from a fairly wide table except maybe a giant text or xml column. A syntax like: SELECT * EXCLUDING (big_col1, big_col2) FROM foo; would be pretty handy. It would definitely save some typing in certain cases. It seems like such a syntax would better document the intent of a query too, rather than leaving one wondering if "big_col1" was supposed to be omitted from the target list or not. Anyways, I just wanted to run the idea by youse guys before I put too much more effort into it. I've already made what appear to be the minimum necessary changes to gram.y, and a few quick greps through the code make me think the rest will be pretty easy. Maybe the SQL spec says something about this and nobody's done the work yet? Thanks for your input! eric -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange code in array_in
On Sat, 2011-10-29 at 15:13 -0400, Tom Lane wrote: > What other lookups? I just meant anything after that point in the function would surely fail (get_type_io_data). > array_out, and I believe a bunch of other places, use the same trick. OK. In retrospect it is a very simple trick, but at the time I was slightly confused by it. I guess I just haven't seen that idiom before. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Hitoshi Harada writes: > On Sat, Oct 29, 2011 at 8:13 AM, Tom Lane wrote: >> I have not looked at the code, but ISTM the way that this has to work is >> that you set up a portal for each active scan. Then you can fetch a few >> rows at a time from any one of them. > Hmm, true. Looking back at the original proposal (neither did I look > at the code,) there seems to be a cursor mode. ISTM it is hard for fdw > to know how the whole plan tree looks, so consequently do we always > cursor regardless of estimated row numbers? I think we have to. Even if we estimate that a given scan will return only a few rows, what happens if we're wrong? We don't want to blow out memory on the local server by retrieving gigabytes in one go. > I haven't had much experiences around cursor myself, but is it as > efficient as non-cursor? No, but if you need max efficiency you shouldn't be using foreign tables in the first place; they're always going to be expensive to access. It's likely that making use of native protocol portals (instead of executing a lot of FETCH commands) would help. But I think we'd be well advised to do the first pass with just the existing libpq facilities, and then measure to see where to improve performance. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
On Sat, Oct 29, 2011 at 8:13 AM, Tom Lane wrote: > Hitoshi Harada writes: >> I have a doubt here, on sharing connection for each server. What if >> there are simultaneous scan on the same plan? Say, > >> -> Nested Loop >> -> Foreign Scan to table T1 on server A >> -> Foreign Scan to table T2 on server A > >> Okay, you are thinking about Foreign Join, so example above is too >> simple. But it is always possible to execute such a query if foreign >> scan nodes are separated far, isn't it? As far as I see from your >> explanation, scan T1 and scan T2 share the same connection. Now join >> node scans one row from left (T1) while asking rows from right (T2) >> without fetching all the rows from left. If T2 requests to server A, >> the connection's result (of T1) is discarded. Am I understand >> correctly? > > I have not looked at the code, but ISTM the way that this has to work is > that you set up a portal for each active scan. Then you can fetch a few > rows at a time from any one of them. Hmm, true. Looking back at the original proposal (neither did I look at the code,) there seems to be a cursor mode. ISTM it is hard for fdw to know how the whole plan tree looks, so consequently do we always cursor regardless of estimated row numbers? I haven't had much experiences around cursor myself, but is it as efficient as non-cursor? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Robert Haas writes: > On Sat, Oct 29, 2011 at 3:35 PM, Tom Lane wrote: >> Ummm ... why do the incrementer functions think they need to restore the >> previous value on failure? AFAICS that's a waste of code and cycles, >> since there is only one caller and it doesn't care in the least. > Well, it might not be strictly necessary for pg_utf8_increment() and > pg_eucjp_increment(), but it's clearly necessary for the generic > incrementer function for exactly the same reason it was needed in the > old coding. I suppose we could weaken the rule to "you must leave a > valid character behind rather than a bunch of bytes that doesn't > encode to a character", but the cycle savings are negligible and the > current rule seems both simpler and more bullet-proof. No, it's *not* necessary any more, AFAICS. make_greater_string discards the data and overwrites it with a null instantly upon getting a failure return from the incrementer. The reason we used to need it was that we did pg_mbcliplen after failing to increment, but now we do that before we ever increment anything, so we already know the length of the last character. It doesn't matter whether those bytes are still valid or contain garbage. >> I'm also quite distressed that you ignored my advice to limit the number >> of combinations tried. This patch could be horribly slow when dealing >> with wide characters, eg think what will happen when starting from >> U+1. > Uh, I think it will try at most one character in that position and > then truncate away that character entirely, per my last email on this > topic (to which you never responded): > http://archives.postgresql.org/pgsql-hackers/2011-09/msg01195.php Oh! You are right, I was expecting it to try multiple characters at the same position before truncating the string. This change seems to have lobotomized things rather thoroughly. What is the rationale for that? As an example, when dealing with a single-character string, it will fail altogether if the next code value sorts out-of-order, so this seems to me to be a rather large step backwards. I think we ought to go back to the previous design of incrementing till failure and then truncating, which puts the onus on the incrementer to make a reasonable tradeoff of how many combinations to try per character position. There's a simple tweak we could make to the patch to limit that: once we've maxed out a lower-order byte of a multibyte char, *don't reset it to minimum*, just move on to incrementing the next higher byte. This preserves the old property that the maximum number of combinations tried is bounded by 256 * string's length in bytes. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add socket dir to pg_config..?
On Fri, Oct 28, 2011 at 06:33:39PM +0200, Dimitri Fontaine wrote: > Andrew Dunstan writes: > > Er, which distros other than debian/ubuntu? > > Well, any and all derivatives I guess, to begin with. > > http://distrowatch.com/dwres.php?resource=independence#debian > Based on Debian GNU/Linux: 129 Distributions > > More seriously, I'm not sure how to understand why some people will both > frown upon distribution allowing themselves to patch the version of > PostgreSQL they are packaging, and vote against making their life > easier. > > If /tmp is the only decent place where to put the socket file on Unix > when security and other concerns are considered, then sure, making > distro life difficult is a good thing to do. But then let's take it to > the FHS that debian and ubuntu are implementing, AFAIUI. > > I'm puzzled, maybe I'm not understanding a key point here though. > > Regards, > -- > Dimitri Fontaine > http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support In Gentoo, we change the socket directory to /var/run/postgresql via pg_config_manual.h. However, I'm not too terribly interested in pg_config outputting the directory location. We inform users at the end of every install where the default location is. Further, all of the packages we maintain build against the sources so the packages automatically know where the socket directory is located. -- Mr. Aaron W. Swenson Gentoo Linux Developer Email: titanof...@gentoo.org GnuPG FP : 2C00 7719 4F85 FB07 A49C 0E31 5713 AA03 D1BB FDA0 GnuPG ID : D1BBFDA0 pgph6Z3VCbZWh.pgp Description: PGP signature
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Sat, Oct 29, 2011 at 3:35 PM, Tom Lane wrote: > Robert Haas writes: >> I've committed this, after a good deal of hacking on the comments, >> some coding style cleanup, and one bug fix: > > Ummm ... why do the incrementer functions think they need to restore the > previous value on failure? AFAICS that's a waste of code and cycles, > since there is only one caller and it doesn't care in the least. Well, it might not be strictly necessary for pg_utf8_increment() and pg_eucjp_increment(), but it's clearly necessary for the generic incrementer function for exactly the same reason it was needed in the old coding. I suppose we could weaken the rule to "you must leave a valid character behind rather than a bunch of bytes that doesn't encode to a character", but the cycle savings are negligible and the current rule seems both simpler and more bullet-proof. > I'm also quite distressed that you ignored my advice to limit the number > of combinations tried. This patch could be horribly slow when dealing > with wide characters, eg think what will happen when starting from > U+1. Uh, I think it will try at most one character in that position and then truncate away that character entirely, per my last email on this topic (to which you never responded): http://archives.postgresql.org/pgsql-hackers/2011-09/msg01195.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_upgrade if 'postgres' database is dropped
On Fri, Oct 28, 2011 at 9:22 PM, Bruce Momjian wrote: > OK, the attached, applied patch removes the pg_upgrade dependency on the > 'postgres' database existing in the new cluster. However, vacuumdb, > used by pg_upgrade, still has this dependency: > > conn = connectDatabase("postgres", host, port, username, > prompt_password, progname); > > In fact, all the /scripts binaries use the postgres database, except for > createdb/dropdb, which has this heuristic: > > /* > * Connect to the 'postgres' database by default, except have the > * 'postgres' user use 'template1' so he can create the 'postgres' > * database. > */ > conn = connectDatabase(strcmp(dbname, "postgres") == 0 ? "template1" : > "postgres", > host, port, username, prompt_password, progname); > > This makes sense because you might be creating or dropping the postgres > database. Do we want these to have smarter database selection code? Well, I suppose as long as we're cleaning this up, we might as well be thorough, so, sure, why not? I think the algorithm pg_dumpall uses is pretty sensible: let the user specify the database to use if they so desire; if not, try postgres first and then template1. I think we could stick some logic for that in common.c which could be shared by clusterdb, createdb, dropdb, dropuser, reindexdb, and vacuumdb. However, we need to rethink the flag to be used for this: pg_dumpall uses -l, but many of the other utilities already use that for some other purpose, and it's not exactly mnemonic anyway. "-d" for database could work, but that's also in use in some places, and furthermore somewhat confusing since many if not all of these utilities have an option to operate on a single database only, and you might think that -d would specify the database to operate on, rather than the one to be used to get the list of databases. pgAdmin uses the term "maintenance database" to refer to a database to be used when none is explicitly specified, and I think that's fairly clear terminology. So I propose that we add a --maintenance-db option (with no short form, since this is a relatively obscure need) to the tools listed above. The tools will pass the associated value (or NULL if the option is not specified) to the above-mentioned routine in common.c, which will do the rest. If nobody objects, I'll go do that. Hopefully that should be enough to put this problem to bed more or less permanently. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
Robert Haas writes: > I've committed this, after a good deal of hacking on the comments, > some coding style cleanup, and one bug fix: Ummm ... why do the incrementer functions think they need to restore the previous value on failure? AFAICS that's a waste of code and cycles, since there is only one caller and it doesn't care in the least. I'm also quite distressed that you ignored my advice to limit the number of combinations tried. This patch could be horribly slow when dealing with wide characters, eg think what will happen when starting from U+1. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] strange code in array_in
Jeff Davis writes: > In array_in(), I see the following code: > my_extra->element_type = ~element_type; > It seems like it was explicitly changed from InvalidOid to > ~element_type. At first I thought it was a mistake, but then I thought > maybe it was to ensure that the next branch was taken even if > element_type == InvalidOid. Exactly. > But the rest of the lookups will surely fail > on InvalidOid, and it seems like it would be easy to test for InvalidOid > at the beginning if we wanted to fail. What other lookups? We do need to protect ourselves against element_type being InvalidOid, and the problem is that if we just set my_extra->element_type to InvalidOid and press on, the lookup won't be made and we'll do something bizarre (possibly even crash) using bogus cached information. On the other hand, element_type *shouldn't* be InvalidOid here, so adding an explicit test for that seemed like a waste of code. It's sufficient to let get_type_io_data complain about the case. array_out, and I believe a bunch of other places, use the same trick. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS]
Would it be possible to remove of the double quotes in the daterange display of BC dates? select '[0001-10-29 BC,2011-10-29)'::daterange; daterange -- ["0001-10-29 BC",2011-10-29) (1 row) after all, it's also: select '0001-10-29 BC'::date; date --- 0001-10-29 BC (1 row) without double quotes. It's not important, perhaps; it just looks ugly to me. Do they serve a purpose? Erik Rijkers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases
On Sat, Oct 29, 2011 at 1:16 PM, wrote: > Hello, I feel at a loss what to do... > >> I thought that code was looking for 0xED/0xF4 in the second position, >> but it's actually looking for them in the first position, which makes >> vastly more sense. Whee! > > Anyway, I try to describe another aspect of this code a the present. I've committed this, after a good deal of hacking on the comments, some coding style cleanup, and one bug fix: + workstr[len] = '\0'; Without that, when we start fiddling with the second-to-last character, the last one is still hanging around, which is different than the old behavior. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Re: [v9.2] make_greater_string() does not return a string in some cases
Hello, I feel at a loss what to do... > I thought that code was looking for 0xED/0xF4 in the second position, > but it's actually looking for them in the first position, which makes > vastly more sense. Whee! Anyway, I try to describe another aspect of this code a the present. The switch block in the g_utf8_increnet is a folded code of five individual manipulation according to the byte-length of the sequence. The separation presupposes the input bytes and length formes a valid utf-8 sequence. For a character more than 5 byte length, retunes false. For 4 bytes, the sequence ranges between U+1 and U+1f. If charptr[3] is less than 0xbf, increment it and return true. Else assign 0x80 to charptr[3] and then if charptr[2] is less than 0xbf increment it and return true. Else assign 0x80 to charptr[2] and then, if (charptr[1] is less than 0x8f when charptr[0] == 0xf4) or (charptr[1] is less than 0xbf when charptr[0] != 0xf4) increment it and return true. Else assign 0x80 to charptr[1] and then if charptr[0] is not 0xf4 increment it and return true. Else the input sequence must be 0xf4 0x8f 0xbf 0xbf which represents U+10 and this is the upper limit of UTF-8 representation. Restore the sequnce and return false. for 3 bytes, the sequence ranges between u+800 and u+. If charptr[2] is less than 0xbf increment it and reutrn true. Else assign 0x80 to charptr[2] and then, if (charptr[1] is less than 0x9f when charptr[0] == 0xed) or (charptr[1] is less than 0xbf when charptr[0] != 0xed) increment it and return true. The sequence 0xed 0x9f 0xbf represents U+d7ff will incremented to 0xef 0x80 0x80 (U+f000) at the end. Else assign 0x80 to charptr[1] and then if charptr[0] is not 0xef increment it and return true. Else the input sequence must be 0xef 0xbf 0xbf which represents U+ and the next UTF8 sequence has the length of 4. Restore the sequnce and return false. For 2 bytes, the sequence ranges between U+80 and U+7ff. If charptr[1] is less than 0xbf increment it and reutrn true. Else assign 0x80 to charptr[1] and then if charptr[0] is not 0xdf increment it and return true. Else the input sequence must be 0xdf 0xbf which reporesents U+7ff and next UTF8 sequence has the length of 3. Restore the sequence and return false. For 1 byte, the byte ranges between U+0 and U+7f. If charptr[0] is less than 0x7f increment it and return true. Else the input sequence must be 0x7f which represents U+7f and next UTF8 sequence has the length of 2. Restore the sequence and return false. -- Kyotaro Horiguchi -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] strange code in array_in
In array_in(), I see the following code: my_extra->element_type = ~element_type; It seems like it was explicitly changed from InvalidOid to ~element_type. At first I thought it was a mistake, but then I thought maybe it was to ensure that the next branch was taken even if element_type == InvalidOid. But the rest of the lookups will surely fail on InvalidOid, and it seems like it would be easy to test for InvalidOid at the beginning if we wanted to fail. Can someone please explain, and perhaps include a comment indicating what's going on? Or is it just too early and I missed something simple? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
Hitoshi Harada writes: > I have a doubt here, on sharing connection for each server. What if > there are simultaneous scan on the same plan? Say, > -> Nested Loop > -> Foreign Scan to table T1 on server A > -> Foreign Scan to table T2 on server A > Okay, you are thinking about Foreign Join, so example above is too > simple. But it is always possible to execute such a query if foreign > scan nodes are separated far, isn't it? As far as I see from your > explanation, scan T1 and scan T2 share the same connection. Now join > node scans one row from left (T1) while asking rows from right (T2) > without fetching all the rows from left. If T2 requests to server A, > the connection's result (of T1) is discarded. Am I understand > correctly? I have not looked at the code, but ISTM the way that this has to work is that you set up a portal for each active scan. Then you can fetch a few rows at a time from any one of them. If you're doing this through libpq, it'd be necessary to implement each scan using a cursor. I'm not sure whether it'd be worth our time to add more functions to libpq to allow more-direct access to the protocol portal feature. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So where are we on the open commitfest?
On Fri, Oct 28, 2011 at 8:50 PM, Tom Lane wrote: > * Non-inheritable check constraints > > Greg Stark claimed this one for committing a few weeks ago, but has > not done anything visible since then. Greg? > Sorry, I had hoped to look at it during pgconfeu but found Amsterdam a bit too distracting. I'm looking at it now. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add socket dir to pg_config..?
On Fri, Oct 28, 2011 at 5:33 PM, Dimitri Fontaine wrote: > I'm puzzled, maybe I'm not understanding a key point here though. > I think the only thing you're missing here is that, despite appearances some days, Postgres is not run by a single hive mind. Tom thinks there needs to be a single location for all builds as it's part of the protocol, and others agree with the approach Debian took and we haven't really resolved that discrepancy. So Postgres's treatment of this configuration is naturally a bit schizophrenic. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
On Sat, Oct 29, 2011 at 12:25:46AM -0700, Hitoshi Harada wrote: > I have a doubt here, on sharing connection for each server. What if > there are simultaneous scan on the same plan? Say, > > -> Nested Loop > -> Foreign Scan to table T1 on server A > -> Foreign Scan to table T2 on server A > > Okay, you are thinking about Foreign Join, so example above is too > simple. But it is always possible to execute such a query if foreign > scan nodes are separated far, isn't it? As far as I see from your > explanation, scan T1 and scan T2 share the same connection. Now join > node scans one row from left (T1) while asking rows from right (T2) > without fetching all the rows from left. If T2 requests to server A, > the connection's result (of T1) is discarded. Am I understand > correctly? This would need to be factored in in the cost calculations. For remote servers there is an overhead per tuple transmitted. So in the above case it might actually be quicker to do the nested loop locally. To handle the parallel case you might need to materialise in the inner loop, that would avoid the double scan. Or we could fix the protocol so you can stream multiple queries at once. Actually, you can already do this is you use DECLARE CURSOR for all the queries upfront and then FETCH as needed. That way you can do it all over one connection. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > He who writes carelessly confesses thereby at the very outset that he does > not attach much importance to his own thoughts. -- Arthur Schopenhauer signature.asc Description: Digital signature
Re: [HACKERS] unite recovery.conf and postgresql.conf
On Fri, Sep 9, 2011 at 10:56 AM, Fujii Masao wrote: > In previous discussion, we've reached the consensus that we should unite > recovery.conf and postgresql.conf. The attached patch does that. The > patch is WIP, I'll have to update the document, but if you notice something, > please feel free to comment. My short summary of the thread is Fujii proposes we allow parameters currently in recovery.conf to be specified in postgresql.conf. All agree to that. Fujii suggests that if we have both postgresql.conf and recovery.conf then recovery.conf should contain overrides. Fujii then suggests that if such an override exists, then SHOW would not work properly. Magnus is rightly horrified and many speak against allowing recovery.conf to continue to exist for this reason. I note that if recovery.conf is an include file of postgresql.conf then the overrides would work correctly, just as if postgresql.conf had multiple settings for that parameter. So the premise is incorrect, so the conclusion is not relevant. Simon, JD, Greg Stark speak in favour of the usefulness of having a recovery.conf separate from postgresql.conf. Tatsuo confirms pgpool uses this. Simon, Fujii, Peter agree an automatic include of recovery.conf would be useful Robert points out that pg_ctl promote was a good feature Simon, JD say that backwards compatibility is important Everybody agrees a neater way of invoking standby mode would be good. Peter points out that including recovery target parameters in postgresql.conf would be difficult and require manual editing, and also that pg_ctl -o is not a suitable interface. The thread also includes a variety of other alternate ideas, misunderstandings and other commentary. - - - My thoughts on how to resolve this are... Everybody agrees these two points: * Allow recovery parameters to be handled same way as other GUCs, and specified in postgresql.conf if desired. * Allow parameters to be reloaded at SIGHUP and visible using SHOW. Those two things do not themselves force us to break backwards compatibility. We also agree that we want a neater way to startup in standby mode. In 9.1 we added "pg_ctl promote" as a better way of indicating failover/switchover. When we did that we kept the trigger_file parameter added in 9.0, which shows it is possible to add a new API without breaking backwards compatibility. We should add a "pg_ctl standby" command as a better way of indicating starting up (also described as triggering) standby mode. We keep standby_mode parameter. There is no difference here between file based and stream based replication: you can have file, stream or both file and stream (as intended). In this mode the recovery target parameters are *ignored* even if specified (explained below). http://developer.postgresql.org/pgdocs/postgres/recovery-target-settings.html In 9.2 the presence of recovery.conf can and therefore should continue to act as it does in 9.1. This should be automatically included at the end of postgresql.conf, which naturally and with no additional code allows us to override settings, with overrides visible by SHOW. We don't make any specific checks to see if someone has added a postgresql.conf parameter in there. If there is a recovery target parameter in recovery.conf we enter recovery, otherwise we operate as a standby. recovery.conf is no longer *required* for standby modes. These things are announced as deprecated and will be removed when we go to release 10.0 * trigger_file * standby_mode * recovery.conf indicates standby recovery.conf should continue to be required to perform a PITR. If we place the recovery_target parameters into postgresql.conf we will have no way to differentiate between (1) a recovery that has successfully completed then crashed and (2) a user-specified recovery, which was the original rationale for its use. This is OK, since we now encourage people to enter a recovery by creating recovery.conf and for entering a standby to use a new cleaner API without the confusing use of the word "recovery". I think that meets all requirements, as far as technically possible. Best Regards -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So where are we on the open commitfest?
On Fri, Oct 28, 2011 at 9:50 PM, Simon Riggs wrote: >> * unite recovery.conf and postgresql.conf >> >> This one also seems to be lacking consensus more than anything else. >> What do we do about that? > > I'll re-read the thread in detail to see if I can break impasse. I've added a new comment onto the OP to summarise this. ISTM there is a good way forwards. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So where are we on the open commitfest?
On Sat, Oct 29, 2011 at 2:21 AM, Fujii Masao wrote: > On Sat, Oct 29, 2011 at 5:50 AM, Simon Riggs wrote: >> On Fri, Oct 28, 2011 at 8:50 PM, Tom Lane wrote: >> >>> * Separating bgwriter and checkpointer >>> >>> Same for this one. >> >> Will commit by end of Monday > > There are plenty of source comments (and probably documents) describing that > checkpoint is performed by bgwriter, but the patch that you posted > didn't correct > them. Are you going to include the change of them in the patch? Or commit > separately? Yes, I will resolve all your comments either in first commit, or ones immediately following. Thank you again for making those observations. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] So where are we on the open commitfest?
On 29.10.2011 06:40, Tom Lane wrote: Robert Haas writes: On Fri, Oct 28, 2011 at 3:50 PM, Tom Lane wrote: * Range Types This has certainly had plenty of work done too. If it's not committable yet, I think we should mark it Returned With Feedback for now. I have been thinking about looking at committing at least part of this, but thought Heikki might be planning to pick it up. Yeah, this one is in Heikki's court AFAIK. I'm waiting for Jeff to fix the caching of type metadata in range type functions to work like the corresponding caching in array functions. I believe that's the last outstanding item on that patch. Jeff, can you get around to do that in the next few days? If not, I might go and do it myself, but I'm lazy and would prefer not to ;-). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pgsql_fdw, FDW for PostgreSQL server
2011/10/25 Shigeru Hanada : > > Connection management > = > The pgsql_fdw establishes a new connection when a foreign server is > accessed first for the local session. Established connection is shared > between all foreign scans in the local query, and shared between even > scans in following queries. Connections are discarded when the current > transaction aborts so that unexpected failure won't cause connection > leak. This is implemented with resource owner mechanism. > I have a doubt here, on sharing connection for each server. What if there are simultaneous scan on the same plan? Say, -> Nested Loop -> Foreign Scan to table T1 on server A -> Foreign Scan to table T2 on server A Okay, you are thinking about Foreign Join, so example above is too simple. But it is always possible to execute such a query if foreign scan nodes are separated far, isn't it? As far as I see from your explanation, scan T1 and scan T2 share the same connection. Now join node scans one row from left (T1) while asking rows from right (T2) without fetching all the rows from left. If T2 requests to server A, the connection's result (of T1) is discarded. Am I understand correctly? Regards, -- Hitoshi Harada -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers