Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Sun, Sep 18, 2005 at 11:23:01PM -0400, Tom Lane wrote: snip class families to relate opclasses for different datatypes. Basically I'd like to solve most of these issues by constructing a new layer atop opclasses, not by deciding that an opclass doesn't convey the full story about the behavior of an index column. Where I'm currently going is creating a table of COLLATE orders. These collate orders would refer to operator classes but tweak them. For example, things like: - Sort ascending or descending (descending reverses the bt*cmp test) - NULLs first or last - Locale for text types - etc They could be declared in the operator class definition, or generated automatically. You could then do things like: CREATE INDEX ... (field1 COLLATE ascending, field2 COLLATE descending) for those queries where you want ascending on one column and descending on another. Or perhaps: CREATE INDEX ... (textfield COLLATE ignore_case) CREATE INDEX ... (textfield COLLATE locale_us) CREATE INDEX ... (textfield COLLATE optimise_regex) CREATE INDEX ... (point COLLATE distance) However, I can't see how this can relate families of operator classes like you talk about Tom. ISTM that needs to dealt with somewhere else, given that it's unrelated to order. This is going way out of spec though... -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZ1XhWSBlJF.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On P, 2005-09-18 at 23:34 -0400, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: That would be an extremely bad idea, because it would immediately remove index scans as one way to meet an ORDER BY. Well couldn't the index scan be taught to go fetch the NULLs in a separate traversal? (1) IS NULL is not an indexable operation, so no, not without significant overhaul of the index AM API. But we do store NULLs in indexes, so why is it not indexable? This is either an interface bug (not making use of stored info) or storage bug (wasting space storing unneccessary info) (2) This propagates a problem that is specific to orderable indexes (ie btree) into code that is generic to all indexes, and thus creates the problem of how do you deal with specifying NULL ordering without any definition of ordering for non-NULLs. we dont need an ordering of NULLs for cases without ORDER BY. You can't specify NULLS FIRST/LAST without ORDER BY. When one needs to use index for ordering we could use a plan like APPEND INDEX SCAN FOR NULLS, FILTER IS NULL INDEX SCAN FOR NOT NULLS, FILTER IS NOT NULL if NULL's are needed to be returned as sorted first/last If no index scan is used, sorting code should be made smart enough to recognize nulls and deal with it. (3) You still have to invent a mechanism to define whether you want nulls first or last ... and make sure that that mechanism works for plans that use explicit SORT steps as well as those that use index scans. The main place I see problems is multiple field indexes, where some non- first field is null. For single field indexes simply making two index scans, possibly in different directions seems easy. -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] FW:How to modify a tuple returned by SPI_execute
Sorry for offtopic, since the pgsql-general ML does not seem to work I post here. I checked the documentation, and still do not get it. I can use SPI_copytuple to return a modified version of a tuple, but how do I modify a column of type HeapTuple. In my case, I just want to modify a INT32 column for sorting. Any ideas? Regards, Mario Weilguni ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Beta2 Wrap Up ...
Also, the change to pg_cancel_backend breaks backwards compatibility with 8.0, which is a whole lot worse than breaking it with 8.1-beta1. Unfortunately, core doesn't see this as backward compatibility break, instead it's regarded as adjustment of a new function. Anything that's not in core isn't worth a single thought This function has been in core since 8.0. The other functions were new (to core that is, imported from external module), but this one was in the main backend already. Is there a reason the old/new can't be aliaseed to each other, instead of the old just being removed? It shuold be possible to do for the changed names - not sure if we want to. Can't do it for pg_cancel_backend(), because only the return type has changed and you acn't overload two functions that differ only in return type. //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Beta2 Wrap Up ...
Also, the change to pg_cancel_backend breaks backwards compatibility with 8.0, which is a whole lot worse than breaking it with 8.1-beta1. Yeah, I thought about that (and Bruce and I already discussed it offlist before I committed the changes). The function was newly added in 8.0 -- if we're *ever* going to fix it, fixing it before 8.1 ships is the best time to do so. I would also guess that (a) not many people are using the function (b) the changes in client code should be minimal (as you point out). So IMHO making the API change and noting it in the release notes was probably best. Hmm. Yeah, I agree provided the if we're ever going to. I'm just not as convinced we have to do it - it's not that broken in the first place. Sure, can be fairly easily recoded with CASE, but... If nothing else this needs to go in as a backwards incompatible change in the release notes. This is already done. Great. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Mon, Sep 19, 2005 at 11:13:05AM +0300, Hannu Krosing wrote: (1) IS NULL is not an indexable operation, so no, not without significant overhaul of the index AM API. But we do store NULLs in indexes, so why is it not indexable? This is either an interface bug (not making use of stored info) or storage bug (wasting space storing unneccessary info) Err, indexes used to not store NULLs to save space. However, it turns out that SQL UNIQUE has something to say about NULLs in unique columns so they had to be included. However, the machinary to decide if an index is usable assumes that usable operators have two arguments and IS NULL isn't really an operator in the PostgreSQL sense and doesn't have two arguments either. *If* that can be fixed, then we can be more flexible. But if it were easy it would have been done long ago... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpdwzN3w7ZIG.pgp Description: PGP signature
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On E, 2005-09-19 at 11:24 +0200, Martijn van Oosterhout wrote: On Mon, Sep 19, 2005 at 11:13:05AM +0300, Hannu Krosing wrote: (1) IS NULL is not an indexable operation, so no, not without significant overhaul of the index AM API. But we do store NULLs in indexes, so why is it not indexable? This is either an interface bug (not making use of stored info) or storage bug (wasting space storing unneccessary info) Err, indexes used to not store NULLs to save space. However, it turns out that SQL UNIQUE has something to say about NULLs in unique columns so they had to be included. surely not UNIQUE hannu=# create table tabuniq(i int ); CREATE TABLE hannu=# create index tabuniq_ndx on tabuniq(i); CREATE INDEX hannu=# insert into tabuniq values(1); INSERT 20560497 1 hannu=# insert into tabuniq values(2); INSERT 20560498 1 hannu=# insert into tabuniq values(null); INSERT 20560499 1 hannu=# insert into tabuniq values(null); INSERT 20560500 1 maybe the problem is with PRIMARY KEY However, the machinary to decide if an index is usable assumes that usable operators have two arguments and IS NULL isn't really an operator in the PostgreSQL sense and doesn't have two arguments either. *If* that can be fixed, then we can be more flexible. But if it were easy it would have been done long ago... sure :) -- Hannu Krosing [EMAIL PROTECTED] ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 64-bit API for large objects
Mark, If you don't mind contributing the changes, we'd be glad to take a look at them. Thanks. -JonahOn 9/18/05, Mark Dilger [EMAIL PROTECTED] wrote: My company has written a 64-bit large object API, extending the postgresqlserver to be able to read/write/seek/tell/open/close objects larger than 2GB.If the hackers community considers this valuable, we will submit the changes back for the rest of the community to share. From one of my programmers, Jeremy Drake:I tested this out on my box with a 4gb dvd iso image, and it appears towork correctly.The test code I found for large object things does not really seem to exercise the api very well though.And the regressiontests do not seem to even touch large objects (they all still pass afterthis change).Mark, can you take a look at this and make sure I haven't broken anything too obviously?I wrote it into the same file as the large objectcode, since that filehas some static stuff for caching things which I would like to share.Iopted to add new functions tell64 and seek64 rather than changing the existing ones for backwards compatibility.I plugged them into thepg_proc catalog, but everything in that file has an explicit OID, and I donot feel comfortable (yet) grabbing up OIDs for stuff.So I set them to an OID of zero, which means the scripts will assign it one which is notused (in the range 1-something).Since the convention is that suchfunctions have explicit assigned OIDs, it would probably be required to get real ones if this were ever to be submitted back.Also, in the libpqstuff, at the moment I have it fail if it cannot find the seek64 or tell64functions.It may be best to have it work as long as you don't try to call them, in order to preserve backwards compatibility with other serverversions.If you think this is a reasonable patch, it might be nice to send it tothem, be a good neighbor and all that... ---(end of broadcast)---TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match-- Respectfully,Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporationhttp://www.enterprisedb.com/
Re: [HACKERS] New dot releases
Hi, On Fri, 16 Sep 2005, Michael Fuhr wrote: On Fri, Sep 16, 2005 at 09:28:39AM -0600, Michael Fuhr wrote: FWIW, I have a Solaris 9/sparc box with gcc 3.4.2 (same setup as gerbil) and have no problems with REL7_2_STABLE through HEAD. I'll test REL8_0_STABLE with gerbil's configure options when I get a chance. I just built REL8_0_STABLE with the following configure options (same as gerbil): ./configure --enable-cassert --enable-debug --enable-nls \ --enable-integer-datetimes --with-perl --with-python \ --with-openssl --with-pgport=5682 gmake check returned the following: == All 96 tests passed. == So no need to hold the new dot releases? :) I want to work on new RPM sets and don't want to apply countless patches... :) Regards, -- Devrim GUNDUZ Kivi Bilişim Teknolojileri - http://www.kivi.com.tr devrim~gunduz.org, devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr http://www.gunduz.org ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] DISTINCT vs. GROUP BY
I was wondering whether it is possible to teach the planner to handle DISTINCT in a more efficient way: em=# explain select distinct lastname from import.testtest; QUERY PLAN Unique (cost=2647377.45..2709467.70 rows=1 width=7) - Sort (cost=2647377.45..2678422.58 rows=12418051 width=7) Sort Key: lastname - Seq Scan on testtest (cost=0.00..370082.51 rows=12418051 width=7) (4 Zeilen) Isn't it possible to perform the same operation using a HashAggregate? We have seen that a GROUP BY workaround is usually a lot faster than sort-unique - at least when work_mem is large enough. best regards, hans -- Cybertec Geschwinde Schönig GmbH Schöngrabern 134; A-2020 Hollabrunn Tel: +43/1/205 10 35 / 340 www.postgresql.at, www.cybertec.at ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] DISTINCT vs. GROUP BY
On Mon, 2005-19-09 at 16:27 +0200, Hans-Jürgen Schönig wrote: I was wondering whether it is possible to teach the planner to handle DISTINCT in a more efficient way: [...] Isn't it possible to perform the same operation using a HashAggregate? One problem is that DISTINCT ON is defined to return the first unique row (according to the query's ORDER BY) for the set of DISTINCT ON columns, which can't easily be done via hashing. -Neil ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New dot releases
Devrim GUNDUZ [EMAIL PROTECTED] writes: So no need to hold the new dot releases? :) I still object to releasing them until we find out what's going on on gerbil. That machine was building 8.0 fine until the patch, and it's failing consistently since then. To assume this is not our problem would be the height of hubris. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Beta2 Wrap Up ...
Marc G. Fournier [EMAIL PROTECTED] writes: Is there a reason the old/new can't be aliaseed to each other, instead of the old just being removed? Any change like that would require another initdb. If we were going to force another initdb, my vote would be to revert these functions to where they were in beta1. It was a mistake to change them in such a hurry. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Start translating
Peter Eisentraut [EMAIL PROTECTED] writes: I think beta 2 is a good time to start fixing up the translations again. One thing I had wanted to do before issuing the call for translations was to make a pass over the recent GiST changes, looking at elog-vs-ereport decisions and message wording. With all due respect to Oleg and Teodor, English is not their first language, and the messages need some work. I'm on quasi-vacation this week and don't have the time to do anything in this line; perhaps you can look at it? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] DISTINCT vs. GROUP BY
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes: I was wondering whether it is possible to teach the planner to handle DISTINCT in a more efficient way: Probably (although the interactions with ORDER BY might be tricky). No one has touched that part of the planner in a very long time. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Per-table freeze limit proposal
It would also be very handy to be able to tell how many transactions (or inserts/updates/deletes) have occured since the last vacuum. Presumably autovacuum needs to know this already, but is it exposed? On Thu, Sep 15, 2005 at 07:46:26AM -0400, Jim Buttafuoco wrote: while you are at it, can you put in some audit timestamps as to when the vacuum occurred (full vs not full). -- Original Message --- From: Alvaro Herrera [EMAIL PROTECTED] To: Hackers pgsql-hackers@postgresql.org Sent: Wed, 14 Sep 2005 22:14:23 -0400 Subject: [HACKERS] Per-table freeze limit proposal Hackers, As you've probably heard too many times already, I'm thinking in improving vacuum, so we can keep track of the freeze Xid on a table level, rather than database level. Hopefully this will eliminate the need for database-wide vacuums. -- Jim C. Nasby, Sr. Engineering Consultant [EMAIL PROTECTED] Pervasive Software http://pervasive.comwork: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461 ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Beta2 Wrap Up ...
Is there a reason the old/new can't be aliaseed to each other, instead of the old just being removed? Any change like that would require another initdb. If we were going to force another initdb, my vote would be to revert these functions to where they were in beta1. It was a mistake to change them in such a hurry. If we want to do that, we should probably try to roll a beta3 with that as soon as possible, so not too many people have upgraded to beta2... //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Does anybody use ORDER BY x USING y?
Martijn van Oosterhout kleptog@svana.org writes: On Sun, Sep 18, 2005 at 11:23:01PM -0400, Tom Lane wrote: snip class families to relate opclasses for different datatypes. Basically I'd like to solve most of these issues by constructing a new layer atop opclasses, not by deciding that an opclass doesn't convey the full story about the behavior of an index column. The thing is that these opclasses you're describing are closely related. It ought to be possible to use a single index to produce results in any of the four orders you describe. Where I'm currently going is creating a table of COLLATE orders. These collate orders would refer to operator classes but tweak them. For example, things like: - Sort ascending or descending (descending reverses the bt*cmp test) - NULLs first or last - Locale for text types - etc These aren't all related in the same way. While it obviously isn't hard to produce results ascending or descending, and it shouldn't be hard to produce NULLs first or last regardless of where they appear in the index, it would be utterly impossible to use an index built with the wrong locale collation. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Does anybody use ORDER BY x USING y?
Greg Stark [EMAIL PROTECTED] writes: The thing is that these opclasses you're describing are closely related. It ought to be possible to use a single index to produce results in any of the four orders you describe. Wrong --- only two of them. You can't magically swap nulls from one end of the index to the other (and Hannu's flight of fantasy about double indexscans is just a flight of fantasy; it would be solving the problem at entirely the wrong place). These aren't all related in the same way. They are all desirable properties of an index column, however. In particular, we do have a market for genuine reverse-sort columns, so that you can use a double-column index to get orderings like ORDER BY x ASC, y DESC. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] DISTINCT vs. GROUP BY
Neil Conway [EMAIL PROTECTED] writes: On Mon, 2005-19-09 at 16:27 +0200, Hans-Jürgen Schönig wrote: I was wondering whether it is possible to teach the planner to handle DISTINCT in a more efficient way: [...] Isn't it possible to perform the same operation using a HashAggregate? One problem is that DISTINCT ON is defined to return the first unique row (according to the query's ORDER BY) for the set of DISTINCT ON columns, which can't easily be done via hashing. Uhm. Sure it can. DISTINCT is really just special a case of GROUP BY. Even DISTINCT ON is just GROUP BY with a kind of first() aggregate function. What would be really neat would be to teach GROUP BY about first() and last() and how it can skip over some index entries and still satisfy the query. Then make DISTINCT and DISTINCT ON be handled through the exact same code path. For bonus points teach it that min() and max() can sometimes be treated the same way if the path is presenting records sorted on that column. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Beta2 Wrap Up ...
On Mon, 19 Sep 2005, Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: Is there a reason the old/new can't be aliaseed to each other, instead of the old just being removed? Any change like that would require another initdb. If we were going to force another initdb, my vote would be to revert these functions to where they were in beta1. It was a mistake to change them in such a hurry. +1 on reverting them back then ... and on a quick beta3 (ie. by end of week?) Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Does anybody use ORDER BY x USING y?
Tom Lane [EMAIL PROTECTED] writes: Greg Stark [EMAIL PROTECTED] writes: The thing is that these opclasses you're describing are closely related. It ought to be possible to use a single index to produce results in any of the four orders you describe. Wrong --- only two of them. You can't magically swap nulls from one end of the index to the other (and Hannu's flight of fantasy about double indexscans is just a flight of fantasy; it would be solving the problem at entirely the wrong place). I think that was my flight of fantasy. I didn't say it was pretty but it would solve the problem. Whereas having a separate opclass would mean someone would need a second index to satisfy the ordering which seems silly. -- greg ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Does anybody use ORDER BY x USING y?
On Mon, Sep 19, 2005 at 12:21:00PM -0400, Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: Greg Stark [EMAIL PROTECTED] writes: The thing is that these opclasses you're describing are closely related. It ought to be possible to use a single index to produce results in any of the four orders you describe. Wrong --- only two of them. You can't magically swap nulls from one end of the index to the other (and Hannu's flight of fantasy about double indexscans is just a flight of fantasy; it would be solving the problem at entirely the wrong place). I think that was my flight of fantasy. I didn't say it was pretty but it would solve the problem. Whereas having a separate opclass would mean someone would need a second index to satisfy the ordering which seems silly. As I understand it, they would only need a second index, if they did want to use the index to determine the sort order, for two different sort orders. I don't see any easy way out of this. I think it could be optimized to scan less than twice, but it would be an incredibly effort and maintenance nightmare, for a minimal return. mark -- [EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] __ . . _ ._ . . .__. . ._. .__ . . . .__ | Neighbourhood Coder |\/| |_| |_| |/|_ |\/| | |_ | |/ |_ | | | | | | \ | \ |__ . | | .|. |__ |__ | \ |__ | Ottawa, Ontario, Canada One ring to rule them all, one ring to find them, one ring to bring them all and in the darkness bind them... http://mark.mielke.cc/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] postmaster core dump
I seem to have an unhappy postgresql: (gdb) bt #0 0xbd99871b in kill () from /usr/lib/libc.so.12 #1 0xbda217e7 in abort () from /usr/lib/libc.so.12 #2 0x0820c1fa in ExceptionalCondition ( conditionName=0x8298920 !(batchno hashtable-curbatch), errorType=0x823919f FailedAssertion, fileName=0x82988e0 /usr/src/local/pgsql/src/backend/executor/nodeHash.c, lineNumber=675) at /usr/src/local/pgsql/src/backend/utils/error/assert.c:51 #3 0x08136c09 in ExecHashTableInsert (hashtable=0x83e9c9c, tuple=0x83e9ce8, hashvalue=4294941132) at /usr/src/local/pgsql/src/backend/executor/nodeHash.c:679 #4 0x081363e1 in MultiExecHash (node=0x83e91b4) at /usr/src/local/pgsql/src/backend/executor/nodeHash.c:114 #5 0x0812c24f in MultiExecProcNode (node=0x83e91b4) at /usr/src/local/pgsql/src/backend/executor/execProcnode.c:439 #6 0x0813707a in ExecHashJoin (node=0x83e65ac) at /usr/src/local/pgsql/src/backend/executor/nodeHashjoin.c:160 #7 0x0812c11a in ExecProcNode (node=0x83e65ac) at /usr/src/local/pgsql/src/backend/executor/execProcnode.c:358 #8 0x081348f5 in agg_fill_hash_table (aggstate=0x83e62b4) at /usr/src/local/pgsql/src/backend/executor/nodeAgg.c:911 #9 0x081345fb in ExecAgg (node=0x83e62b4) at /usr/src/local/pgsql/src/backend/executor/nodeAgg.c:681 #10 0x0812c152 in ExecProcNode (node=0x83e62b4) at /usr/src/local/pgsql/src/backend/executor/execProcnode.c:377 #11 0x0812aa2b in ExecutePlan (estate=0x83e601c, planstate=0x83e62b4, operation=CMD_SELECT, numberTuples=0, direction=ForwardScanDirection, dest=0x83d8d8c) at /usr/src/local/pgsql/src/backend/executor/execMain.c:1110 #12 0x08129dc7 in ExecutorRun (queryDesc=0x83db848, direction=ForwardScanDirection, count=0) at /usr/src/local/pgsql/src/backend/executor/execMain.c:231 #13 0x081a89a3 in PortalRunSelect (portal=0x83e401c, forward=1 '\001', count=2147483647, dest=0x83d8d8c) at /usr/src/local/pgsql/src/backend/tcop/pquery.c:797 #14 0x081a8758 in PortalRun (portal=0x83e401c, count=2147483647, dest=0x83d8d8c, altdest=0x83d8d8c, completionTag=0xbfbfe1d0 ) at /usr/src/local/pgsql/src/backend/tcop/pquery.c:648 #15 0x081a4c2f in exec_simple_query ( query_string=0x834501c select timesliced, count(stats_id) from trans left j /* * put the tuple into a temp file for later batches */ Assert(batchno hashtable-curbatch); ExecHashJoinSaveTuple(tuple, hashvalue, hashtable-innerBatchFile[batchno]); (gdb) print batchno $2 = 2 (gdb) print *hashtable $3 = {nbuckets = 2063, buckets = 0x83f601c, nbatch = 16, curbatch = 3, nbatch_original = 16, nbatch_outstart = 16, growEnabled = 1 '\001', totalTuples = 25998, innerBatchFile = 0x83f401c, outerBatchFile = 0x83f4068, hashfunctions = 0x83ea0a4, spaceUsed = 136816, spaceAllowed = 1048576, hashCxt = 0x835a648, batchCxt = 0x835a6d4} cvs of 1 Sept, nodeHash.c v 1.94 Any thoughts? Cheers, Patrick ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] statement_timeout logging
Simon Riggs wrote: On Fri, 2005-09-16 at 20:48 -0400, Bruce Momjian wrote: We can go three ways. We can add a boolean GUC to control printing of the query during a timeout, but that seems like overkill. We can add a new level for log_min_error_statement that is just above error, but that seems confusing. I think the right solution would be to allow log_min_duration_statement to work for canceled queries. Right now, log_min_duration_statement doesn't work for canceled queries because the query never completes to give a final duration and hit the test code. Should that be fixed now or added to the TODO list? The last one seems the right way to go. OK. I tested it and it actually works, and I added documentation suggesting its usage. I was a little confused above because the STATEMENT: line is only output to the server logs because of the way elog.c handles a STATEMENT print for log_min_error_statement. It does not output to the client no matter what log_min_messages is set to, and if someone is concerned about that we can fix it. Technically STATEMENT is not a log level message. So, if I do: test= SET statement_timeout = 1; SET test= SET log_min_error_statement = 'ERROR'; SET test= SELECT * FROM pg_class; ERROR: canceling statement due to statement timeout in the logs I see: ERROR: canceling statement due to statement timeout STATEMENT: select * from pg_class; So, reformat the message at statement_timeout, so that the log looks exactly like log_min_duration_statement: e.g. LOG: statement_timeout has been activated to cancel statement LOG: duration 1625652ms statement SELECT * from bigOne LOG: query has been cancelled by user action Not really. The problem here is that the last line is wrong --- it was not cancelled by user action. The attached, applied patch adds a cancel_from_timeout variable that properly records if the cancel was because of a timeout or user interaction, and displays the proper log message. Perhaps we should change the message from kill() to be statement rather than query also... kill() is the wrong place to print the message. You will see the attached patch does it in a cleaner way. I'd vote fix now, but I guess that seems to be becoming a regular viewpoint from me. OK, fixed. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/config.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.17 diff -c -c -r1.17 config.sgml *** doc/src/sgml/config.sgml13 Sep 2005 15:24:56 - 1.17 --- doc/src/sgml/config.sgml19 Sep 2005 16:58:53 - *** *** 3232,3238 listitem para Abort any statement that takes over the specified number of ! milliseconds. A value of zero (the default) turns off the limitation. /para /listitem /varlistentry --- 3232,3241 listitem para Abort any statement that takes over the specified number of ! milliseconds. If varnamelog_min_error_statement/ is set to ! literalERROR/ or lower, the statement that timed out will also be ! logged. A value of zero (the default) turns off the ! limitation. /para /listitem /varlistentry Index: src/backend/storage/lmgr/proc.c === RCS file: /cvsroot/pgsql/src/backend/storage/lmgr/proc.c,v retrieving revision 1.163 diff -c -c -r1.163 proc.c *** src/backend/storage/lmgr/proc.c 20 Aug 2005 23:26:24 - 1.163 --- src/backend/storage/lmgr/proc.c 19 Sep 2005 16:58:55 - *** *** 78,83 --- 78,84 /* Mark these volatile because they can be changed by signal handler */ static volatile bool statement_timeout_active = false; static volatile bool deadlock_timeout_active = false; + volatile bool cancel_from_timeout = false; /* statement_fin_time is valid only if statement_timeout_active is true */ static struct timeval statement_fin_time; *** *** 1058,1063 --- 1059,1065 Assert(!deadlock_timeout_active); statement_fin_time = fin_time; statement_timeout_active = true; + cancel_from_timeout = false; } else if (statement_timeout_active) { *** *** 1128,1141 MemSet(timeval, 0, sizeof(struct itimerval)); if (setitimer(ITIMER_REAL, timeval, NULL)) { ! statement_timeout_active = deadlock_timeout_active = false;
Re: [HACKERS] statement logging / extended query protocol issues
Oliver Jowett wrote: Bruce Momjian wrote: Well, from the application writer perspective, you are right it doesn't make sense, This is exactly what the end user is going to say. but this is only because jdbc is using prepare internally. Isn't this mostly irrelevant to the result we want to see? It's a detail of how the interface layer chooses to execute its queries, and 90% of the time the end user is not going to know or care about it. Right, but have no way to know if the user is using an interface that hides prepares from them, or they are using prepares visibly in their applications. For this reason, we should just display whatever the backend is doing. If all interfaces used prepares invisibly like jdbc, we would be right to suppress the log information. If you were to have written it in libpq, it would make sense, I think, and internally, this is what is happening. We can't assume only interface libraries like jdbc are using this feature. Wait, so is the extended query protocol the poor cousin of what libpq does, or what? You can do Parse/Bind using libpq, can't you? Sure. The *meaning* of the Parse/Bind/Execute sequence is quite clear regardless of what interface library is used. I still think that logging just the queries that were actually executed, once per execution, is the sensible thing to do here. I can't see a sequence of protocol messages that would produce a strange result if we used the rules I suggested -- do you have an example where it breaks? I have no idea. As far as I understand things, the protocol-level prepare/execute is identical to the SQL-level prepare/execute, except that there is no need to parse the execute, so it should log like the SQL-level statements, if possible. You can Parse any SQL statement, but you can't PREPARE any SQL statement. So, no, they're not equivalent. That's one aspect of what I meant about generating synthetic statements that weren't syntactially correct (the strange FETCH syntax with ROWS/MAXROWS that Simon was suggesting is another case). I am hesitant to add another log syntax to be used just for protocol-level prepare. I think it adds complexity with little benefit, particularly for people reading those logs with automated tools. Simon's page is in the patches queue. What would you like changed, exactly? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Beta2 Wrap Up ...
-Original Message- From: Marc G. Fournier[EMAIL PROTECTED] Sent: 19/09/05 16:56:23 To: Tom Lane[EMAIL PROTECTED] Cc: Marc G. Fournier[EMAIL PROTECTED], Magnus Hagander[EMAIL PROTECTED], Andreas Pflug[EMAIL PROTECTED], Dave Pagedpage@vale-housing.co.uk, pgsql-hackers@postgresql.orgpgsql-hackers@postgresql.org Subject: Re: [HACKERS] Beta2 Wrap Up ... +1 on reverting them back then ... and on a quick beta3 (ie. by end of week?) +1 from me as well. /D -Unmodified Original Message- On Mon, 19 Sep 2005, Tom Lane wrote: Marc G. Fournier [EMAIL PROTECTED] writes: Is there a reason the old/new can't be aliaseed to each other, instead of the old just being removed? Any change like that would require another initdb. If we were going to force another initdb, my vote would be to revert these functions to where they were in beta1. It was a mistake to change them in such a hurry. +1 on reverting them back then ... and on a quick beta3 (ie. by end of week?) Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Beta2 Wrap Up ...
On Mon, 2005-19-09 at 10:57 -0400, Tom Lane wrote: Any change like that would require another initdb. If we were going to force another initdb, my vote would be to revert these functions to where they were in beta1. What purpose would that serve? About the only thing purpose I can see is to avoid the API compatibility break for pg_cancel_backend() -- do people actually consider that a major issue? At any rate, I don't see any reason to revert the other changes (i.e. those other than pg_cancel_backend()). -Neil ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] postmaster core dump
On Mon, Sep 19, 2005 at 06:12:54PM +0100, Patrick Welche wrote: #15 0x081a4c2f in exec_simple_query ( query_string=0x834501c select timesliced, count(stats_id) from trans left j I just truncated one line early.. the query was: # explain select timesliced, count(stats_id) from trans left join stats on stats_id=stats.id group by timesliced; QUERY PLAN --- HashAggregate (cost=123718.66..123738.61 rows=1596 width=8) - Hash Left Join (cost=4143.88..115550.16 rows=1633701 width=8) Hash Cond: (outer.stats_id = inner.id) - Seq Scan on trans (cost=0.00..61341.01 rows=1633701 width=4) - Hash (cost=3292.30..3292.30 rows=123430 width=8) - Seq Scan on stats (cost=0.00..3292.30 rows=123430 width=8) (6 rows) Cheers, Patrick ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] DISTINCT vs. GROUP BY
Greg Stark [EMAIL PROTECTED] writes: DISTINCT is really just special a case of GROUP BY. Even DISTINCT ON is just GROUP BY with a kind of first() aggregate function. What would be really neat would be to teach GROUP BY about first() and last() and how it can skip over some index entries and still satisfy the query. Then make DISTINCT and DISTINCT ON be handled through the exact same code path. You've missed the point entirely. first() is not a substitute for sorting the input; it is only useful if the input comes pre-sorted. And if you are going to sort the input, you might as well use the current implementation of DISTINCT ON and skip the effort and memory-overflow-risk associated with a hashtable. I do think hash aggregation is a plausible alternative implementation of plain DISTINCT, but I don't see the case for using it for DISTINCT ON. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] postmaster core dump
Patrick Welche [EMAIL PROTECTED] writes: I seem to have an unhappy postgresql: Let's see a test case, not a stack trace. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] postmaster core dump
On Mon, Sep 19, 2005 at 03:59:35PM -0400, Tom Lane wrote: Patrick Welche [EMAIL PROTECTED] writes: I seem to have an unhappy postgresql: Let's see a test case, not a stack trace. I haven't set up the minimalist test case yet, but the 2 tables involved are incredibly simple. stats.id is an integer primary key, trans.stats_id points to it. You just need a query which uses a HashJoin. This time, no core dump, however: transatlantic=# set enable_hashjoin=on; SET transatlantic=# select timeslice,count(stats_id) from trans,stats where trans.stats_id=stats.id group by timeslice; timeslice | count ---+--- (0 rows) transatlantic=# set enable_hashjoin=off; SET transatlantic=# select timeslice,count(stats_id) from trans,stats where trans.stats_id=stats.id group by timeslice; timeslice | count -+--- 2005-08-28 00:00:00 | 586 2005-08-28 00:00:01 | 378 2005-08-28 00:20:00 | 878 ... So, no results with enable_hashjoin=on. Broken: QUERY PLAN -- GroupAggregate (cost=326296.78..338449.98 rows=97067 width=12) - Sort (cost=326296.78..329943.40 rows=1458648 width=12) Sort Key: stats.timeslice - Hash Join (cost=4203.88..108728.93 rows=1458648 width=12) Hash Cond: (outer.stats_id = inner.id) - Seq Scan on trans (cost=0.00..59706.48 rows=1458648 width=4) - Hash (cost=3292.30..3292.30 rows=123430 width=12) - Seq Scan on stats (cost=0.00..3292.30 rows=123430 width=12) Working: QUERY PLAN --- GroupAggregate (cost=506460.77..518613.97 rows=97067 width=12) - Sort (cost=506460.77..510107.39 rows=1458648 width=12) Sort Key: stats.timeslice - Merge Join (cost=263024.32..288892.93 rows=1458648 width=12) Merge Cond: (outer.id = inner.stats_id) - Index Scan using stats_pkey on stats (cost=0.00..3688.21 rows=123430 width=12) - Sort (cost=263024.32..266670.94 rows=1458648 width=4) Sort Key: trans.stats_id - Seq Scan on trans (cost=0.00..59706.48 rows=1458648 width=4) I'll make a smaller test case over night.. Cheers, Patrick ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] DISTINCT vs. GROUP BY
Tom Lane [EMAIL PROTECTED] writes: I do think hash aggregation is a plausible alternative implementation of plain DISTINCT, but I don't see the case for using it for DISTINCT ON. It could be done without presorting the input though not with a simple first()-like function. It would have be a sort of two-argument min() function that kept a state variable for the smallest value found so far of the sort key. My main motivation here is that it's odd to have two code paths for implementing the two language constructs when one is really just a special case of the other. It's a source of cases like this where the code to implement a query path exists but isn't accessible due to the way the query is written. -- greg ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches
I have removed this TODO item: * Research use of sched_yield() for spinlock acquisition failure --- Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: Marko Kreen marko@l-t.ee writes: (I speculate that it's set up to only yield the processor to other processes already affiliated to that processor. In any case, it is definitely capable of getting through 1 yields without running the guy who's holding the spinlock.) Maybe it should try sched_yield once and then use select after that? I tried that, actually, but it didn't seem to offer any particular benefit. (Maybe it would have helped more on older Linux kernels before they changed sched_yield?) I'm feeling even more disenchanted with sched_yield now that Marko pointed out that the behavior was changed recently. Here we have a construct that is not portable cross-platform, does not act as documented in its man page, and the kernel guys feel free to whack its behavior around in major ways without documenting that either. It seems to be a crap-shoot whether it will be useful on any particular machine or not. At least with the select() code we can be reasonably confident we know what will happen. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] logging blemishes
While preparing for a presentation, I noticed some mildly ugly effects with log_line_prefix during session startup if log_connections is turned on. Example (log_line_prefix = '%t [EMAIL PROTECTED] %r %p %c:%l'): 2005-09-19 19:16:39 EDT [EMAIL PROTECTED] 6541 432f46d7.198d:1 LOG: connection received: host=[local] port= 2005-09-19 19:16:42 EDT [EMAIL PROTECTED] 6543 432f46da.198f:1 LOG: connection received: host=[local] port= 2005-09-19 19:16:42 EDT [EMAIL PROTECTED] [local] 6543 432f46da.198f:2 LOG: connection authorized: user=postgres database=template1 2005-09-19 19:16:44 EDT [EMAIL PROTECTED] [local] 6543 432f46da.198f:3 LOG: disconnection: session time: 0:00:01.95 user=postgres database=template1 host=[local] port= I'm also unclear why we get two lines at the start of each connection - that seems like a bug, or is it due to some sort of protocol negotiation. This is a fresh FC4 machine with only pg 8.0.3 ever installed. Anyway, currently, we test for stop producing output here with the following code in elog.c: case 'q': /* in postmaster and friends, stop if %q is seen */ /* in a backend, just ignore */ if (MyProcPort == NULL) i = format_len; break; I'm wondering if we should extend that test slightly, to something like if (MyProcPort == NULL || MyProcPort-username == NULL || *(MyProcPort-username) == '\0') Thoughts? cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 64-bit API for large objects
Jonah H. Harris wrote: Mark, If you don't mind contributing the changes, we'd be glad to take a look at them. Thanks. -Jonah Ok, we will post it back soon. We have tested it on two different 64-bit architectures (Sparc and AMD) and are now testing on pentium before posting up to the list. mark ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [pgsql-hackers-win32] Time to close hackers-win32?
Magnus Hagander wrote: It occurs to me that there is no longer any great need to have a separate hackers list for win32 development. Perhaps we should close it down now and keep all development on -hackers? I also think this is a good idea. The number of win32 only issues of -hacker level is significantly smaller now, and having to bounce people between the lists can be kind of annoying... Agreed. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] DISTINCT vs. GROUP BY
Added to TODO: * Allow DISTINCT to use hashing like GROUP BY --- Greg Stark wrote: Neil Conway [EMAIL PROTECTED] writes: On Mon, 2005-19-09 at 16:27 +0200, Hans-J?rgen Sch?nig wrote: I was wondering whether it is possible to teach the planner to handle DISTINCT in a more efficient way: [...] Isn't it possible to perform the same operation using a HashAggregate? One problem is that DISTINCT ON is defined to return the first unique row (according to the query's ORDER BY) for the set of DISTINCT ON columns, which can't easily be done via hashing. Uhm. Sure it can. DISTINCT is really just special a case of GROUP BY. Even DISTINCT ON is just GROUP BY with a kind of first() aggregate function. What would be really neat would be to teach GROUP BY about first() and last() and how it can skip over some index entries and still satisfy the query. Then make DISTINCT and DISTINCT ON be handled through the exact same code path. For bonus points teach it that min() and max() can sometimes be treated the same way if the path is presenting records sorted on that column. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] logging blemishes
Andrew Dunstan wrote: While preparing for a presentation, I noticed some mildly ugly effects with log_line_prefix during session startup if log_connections is turned on. Example (log_line_prefix = '%t [EMAIL PROTECTED] %r %p %c:%l'): 2005-09-19 19:16:39 EDT [EMAIL PROTECTED] 6541 432f46d7.198d:1 LOG: connection received: host=[local] port= 2005-09-19 19:16:42 EDT [EMAIL PROTECTED] 6543 432f46da.198f:1 LOG: connection received: host=[local] port= 2005-09-19 19:16:42 EDT [EMAIL PROTECTED] [local] 6543 432f46da.198f:2 LOG: connection authorized: user=postgres database=template1 2005-09-19 19:16:44 EDT [EMAIL PROTECTED] [local] 6543 432f46da.198f:3 LOG: disconnection: session time: 0:00:01.95 user=postgres database=template1 host=[local] port= I'm also unclear why we get two lines at the start of each connection - that seems like a bug, or is it due to some sort of protocol negotiation. This is a fresh FC4 machine with only pg 8.0.3 ever installed. I can not reproduce your problem here with current CVS. I see: 2005-09-19 22:37:58 EDT LOG: transaction ID wrap limit is 2147484146, limited by database postgres 2005-09-19 22:39:17 EDT [EMAIL PROTECTED] 6404 432f7655.1904:1LOG: connection received: host=[local] port= 2005-09-19 22:39:17 EDT [EMAIL PROTECTED] [local] 6404 432f7655.1904:2LOG: connection authorized: user=postgres database=test 2005-09-19 22:39:19 EDT [EMAIL PROTECTED] [local] 6404 432f7655.1904:3LOG: disconnection: session time: 0:00:01.62 user=postgres da tabase=test host=[local] port= What other things did you change in postgresql.conf. I turned on log_connections, and log_disconnections, which shows as enabled in your output above. Anyway, currently, we test for stop producing output here with the following code in elog.c: case 'q': /* in postmaster and friends, stop if %q is seen */ /* in a backend, just ignore */ if (MyProcPort == NULL) i = format_len; break; I'm wondering if we should extend that test slightly, to something like if (MyProcPort == NULL || MyProcPort-username == NULL || *(MyProcPort-username) == '\0') Interesting, but I would like to find a need to add those tests. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New dot releases
On Mon, Sep 19, 2005 at 10:53:44AM -0400, Tom Lane wrote: Devrim GUNDUZ [EMAIL PROTECTED] writes: So no need to hold the new dot releases? :) I still object to releasing them until we find out what's going on on gerbil. That machine was building 8.0 fine until the patch, and it's failing consistently since then. To assume this is not our problem would be the height of hubris. In an earlier message you said that the owner of the machine has been completely unhelpful about providing any information to track it down. Is he not responding at all, or is he responding but with not enough information? Most of gerbil's failures are: creating information schema ... Bus Error - core dumped Is the message implying that the postgres process that initdb starts is dumping core? Any ideas on how the patch might cause that? The most recent failures are shmat(id=8326) failed: Not enough space and the default settings are selecting default max_connections ... 10 selecting default shared_buffers ... 50 Earlier tests that got as far as creating information schema had defaults lower than the maximums: selecting default max_connections ... 40 selecting default shared_buffers ... 700 Could the reduced settings (and thus what they imply about the amount of shared memory) be relevant? Could anything in the patch be affected by that? If you think it might be worthwhile, I could mess around with my box's shared memory settings and test it. Just looking for differences between gerbil and my box -- Michael Fuhr ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly