[HACKERS] New btree_gist code has a few problems
I tried running 'make installcheck' in contrib just now, and didn't get past btree_gist :-( The interval test fails with the attached diffs. text, varchar, char, bytea, bit, varbit and numeric either dump core or go into infinite loops during CREATE INDEX. (It's probably significant that these are all varlena datatypes...) This is on HPUX 10.20 using gcc. Let me know if you have any thoughts about tracking it down. regards, tom lane *** ./expected/interval.out Fri May 28 06:43:27 2004 --- ./results/interval.out Tue Jun 1 01:35:07 2004 *** *** 49,66 SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 1 (1 row) SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- !271 (1 row) SELECT count(*) FROM intervaltmp WHERE a '199 days 21:21:23'::interval; count --- !270 (1 row) --- 49,66 SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 0 (1 row) SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 8 (1 row) SELECT count(*) FROM intervaltmp WHERE a '199 days 21:21:23'::interval; count --- ! 8 (1 row) Core dump in varchar test looks like: Program terminated with signal 10, Bus error. #0 0xc0c6de40 in gbt_var_key_copy (u=0x7b03e4f0, force_node=5 '\005') at btree_utils_var.c:31 31r = (GBT_VARKEY *) palloc(VARSIZE(u-lower) + VARSIZE(u-upper) + VARHDRSZ ); (gdb) bt #0 0xc0c6de40 in gbt_var_key_copy (u=0x7b03e4f0, force_node=5 '\005') at btree_utils_var.c:31 #1 0xc0c6e3f8 in gbt_var_bin_union (u=0x7b03d920, e=0x40110cd8, tinfo=0x7afff594) at btree_utils_var.c:220 [ gdb gets confused here, possibly stack is smashed ] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] CVS tip problems
On Tue, 2004-06-01 at 01:33, Tom Lane wrote: First you might want to check which flavor of strerror_r() your platform has --- does it return int or char* ? I made the following change to the strerror_r call, which makes it work correctly with threading enabled: --- src/port/thread.c 23 Apr 2004 18:15:55 - 1.20 +++ src/port/thread.c 1 Jun 2004 07:18:26 - @@ -71,7 +71,8 @@ #if defined(FRONTEND) defined(ENABLE_THREAD_SAFETY) defined(HAVE_STRERROR_R) /* reentrant strerror_r is available */ /* some early standards had strerror_r returning char * */ - strerror_r(errnum, strerrbuf, buflen); + char buf[256]; + StrNCpy(strerrbuf, strerror_r(errnum, buf, 256), buflen); return strerrbuf; #else (I realise this is not sufficient for a patch to correct the problem.) -- Oliver Elphick [EMAIL PROTECTED] Isle of Wight http://www.lfix.co.uk/oliver GPG: 1024D/A54310EA 92C8 39E7 280E 3631 3F0E 1EC0 5664 7A2F A543 10EA Thou will show me the path of life; in thy presence is fullness of joy; at thy right hand there are pleasures for evermore. Psalms 16:11 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] New btree_gist code has a few problems
Ok, I'll have a look. Tom Lane wrote: I tried running 'make installcheck' in contrib just now, and didn't get past btree_gist :-( The interval test fails with the attached diffs. text, varchar, char, bytea, bit, varbit and numeric either dump core or go into infinite loops during CREATE INDEX. (It's probably significant that these are all varlena datatypes...) This is on HPUX 10.20 using gcc. Let me know if you have any thoughts about tracking it down. regards, tom lane *** ./expected/interval.out Fri May 28 06:43:27 2004 --- ./results/interval.out Tue Jun 1 01:35:07 2004 *** *** 49,66 SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 1 (1 row) SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- !271 (1 row) SELECT count(*) FROM intervaltmp WHERE a '199 days 21:21:23'::interval; count --- !270 (1 row) --- 49,66 SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 0 (1 row) SELECT count(*) FROM intervaltmp WHERE a = '199 days 21:21:23'::interval; count --- ! 8 (1 row) SELECT count(*) FROM intervaltmp WHERE a '199 days 21:21:23'::interval; count --- ! 8 (1 row) Core dump in varchar test looks like: Program terminated with signal 10, Bus error. #0 0xc0c6de40 in gbt_var_key_copy (u=0x7b03e4f0, force_node=5 '\005') at btree_utils_var.c:31 31r = (GBT_VARKEY *) palloc(VARSIZE(u-lower) + VARSIZE(u-upper) + VARHDRSZ ); (gdb) bt #0 0xc0c6de40 in gbt_var_key_copy (u=0x7b03e4f0, force_node=5 '\005') at btree_utils_var.c:31 #1 0xc0c6e3f8 in gbt_var_bin_union (u=0x7b03d920, e=0x40110cd8, tinfo=0x7afff594) at btree_utils_var.c:220 [ gdb gets confused here, possibly stack is smashed ] ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html -- Teodor Sigaev E-mail: [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_dump --comment?
In article [EMAIL PROTECTED], Chris Campbell [EMAIL PROTECTED] writes: Harald Fuchs wrote: Why don't you just do ( echo -- This is my comment pg_dump whatever ) dumpfile ? How could I dump using the custom format, and then use dumpfile with pg_restore to restore the dump? If I just prepend the comment to the file, then pg_restore will choke, since the file won't be in the proper custom format. I would have to remove the comment before sending the file to pg_restore. Is there an easy way to do that? That can be easily automated, and not take a huge amount of time given a 4 gig dump file that must be modified? Since pg_restore is able to read from standard input, that should not be a problem: instead of pg_restore options dumpfile just do sed 1d dumpfile | pg_restore options ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Fast index build vs. PITR
On Tue, 2004-06-01 at 01:24, Tom Lane wrote: I was just about to commit a patch that revises the btree index build procedure as discussed here: http://archives.postgresql.org/pgsql-general/2004-05/msg00480.php specifically, not using shared buffers during index build and bypassing WAL-logging in favor of just fsyncing the index file before commit. I was actually writing the commit message when it occurred to me that this would seriously break PITR. If the WAL datastream doesn't contain enough info to rebuild the index then rolling forward from a past backup isn't gonna work. I thought for a little bit about a magic reconstruct the index WAL entry that would invoke the index build procedure in toto, but that doesn't look like it will fly either. (Two problems: during crash recovery, you couldn't be sure that what's on disk for the underlying table exactly matches the index you need to build --- it could be a later state of the table; and besides, the environment of the WAL replay process isn't capable of running user-defined functions, so it couldn't work for functional indexes.) So AFAICS, we've got to dump the index contents into WAL to support PITR. This is a tad annoying. What I'm thinking about right now is tweaking the index-build code to write to WAL only if it sees that PITR is actually in use. It would have to look at the GUC variables to determine whether WAL archiving is enabled. If archiving isn't turned on, then we could assume that rollforward from a past backup isn't needed in this installation, and use the WAL-less index build method. Comments? The mechanism you suggest would also break crash recovery, not just PITR - though the avoidance of shared buffers seems like a gain either way. ..You raise the whole subject of UNRECOVERABLE data objects. Also known as NOT LOGGED etc. There are many significant performance gains to be had by turning off recoverability for certain large operations. Oracle and Teradata make extensive use of such features, i.e. especially in Data Warehousing. Examples of such operations might be: - index builds - INSERT SELECTs into previously empty tables This is an important area for performance...not just index builds. A suggestion would be: - add the dont send to xlog functionality as a user option on each statement, default=LOGGING - this could be Oracle compatible, or not, but concept is similar. Put the hooks in now and we can add this to all appropriate statement syntax later. e.g. CREATE INDEX blah ... NO LOGGING... ; INSERT INTO blah ... NO LOGGING SELECT... ; - if conf file says dont use fsync, then dont write to log - clearly they dont mind losing data in the event of a crash... i.e. default=NOLOGGING on all statements Best regards, Simon Riggs ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
On Monday 31 May 2004 22:00, Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: Right now following are measured in pages wal_buffers shared_buffers effective_cachesize while rest of the memory parameters are in kb. I thought being uniform would be good. Besides it will make it independent of page size as well. It would probably be reasonable to change effective_cache_size, since we really do not know what the kernel's unit of buffering is (except on Linux, where we *do* know that it ain't 8K ;-)). Personally I'd opt for measuring it in MB not KB, though; that would be a much more convenient unit on modern machines. We could easily make it a float for anyone who thinks they know the cache size to sub-MB accuracy. I have no problems with MB. Only thing I want to see is a bit more user friendly and consistent configuration. Initially I thought of bytes as oracle does but said 'let's be little more practical' and put KB..:-) MB is perfectly OK. It would just change the multiplier. And I don't think specifying a float is such a good idea. It is just so counter-intuitive. I mean how many people would care for fraction of an MB wasted? As for the others, I'll side with Emerson: a foolish consistency is the hobgoblin of little minds. We know very well what the unit of allocation of those is, and it's pages. There's no advantage to using KB except making it harder to work out what's really happening. We could measure max_connections in KB too if we had a mind to: there's a very definite shared-mem cost per connection slot. Or the FSM parameters, or checkpoint_segments, or max_locks_per_transaction. The fact that they have quantifiable space costs doesn't mean that space is the most useful way to measure them. Agreed. This is not to change things left and right. It is only to put some consistency in place. BTW, were you intending to convert KB to NBuffers by charging exactly 8K per buffer, or were you intending to allow for the additional shmem costs such as buffer headers, per-buffer LWLocks, etc? If not the latter, then what are you really measuring? For sure it's not shared memory size --- charging an artificial number isn't going to help anyone who's trying to pick shared_buffers to arrive at a particular actual shmem size. But if it is the latter then it'll become even more impossible to tell what's really happening, and we'll be forced to invent some way of reading out how many buffers really got allocated. Well, for the purpose, this is beyond what I am trying to do. As of now there is a int value sitting in postgresql.conf which is in page blocks. Now there will be a conversion before it is used anytime so that it is bit more user friendly. The change should be skin deep so as to be low impact. As far putting a prefix such as K or M, I don't know much work that would be. Does that mean we need to convert shared_buffers to a string parameter and parse it? [EMAIL PROTECTED](I would gladly write a real name but alas) said yesterday that there is a patch pending with Bruce for such a framework. I don't know what and how it does. For simplicity, I would convert all memory parameters to either KB or MB and state so in postgresql.conf. No floats no suffixes. This is my opinion of course. Any suggestions are always welcome.. Actually I need to find out few more things about it. It is not as simple as adding a assign_hook. When I tried to initdb with changes, it demanded 64MB of shared buffers which I (now) think that somewhere NBuffers are used before postgresql.conf is parsed. So 8192*8000=64MB. But this is just guesswork. Haven't looked in it there. If this seems reasonably OK, then I would spend some more time on it. We would need quite some documentation update then. So I disagree with the premise. Measuring these things in KB is not an improvement. As I said, KBs or MBs is not the issue. Not having it in terms of pagesize is. Shridhar ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
On Tuesday 01 June 2004 14:12, Shridhar Daithankar wrote: Actually I need to find out few more things about it. It is not as simple as adding a assign_hook. When I tried to initdb with changes, it demanded 64MB of shared buffers which I (now) think that somewhere NBuffers are used before postgresql.conf is parsed. So 8192*8000=64MB. But this is just guesswork. Haven't looked in it there. Found it. Following is the code that is causing problem. guc.c:2998 --- if (conf-assign_hook) if (!(*conf-assign_hook) (newval, changeVal, source)) { ereport(elevel, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), errmsg(invalid value for parameter \%s\: %d, name, newval))); return false; } if (changeVal || makeDefault) { if (changeVal) { *conf-variable = newval; conf-gen.source = source; } --- So even if assign_hook is executed, the value of variable is overwritten in next step which nullifies any factoring/change in value done in assign hook. I find this as a convention at many other place at guc.c. Call assign_hook and the overwrite the value. So is assign_hook called only to validate the value? How do I modify the value of the variable without getting specific? I tried if (changeVal !(conf-assign_hook)) and it worked. However that is just for int variables. I am not sure if that is a design decision. What should I do? Regards, Shridhar ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, 28 May 2004 14:47:01 -0400, Tom Lane [EMAIL PROTECTED] wrote: If putting back xmax is the price we must pay for nested transactions, then we *will* pay that price. Maybe not in this release, but it will inevitably happen. we = every Postgres user, even those that do not use subtransactions. price = 2% to 5% performance loss for databases where the working set is larger than main memory. Don't bother hollering veto ;-) Ok, I'll shut up till I have something concrete to support my opinion. Servus Manfred ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Fri, 28 May 2004 21:59:53 -0400, Alvaro Herrera [EMAIL PROTECTED] wrote: So the assumption was that when we see that this has happenned, the Cmin is no longer important (== every future command can already see the tuple) If it was that simple, we wouldn't have had to invent the CMIN_IS_CMAX flag. This has been discussed two years ago. Did you follow the link I posted last week? Every future command is not enough. You have to consider the current command and even commands started before this. Servus Manfred ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] SELECT * FROM table LIMIT 1; is really slow
On Tue, Jun 01, 2004 at 02:37:37PM +0200, Manfred Koizar wrote: On Fri, 28 May 2004 21:59:53 -0400, Alvaro Herrera [EMAIL PROTECTED] wrote: So the assumption was that when we see that this has happenned, the Cmin is no longer important (== every future command can already see the tuple) If it was that simple, we wouldn't have had to invent the CMIN_IS_CMAX flag. This has been discussed two years ago. Did you follow the link I posted last week? Every future command is not enough. You have to consider the current command and even commands started before this. Yes, I did follow it (you mean XMAX_IS_XMIN, right? I suppose no tuple can really have Cmin == Cmax). I'm not claiming I understood it fully though. But as you see, since the assumption is not valid we have to drop the idea and put back the Xmax as a field on its own on HeapTupleHeader (which is what I had done before Bruce persuaded me not to). I don't really like this idea but I don't see other way out. A couple of days ago I was going to propose putting Xmax as a separate field only as needed, in a way similar to the way Oid is handled --- thus we would enlarge the tuple if and only if the creating transaction deletes it. This would be nice because one would expect that there are not that many tuples created and deleted by the same transaction, so we'd localize the inefficiency of storing both fields (Cmin and Xmax) only on tuples that need it. While I was writing the proposal I realised that it'd mean enlarging tuples that are already on disk, and there's no way we can do that. If you have other ideas I'm all ears. I'm the last one to want that nested xacts make everything else work slower. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) The eagle never lost so much time, as when he submitted to learn of the crow. (William Blake) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Fast index build vs. PITR
I think an actually implementable version of this would be: 1. Don't log any index operations at all in WAL. 2. When recovering from WAL, restore all the table contents by WAL replay. (This would of course include the system catalog contents that describe the indexes.) Then sit there and do a global REINDEX to rebuild all the indexes. This would gain a reduction of some percentage in WAL traffic, at the cost of a hugely expensive recovery cycle any time you actually needed to use the WAL. I guess this could be attractive to some installations, but I'm not sure very many people would want it ... I think only the global part of it is not really acceptable. If we had a flag for each index that marks it inconsistent reindexing only those that are marked would be great. Could we log a WAL record that basically only marks an index for deferred reindex after WAL recovery ? During WAL replay all records for this index could be ignored (this is not a must because of the post update page images in WAL, the index would still stay inconsistent until reindex of course). I think such a reindex step could also be responsible for those non-btree indexes that don't fully support WAL (gist?). Andreas ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Nested xacts: looking for testers and review
This may be out of scope but I'm goign to mention it. Would error trapping help any of these issues. In Oracle PL/SQL you have an exception section to handle any known or unknown errors. Is this for the future or does the nested xacts code include this at all? |-+-- | | Bruce Momjian | | | [EMAIL PROTECTED]| | | | | | Sent by: | | | [EMAIL PROTECTED]| | | tgresql.org| | | | | | | | | 05/28/2004 03:05 PM| | | | |-+-- --| | | | To: Alvaro Herrera [EMAIL PROTECTED] | | cc: Tatsuo Ishii [EMAIL PROTECTED], [EMAIL PROTECTED] | | Subject: Re: [HACKERS] Nested xacts: looking for testers and review | --| Alvaro Herrera wrote: On Fri, May 28, 2004 at 01:43:16PM -0400, Bruce Momjian wrote: In this case, I want to try all of the inserts, but any of them can fail, then I want the bottom part done. I wonder where everyone eas when I asked this question a lot of time ago. I said I thought the behavior should be like I described, and no one objected. Sorry, I didn't understand the question at the time, or wasn't paying attention. Personally I think it would be a mistake to allow the COMMIT for the subtransaction to ignore the fact that the subxact was aborted. However I realize what you are proposing, and maybe this can be implemented using a parameter to COMMIT (indicating to not propagate the error if it's in aborted state, but commit normally otherwise). However if everyone disagrees, I can take that part out, and the code would be simpler. IMHO however, it would be less reliable. Imagine this case used in a script: BEGIN; DROP TABLE test; CREATE TABLE test(x int); COMMIT; This will not work because the drop might fail. However you could use this: BEGIN; BEGIN; DROP TABLE test; COMMIT; CREATE TABLE test(x int); COMMIT; It is done in a transaction so the table replace is an atomic operation. One interesting idea would be for COMMIT to affect the outer transaction, and END not affect the outer transaction. Of course that kills the logic that COMMIT and END are the same, but it is an interesting idea, and doesn't affect backward compatibility because END/COMMIT behave the same in non-nested transactions. If this is the type of issue we are dealing with for the patch, I feel very good. Good job Alvaro. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org * PRIVILEGED AND CONFIDENTIAL: This communication, including attachments, is for the exclusive use of addressee and may contain proprietary, confidential and/or privileged information. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this communication and destroy all copies. * ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Proposed Query Planner TODO items
On 12 Feb, Tom Lane wrote: [EMAIL PROTECTED] writes: Ok, I have EXPLAIN ANALYZE results for both the power and throughput tests: http://developer.osdl.org/markw/dbt3-pgsql/ Thanks. I just looked at Q9 and Q21, since those are the slowest queries according to your chart. (Are all the queries weighted the same for evaluation purposes, or are some more important than others?) [snip] The estimate for the part/partsupp join is close enough (60K vs 90K rows), but why is it estimating 92 rows out of the join to lineitem when the true figure is 681518? With a more accurate estimate the planner would probably have chosen different join methods above this point. Can you show us the pg_stats rows for the columns p_partkey, l_partkey, ps_suppkey, and l_suppkey? It would also be interesting to see whether a better estimate emerges if you increase default_statistics_target (try 100 or so). http://developer.osdl.org/markw/dbt3-pgsql/62/ This run changes default_statistics_target to 1000 and I have p_partkey, l_partkey, ps_suppkey, and l_suppkey pg_stats here at 1 min intervals http (no links on the web page.) Pretty significant performance change. Power: http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/power.l_partkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/power.l_suppkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/power.p_partkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/power.ps_suppkey.out Throughput: http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/thuput.l_partkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/thuput.l_suppkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/thuput.p_partkey.out http://developer.osdl.org/markw/dbt3-pgsql/62/db_stat/thuput.ps_suppkey.out Something went wrong when I tried to run another test with the Q21 changes overnight, so I'll have to get back to you on that one. Mark ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Fast index build vs. PITR
Simon Riggs [EMAIL PROTECTED] writes: The mechanism you suggest would also break crash recovery, not just PITR No it wouldn't. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] dblink - custom datatypes NOW work :)
Mark Gibson wrote: I've found the problem, although I'm still a bit confused by it. I hate to do this to you now, but after some thought I think I have a better approach -- I'd be interested in opinions on that assessment. The attached eliminates pgresultGetTupleDesc() entirely and instead depends on the TupleDesc passed as rsinfo-expectedDesc from the executor. What this means is that the string representation of the remote value (from the out function on the remote side, as provided by libpq) will get fed into the in function corresponding to the local type you assign in your SQL statement. Assuming the types on the two sides are the same (or at least compatible), it should work well. Please give this a try and let me know what you think. Joe Index: contrib/dblink/dblink.c === RCS file: /opt/src/cvs/pgsql-server/contrib/dblink/dblink.c,v retrieving revision 1.29 diff -c -r1.29 dblink.c *** contrib/dblink/dblink.c 28 Nov 2003 05:03:01 - 1.29 --- contrib/dblink/dblink.c 13 Feb 2004 18:23:49 - *** *** 82,88 static int16 get_attnum_pk_pos(int16 *pkattnums, int16 pknumatts, int16 key); static HeapTuple get_tuple_of_interest(Oid relid, int16 *pkattnums, int16 pknumatts, char **src_pkattvals); static Oidget_relid_from_relname(text *relname_text); - static TupleDesc pgresultGetTupleDesc(PGresult *res); static char *generate_relation_name(Oid relid); /* Global */ --- 82,87 *** *** 395,400 --- 394,400 StringInfo str = makeStringInfo(); char *curname = NULL; int howmany = 0; + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo-resultinfo; if (PG_NARGS() == 3) { *** *** 457,463 if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' functypeid == RECORDOID) ! tupdesc = pgresultGetTupleDesc(res); else /* shouldn't happen */ elog(ERROR, return type must be a row type); --- 457,472 if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' functypeid == RECORDOID) ! { ! if (!rsinfo) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), !errmsg(returning setof record is not \ ! allowed in this context))); ! ! /* get the requested return tuple description */ ! tupdesc = CreateTupleDescCopy(rsinfo-expectedDesc); ! } else /* shouldn't happen */ elog(ERROR, return type must be a row type); *** *** 550,555 --- 559,565 char *sql = NULL; char *conname = NULL; remoteConn *rcon = NULL; + ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo-resultinfo; /* create a function context for cross-call persistence */ funcctx = SRF_FIRSTCALL_INIT(); *** *** 620,626 if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' functypeid == RECORDOID) ! tupdesc = pgresultGetTupleDesc(res); else /* shouldn't happen */ elog(ERROR, return type must be a row type); --- 630,645 if (functyptype == 'c') tupdesc = TypeGetTupleDesc(functypeid, NIL); else if (functyptype == 'p' functypeid == RECORDOID) ! { ! if (!rsinfo) ! ereport(ERROR, ! (errcode(ERRCODE_SYNTAX_ERROR), !errmsg(returning setof record is not \ ! allowed in this context))); ! ! /* get the requested return tuple description */ ! tupdesc = CreateTupleDescCopy(rsinfo-expectedDesc); ! } else /* shouldn't happen */ elog(ERROR, return type must be a row type); *** ***
Re: [HACKERS] CVS tip problems
Oliver Elphick [EMAIL PROTECTED] writes: - strerror_r(errnum, strerrbuf, buflen); + char buf[256]; + StrNCpy(strerrbuf, strerror_r(errnum, buf, 256), buflen); return strerrbuf; Easier and safer would be - strerror_r(errnum, strerrbuf, buflen); - return strerrbuf; + return strerror_r(errnum, strerrbuf, buflen); The real point here is that we need to code differently depending on which flavor of strerror_r we have. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
Sometime between yesterday and today queries in the form of select * from t where t.f like 'k%' have been broken so that they never use the index (on win32, not sure about others). On win32, at least, they have been broken for a while but this was due to a known issue based on the locales. AFAICT, the current cvs has addressed this issue and (show lc_collate returns C) there seems to be no reason why the queries aren't working properly. Merlin Did you do an ANALYZE on the table? Yes. Just for kicks, I also drop/rc the index...no help. Following that, I ran a fresh initdb which reported: The database cluster will be initialized with locale English_United States.1252. I then ran I just recently had the same issue (due to locale problems). This was recently fixed in cvs and replaced the hack I was using to work around the problem. The index search no longer works and I am very suspicious about a locale related issue. This is all off of a fresh copy of 7.5devel from the anonymous cvs server. Are there a lot of duplicate keys? How big is the table? About 250k with less than 1% duplicatation. What does the explain look like? cpc=# explain select * from hchassis where vin_no = '2FTZX08W8WCA24365'; QUERY PLAN -- Index Scan using hchassis_vin_no_idx on hchassis (cost=0.00..8.94 rows=2 width=437) Index Cond: (vin_no = '2FTZX08W8WCA24365'::bpchar) (2 rows) cpc=# explain select * from hchassis where vin_no like '2FTZX08W8WCA24365%'; QUERY PLAN -- Seq Scan on hchassis (cost=0.00..19577.70 rows=1 width=437) Filter: (vin_no ~~ '2FTZX08W8WCA24365%'::text) (2 rows) cpc=# Merlin ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
Sometime between yesterday and today queries in the form of select * from t where t.f like 'k%' have been broken so that they never use the index (on win32, not sure about others). On win32, at least, they have been broken for a while but this was due to a known issue based on the locales. AFAICT, the current cvs has addressed this issue and (show lc_collate returns C) there seems to be no reason why the queries aren't working properly. Merlin Did you do an ANALYZE on the table? Yes. Just for kicks, I also drop/rc the index...no help. Following that, I ran a fresh initdb which reported: The database cluster will be initialized with locale English_United States.1252. I then ran I just recently had the same issue (due to locale problems). This was recently fixed in cvs and replaced the hack I was using to work around the problem. The index search no longer works and I am very suspicious about a locale related issue. This is all off of a fresh copy of 7.5devel from the anonymous cvs server. Are there a lot of duplicate keys? How big is the table? About 250k with less than 1% duplicatation. What does the explain look like? cpc=# explain select * from hchassis where vin_no = '2FTZX08W8WCA24365'; QUERY PLAN -- Index Scan using hchassis_vin_no_idx on hchassis (cost=0.00..8.94 rows=2 width=437) Index Cond: (vin_no = '2FTZX08W8WCA24365'::bpchar) (2 rows) cpc=# explain select * from hchassis where vin_no like '2FTZX08W8WCA24365%'; QUERY PLAN -- Seq Scan on hchassis (cost=0.00..19577.70 rows=1 width=437) Filter: (vin_no ~~ '2FTZX08W8WCA24365%'::text) (2 rows) cpc=# It looks to me like you have an index of type bpchar but are searching with type text. I find type conversions very limited with LIKE. I would create an index on 'vin_no' using a cast to TEXT. This should work on both queries. ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
It looks to me like you have an index of type bpchar but are searching with type text. I find type conversions very limited with LIKE. I would create an index on 'vin_no' using a cast to TEXT. This should work on both queries. Not in this case. Just to be sure, I created a new column as text type, created index, analyzed, and searched and got the same behavior. Furthermore, I did this: cpc=# show lc_collate; lc_collate C (1 row) cpc=# show lc_ctype; lc_ctype -- C (1 row) followed by this: C:\postgres\pgsql\src\test\localepg_controldata [...] LC_COLLATE: English_United States.1252 LC_CTYPE: English_United States.1252 At this point I'm about 90% sure I've turned up a locale related bug...initdb warned me wrt the locale but psql is still reporting 'C'. Plus, my queries don't work where they used to about a week ago. My next step is to initdb --locale=C to confirm this. I've informed Magnus about this and he is looking into it. Merlin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Sounds like a project manager type should be put into place to organize this information into a straight stream instead of 50 random mists of water |-+-- | | Andrew Dunstan | | | [EMAIL PROTECTED] | | | Sent by: | | | [EMAIL PROTECTED]| | | tgresql.org| | | | | | | | | 06/01/2004 11:10 AM| | | | |-+-- --| | | | To: [EMAIL PROTECTED] | | cc: | | Subject: Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004 | --| Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the development cycle we seem to have, and the fact that many of the critical things people are working on are quite large. (I'd also like to see someone who would get regular progress reports from people who have undertaken to work on large/critical items, so that we don't get into a position of thinking they will make a cutoff date and then finding out late in the piece that they will not, but maybe that's a discussion for another day). cheers andrew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster * PRIVILEGED AND CONFIDENTIAL: This communication, including attachments, is for the exclusive use of addressee and may contain proprietary, confidential and/or privileged information. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this communication and destroy all copies. * ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the development cycle we seem to have, and the fact that many of the critical things people are working on are quite large. (I'd also like to see someone who would get regular progress reports from people who have undertaken to work on large/critical items, so that we don't get into a position of thinking they will make a cutoff date and then finding out late in the piece that they will not, but maybe that's a discussion for another day). cheers andrew ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Win32, PITR, nested transactions, tablespaces
Christopher Browne [EMAIL PROTECTED] writes: PITR may turn out to be a don't care item if Slony1 winds up providing its own approach to PITR. (e.g. - if you write out to disk the sets of SQL statements that are to be applied to a replica, then the spooled sets of these statements represent a history of updates that can be used to do PITR.) In the long run nothing can substitute for PITR. It's the only way to get a backup that is guaranteed to restore you to exactly what you had before. Logical dumps a la pg_dump suffer from having to unparse and parse all the data. Any data types that don't accurately store themselves as text (such as arrays currently, due to the index lower bound) get corrupted. SQL level replication, a la Slony wouldn't serve if you have any non-deterministic behaviour. And given SQL's set theoretic roots non-deterministic behaviour can creep in in places where it's not expected, such as any query that doesn't have an ORDER BY clause. Both of these tools have their uses, but they don't provide a rock solid guarantee that you can restore a machine to exactly what you had previously. To do that you really want something that works at the storage level and doesn't try to re-interpret data or reapply any logic. -- greg ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Fast index build vs. PITR
Tom Lane [EMAIL PROTECTED] writes: I thought for a little bit about a magic reconstruct the index WAL entry that would invoke the index build procedure in toto, but that doesn't look like it will fly either. (Two problems: during crash recovery, you couldn't be sure that what's on disk for the underlying table exactly matches the index you need to build --- it could be a later state of the table; and besides, the environment of the WAL replay process isn't capable of running user-defined functions, so it couldn't work for functional indexes.) Could you just mark the index as unusable? Have the optimizer ignore such indexes and PITR recovery can notify the user of these indexes and/or invoke a rebuild automatically? It wouldn't happen unless the user had done an index rebuild since the last complete backup, so it wouldn't even be a performance issue. Restoring the index from the WAL replay of an index rebuild must take a long time anyways. -- greg ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Marc G. Fournier wrote: On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? I am far from being the only person to whom this was less than clear. I also know that when I discussed this with one or two members of the core team *they* were not clear about it either. Maybe I missed something in an email somewhere ... In any case, I think a target date should be set at the beginning of a dev cycle and a hard date should be set closer to the end of the cycle. Trying to adhere rigidly to a date set nine or twelve months previously doesn't strike me as good practice. cheers andrew ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Fast index build vs. PITR
On Tue, Jun 01, 2004 at 12:52:32PM -0400, Greg Stark wrote: Tom Lane [EMAIL PROTECTED] writes: I thought for a little bit about a magic reconstruct the index WAL entry that would invoke the index build procedure in toto, but that doesn't look like it will fly either. (Two problems: during crash recovery, you couldn't be sure that what's on disk for the underlying table exactly matches the index you need to build --- it could be a later state of the table; and besides, the environment of the WAL replay process isn't capable of running user-defined functions, so it couldn't work for functional indexes.) Could you just mark the index as unusable? Have the optimizer ignore such indexes and PITR recovery can notify the user of these indexes and/or invoke a rebuild automatically? The big problem I see with this kind of approaches is that building an index from scratch can take a huge amount of time, because you have to sort the data. Building from WAL does not have this problem, so it can be much faster. Of course, when you are restoring using a PITR approach you probably want it to be very fast, and have the DB running with as little quirks as possible, as soon as possible. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) El realista sabe lo que quiere; el idealista quiere lo que sabe (Anónimo) ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
I really hate seeing all the developers wasting time and brain cycles on this type of stuff. I would much rather that time and brain cycles be put to the design and development of the code. Would a project manager type position be of any value to take some of this off the developers and onto the project manager.? They would be the focal point for this type of stuff and responsible to get updates from the developers and check statuses and things of that nature. |-+-- | | Andrew Dunstan | | | [EMAIL PROTECTED] | | | Sent by: | | | [EMAIL PROTECTED]| | | tgresql.org| | | | | | | | | 06/01/2004 12:26 PM| | | | |-+-- --| | | | To: [EMAIL PROTECTED] | | cc: | | Subject: Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004 | --| Marc G. Fournier wrote: On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? I am far from being the only person to whom this was less than clear. I also know that when I discussed this with one or two members of the core team *they* were not clear about it either. Maybe I missed something in an email somewhere ... In any case, I think a target date should be set at the beginning of a dev cycle and a hard date should be set closer to the end of the cycle. Trying to adhere rigidly to a date set nine or twelve months previously doesn't strike me as good practice. cheers andrew ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] * PRIVILEGED AND CONFIDENTIAL: This communication, including attachments, is for the exclusive use of addressee and may contain proprietary, confidential and/or privileged information. If you are not the intended recipient, any use, copying, disclosure, dissemination or distribution is strictly prohibited. If you are not the intended recipient, please notify the sender immediately by return e-mail, delete this communication and destroy all copies. * ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Fast index build vs. PITR
Alvaro Herrera [EMAIL PROTECTED] writes: The big problem I see with this kind of approaches is that building an index from scratch can take a huge amount of time, because you have to sort the data. Building from WAL does not have this problem, so it can be much faster. I'm not clear that building from WAL is really going to be that much faster. A) algorithmically it's only the factor of log(n) that you're talking about. and B) the WAL will have records for every write, not just the final product, so it might potentially have a lot more writes to do. I thought part of the original problem was specifically that going through WAL slowed down the index rebuild much more than a factor of 2, which would tend to imply that in fact rebuilding from WAL isn't going to be as fast as you might expect. Another possibility is doing the complete index build without going through WAL and then inserting a complete copy of the index into the WAL without syncing or activating the rebuilt index until the copy do WAL is done. That kind of sucks since it's equivalent to just taking another backup of the data files immediately after the rebuild, but might be a more direct solution using the existing tools. Of course, when you are restoring using a PITR approach you probably want it to be very fast, and have the DB running with as little quirks as possible, as soon as possible. This is certainly true. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Having a good hard copy (not having to search mailing list archives) of release dates would be really nice, not just for developers, but users too. Even if they are subject to change without notice. I think Mozilla has a great concept with there Milestone Schedule, the gray table at: http://www.mozilla.org/roadmap.html#milestone-schedule. I'm sure having just a small table like what Mozilla uses on the PostgreSQL developers page would work wonders to eliminate much of the confusion in the future. On Tue, 2004-06-01 at 13:26 -0400, Andrew Dunstan wrote: Marc G. Fournier wrote: On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? I am far from being the only person to whom this was less than clear. I also know that when I discussed this with one or two members of the core team *they* were not clear about it either. Maybe I missed something in an email somewhere ... In any case, I think a target date should be set at the beginning of a dev cycle and a hard date should be set closer to the end of the cycle. Trying to adhere rigidly to a date set nine or twelve months previously doesn't strike me as good practice. cheers andrew ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Mike Benoit [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fast index build vs. PITR
Greg Stark [EMAIL PROTECTED] writes: I'm not clear that building from WAL is really going to be that much faster. A) algorithmically it's only the factor of log(n) that you're talking about. and B) the WAL will have records for every write, not just the final product, so it might potentially have a lot more writes to do. Wrong ... what we log in WAL for a btree index build is just the series of completed index page images. Recreation of the index would proceed at whatever your disk read/write bandwidth is. Like Alvaro, I suspect that people who are using PITR will be concerned about recovery time, and would not be thrilled with any scenario that involves REINDEX to get the system back on its feet. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Fast index build vs. PITR
On Tue, Jun 01, 2004 at 01:55:38PM -0400, Greg Stark wrote: Alvaro Herrera [EMAIL PROTECTED] writes: The big problem I see with this kind of approaches is that building an index from scratch can take a huge amount of time, because you have to sort the data. Building from WAL does not have this problem, so it can be much faster. I'm not clear that building from WAL is really going to be that much faster. A) algorithmically it's only the factor of log(n) that you're talking about. and B) the WAL will have records for every write, not just the final product, so it might potentially have a lot more writes to do. Maybe it is log(n) algorithmically, but the constants are big because there's a lot of non-sequential I/O involved. With the WAL approach you only read the pages from the log and copy them somewhere else. It's not nearly the same amount of I/O, and it's mostly sequential. And if I understood correctly what Tom said, after index construction the whole index pages are dropped, so there's no need to redo each node split operation. I thought part of the original problem was specifically that going through WAL slowed down the index rebuild much more than a factor of 2, which would tend to imply that in fact rebuilding from WAL isn't going to be as fast as you might expect. I think there was more than one problem in the code. I expect at least those not related (such as some locking issue apparently) are solved. And I'd expect WAL construction to be heavier than recovery from WAL, because some WAL writes have to be fsync()ed, and this is a heavy burden, while recovery does not need fsync on the new files AFAIK (but I could be wrong on this). One of the things that bothered me was that for some reason you couldn't get the whole performance benefit you would expect from simultaneous index builds when using a multiprocessor machine. Another possibility is doing the complete index build without going through WAL and then inserting a complete copy of the index into the WAL without syncing or activating the rebuilt index until the copy do WAL is done. That kind of sucks since it's equivalent to just taking another backup of the data files immediately after the rebuild, but might be a more direct solution using the existing tools. Apparently this is the current state of affairs, though I'm not sure. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) La persona que no quería pecar / estaba obligada a sentarse en duras y empinadas sillas/ desprovistas, por cierto de blandos atenuantes (Patricio Vogel) ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Or even KDE as an example where they have both a document on the website for release schedule and another one that is a list of features that are desired for the next release, have been worked on, and have been completed. http://developer.kde.org/development-versions/ Having a good hard copy (not having to search mailing list archives) of release dates would be really nice, not just for developers, but users too. Even if they are subject to change without notice. I think Mozilla has a great concept with there Milestone Schedule, the gray table at: http://www.mozilla.org/roadmap.html#milestone-schedule. I'm sure having just a small table like what Mozilla uses on the PostgreSQL developers page would work wonders to eliminate much of the confusion in the future. On Tue, 2004-06-01 at 13:26 -0400, Andrew Dunstan wrote: Marc G. Fournier wrote: On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? I am far from being the only person to whom this was less than clear. I also know that when I discussed this with one or two members of the core team *they* were not clear about it either. Maybe I missed something in an email somewhere ... In any case, I think a target date should be set at the beginning of a dev cycle and a hard date should be set closer to the end of the cycle. Trying to adhere rigidly to a date set nine or twelve months previously doesn't strike me as good practice. cheers andrew ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED] -- Mike Benoit [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Win32, PITR, nested transactions, tablespaces
On Thu, 27 May 2004, Christopher Browne wrote: [EMAIL PROTECTED] (Bruce Momjian) wrote: Win32 has 98% of its code in CVS, so it will make it Tablespaces - Christopher says it is ready, and has run tests PITR - some functionality might be in 7.5, but we aren't sure Nested transactions - Alvaro thinks it is close, but we don't know Does this mean that 2PC is likely to be deferred? I believe that integration work on that was awaiting having nested transactions in the codebase... I'm not waiting for the nested transactions to land anymore. I have simply been too busy to finish up 2PC. :( When I said I'd wait for nested transactions, some weeks ago, I thought that Alvaros patch was going to be checked in within a couple of days or a week. That impression was wrong. Anyway, I don't believe the patches conflict very badly, so there's not really a need for either one to wait. Pushing the freeze date to 1. July just might be enough for me to finish the 2PC patch, but don't get your hopes up. I don't know how much time I can spend on it in June. The issue that I'm currently pondering is what to do with SET SESSION variables on 2PC. This far my thinking has been that the PREPARE TRANSACTION command (1st phase) detaches the transaction from the backend You can do whatever you want with the backend, run other transaction, drop the connection etc. The prepared transaction acts just like a running transaction in another backend. You can also use a different backend to commit the transaction. The question is, what happens to session variables when the first transaction commits? The original backend could be in the middle of another transaction, surely you can't just modify the variables. But you can't really ignore them either. There is also a lot of other functionality that throws not implemented errors if you try to use 2PC with them. For example, DDL and notifications. I believe they are quite straightforward, I just haven't had the time to tackle them yet. You can always check the latest version of the patch at http://www.hut.fi/~hlinnaka/pgsql/, I try to update it whenever there is progress. - Heikki ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
On Tue, 2004-06-01 at 18:26, Andrew Dunstan wrote: Marc G. Fournier wrote: On Tue, 1 Jun 2004, Andrew Dunstan wrote: Marc G. Fournier wrote: Just so that everyone is aware, we are going to push the freeze date for 7.5 to July 1st. Although we feel that there are enough improvements and features already in place for 7.5, Tom's felt that if we gave it that extra month, we could also have PITR in place for 7.5 ... If anyone is working on other features that they feel can be polished off before the July 1st deadline, we would be most happy to incorporate those as well, but do recommend submitting patches for review *sooner*, rather then later, so that any recommended corrections can be addressed before teh deadline. I welcome this, as I always thought June 1 was too soon. However, I think that the process by which the date was eventually arrived at was unfortunate. I would modestly suggest that there should be a minimum period of notice of a feature freeze - 6 weeks or 2 months seems about right to me, given the Oh, you mean the original freeze date that was set at the start of the dev cycle 6 months ago? I am far from being the only person to whom this was less than clear. I also know that when I discussed this with one or two members of the core team *they* were not clear about it either. Maybe I missed something in an email somewhere ... The June 1st date was first mentioned on list in mid-March (to me), but wasn't generally announced until May under a specific heading. If it was set in January, I was never knowingly party to that info. Major-architectural changes notwithstanding, xlog archiving was originally completed in late April, having started in Feb. A published schedule might have helped all of us to understand the impact of an extra weeks discussion etc.. Personally, I feel I had good notice, but that doesn't mean it was possible for me to finish by that time... Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Fast index build vs. PITR
Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: I'm not clear that building from WAL is really going to be that much faster. A) algorithmically it's only the factor of log(n) that you're talking about. and B) the WAL will have records for every write, not just the final product, so it might potentially have a lot more writes to do. Wrong ... what we log in WAL for a btree index build is just the series of completed index page images. Recreation of the index would proceed at whatever your disk read/write bandwidth is. Like Alvaro, I suspect that people who are using PITR will be concerned about recovery time, and would not be thrilled with any scenario that involves REINDEX to get the system back on its feet. Agreed. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Fast index build vs. PITR
On Tue, 2004-06-01 at 03:21, Bruce Momjian wrote: Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: I assume if someone turns on archiving in postgresql.conf, sighups the postmaster, then does a tar backup, they should be able to do archiving, no? I would have zero problem with labeling the archive parameter as changeable only at postmaster start. I guess the question is whether it would be possible to start/stop it at other times. And what process are we going to use to do a tar backup? Do they turn off archiving before doing the tar, or tell the system to tar to another location? This situation has been discussed and agreed twice already. First we discussed it was SUSET, then SIGHUP, now we talk about postmaster startup. I'm not sure I'm too bothered either way, but the code has now been written to make it a SIGHUP operation. Making it SIGHUP effects the way we invoke the archiver process at postmaster startup, so if we want to change things again we must do so real soon. Postmaster startup is the simplest scenario at run-time, so I'd suggest we move to that NOW and then MAYBE back to SIGHUP at a later time, when we are more certain everything works in production. And you brought up the issue of how do we feed multilple archive files back into the xlog directory during restore if they don't all fit on the disk. That has already been requested by Tom and agreed as on-the-PITR feature list as an embellishment of the general recover-to-a point scenario. It *MIGHT* make it into this release, if we get the other stuff done first. I think we need to explore the procedures we are going to use for PITR. Much of that has already been discussed. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Fast index build vs. PITR
On Tue, 2004-06-01 at 03:13, Alvaro Herrera wrote: A completely different idea would be to log a logical index creation, so that during normal recovery those entries are saved somewhere; after the rest of WAL recovery is done, the system is taken into a more normal post-recovery pre-usable state, on which those indexes are recreated from user data. This would be cheapest in WAL traffic, but probably it'll also require more code and new hooks in the startup mechanism. Also, it'd require examining later WAL entries that refer to the index and act accordingly (e.g. ignore the entry if it modifies the index, and forget the creation if it's a DROP INDEX command.) There will be many ways to optimise recovery once we have PITR working... The current code does a straight replay of all changes. We can imagine lots of different multi-pass or lookahead strategies for replaying xlog records, but please lets wait awhile... Not that I like neither of those ideas really ... issuing normal WAL index creation traffic if PITR is active is certainly the easiest way. I agree, certainly for now. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
Shridhar, Tom, As long as we're messing around with PostgreSQL.conf, I propose that we comment the file much more thouroughly -- in the style of Apache's httpd.conf and our own pg_hba.conf (though maybe not quite as long as hba). Someone proposed this for 7.4 and we ran out of time, and as I've thought about the idea over the last 6 months I've come to like it. I'm happy to do the work for this ... it should be relatively easy if everyone's Doc patches are up to date.I can't see any drawback to it; is there something I'm missing? -- -Josh Berkus Aglio Database Solutions San Francisco ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
Josh Berkus [EMAIL PROTECTED] writes: As long as we're messing around with PostgreSQL.conf, I propose that we comment the file much more thouroughly -- in the style of Apache's httpd.conf and our own pg_hba.conf (though maybe not quite as long as hba). ISTM we had decided that putting vast amounts of documentation into the file comments was exactly the thing *not* to do. It duplicates the SGML documentation, thereby doubling the maintenance effort, to very little purpose. pg_hba's comments in particular had gotten quite out of hand at one point, and have been pruned back severely. Let's not follow the same wheel of fate for postgresql.conf. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Win32, PITR, nested transactions, tablespaces
Heikki Linnakangas [EMAIL PROTECTED] writes: There is also a lot of other functionality that throws not implemented errors if you try to use 2PC with them. For example, DDL and notifications. Why would DDL be different from any other query? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
[HACKERS] ACLs versus ALTER OWNER
I've noticed yet another hole in our handling of object permissions, which is that ALTER OWNER doesn't modify the object ACL list at all. This leads to unpleasant results. For example, in CVS tip: regression=# \c - alice You are now connected as new user alice. regression= create table atable (f1 int); CREATE TABLE regression= grant select on atable to public; GRANT regression= \z atable Access privileges for database regression Schema | Name | Type | Access privileges ++---+ public | atable | table | {alice=arwdRxt/alice,=r/alice} (1 row) regression= \c - postgres You are now connected as new user postgres. regression=# alter table atable owner to bob; ALTER TABLE regression=# \c - bob You are now connected as new user bob. regression= insert into atable values(1); ERROR: permission denied for relation atable Bob hasn't got insert permissions on his own table ... the ACL says so. Well, since Bob is now the owner he can fix that: regression= grant all on atable to bob; GRANT regression= insert into atable values(1); INSERT 154991 1 but he's not out of the woods yet. The ACL now looks like this: regression= \z atable Access privileges for database regression Schema | Name | Type | Access privileges ++---+ public | atable | table | {alice=arwdRxt/alice,=r/alice,bob=arwdRxt/bob} (1 row) Alice still has all permissions, and PUBLIC still has select permissions, and there isn't a darn thing Bob can do about it because he didn't grant those permissions: regression= revoke all on atable from alice; REVOKE regression= revoke all on atable from public; REVOKE regression= \z atable Access privileges for database regression Schema | Name | Type | Access privileges ++---+ public | atable | table | {alice=arwdRxt/alice,=r/alice,bob=arwdRxt/bob} (1 row) Even more interesting, the superuser can't fix it either, at least not without manual hacking of the ACL entry, because any GRANT/REVOKE the superuser issues on the object will be treated as issued by Bob. The *only* way to get rid of those rights is to persuade Alice to revoke them. (Or for the superuser to revert the ownership change, revoke the rights as-if-Alice, and then give the table back to Bob. Blech.) ISTM that reasonable behavior for ALTER OWNER would include doing surgery on the object's ACL to replace references to the old owner by references to the new owner. A simplistic approach would just be to do that everywhere in both the grantor and grantee fields. If there are existing entries mentioning the new owner then this could produce duplicate ACL entries, which would need to be merged together. I think there are corner cases where the merging might produce unintuitive results, but it couldn't be as spectacularly bad as doing nothing is. Comments? regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Nested transactions and tuple header info
Alvaro Herrera wrote: Yes, I did follow it (you mean XMAX_IS_XMIN, right? I suppose no tuple can really have Cmin == Cmax). I'm not claiming I understood it fully though. But as you see, since the assumption is not valid we have to drop the idea and put back the Xmax as a field on its own on HeapTupleHeader (which is what I had done before Bruce persuaded me not to). I don't really like this idea but I don't see other way out. A couple of days ago I was going to propose putting Xmax as a separate field only as needed, in a way similar to the way Oid is handled --- thus we would enlarge the tuple if and only if the creating transaction deletes it. This would be nice because one would expect that there are not that many tuples created and deleted by the same transaction, so we'd localize the inefficiency of storing both fields (Cmin and Xmax) only on tuples that need it. While I was writing the proposal I realised that it'd mean enlarging tuples that are already on disk, and there's no way we can do that. I have read the archives and I think understand the issue. Before subtransactions, the only transaction that could see and hence delete a tuple created by an open transaction was the transaction itself, and to keep the cmin and cmax, we created a separate tuple bit which indicated the xmin and xmax were the same. With subtransactions, other xids (subtransaction ids) can see and delete tuples created by earlier parts of the main transaction, and the tuple bit cmin=cmax doesn't work. So, we need a way to record the xmin and xmax while keeping cmin and cmax in the tuple header. My idea is for subtransactions to create additional xid's that represent the opposite of the commit state for changing tuples created by earlier subtransactions. BEGIN; xid=1 INSERT a; BEGIN; xid=2 INSERT b; DELETE a; xid=3 COMMIT; COMMIT; When DELETE a happens, we remove the xmin=1 from the tuple header and replace it with xmin=3. xid=3 will be marked as committed if xid2 aborts, and will be marked as aborted if xid3 commits. So, if xid2 aborts, the insert of xid1 should be honored, and xid3 is marked as committed, and the opposite if xid2 commits. We would have to use pg_subtrans so these phantom xids could point to the base xid and a list would have to be maintained so higher-level subtransactions aborting would trigger changes in these phantom xids, that is, if xid1 aborts, xid2 should abort as well. Anyway, this is more of a sketch of an possible way to do this without extending the tuple header for all transactions. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Nested transactions and tuple header info
On Tue, Jun 01, 2004 at 06:40:07PM -0400, Bruce Momjian wrote: So, we need a way to record the xmin and xmax while keeping cmin and cmax in the tuple header. My idea is for subtransactions to create additional xid's that represent the opposite of the commit state for changing tuples created by earlier subtransactions. Hmm, interesting idea. What seems more interesting is that the change seems to be confined to HeapTupleHeaderSetXmax. Every transaction and subtransaction will need two Xids (I think we can even optimize it so the abort xid is taken only as needed). I don't see anything immediately that would invalidate this idea. I'll marinate it while I write the trigger stuff, and wait for other comments. If nothing bad arises I'll try an implementation and report back. Thanks, -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Nunca confiaré en un traidor. Ni siquiera si el traidor lo he creado yo (Barón Vladimir Harkonnen) ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Nested transactions and tuple header info
Bruce Momjian [EMAIL PROTECTED] writes: BEGIN; xid=1 INSERT a; BEGIN; xid=2 INSERT b; DELETE a; xid=3 COMMIT; COMMIT; When DELETE a happens, we remove the xmin=1 from the tuple header and replace it with xmin=3. You can't change xmin --- this would break visibility tests. Consider a cursor opened in the outer transaction after the INSERT a. It should be able to see the a row (note that this depends on recognizing xid equality and then comparing cid's within the outer transaction). If the subtransaction mangles xmin then it is no longer possible to make this test correctly. This is exactly the same argument as not being able to overwrite cmin. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
Tom Lane wrote: ISTM we had decided that putting vast amounts of documentation into the file comments was exactly the thing *not* to do. It duplicates the SGML documentation, thereby doubling the maintenance effort, to very little purpose. I agree. If people really think that adding more comments to pg_hba.conf is a good idea, I think the best way to do that is to automatically extract that information from the main SGML docs. -Neil ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [pgsql-hackers-win32] [HACKERS] select like...not using index
It looks to me like you have an index of type bpchar but are searching with type text. I find type conversions very limited with LIKE. I would create an index on 'vin_no' using a cast to TEXT. This should work on both queries. Not in this case. Just to be sure, I created a new column as text type, created index, analyzed, and searched and got the same behavior. Hmmm, snipped from your reply was the explain plan from the query where it was clear you were using two different character data types: bpchat and text. That, alone, may have been a problem. Looking at your defaults, did you do: initdb --locale=C somepath ? I found, at some point, 'C' used to be the default, now it seems initdb wants to fish out what locale your system is using. Personally, I think, if I do not specify a locale, I don't want a specific locale. Period. I haven't been paying too close attention to the hackers list to say when this happened, but it bit me a couple times. Furthermore, I did this: cpc=# show lc_collate; lc_collate C (1 row) cpc=# show lc_ctype; lc_ctype -- C (1 row) followed by this: C:\postgres\pgsql\src\test\localepg_controldata [...] LC_COLLATE: English_United States.1252 LC_CTYPE: English_United States.1252 At this point I'm about 90% sure I've turned up a locale related bug...initdb warned me wrt the locale but psql is still reporting 'C'. Plus, my queries don't work where they used to about a week ago. My next step is to initdb --locale=C to confirm this. I've informed Magnus about this and he is looking into it. Merlin ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Persistent main memory Storage Manager
Diego Montenegro [EMAIL PROTECTED] writes: Also, in released versions, MM.c is included but not used, does anyone know if it should work if we define the STABLE_MEMORY_STORAGE, or do a lot coding has to be done for it to work? Actually, I removed that file a few days ago because I didn't see any point in updating it to match the recent smgr API changes. Seems very unlikely that it'd still work, given that it hasn't been maintained for so many years. It's unclear to me why anyone would bother with it anyway. Putting $PGDATA on a RAM disk should have approximately the same effect. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
On Tuesday 01 June 2004 16:08, Simon Riggs wrote: The June 1st date was first mentioned on list in mid-March (to me), but wasn't generally announced until May under a specific heading. If it was set in January, I was never knowingly party to that info. Well, it should not have surprised anyone. We have targeted June 1 as a beta freeze date for several versions, not just 7.5. In fact, looking back through last year's pre-7.4 discussion, it's deja vu all over again Please read the thread Release cycle length ( http://archives.postgresql.org/pgsql-hackers/2003-11/msg00889.php ) and follow it through. We're following the same track we did with 7.4. Are we going to be a full year this time? (4.5 months from freeze to release last time) But I could not find using the archives the date June 1 (except in relation to the 7.4 freeze for 6/1/2003). The closest to such a discussion would have been in the thread http://archives.postgresql.org/pgsql-hackers/2004-01/msg00273.php, at least that's all I could find. -- Lamar Owen Director of Information Technology Pisgah Astronomical Research Institute 1 PARI Drive Rosman, NC 28772 (828)862-5554 www.pari.edu ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
[HACKERS] Participate to translation
Hello, I just saw a different documents not translate in the progresql project. So i want participate to the translation in English to French, i am just a french guy using linux since 8 years. It's time for me to give some of my time to the Open Source Community . So if it's possible: -- I want to start a new translation or continue an existing one and want to avoid duplicating your work with someone else, -- I am looking for other people who want to work with you on translating, Best regards. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] Restore live backup from posgtres/data/base/* gives PHPPGADMIN errors, works fine otherwise
Hi there fellow admins, One of our databases had to be restored from a backup tape. Since there weren't any dumps made. All I had was a live backup (I know I know) of /var/lib/postgres/data/base/number. The number for the corresponding database I manages to discover after some searching with strings and grep. First of all: isn't there an easy way or a file to look up the number for the corresponding databases?? Second question... what could have gone wrong while restoring just the corresponding directory for the database? In the log files I only found out that the database wasn't shut down properly (when I started postgres again after the restore) which I assume is correct because it was restored from a live system backup. One CLOG file turned up to be missing, however no database-transactions are used in this database application. Recovery was done and successful by postmaster. The database is now up and running. The website running from the content works fine again. All records including BLOBS are there as well. In PSQL no strange warnings or errors... HOWEVER when I browse the DB with PHPPGADMIN (version 2.4.1-2) I get all kinds of resultset errors. For example: Warning: Supplied argument is not a valid PostgreSQL result resource in /usr/share/phppgadmin/db_details.php on line 344 And this is ONLY for tables from the restored DB, the other (non-restored) DB's give no PHPPGADMIN DB-errors. Has anyone a clue what could be wrong. Or are there tools I can run to do more consistency checks / rebuilds / whatever. Postgres version: 7.2.1-2woody4 Thanks in advance!! -- Kind regards, Jo Voordeckers ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Lamar Owen wrote: On Tuesday 01 June 2004 16:08, Simon Riggs wrote: The June 1st date was first mentioned on list in mid-March (to me), but wasn't generally announced until May under a specific heading. If it was set in January, I was never knowingly party to that info. Well, it should not have surprised anyone. We have targeted June 1 as a beta freeze date for several versions, not just 7.5. In fact, looking back through last year's pre-7.4 discussion, it's deja vu all over again Here's what Tom said on 31 March: -- Simon Riggs [EMAIL PROTECTED] writes: [ expecting to finish PITR by early June ] Is this all still OK for 7.5? (My attempts at cataloguing changes has fallen by the wayside in concentrating on the more important task of PITR.) Do we have a planned freeze month yet? There's not really a plan at the moment, but I had June in the back of my head as a good time; it looks to me like the Windows port will be stable enough for beta in another month or two, and it'd be good if PITR were ready to go by then. -- That seems to indicate that at that stage, barely 2 months ago, the month was not definite, let alone the day. I confess that as a newcomer I was not around before the 7.4 cycle, so saying that people should have known the freeze date because it is following past patterns doesn't help me much. Are people supposed to obtain this info by trawling mailing list archives years back, or by some sort of divine revelation? Other OS projects manage this whole process better, IMNSHO. I'm not trying to point fingers, but to get future improvement. cheers andrew ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] Why repalloc() != realloc() ?
Hackers, Is there a reason why repalloc() does not behave the same as realloc? realloc(NULL, size) behaves the same as malloc(size), and it seems useful behavior -- I wonder why repalloc() chooses to Assert() against this exact condition? I assume this is because the NULL pointer would not know what context it belongs to, but the obvious answer is CurrentMemoryContext just like palloc() does. So there must be another reason. Can this behavior be changed? -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Para tener más hay que desear menos ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Converting postgresql.conf parameters to kilobytes
Neil Conway wrote: Tom Lane wrote: ISTM we had decided that putting vast amounts of documentation into the file comments was exactly the thing *not* to do. It duplicates the SGML documentation, thereby doubling the maintenance effort, to very little purpose. I agree. If people really think that adding more comments to pg_hba.conf is a good idea, I think the best way to do that is to automatically extract that information from the main SGML docs. And if folks want more info in postgresql.conf, we can perhaps pull them from the docs too, or at least the descriptions we added. Also, I it seems postgres --describe-config isn't working. It outputs nothing here. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Nested transactions and tuple header info
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: BEGIN; xid=1 INSERT a; BEGIN; xid=2 INSERT b; DELETE a; xid=3 COMMIT; COMMIT; When DELETE a happens, we remove the xmin=1 from the tuple header and replace it with xmin=3. You can't change xmin --- this would break visibility tests. Consider a cursor opened in the outer transaction after the INSERT a. It should be able to see the a row (note that this depends on recognizing xid equality and then comparing cid's within the outer transaction). If the subtransaction mangles xmin then it is no longer possible to make this test correctly. This is exactly the same argument as not being able to overwrite cmin. Basically the phantom xid's are a shorthand for saying the tuple was created by xid1 and deleted by xid2, both part of the same main transaction. A cursor looking at the rows has to recognize the xid is a phantom (via pg_subtrans) and look up the creation xid. Also, we will need a phantom xid for every xid1/xid2 pair. You can't just create one phantom xid per subtransaction because you must be able to control independently commit/rollback rows based on the status of the insert transaction. In this case: BEGIN; BEGIN; xid=1 INSERT a; BEGIN; xid=2 INSERT b; BEGIN; xid=3 DELETE a; xid=4 DELETE b; xid=5 COMMIT; COMMIT; COMMIT; COMMIT; xid4 and xid5 has to be adjusted based on that status of xid1 and xid2. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Official Freeze Date for 7.5: July 1st, 2004
Andrew Dunstan wrote: That seems to indicate that at that stage, barely 2 months ago, the month was not definite, let alone the day. I confess that as a newcomer I was not around before the 7.4 cycle, so saying that people should have known the freeze date because it is following past patterns doesn't help me much. Are people supposed to obtain this info by trawling mailing list archives years back, or by some sort of divine revelation? Other OS projects manage this whole process better, IMNSHO. I'm not trying to point fingers, but to get future improvement. I sent this email on April 16th asking for a status on the big 7.5 features: --- From pgman Fri Apr 16 14:22:42 2004 Subject: PITR, nested transactions, tablespaces, 2-phase commit: Status request To: PostgreSQL-development [EMAIL PROTECTED] Date: Sat, 17 Apr 2004 21:16:20 -0400 (EDT) Would folks report on the current status of these projects: o nested transactions (Alvaro Herrera) o tablespaces (Gavin Sherry) o PITR (Simon Riggs) o 2-phase commit (Heikki Linnakangas) --- I got no replies, except for Simon, I think. Without replies, it is very hard for us to adjust feature freeze timing for these features. I should add I have been proposing a longer development period for a long time because our features are getting more complex. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Why repalloc() != realloc() ?
Alvaro Herrera [EMAIL PROTECTED] writes: Is there a reason why repalloc() does not behave the same as realloc? realloc(NULL, size) behaves the same as malloc(size), and it seems useful behavior -- I wonder why repalloc() chooses to Assert() against this exact condition? We don't allow palloc(0) either, and we don't return NULL on failure, and for that matter we don't allow pfree(NULL). Please don't argue that we ought to be just like libc. I assume this is because the NULL pointer would not know what context it belongs to, That's a sufficient reason from my point of view. One of the main properties of repalloc is that the alloc'd memory stays in the same context it was first allocated in. I'm not excited about allowing a special exception that makes that behavior less predictable. To give a concrete example of why this sort of corner-case exception is bad, imagine an aggregate function or similar thing that concatenates strings. It starts with a palloc() in a specific context and then assumes that repalloc's will stay in that context without prodding. The code works fine --- unless there are corner cases for palloc(0) returning NULL and repalloc() accepting NULL. Can this behavior be changed? Not without a significantly better argument than you've offered. regards, tom lane ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] Nested transactions and tuple header info
Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: You can't change xmin --- this would break visibility tests. Basically the phantom xid's are a shorthand for saying the tuple was created by xid1 and deleted by xid2, both part of the same main transaction. That would be fine if the shorthand were readable, but it's not. A cursor looking at the rows has to recognize the xid is a phantom (via pg_subtrans) and look up the creation xid. And it will find that how? Imagine that the creating transaction is itself a subtransaction, and the deleting one is a few nesting levels further down. I don't see how the tuple is going to carry enough information to let you determine what's what, if the deleting subxact overwrites the creating one's XID. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] Nested transactions and tuple header info
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Tom Lane wrote: You can't change xmin --- this would break visibility tests. Basically the phantom xid's are a shorthand for saying the tuple was created by xid1 and deleted by xid2, both part of the same main transaction. That would be fine if the shorthand were readable, but it's not. A cursor looking at the rows has to recognize the xid is a phantom (via pg_subtrans) and look up the creation xid. And it will find that how? Imagine that the creating transaction is itself a subtransaction, and the deleting one is a few nesting levels further down. I don't see how the tuple is going to carry enough information to let you determine what's what, if the deleting subxact overwrites the creating one's XID. The backend who created _and_ expired the tuple has to do a lookup to find the creation or expire xid. We need two things. First, we need to control the visibility of the tuple once the entire transaction is done. This does that. Second, we need to be able to find the creation and expire xid, and that information is only required to be visible by the main transation and its subtransactions. On commit we can adjust these xid status to show the proper visibility. How do they do the lookup? Well, one idea would be to just create a local backend hash of these xids and their creation/expire xids. Another idea is that pg_subtrans already points to a parent xid. We could use the same method and point to both creation and expire xids. Why does this not work? -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Nested transactions and tuple header info
On Tue, Jun 01, 2004 at 11:17:40PM -0400, Bruce Momjian wrote: Basically the phantom xid's are a shorthand for saying the tuple was created by xid1 and deleted by xid2, both part of the same main transaction. Hmm... this would spread the ugliness elsewhere (hopefully only HeapTupleHeaderGetXmin). A cursor looking at the rows has to recognize the xid is a phantom (via pg_subtrans) and look up the creation xid. No need to look at pg_subtrans because we know all the Xids of our transaction tree. I think this can be kept in local memory. Also, we will need a phantom xid for every xid1/xid2 pair. You can't just create one phantom xid per subtransaction because you must be able to control independently commit/rollback rows based on the status of the insert transaction. Oh, sure. This could get huge pretty fast. We still need to think on the effects this could have on crash recovery though -- we'd have to write the phantom Xids to Xlog somehow (indicating which ones are committed and which are aborted). And we still don't know what effect it would have on CPU cost for every visibility check. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Los dioses no protegen a los insensatos. Éstos reciben protección de otros insensatos mejor dotados (Luis Wu, Mundo Anillo) ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Why repalloc() != realloc() ?
On Tue, Jun 01, 2004 at 11:39:57PM -0400, Tom Lane wrote: I assume this is because the NULL pointer would not know what context it belongs to, That's a sufficient reason from my point of view. Right, you've convinced me. Just wanted to know if I could save three lines of code. Probably not a compelling reason to change the behavior. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) El realista sabe lo que quiere; el idealista quiere lo que sabe (Anónimo) ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Nested transactions and tuple header info
Alvaro Herrera wrote: Also, we will need a phantom xid for every xid1/xid2 pair. You can't just create one phantom xid per subtransaction because you must be able to control independently commit/rollback rows based on the status of the insert transaction. Oh, sure. This could get huge pretty fast. We still need to think on the effects this could have on crash recovery though -- we'd have to write the phantom Xids to Xlog somehow (indicating which ones are committed and which are aborted). And we still don't know what effect it would have on CPU cost for every visibility check. As I understand, this overhead would only be needed for subtransactions. I also don't think there will be a lot of them because it is only for creation/expire in the same main transaction, and it is only needed for unique creation/expire combinations, which should be pretty small. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]