Re: [HACKERS] slow IN() clause for many cases
On Wed, 2005-11-30 at 07:18 +0100, Martijn van Oosterhout wrote: And finally, why can't: Select * From Sales where month IN ( select month from time_dimension where FinYear = 2005 and Quarter = 3) Be written as: Select sales.* From Sales, time_dimension where month = time_dimension.inYear = 2005 and time_dimension.Quarter = 3; As long as there are no NULLs it returns the same as the IN() version and PostgreSQL can optimise it just fine. It can, of course, but there must be value in that optimization. If you consider how IN () would be transformed into =ANY(ARRAY(subselect)) you'll see that the subselect values would be treated as constants that could result in a bitmap index lookup. Transforming IN () into straight joins would not take the same approach when more than one join (i.e. 3 or more tables) was requested. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Using multi-row technique with COPY
On Wed, 2005-11-30 at 02:10 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: [...context omitted...] We would need to flush all the blocks in the table out of cache at commit time, for that table only. This seems striking close to the Old Concept of temp tables, which we got rid of for good-and-sufficient reasons. You might want to spend some time reviewing the archives before proposing that we reintroduce it. My suggestion is that I use the same mechanism as was introduced for 8.1 with the CREATE TABLE AS SELECT statement. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] slow IN() clause for many cases
On Fri, Dec 02, 2005 at 08:18:44AM +, Simon Riggs wrote: It can, of course, but there must be value in that optimization. If you consider how IN () would be transformed into =ANY(ARRAY(subselect)) you'll see that the subselect values would be treated as constants that could result in a bitmap index lookup. Transforming IN () into straight joins would not take the same approach when more than one join (i.e. 3 or more tables) was requested. Are you sure? If you have one table joined to many others, that is the single most obvious case for bitmap indexes. And joins are converted to bitmap index scans all the time so I'm unsure why this case would be any different. If the results are the same, the optimiser should optimise both the same, no? Anyway, maybe I'm just old fashioned in thinking that joins are by far the easiest to optimise because they are closest to relational algebra. IN() can also be converted to a join, except for the NULL effect. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpz2dTlzdAIw.pgp Description: PGP signature
Re: [HACKERS] generalizing the planner knobs
On Thu, 2005-12-01 at 22:01, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: On the other hand the type I would prefer to see are hints that feed directly into filling in information the planner lacks. This only requires that the user understand his own data and still lets the planner pick the best plan based on the provided information. This would avoid some issues, but it still is vulnerable to the problem that the hint you put in your code today will fail to track changes in your data tomorrow. Tom, I have to disagree here. At least in our application, we must provide for an acceptable worst case scenario, and sometimes a slightly wrong estimate can lead to a plan which is very fast 99% of the time but completely wrong in 1% of the cases. Sometimes the percentage is 50/50. I've had this situation with some limit plans where the planner had chosen a wrong index. The problem there was that the planner had estimated that the query will have 20 rows as a result, but it had less, and resulted in the complete scan of the index... as opposed to a much smaller scan that would have resulted by scanning the other index, as that one would have provided an end condition orders of magnitudes sooner. Now the statistics will always be only an estimation, and +/- a few can really make a big difference in some situations. In this particular situation the index choice of the planner would have been faster for all cases where there were really 20 rows returned, but I forced it to always choose the other plan (by adding the proper order by) because I can't risk a bad result in any of the cases. In this particular case I was able to force the planner choose a specific plan, but that might not be always possible, so I guess it really would make sense to be able to tell the planner how selective some conditions are. And yes, sometimes I would like to freeze a specific safe plan for a specific query, even if it is not optimal. So for me the hint mechanism is good for telling the server that I'm not interested at all in the BEST plan but which risks getting very bad on occasions, but in a good enough plan which is safe. And as for the selectivity changes over time, the hints will change along. In most of the situations when selectivity change, the SQL has to change too, sometimes even the complete workflow. I find that if changed hints will help in some occasions then having them would mean less maintenance than the code rewriting that would be otherwise involved... and I'm completely sure the server can't compensate for the change of the dynamics of the data all the time. And it definitely can't keep up with highly dynamic data, where the statistics change constantly in big tables... Our application for example has kind of batch processing, where we insert smaller or larger batches of data in a HUGE table (~200 millions of rows), and then that data is immediately used for different operations and then reports, and furthermore it is heavily updated. I can't think of any reasonable statistics target and ANALYZE strategy which could satisfy both small batches and large batches without running ANALYZE permanently with high statistics targets on the key fields... and even that would not be specific enough when limit 20 is involved. For queries involving this table I really would like to freeze plans, as any misplanning has bad consequences. Cheers, Csaba. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Shared locking in slru.c
On Wed, 30 Nov 2005 13:53:13 -0500, Tom Lane [EMAIL PROTECTED] wrote: The way the attached patch attacks this is for the shared-lock access case to simply set the page's LRU counter to zero, without bumping up the LRU counters of the other pages as the normal adjustment would do. If you still plan to do this, you might also want to revert the micro-optimisation intruduced by the original SLRU patch: | Apart from refactoring I made a little change to SlruRecentlyUsed, | formerly ClogRecentlyUsed: It now skips incrementing lru_counts, if | slotno is already the LRU slot, thus saving a few CPU cycles. |+#define SlruRecentlyUsed(shared, slotno) \ |+ do { \ |+ if ((shared)-page_lru_count[slotno] != 0) { \ |+ int iilru; \ |+ for (iilru = 0; iilru NUM_CLOG_BUFFERS; iilru++) \ |+ (shared)-page_lru_count[iilru]++; \ |+ (shared)-page_lru_count[slotno] = 0; \ |+ } \ |+ } while (0) Otherwise you could end up with a stable state of several pages having lru_count == 0. Servus Manfred ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] generalizing the planner knobs
On Fri, Dec 02, 2005 at 11:07:06AM +0100, Csaba Nagy wrote: So for me the hint mechanism is good for telling the server that I'm not interested at all in the BEST plan but which risks getting very bad on occasions, but in a good enough plan which is safe. I'm wondering if long term another approach might be to have another parameter in the planner, cost_error or selectivity_error which is an indication of how accurate we think it is. So for example you have an index scan might cost x but with a possible error of 15% and the seqscan might cost y but with an error of 1%. The error for nested loop would be the product of the two inputs, whereas a merge join whould be much less sensetive to error. A sort or hash join would react badly to large variations of input. So in cases where there is a choice between two indexscans with one slightly more expensive and more accurate but can result in a mergejoin would be a better choice than a possibly highly selective index but without accurate info that needs to be fed into a nested loop. Even though the latter might look better, the former is the safer option. I think this would solve the problem where people see sudden flip-flops between good and bad plans. The downside is that it's yet another parameter for the planner to get wrong. Unfortunatly, this is the kind of thing people write thesises on and I don't think many people have the grounding in statistics to make it all work. Have a ncie day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpZhqhRS8A1y.pgp Description: PGP signature
[HACKERS] Graphics in postgress using GTK
Hi We are trying to implement a progress estiamator for long queries. We are trying to make the display graphical using GTK but we get an error saying Xlib : Connection to :0.0 refused by server xlib : no protocol specified GTK-warning ** : Cannot open display Can someone suggest a method to integrate this GTK code into postgres and solve this problem. Thanks and Regards Anuj Tripathi ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Graphics in postgress using GTK
On Fri, Dec 02, 2005 at 04:15:55PM +0530, Anuj Tripathi wrote: Hi We are trying to implement a progress estiamator for long queries. We are trying to make the display graphical using GTK but we get an error saying Xlib : Connection to :0.0 refused by server xlib : no protocol specified Err, why on earth would the *backend* need to talk to the X server. Shouldn't it be sending messages to the client and then the client making the pics? How do you plan to deal with 10 queries running simultaneously on a server at the other side of the world. I'm not even running an X-server on my database server... Anyway, the database server doesn't have permissions to access the X server, that's the error. Can someone suggest a method to integrate this GTK code into postgres and solve this problem. Umm, don't? Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpOMCVh22K51.pgp Description: PGP signature
Re: [HACKERS] Graphics in postgress using GTK
Anuj Tripathi wrote: Hi We are trying to implement a progress estiamator for long queries. We are trying to make the display graphical using GTK but we get an error saying Xlib : Connection to :0.0 refused by server xlib : no protocol specified GTK-warning ** : Cannot open display Can someone suggest a method to integrate this GTK code into postgres and solve this problem. Um - don't you want to be doing this in the client, not in the backend? -- Richard Huxton Archonet Ltd ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Buildfarm: Bear, Branch 2?
Michael Glaesemann said: Out of curiosity, what is this beast? http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=beardt=2005-11-13% 2012:01:08 pilot error. I'll clean it up. cheers andrew ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Buildfarm: Bear, Branch 2?
On Dec 2, 2005, at 20:54 , Andrew Dunstan wrote: Michael Glaesemann said: Out of curiosity, what is this beast? http://pgbuildfarm.org/cgi-bin/show_log.pl?nm=beardt=2005-11-13% 2012:01:08 pilot error. I'll clean it up. Shucks. I was hoping it was a secret new port that reduced count(*) to a simple lookup, removed the need for vacuuming, solved all context-storm problems, provided simple in-place upgrades, and column- level locale support, as well as a nice, async multi-master solution. Guess I better get to work. :) Michael Glaesemann grzm myrealbox com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] generalizing the planner knobs
Greg Stark [EMAIL PROTECTED] writes: You more or less missed my entire point. Only because I am still getting used to how powerful and flexible Postgres is; but I am working on expanding my horizons. In the extreme, no amount of added intelligence in the optimizer is going to help it come up with any sane selectivity estimate for something like WHERE radius_authenticate(user) = 'OK' yeah, I can see where something like this would be problematic. While I still think that in an ideal world, you want to leave all of this to the engine, it is true that in the real world sometimes we still have to do some of the thinking for the computer. It's just that I've seen code absolutely littered with optimizer hints, and that really bothers me. But you can't not build a useful tool just because some would abuse it. Mike Pollard SUPRA Server SQL Engineering and Support Cincom Systems, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Shared locking in slru.c
Manfred Koizar [EMAIL PROTECTED] writes: On Wed, 30 Nov 2005 13:53:13 -0500, Tom Lane [EMAIL PROTECTED] wrote: The way the attached patch attacks this is for the shared-lock access case to simply set the page's LRU counter to zero, without bumping up the LRU counters of the other pages as the normal adjustment would do. If you still plan to do this, you might also want to revert the micro-optimisation intruduced by the original SLRU patch: Good point --- thanks for mentioning it. I'm still fooling with the modified code because it seems like it's not doing very well at managing the SLRU pool, and perhaps that's got something to do with it ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing relation locking overhead
On Fri, 2005-12-02 at 02:14 -0500, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: It was a *major* new feature that many people were waiting for when Oracle finally implemented live CREATE INDEX and REINDEX. The ability to run create an index without blocking any operations on a table, even updates, was absolutely critical for 24x7 operation. Well, we're still not in *that* ballpark and I haven't seen any serious proposals to make us so. How absolutely critical is it really? Is REINDEX-in-parallel-with-reads-but-not-writes, which is what we actually have at the moment, an absolutely critical facility? REINDEX isn't run that regularly, so perhaps might warrant special attention. (I think there are other things we could do to avoid ever needing to run a REINDEX.) CREATE/DROP INDEX is important however, since we may want to try out new index choices without stopping access altogether. But we do also want the locking contention to be reduced also I know at least one other RDBMS that uses optimistic locking when creating indexes. It checks the table description, builds the index with a read lock, then checks the table description again before attempting to lock the catalog, create the index and then complete. There is a risk of getting a table restructured error after the build is nearly complete. If we did that, then we wouldn't need to lock the indexes because you wouldn't be able to see an index until it was built. Doing something similar might allow us to have online CREATEs yet without a locking overhead. 24x7 operation is actually fairly common. Maybe not with a strong SLA for availability, but many websites and embedded apps are out there all the time. The PostgreSQL claim to fame has concurrency at the top of the list, so we should assume that in all we do. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] Optimizer oddness, possibly compounded in 8.1
The optimizer seems to want to use sequential scans on inherited tables when crossed with another table, as the following seems to demonstrate: Create Table base(f1 bigserial); create table inh1(f2 bigint) inherits (base); create table inh2(f2 bigint) inherits (base); create table inh3(f2 bigint) inherits (base); create table inh4(f2 bigint) inherits (base); insert into inh1(f2) values(1); insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; insert into inh1(f2) select f2 from inh1; create unique index base_f1 on base(f1); create unique index inh1_f1 on inh1(f1); create unique index inh2_f1 on inh2(f1); create unique index inh3_f1 on inh3(f1); create unique index inh4_f1 on inh4(f1); vacuum analyze base; vacuum analyze inh1; vacuum analyze inh2; vacuum analyze inh3; vacuum analyze inh4; create table t2(f1 bigint); insert into t2 values(1); insert into t2 values(2); insert into t2 values(128); insert into t2 values(32768); explain analyze select * from t2,base where base.f1=t2.f1; gives: Hash Join (cost=1.05..1546.04 rows=150 width=16) (actual time=0.433..436.791 rows=4 loops=1) Hash Cond: (outer.f1 = inner.f1) - Append (cost=0.00..1181.66 rows=72366 width=8) (actual time=0.279..331.698 rows=65536 loops=1) - Seq Scan on base (cost=0.00..29.40 rows=1940 width=8) (actual time=0.002..0.002 rows=0 loops=1) - Seq Scan on inh1 base (cost=0.00..1073.36 rows=65536 width=8) (actual time=0.273..148.326 rows=65536 loops=1) - Seq Scan on inh2 base (cost=0.00..26.30 rows=1630 width=8) (actual time=0.002..0.002 rows=0 loops=1) - Seq Scan on inh3 base (cost=0.00..26.30 rows=1630 width=8) (actual time=0.003..0.003 rows=0 loops=1) - Seq Scan on inh4 base (cost=0.00..26.30 rows=1630 width=8) (actual time=0.002..0.002 rows=0 loops=1) - Hash (cost=1.04..1.04 rows=4 width=8) (actual time=0.132..0.132 rows=0 loops=1) - Seq Scan on t2 (cost=0.00..1.04 rows=4 width=8) (actual time=0.111..0.119 rows=4 loops=1) Total runtime: 436.880 ms unwrapping the query into a series of UNIONS on the child tables reduces the run time by a factor of several hundred under PG8.0: explain analyze select z.f1 from t2,only base z where z.f1=t2.f1 UNION ALL select z.f1 from t2,inh1 z where z.f1=t2.f1 UNION ALL select z.f1 from t2,inh2 z where z.f1=t2.f1 UNION ALL select z.f1 from t2,inh3 z where z.f1=t2.f1 UNION ALL select z.f1 from t2,inh4 z where z.f1=t2.f1 Append (cost=0.00..94.87 rows=20 width=8) (actual time=0.184..0.485 rows=4 loops=1) - Subquery Scan *SELECT* 1 (cost=0.00..20.42 rows=4 width=8) (actual time=0.096..0.096 rows=0 loops=1) - Nested Loop (cost=0.00..20.38 rows=4 width=8) (actual time=0.093..0.093 rows=0 loops=1) - Seq Scan on t2 (cost=0.00..1.04 rows=4 width=8) (actual time=0.033..0.043 rows=4 loops=1) - Index Scan using base_f1 on base z (cost=0.00..4.82 rows=1 width=8) (actual time=0.008..0.008 rows=0 loops=4) Index Cond: (z.f1 = outer.f1) - Subquery Scan *SELECT* 2 (cost=0.00..13.18 rows=4 width=8) (actual time=0.084..0.194 rows=4 loops=1) - Nested Loop (cost=0.00..13.14 rows=4 width=8) (actual time=0.081..0.178 rows=4 loops=1) - Seq Scan on t2 (cost=0.00..1.04 rows=4 width=8) (actual time=0.002..0.012 rows=4 loops=1) - Index Scan using inh1_f1 on inh1 z (cost=0.00..3.01 rows=1 width=8) (actual time=0.031..0.033 rows=1 loops=4) Index Cond: (z.f1 = outer.f1) - Subquery Scan *SELECT* 3 (cost=0.00..20.42 rows=4 width=8) (actual time=0.061..0.061 rows=0 loops=1) - Nested Loop (cost=0.00..20.38 rows=4 width=8) (actual time=0.057..0.057 rows=0 loops=1) - Seq Scan on t2 (cost=0.00..1.04 rows=4 width=8) (actual time=0.003..0.011 rows=4 loops=1) - Index Scan using inh2_f1 on inh2 z (cost=0.00..4.82 rows=1 width=8) (actual time=0.007..0.007 rows=0 loops=4) Index Cond: (z.f1 = outer.f1) - Subquery Scan *SELECT* 4 (cost=0.00..20.42 rows=4 width=8) (actual time=0.058..0.058 rows=0 loops=1) - Nested Loop (cost=0.00..20.38 rows=4 width=8) (actual time=0.055..0.055 rows=0 loops=1) - Seq Scan on t2 (cost=0.00..1.04 rows=4 width=8) (actual time=0.002..0.011 rows=4 loops=1) - Index Scan using inh3_f1 on inh3 z
Re: [HACKERS] Graphics in postgress using GTK
What are you doing? Trying to run the GTK application in the same server as the database? Or are you trying to do a graphical interface in a server without an X server?. I guess there is something very bad with your approach, I guess your graphical interface should be running in the client side NOT in the server side. -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Anuj Tripathi Sent: Viernes, 02 de Diciembre de 2005 04:46 a.m. To: pgsql-hackers@postgresql.org Subject: [HACKERS] Graphics in postgress using GTK Hi We are trying to implement a progress estiamator for long queries. We are trying to make the display graphical using GTK but we get an error saying Xlib : Connection to :0.0 refused by server xlib : no protocol specified GTK-warning ** : Cannot open display Can someone suggest a method to integrate this GTK code into postgres and solve this problem. Thanks and Regards Anuj Tripathi ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] generalizing the planner knobs
In the extreme, no amount of added intelligence in the optimizer is going to help it come up with any sane selectivity estimate for something like WHERE radius_authenticate(user) = 'OK' Why not? The missing capability in this case is to be able to provide or generate (self learning?) statistics for a function that describe a typical result and the cost of getting that result. -- ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Graphics in postgress using GTK
On 12/2/05, Anuj Tripathi [EMAIL PROTECTED] wrote: Hi We are trying to implement a progress estiamator for long queries. We are trying to make the display graphical using GTK but we get an error saying Xlib : Connection to :0.0 refused by server xlib : no protocol specified GTK-warning ** : Cannot open display Can someone suggest a method to integrate this GTK code into postgres and solve this problem. Thanks and Regards Anuj Tripathi What about to return to your application a cursor or a set of rows and while looping let your progress bar advance? -- regards, Jaime Casanova (DBA: DataBase Aniquilator ;) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Optimizer oddness, possibly compounded in 8.1
Philip Warner [EMAIL PROTECTED] writes: The optimizer seems to want to use sequential scans on inherited tables when crossed with another table, as the following seems to demonstrate: Is it intentional that your test case omits an analyze on t2? Coz when I add that, I get the same plan you show for 8.0. Without the knowledge that t2 is small, that plan is not a good choice. (The larger point that joins of inheritance unions aren't well-planned is true, but it's always been true...) regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] postgres questions (semi-joins, algebraic space)
Good Morning, I'm an italian student and I'm working on a project involving postgres. I'm sorry for my very bad english. I've some questions: 1)Does postgres upport semi-joins? 2)What about projections or selections in query tree writing? Does Postgres support algebraic space (in other word when/how Postgres process algebraic operators? Are they used as common to reduce the size of the space that the search strategy has to explore?). Thanks you all Lucky OffroCerco: CASA, LAVORO, VACANZE, ELETTRONICA, INCONTRI. Tutti gli annunci sono online su Kataweb all'indirizzo http://www.offroecerco.it OffroCerco: CASA, LAVORO, VACANZE, ELETTRONICA, INCONTRI. Tutti gli annunci sono online su Kataweb all'indirizzo http://www.offroecerco.it ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS]
Good Morning, I'm an italian student and I'm working on a project involving postgres. I'm sorry for my very bad english. I've some questions: 1)Does postgres upport semi-joins? 2)What about projections or selections in query tree writing? Does Postgres support algebraic space (in other word when/how Postgres process algebraic operators? Are they used as common to reduce the size of the space that the search strategy has to explore?). Thanks you all Lucky OffroCerco: CASA, LAVORO, VACANZE, ELETTRONICA, INCONTRI. Tutti gli annunci sono online su Kataweb all'indirizzo http://www.offroecerco.it ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] What`s wrong with the lists?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is there something weird going on with the lists? I ask because lately there has been spam showing up rather frequently, I'm not seeing this here. Can you point to a specific message on hackers? The only list I see that has spam is pg-press. - -- Greg Sabino Mullane [EMAIL PROTECTED] PGP Key: 0x14964AC8 200512021436 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -BEGIN PGP SIGNATURE- iD8DBQFDkKKRvJuQZxSWSsgRAj7KAJ9lRaxONxpbFkQlDRMDDT8C5aOWtACfdZ5D 4paLbJ+QBN0ZM4NmswXcrkU= =UK+B -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What`s wrong with the lists?
Greg Sabino Mullane wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is there something weird going on with the lists? I ask because lately there has been spam showing up rather frequently, I'm not seeing this here. Can you point to a specific message on hackers? The only list I see that has spam is pg-press. Doh! I understand what's happening now. The problem is that some spam arrives to [EMAIL PROTECTED] with Cc: pgsql-hackers, and I have a procmail rule that redirects that to pgsql-hackers before the rule that redirects it to the pgsql-es-ayuda-owner mbox. The solution seems to be to put the -owner rule before the others ... sorry for the noise. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing relation locking overhead
Simon Riggs [EMAIL PROTECTED] writes: On Fri, 2005-12-02 at 02:14 -0500, Tom Lane wrote: Greg Stark [EMAIL PROTECTED] writes: It was a *major* new feature that many people were waiting for when Oracle finally implemented live CREATE INDEX and REINDEX. The ability to run create an index without blocking any operations on a table, even updates, was absolutely critical for 24x7 operation. Well, we're still not in *that* ballpark and I haven't seen any serious proposals to make us so. How absolutely critical is it really? Is REINDEX-in-parallel-with-reads-but-not-writes, which is what we actually have at the moment, an absolutely critical facility? Alright, I'll grant Tom that absolutely critical was a bit of hyperbole. I know at least one other RDBMS that uses optimistic locking when creating indexes. It checks the table description, builds the index with a read lock, then checks the table description again before attempting to lock the catalog, create the index and then complete. There is a risk of getting a table restructured error after the build is nearly complete. I suspect this comes out of a very different storage model from Postgres's. Postgres would have no trouble building an index of the existing data using only shared locks. The problem is that any newly inserted (or updated) records could be missing from such an index. To do it you would then have to gather up all those newly inserted records. And of course while you're doing that new records could be inserted. And so on. There's no guarantee it would ever finish, though I suppose you could detect the situation if the size of the new batch wasn't converging to 0 and throw an error. One optimization would be to have a flag that disabled the use of the FSM, forcing all inserts to extend the table and allocate new tuples at the end. This would at least limit the amount the index build would have to scan. The index build could just do one-by-one insertions for the remaining tuples until it catches up to the head. At the end of the index build there's also a problem upgrading locks to put in place the new index. That would create a deadlock risk. Perhaps that's where the table restructured error comes up in these other databases? 24x7 operation is actually fairly common. Maybe not with a strong SLA for availability, but many websites and embedded apps are out there all the time. The PostgreSQL claim to fame has concurrency at the top of the list, so we should assume that in all we do. Off the top of my head I would put these items on the list of necessary for 24x7 operation: . (non-FULL) VACUUM . Online/PITR backups . Partitioned Tables . online index builds Of which Postgres has 2.5 out of 4. And most of those have come in just the last 12 months or so. Doing pretty damned good. -- greg ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] generalizing the planner knobs
Rod Taylor [EMAIL PROTECTED] writes: In the extreme, no amount of added intelligence in the optimizer is going to help it come up with any sane selectivity estimate for something like WHERE radius_authenticate(user) = 'OK' Why not? The missing capability in this case is to be able to provide or generate (self learning?) statistics for a function that describe a typical result and the cost of getting that result. Ok, try WHERE radius_authenticate(user, (select ...), ?) The point is that you can improve the estimates the planner gets. But you can never make them omniscient. There will always be cases where the user knows his data more than the planner. And those hints are still valid when a new optimizer has new plans available. This is different from hints that tell the planner what plan to use. Every situation where the predicted cost is inaccurate despite accurate estimates represents a fixable bug in the optimizer's cost model. When a new version of the optimizer is available with a more accurate cost model or new available plans those kinds of hints will only get in the way. -- greg ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Reducing relation locking overhead
On 02 Dec 2005 15:25:58 -0500, Greg Stark [EMAIL PROTECTED] wrote: I suspect this comes out of a very different storage model from Postgres's. Postgres would have no trouble building an index of the existing data using only shared locks. The problem is that any newly inserted (or updated) records could be missing from such an index. To do it you would then have to gather up all those newly inserted records. And of course while you're doing that new records could be inserted. And so on. There's no guarantee it would ever finish, though I suppose you could detect the situation if the size of the new batch wasn't converging to 0 and throw an error. After you're mostly caught up, change locking behavior to block further updates while the final catchup happens. This could be driven by a hurestic that says make up to N attempts to catch up without blocking, after that just take a lock and finish the job. Presumably the catchup would be short compared to the rest of the work. Are their enviroments which could not tolerate even this minimal hit? Probably, which leaves the choice of telling them 'don't reindex then' or providingaA knob which would tell it to never block (would just try N times and then give up, failing the reindex). ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] generalizing the planner knobs
On 02 Dec 2005 15:49:02 -0500, Greg Stark [EMAIL PROTECTED] wrote: Rod Taylor [EMAIL PROTECTED] writes: The missing capability in this case is to be able to provide or generate (self learning?) statistics for a function that describe a typical result and the cost of getting that result. Ok, try WHERE radius_authenticate(user, (select ...), ?) The point is that you can improve the estimates the planner gets. But you can never make them omniscient. There will always be cases where the user knows his data more than the planner. And those hints are still valid when a new optimizer has new plans available. Actually... If a statistics engine stores the entire query as well and used that as a key I don't see why it couldn't figure this out. I.e. in queries that look like Z operation X has historically had selectivity Y. The the instruction to the user is simple: 'make sure that queries with different results look different' . This is often naturally the case. The challenge becomes how do you group together queries which are mostly the same so that you get enough data, but not falsely cluster queries with different statistics. The simplest way check the statistics list for the most similar query match, and use that information. If the result is similar to what is expected, use it to update the statistics record. If the measured selectivity is too different make a new record which will then attract similar queries. Sounds like a good research project for someone. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Reducing relation locking overhead
Gregory Maxwell wrote: On 02 Dec 2005 15:25:58 -0500, Greg Stark [EMAIL PROTECTED] wrote: I suspect this comes out of a very different storage model from Postgres's. Postgres would have no trouble building an index of the existing data using only shared locks. The problem is that any newly inserted (or updated) records could be missing from such an index. To do it you would then have to gather up all those newly inserted records. And of course while you're doing that new records could be inserted. And so on. There's no guarantee it would ever finish, though I suppose you could detect the situation if the size of the new batch wasn't converging to 0 and throw an error. After you're mostly caught up, change locking behavior to block further updates while the final catchup happens. This could be driven by a hurestic that says make up to N attempts to catch up without blocking, after that just take a lock and finish the job. Presumably the catchup would be short compared to the rest of the work. The problem is that you need to upgrade the lock at the end of the operation. This is very deadlock prone, and likely to abort the whole operation just when it's going to finish. Is this a showstopper? Tom seems to think it is. I'm not sure anyone is going to be happy if they find that their two-day reindex was aborted just when it was going to finish. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing relation locking overhead
On Fri, 2005-12-02 at 19:04 -0300, Alvaro Herrera wrote: Gregory Maxwell wrote: On 02 Dec 2005 15:25:58 -0500, Greg Stark [EMAIL PROTECTED] wrote: I suspect this comes out of a very different storage model from Postgres's. Postgres would have no trouble building an index of the existing data using only shared locks. The problem is that any newly inserted (or updated) records could be missing from such an index. To do it you would then have to gather up all those newly inserted records. And of course while you're doing that new records could be inserted. And so on. CREATE INDEX uses SnapshotAny, so the scan that feeds the build could easily include rows added after the CREATE INDEX started. When the scan was exhausted we could mark that last TID and return to it after the sort/build. There's no guarantee it would ever finish, though I suppose you could detect the situation if the size of the new batch wasn't converging to 0 and throw an error. After you're mostly caught up, change locking behavior to block further updates while the final catchup happens. This could be driven by a hurestic that says make up to N attempts to catch up without blocking, after that just take a lock and finish the job. Presumably the catchup would be short compared to the rest of the work. The problem is that you need to upgrade the lock at the end of the operation. This is very deadlock prone, and likely to abort the whole operation just when it's going to finish. Is this a showstopper? Tom seems to think it is. I'm not sure anyone is going to be happy if they find that their two-day reindex was aborted just when it was going to finish. If that is the only objection against such a seriously useful feature, then we should look at making some exceptions. (I understand the lock upgrade issue). Greg has come up with an exceptional idea here, so can we look deeper? We already know others have done it. What types of statement would cause the index build to fail? How else can we prevent them from executing while the index is being built? Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Reducing relation locking overhead
On 12/2/05, Alvaro Herrera wrote: Gregory Maxwell wrote: After you're mostly caught up, change locking behavior to block further updates while the final catchup happens. This could be driven by a hurestic that says make up to N attempts to catch up without blocking, after that just take a lock and finish the job. Presumably the catchup would be short compared to the rest of the work. The problem is that you need to upgrade the lock at the end of the operation. This is very deadlock prone, and likely to abort the whole operation just when it's going to finish. Is this a showstopper? Tom seems to think it is. I'm not sure anyone is going to be happy if they find that their two-day reindex was aborted just when it was going to finish. How about the following sceanrio for building a new index: - create an empty index - flag it as incomplete - commit it so it becomes visible to new transactions - new transactions will update the index when inserting / updating - the planner will not use it for queries because it is flagged as incomplete - wait until the the index is visible to all running transactions - start a new seqscan and insert all records in the index - commit - remove the incomplete flag Wouldn't this overcome the lock upgrade problem? Jochem ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing relation locking overhead
Simon Riggs [EMAIL PROTECTED] writes: CREATE INDEX uses SnapshotAny, so the scan that feeds the build could easily include rows added after the CREATE INDEX started. When the scan was exhausted we could mark that last TID and return to it after the sort/build. And do what? This has nothing to do with the fundamental problem of never being able to catch up unless you can upgrade your lock to exclude writes. What's worse, once you have excluded writes you have to rescan the entire table to be sure you haven't missed anything. So in the scenarios where this whole thing is actually interesting, ie enormous tables, you're still talking about a fairly long interval with writes locked out. Maybe not as long as a complete REINDEX, but long. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Optional postgres database not so optional in 8.1
I never saw a followup to this. Is someone working on a ping protocol extension, or should we revert pg_ctl to using template1 on the ground that it does a poor man's ping anyway? cheers andrew Andrew Dunstan wrote: Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: I now notice that pg_ctl -w start fails if the postgres db is missing. I am not sure that changing pg_ctl to use this rather than template1 was a good thing, and it can't be overridden. I suggest we revert that particular change - it seems to me to confer little to no benefit, unlike the case with createdb etc. pg_ctl -w is already incredibly fragile because it needs a working password-free login name. Rather than worrying about whether the database name exists, what we ought to do is invent the long-awaited ping extension to the postmaster protocol --- something that would just ask are you up and ready to accept connections without having to specify a valid user *or* database name. You can sort of do this today if you are willing to examine the error message that comes back from the postmaster, but I think it'd be cleaner to have an official protocol extension. Actually, it looks like pg_ctl already does this: if ((conn = PQsetdbLogin(NULL, portstr, NULL, NULL, postgres, NULL, NULL)) != NULL (PQstatus(conn) == CONNECTION_OK || (strcmp(PQerrorMessage(conn), PQnoPasswordSupplied) == 0))) { PQfinish(conn); success = true; break; } ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Spam 508
It appears that a bunch of spammers have invaded -hackers, claiming they can avoid the heat death of the universe in exchange for 2 bytes. Or something like that. :-) Even my Krull-powered Spam filters can't cope. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] generalizing the planner knobs
On Fri, 2005-12-02 at 11:53 +0100, Martijn van Oosterhout wrote: On Fri, Dec 02, 2005 at 11:07:06AM +0100, Csaba Nagy wrote: So for me the hint mechanism is good for telling the server that I'm not interested at all in the BEST plan but which risks getting very bad on occasions, but in a good enough plan which is safe. I'm wondering if long term another approach might be to have another parameter in the planner, cost_error or selectivity_error which is an indication of how accurate we think it is. So for example you have an index scan might cost x but with a possible error of 15% and the seqscan might cost y but with an error of 1%. The error for nested loop would be the product of the two inputs, whereas a merge join whould be much less sensetive to error. A sort or hash join would react badly to large variations of input. So in cases where there is a choice between two indexscans with one slightly more expensive and more accurate but can result in a mergejoin would be a better choice than a possibly highly selective index but without accurate info that needs to be fed into a nested loop. Even though the latter might look better, the former is the safer option. I think this would solve the problem where people see sudden flip-flops between good and bad plans. The downside is that it's yet another parameter for the planner to get wrong. Measuring parameters more accurately is a lengthy experimental job, not a theoretical one. I think we are just waiting for someone to do this. Unfortunatly, this is the kind of thing people write thesises on and I don't think many people have the grounding in statistics to make it all work. I'd considered that before; its just a lot of work. The theory of error propagation is straightforward: you just take the root mean square of the errors on the parameters. Trouble is, many of the planning parameters are just guesses, so you have no idea of the error estimates either. Hence you can't really calculate the error propagation accurately enough to make a sensible stab at risk control. But it would be useful sometimes, which is about the best it gets with the planner. Right now the worst part of the planner is: - the estimation of number of distinct values, which is an inherent statistical limitation - need for multi-column interaction statistics The two are somewhat related. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] generalizing the planner knobs
Is it possible to submit a hand written or arbitrary execution plan to the retrieval engine? (That is, can one bypass the SQL parser and planner or optimizer and just provide instructions to nested loop join table a to table b ...) ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Spam 508
I haven't received any yet, that I can tell ... sure its coming through the lists, and not around them? On Fri, 2 Dec 2005, Simon Riggs wrote: It appears that a bunch of spammers have invaded -hackers, claiming they can avoid the heat death of the universe in exchange for 2 bytes. Or something like that. :-) Even my Krull-powered Spam filters can't cope. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Reducing relation locking overhead
Jochem van Dieten [EMAIL PROTECTED] writes: How about the following sceanrio for building a new index: - create an empty index - flag it as incomplete - commit it so it becomes visible to new transactions - new transactions will update the index when inserting / updating - the planner will not use it for queries because it is flagged as incomplete - wait until the the index is visible to all running transactions - start a new seqscan and insert all records in the index - commit - remove the incomplete flag Wouldn't this overcome the lock upgrade problem? Doesn't really solve the problem for REINDEX, though. Presumably, the reason that you are REINDEXing is that you would like to defragment the existing index. Well, that requires collecting all the index entries and sorting them. The above method is not going to produce a nicely sorted index; whatever entries get made on-the-fly during the first stage are going to determine the index shape. This same problem applies to the build-lock-catchup paradigm, although less severely since you can hope that the entries to be added on at the end are only a small part of the total and will fit in the excess space that you leave in the index leaf pages. If you spend too long catching up, though (as in the multiple-pass ideas that various people were advocating), you'll end up with an index messy enough that it's questionable why you bothered. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] strange behavior (corruption?) of large production database
We have very strange behavior from an internal production database. There are multiple symptoms, all pointing to a problem with clusterwide tables. For example: [EMAIL PROTECTED]:~ psql -U postgres -p 5433 cyspec Welcome to psql 7.4.8, the PostgreSQL interactive terminal. Type: \copyright for distribution terms \h for help with SQL commands \? for help on internal slash commands \g or terminate with semicolon to execute query \q to quit cyspec=# select version(); version - PostgreSQL 7.4.8 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 3.3.3 (SuSE Linux) (1 row) [EMAIL PROTECTED]:~ psql -l List of databases Name | Owner | Encoding +--+--- cyspec | postgres | SQL_ASCII temp_mike | postgres | SQL_ASCII temp_mike_new | postgres | SQL_ASCII temp_mike_orig | postgres | SQL_ASCII template0 | postgres | SQL_ASCII template1 | postgres | SQL_ASCII (6 rows) cyspec=# select * from pg_database; datname | datdba | encoding | datistemplate | datallowconn | datlastsysoid | datvacuumxid | datfrozenxid | datpath | datconfig | datacl -++--+---+--+---+--+--+-+---+ (0 rows) cyspec=# \l List of databases Name | Owner | Encoding --+---+-- (0 rows) No databases found. Additionally: cyspec=# select usename, usesysid from pg_shadow; usename | usesysid --+-- postgres |1 colxl| 102 colro| 101 l400509 | 105 (4 rows) cyspec=# \c - colprod You are now connected as new user colprod. cyspec= \c - colxl You are now connected as new user colxl. cyspec= \c - colprod You are now connected as new user colprod. cyspec= \c - zxcvvb FATAL: user zxcvvb does not exist Previous connection kept The colprod user has disappeared from pg_shadow (there was one previously, and it was never intentionally dropped), but I can still connect with that user. The current problem was actually initially found because pg_dump complained that the owner of the colprod schema didn't exist. One more thing: cyspec=# show wal_sync_method; wal_sync_method - fdatasync (1 row) That works, but SHOW ALL and select * from pg_settings; return lines and lines of nothing in psql. (I mean literally blank lines, not even (0 rows)) After issuing \o /tmp/filename the output is all there, and looks normal. The oddness was first noticed about 3 days after a maintenance shutdown. As far as I have been told, during the maintenance window, there may have been OS level package upgrades, and there was a firmware upgrade done on the storage subsystem (NetApp). Any advice at what to look at/do would be appreciated. This database is somewhere around 1.1 TB in size, so dump and reload is not something we're anxious to do. Joe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Spam 508
Marc G. Fournier [EMAIL PROTECTED] writes: On Fri, 2 Dec 2005, Simon Riggs wrote: Even my Krull-powered Spam filters can't cope. I haven't received any yet, that I can tell ... sure its coming through the lists, and not around them? I think Simon is complaining about the length of the -patches thread, not the -hackers one ... regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] strange behavior (corruption?) of large production database
Joe Conway [EMAIL PROTECTED] writes: We have very strange behavior from an internal production database. There are multiple symptoms, all pointing to a problem with clusterwide tables. For example: psql -l really should produce the same results as doing \l in the template1 database. Does it? If so, the next thing to look at is probably whether the pg_class and pg_attribute entries for pg_database look the same in template1 and in cyspec. Similarly you could compare what pg_shadow looks like from different databases, and what the local system tables have as entries for it. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Spam 508
On 12/2/2005 6:19 PM, Marc G. Fournier wrote: I haven't received any yet, that I can tell ... sure its coming through the lists, and not around them? Some Tom Lane guy and a bunch of other well known addresses sent it. Could be forged From fields though ;-) Jan On Fri, 2 Dec 2005, Simon Riggs wrote: It appears that a bunch of spammers have invaded -hackers, claiming they can avoid the heat death of the universe in exchange for 2 bytes. Or something like that. :-) Even my Krull-powered Spam filters can't cope. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Spam 508
*roll eyes* Its not even April Fools yet ... On Fri, 2 Dec 2005, Jan Wieck wrote: On 12/2/2005 6:19 PM, Marc G. Fournier wrote: I haven't received any yet, that I can tell ... sure its coming through the lists, and not around them? Some Tom Lane guy and a bunch of other well known addresses sent it. Could be forged From fields though ;-) Jan On Fri, 2 Dec 2005, Simon Riggs wrote: It appears that a bunch of spammers have invaded -hackers, claiming they can avoid the heat death of the universe in exchange for 2 bytes. Or something like that. :-) Even my Krull-powered Spam filters can't cope. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] strange behavior (corruption?) of large production database
Joe Conway [EMAIL PROTECTED] writes: So they agree in template1 and cyspec databases. OK, in that case I'd wonder about whether you've suffered XID wraparound in pg_database and/or pg_shadow. The typical symptom of this is that entries are valid from the system's point of view but not visible to queries, and that seems to be what you have. If so, a restart will NOT fix it. You could try a VACUUM FREEZE on pg_database though. Before doing that, I'd suggest looking at the rows' xmin values (use pg_filedump or grovel through the bits by hand) to confirm the wraparound theory. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] strange behavior (corruption?) of large production
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: So they agree in template1 and cyspec databases. OK, in that case I'd wonder about whether you've suffered XID wraparound in pg_database and/or pg_shadow. The typical symptom of this is that entries are valid from the system's point of view but not visible to queries, and that seems to be what you have. If so, a restart will NOT fix it. You could try a VACUUM FREEZE on pg_database though. Before doing that, I'd suggest looking at the rows' xmin values (use pg_filedump or grovel through the bits by hand) to confirm the wraparound theory. Talking to the maintainer of this cluster, it sounds like XID wraparound could be the problem. I thought they were running database wide vacuums at some regularity, but apparently they are only vacuuming specific production tables. Since this is a production machine, putting pg_filedump on it may be problematic -- if I grovel through the bits by hand, can you give me a hint about what to look for? Thanks, Joe ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] strange behavior (corruption?) of large production database
Joe Conway [EMAIL PROTECTED] writes: Tom Lane wrote: You could try a VACUUM FREEZE on pg_database though. Since this is a production machine, putting pg_filedump on it may be problematic -- if I grovel through the bits by hand, can you give me a hint about what to look for? How about you copy off the pg_database file to someplace where it's OK to run pg_filedump? Doing that by hand is mighty tedious. BTW, forget the FREEZE part, just VACUUM: http://archives.postgresql.org/pgsql-general/2005-11/msg00097.php regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Optimizer oddness, possibly compounded in 8.1
Is it intentional that your test case omits an analyze on t2? No; my mistake. (The larger point that joins of inheritance unions aren't well-planned is true, but it's always been true...) It also seems to have a probkem with unions in views. Is there anything that can be done about this -- workarounds etc? Any plans to address it? We've got a couple of places where it's beginning to bite us due to growth of tables. ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Optimizer oddness, possibly compounded in 8.1
Philip Warner [EMAIL PROTECTED] writes: (The larger point that joins of inheritance unions aren't well-planned is true, but it's always been true...) It also seems to have a probkem with unions in views. Is there anything that can be done about this -- workarounds etc? Any plans to address it? We've got a couple of places where it's beginning to bite us due to growth of tables. It's something that's on the ever-growing TODO list ... I dunno if anyone has any near-term plans to work on it. It'd definitely be nice to teach the planner to do joins-over-unions well, and then make inheritance just invoke that behavior instead of being a crocky special case. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] strange behavior (corruption?) of large production database
Joe Conway [EMAIL PROTECTED] writes: Since this database has many large, but static tables (i.e. new data is loaded each day, but the tables are partitioned into year-month tables), I'm thinking we can run VACUUM FREEZE on the whole database once, and then run VACUUM FREEZE periodically on just the tables that have had rows added since the last time -- is that correct? That would work for the user tables, but the lesson to draw from this is not to forget about the system catalogs ... regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] strange behavior (corruption?) of large production
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: Since this database has many large, but static tables (i.e. new data is loaded each day, but the tables are partitioned into year-month tables), I'm thinking we can run VACUUM FREEZE on the whole database once, and then run VACUUM FREEZE periodically on just the tables that have had rows added since the last time -- is that correct? That would work for the user tables, but the lesson to draw from this is not to forget about the system catalogs ... Right, I was just thinking about that too. Important safety tip :-) Thanks again! Joe ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] Building Windows Server Extensions Using VC++ 2005
Hi everyone, I've been able to successfully build server extension using Visual Studio 2005 for Windows Postgresql 8.1. However, it took a few tweaks which I thought I should document (maybe these issues could be fixed in future postgresql versions?): 1. There is no lib file for VC++ to link against. This can be created easily enough by going to src/backend directory and running: lib /MACHINE:x86 /NAME:postgres.exe /DEF:postgres.def Note the use of /NAME to tell VC++ it is linking against an executable and not a dll. It would be nice if this lib file was automatically installed on windows when installing postgresql. 2. Requirement on strings.h In c.h:69 there is this code: #ifdef HAVE_STRINGS_H #include strings.h #endif In pg_config.h:405 this is defined: /* Define to 1 if you have the strings.h header file. */ #define HAVE_STRINGS_H 1 However, Visual Studio 2005 does not include this file. For a workaround I simply added it but that's a bit of hack. 3. This is a bigger issue, and took a while to figure out. If you try to use the Version 1 calling convention, your function will be called but if you try to do anything with the passed in parameters a segmentation fault will occur. If you use the Version 0 calling convention things work fine. The problem is if you use PG_FUNCTION_INFO_V1 postgres does not see the generated function because it is not exported from the dll and thus assumes the Version 0 calling convention when in fact your function is using Version1. The problem is in fmgr.h:298 #define PG_FUNCTION_INFO_V1(funcname) \ extern Pg_finfo_record * CppConcat(pg_finfo_,funcname) (void); \ ... For windows to export this function it must be: extern __declspec(dllexport) Pg_finfo_record * CppConcat(pg_finfo_,funcname) (void); Would it be possible to add a DLLEXPORT macro here to fix this? 4. Last, and the biggest issue, if my function calls pfree it blows up. What is happening is that pfree somehow ends up pointing to a random spot in memory - thus when you try to call it you execute invalid code (in fact you never execute pfree at all as far as I can see). I worked around this by using pgport_pfree which does work. Haven't a clue why... Here is the assembly for successfully calling pgport_pfree: pgport_pfree(fileName); 100112D3 mov eax,dword ptr [ebp-0Ch] 100112D6 pusheax 100112D7 call100110C3 100112DC add esp,4 100110C3 jmp 1001131A pgport_pfree: 1001131A jmp dword ptr ds:[10016288h] 005CF140 pushebp 005CF141 mov ebp,esp 005CF143 sub esp,8 005CF146 mov eax,dword ptr [ebp+8] 005CF149 mov dword ptr [esp+4],eax 005CF14D mov eax,dword ptr ds:[006A9F94h] 005CF152 mov dword ptr [esp],eax 005CF155 call005CF0D0 005CF15A leave 005CF15B ret And here is pfree. Note at the end the code tries to execute db at 005E1560 causing a segmentation fault. pfree( fileName ); 100112D3 mov eax,dword ptr [ebp-0Ch] 100112D6 pusheax 100112D7 call1001110E 100112DC add esp,4 1001110E jmp 10011238 pfree: 10011238 jmp dword ptr ds:[1001628Ch] 005E1560 db ffh Hope this helps others. It would be great if building postgresql server extensions with VC++ worked out of the box on Windows (in addition of course to using MingW) since I think it would open up a wider audience. Charlie smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] strange behavior (corruption?) of large production database
Joe Conway [EMAIL PROTECTED] writes: cyspec=# vacuum freeze pg_catalog.pg_class; ERROR: failed to re-find parent key in pg_class_relname_nsp_index It seems that we cannot vacuum pg_class, because vacuum itself fails. Any suggestions on how to bootstrap the fixing of pg_class? REINDEX? regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] strange behavior (corruption?) of large production
Tom Lane wrote: Joe Conway [EMAIL PROTECTED] writes: cyspec=# vacuum freeze pg_catalog.pg_class; ERROR: failed to re-find parent key in pg_class_relname_nsp_index It seems that we cannot vacuum pg_class, because vacuum itself fails. Any suggestions on how to bootstrap the fixing of pg_class? REINDEX? As usual, sage advice. That worked -- thanks. Joe ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Optimizer oddness, possibly compounded in 8.1
Tom Lane wrote: It's something that's on the ever-growing TODO list ... I dunno if anyone has any near-term plans to work on it. It'd definitely be nice to teach the planner to do joins-over-unions well, and then make inheritance just invoke that behavior instead of being a crocky special case. Sounds good; currently if you use the polymorphism of inherited tables, and happen to cross 2 such tables, you get O(n^2) performance. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster