Re: [PATCHES] Remove all trace of EXPLAIN EXECUTE
On Tue, 2005-08-09 at 18:50 -0400, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: There once was a rumour of a EXPLAIN EXECUTE command. This minor patch removes all trace of that, but without disturbing other valid occurrences of the EXECUTE command, which still lives on. It's not as dead as you seem to think. regression=# prepare foo as select * from tenk1; PREPARE regression=# explain execute foo; QUERY PLAN - Seq Scan on tenk1 (cost=0.00..458.00 rows=1 width=244) (1 row) regression=# Got me! LOL Patch retracted. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[PATCHES] 5 new entries for FAQ
[Sorry for the duplicate post, sent to wrong list first time] Hi, After going through pgsql-general a bit I figured there were a few important questions missing from the FAQ, so I wrote some. Comments welcome. I can write more, if people can suggest things to write about. I was thinking something about collation and locales but I'm sure sure I understand them myself. Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. Index: FAQ.html === RCS file: /projects/cvsroot/pgsql/doc/src/FAQ/FAQ.html,v retrieving revision 1.310 diff -u -r1.310 FAQ.html --- FAQ.html30 May 2005 13:11:06 - 1.310 +++ FAQ.html10 Aug 2005 07:34:41 - @@ -116,7 +116,11 @@ does not exist errors when accessing temporary tables in PL/PgSQL functions?BR A href=#4.204.20/A) What replication solutions are available?BR - + A href=#4.214.21/A) How can I make PostgreSQL compare strings ignoring case like some other databases?BR + A href=#4.224.22/A) Why are PostgreSQL tables names case-sensitive?BR + A href=#4.234.23/A) Why is PostgreSQL only using one CPU to execute my query?BR + A href=#4.244.24/A) How can I arrange to have a query run at a certain time of day? Regular intervals?BR + A href=#4.254.25/A) What does 'index row size ... exceeds btree maximum, 2713' mean? HR @@ -1000,5 +1004,114 @@ PThere are also commercial and hardware-based replication solutions available supporting a variety of replication models./P + +H3A name=4.214.21/A) How can I make PostgreSQL compare strings +ignoring case like some other databases?/H3 + +PAll text types in PostgreSQL are case-sensitive. This is generally +what you want. If you have an identifier in a table that must be a +particular format, it's better to use a SMALLCHECK/SMALL constraint. +If you want to avoid duplicates that differ only in case, create a +UNIQUE index like so: + +PRE +CREATE UNIQUE INDEX mytable_myfield ON mytable ( lower(myfield) ); +/PRE + +PIf you want to do case-insensitive searches then that index can help +you also (in a non-unique form). For regular expressions, see A +href=#4.8FAQ 4.8/A. + +PIf you really need to have a column that is case-insensitive (perhaps +for porting) there is a project on A +href=http://gborg.postgresql.org/;GBorg/A called A + href=http://gborg.postgresql.org/project/citext/projdisplay.php;citext/A +which can give you a case insensitive text data type. + +H3A name=4.224.22/A) Why are PostgreSQL table names case-sensitive?/H3 + +PActually, what happens is that unquoted identifiers are folded to a +single case before comparison. The SQL standard requires folding to +uppercase, PostgreSQL folds to lowercase. What this means is that the +identifiers SMALLMyTable/SMALL, SMALLMytable/SMALL and +SMALLmytable/SMALL all compare the same. But +SMALLMyTable/SMALL, SMALLMytable/SMALL and +SMALLMYTABLE/SMALL are all different, even in totally SQL +compliant databases. + +PThe usual cause for this complaint is that people use a GUI admin +interface to create their tables and it has helpfully quoted the table +and field names so that only identical quoting everywhere else can match +it. + +PA good rule of thumb is to either balways/b quote your identifiers +(in which case you must get the case right all the time) or bnever/b +quote them (in which case the case never matters). + +PIf this is your problem, then you can use your admin interface to rename +all the columns to all-lowercase names. Or execute queries like those +below. + +PRE +ALTER TABLE MyTable RENAME TO MyTable; +ALTER TABLE MyTable RENAME MyField TO MyField; +/PRE + +H3A name=4.234.23/A) Why is PostgreSQL only using one CPU to execute my query?/H3 + +PThe PostgreSQL process model has one single threaded backend handling +each client. This means that multiple queries from different clients +will have their loads spread out as determined most appropriate by the +operating system. But within a single query there is only a single +thread of execution, so a single query is run only on one CPU (though +the OS can move it around, PostgreSQL is unaware of this). + +PUsually this is not a problem as the system is generally I/O bound +(limited by disk speed), not CPU bound. However, for some unusual +workloads you may need to think about connecting more than once to the +same database so you can have different portions work in parallel. + +H3A name=4.244.24/A) How can I arrange to have a query run at a certain time of day? Regular
Re: [PATCHES] COPY FROM performance improvements
Simon, That part of the code was specifically written to take advantage of processing pipelines in the hardware, not because the actual theoretical algorithm for that approach was itself faster. Yup, good point. Nobody's said what compiler/hardware they have been using, so since both Alon and Tom say their character finding logic is faster, it is likely to be down to that? Name your platforms gentlemen, please. In this case, we've been using gcc (3.2.3 RHEL3 Linux, 3.4.3 Solaris 10) on Opteron and Intel Xeon and Pentium 4. Alon's performance comparisons for the parse only were done on a HT enabled P4 3.0GHz on RHEL3 with gcc 3.2.3, probably with optimization -O2, but possibly -O3. Note that the level of microparallelism on upcoming CPUs is increasing with increasing pipeline depth. Though there will be a step back on the Intel line with the introduction of the Centrino-based Xeon cores in 2006/7, other CPUs continue the trend, and I expect the next generation of multi-core CPUs to possibly introduce threaded micro-architectures which can also be scheduled as pipelines. The gcc 4 compiler introduces auto vectorization, which may enhance the optimality of some loops. I think the key thing is to make as much parallelism apparent to the compiler as possible, which will generally mean loops. This means faster code on all modern CPUs and it won't hurt older CPU speeds. My feeling is that we may learn something here that applies more widely across many parts of the code. Yes, I think one thing we've learned is that there are important parts of the code, those that are in the data path (COPY, sort, spill to disk, etc) that are in dire need of optimization. For instance, the fgetc() pattern should be banned everywhere in the data path. BTW - we are tracking down (in our spare time :-() the extremely slow sort performance. We're seeing sort times of 1.7MB/s on our fastest machines, even when the work_mem is equal to the square root of the sort set. This is a *serious* problem for us and we aren't getting to it - ideas are welcome. Optimization here means both the use of good fundamental algorithms and micro-optimization (minimize memory copies, expose long runs of operations to the compiler, maximize computational intensity by working in cache-resident blocks, etc). - Luke ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] 5 new entries for FAQ
PIf you're really desparate, you can increase the pagesize a bit (to Should be desperate, not desparate. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] 5 new entries for FAQ
On Wed, 2005-08-10 at 04:04, Martijn van Oosterhout wrote: [Sorry for the duplicate post, sent to wrong list first time] Hi, After going through pgsql-general a bit I figured there were a few important questions missing from the FAQ, so I wrote some. Comments welcome. I can write more, if people can suggest things to write about. I was thinking something about collation and locales but I'm sure sure I understand them myself. I might suggest adding links to the relevant portions of the docs, for example add a link to http://www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] 5 new entries for FAQ
On Wed, Aug 10, 2005 at 10:04:23AM +0200, Martijn van Oosterhout wrote: Comments welcome. I can write more, if people can suggest things to write about. I was thinking something about collation and locales but I'm sure sure I understand them myself. I'd really love to see a QA for encodings, recoding, and I see strange characters. Not sure how to phrase the question though. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Java is clearly an example of a money oriented programming (A. Stepanov) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] Fix oversight in pts_error_callback()
Qingqing Zhou [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes Please exhibit a case in which you feel this is needed. Suppose I want to print a debug info in parseTypeString() like this: + elog(DEBUG1, parse type %s, buf.data); raw_parsetree_list = raw_parser(buf.data); That's a contrived example (and if I believed it, I would think that the right answer is to emit no errcontext if the elevel is less than ERROR). Give me an actual use case in which the patch gives a better rather than worse error report. I think for most people it would just obfuscate the message. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] 5 new entries for FAQ
On Wed, Aug 10, 2005 at 11:17:40AM -0400, Alvaro Herrera wrote: On Wed, Aug 10, 2005 at 10:04:23AM +0200, Martijn van Oosterhout wrote: Comments welcome. I can write more, if people can suggest things to write about. I was thinking something about collation and locales but I'm sure sure I understand them myself. I'd really love to see a QA for encodings, recoding, and I see strange characters. Not sure how to phrase the question though. I think you could write a whole section just on them and all the issues on various platforms. But having never dealt with a system with multiple languages / encodings I'm not sure I really understand the issues. You know, like: Encoding / character sets gotchas / recommendations: Languages: Asian European Programming: Perl Python Java ODBC Regular expressions Full text indexing etc... Platforms: Windows UNIX etc... The main thing I wonder about is does UTF-8 handle all characters anybody would want to use. I've been told it doesn't for Asian languages, in which case I don't see how this is a solvable problem anyway. I've collected quite a few comments from other people, so I'll post a slightly revised patch later. -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a tool for doing 5% of the work and then sitting around waiting for someone else to do the other 95% so you can sue them. pgpDnETFKBczb.pgp Description: PGP signature
Re: [PATCHES] COPY FROM performance improvements
Simon Riggs [EMAIL PROTECTED] writes: Nobody's said what compiler/hardware they have been using, so since both Alon and Tom say their character finding logic is faster, it is likely to be down to that? Name your platforms gentlemen, please. I tested on HPPA with gcc 2.95.3 and on a Pentium 4 with gcc 3.4.3. Got pretty much the same results on both. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] COPY FROM performance improvements
Luke Lonergan [EMAIL PROTECTED] writes: Yes, I think one thing we've learned is that there are important parts of the code, those that are in the data path (COPY, sort, spill to disk, etc) that are in dire need of optimization. For instance, the fgetc() pattern should be banned everywhere in the data path. Luke, I dislike whacking people upside the head, but this discussion seems to presume that raw speed on Intel platforms is the only thing that matters. We have a few other concerns. Portability, readability, maintainability, and correctness all trump platform-specific optimizations. The COPY patch as presented lost badly on all those counts, and you are lucky that it didn't get rejected completely. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] COPY FROM performance improvements
On Wed, Aug 10, 2005 at 09:16:08AM -0700, Luke Lonergan wrote: On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote: Luke, I dislike whacking people upside the head, but this discussion seems to presume that raw speed on Intel platforms is the only thing that matters. We have a few other concerns. Portability, readability, maintainability, and correctness all trump platform-specific optimizations. The COPY patch as presented lost badly on all those counts, and you are lucky that it didn't get rejected completely. It's a pleasure working with you too Tom :-) Until you present a result on platform that is faster than Alon's in the code that was modified, our proof still stands that his is 20% faster than yours. AFAIR he never claimed otherwise ... his point was that to gain that additional speedup, the code has to be made considerable worse (in maintenability terms.) Have you (or Alon) tried to port the rest of the speed improvement to the new code? Maybe it's possible to have at least some of it without worsening the maintenability too badly. Another question that comes to mind is: have you tried another compiler? I see you are all using GCC at most 3.4; maybe the new optimizing infrastructure in GCC 4.1 means you can have most of the speedup without uglifying the code. What about Intel's compiler? PostgreSQL needs major improvement to compete with Oracle and even MySQL on speed. No whacking on the head is going to change that. Certainly. I think the point is what cost do we want to pay for the speedup. I think we all agree that even if we gain a 200% speedup by rewriting COPY in assembly, it's simply not acceptable. Another point may be that Bizgres can have a custom patch for the extra speedup, without inflicting the maintenance cost on the community. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) La libertad es como el dinero; el que no la sabe emplear la pierde (Alvarez) ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] COPY FROM performance improvements
Luke Lonergan wrote: Tom, On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote: Luke, I dislike whacking people upside the head, but this discussion seems to presume that raw speed on Intel platforms is the only thing that matters. We have a few other concerns. Portability, readability, maintainability, and correctness all trump platform-specific optimizations. The COPY patch as presented lost badly on all those counts, and you are lucky that it didn't get rejected completely. It's a pleasure working with you too Tom :-) Until you present a result on platform that is faster than Alon's in the code that was modified, our proof still stands that his is 20% faster than yours. Well, we could write it in assembler and make it even faster. :-) I assume no one is suggesting that, so in such cases, we need to weigh readability with performance. I have not looked at the patch issues, but usually loop unrolling is the opposite of readability, so we have to make a tradeoff. We have used macros in places where function call overhead is a major hit, so we can consider loop unrolling in places that are a major performance hit. The macros we have used have maintained the readability of the function call (unless you look at the macro contents) so perhaps the optimizations you suggest can be done with a similar eye to readability. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] COPY FROM performance improvements
Also, as we proved the last time the correctness argument was thrown in, we can fix the bugs and still make it a lot faster - and I would stick to that whether it's a PA-RISC, DEC Alpha, Intel or AMD or event Ultra Sparc. Luke this comment doesn't work. Do you have a test case that shows that on an Ultra Sparc or PPC that you are accurate? Both of these CPUs are pretty big players in the Enterprise space. PostgreSQL needs major improvement to compete with Oracle and even MySQL on speed. No whacking on the head is going to change that. I am going to assume that you forgot to clarify this statement with IN BULK LOADING, because if you didn't I would like to see your test results. My very real life experience shows that MySQL can't not keep up with PostgreSQL under load. Nobody here argues that PostgreSQL needs to improve. If it didn't need to improve I would be out of business because it would be perfect. Sincerely, Joshua D. Drake - Luke ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240 PostgreSQL Replication, Consulting, Custom Programming, 24x7 support Managed Services, Shared and Dedicated Hosting Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [PATCHES] COPY FROM performance improvements
Alvaro Herrera wrote: Another question that comes to mind is: have you tried another compiler? I see you are all using GCC at most 3.4; maybe the new optimizing infrastructure in GCC 4.1 means you can have most of the speedup without uglifying the code. What about Intel's compiler? EnterpriseDB mentioned on their web page they use the Intel compiler, so I assume they saw a speedup by using it: http://www.enterprisedb.com/edb-db.do -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [PATCHES] COPY FROM performance improvements
Alvaro, On 8/10/05 9:46 AM, Alvaro Herrera [EMAIL PROTECTED] wrote: AFAIR he never claimed otherwise ... his point was that to gain that additional speedup, the code has to be made considerable worse (in maintenability terms.) Have you (or Alon) tried to port the rest of the speed improvement to the new code? Maybe it's possible to have at least some of it without worsening the maintenability too badly. As I suggested previously, there is another, more maintainable way to get more performance from the parsing logic. It involves replacing something like this: char c = input_routine() if (c == '\n') { else if ( . . . } With something like this: char [32] carr; nread = Input_routine_new(carr,32) for (i=0; inread; i++) { if (carr[I] == '\n') { . . . } And this section would run much faster (3x?). This is what I think could make the overall patch 50% faster than it is now (on the parsing part). The issue that I expect we'll hear about is that since the parsing is already 500% faster, it has vanished in the profile. That's why Tom's testing is not showing much difference between his and Alon's code, we actually drop the other sections to bring it forward where we see the bigger difference. However, what I'm arguing here and elsewhere is that there's still a lot more of this kind of optimization to be done. 12 MB/s COPY speed is not enough. There's 40% of the time in processing left to smack down. Another question that comes to mind is: have you tried another compiler? I see you are all using GCC at most 3.4; maybe the new optimizing infrastructure in GCC 4.1 means you can have most of the speedup without uglifying the code. What about Intel's compiler? We have routinely distributed PostgreSQL with the Intel compiler, up until recently. Interestingly, GCC now beats it handily in our tests on Opteron and matches it on Xeon, which is too bad - it's my fav compiler. The problem with this code is that it doesn't have enough micro-parallelism without loops on the character parsing core. The compiler can only do register optimizations and branch prediction (poorly) unless it is given more to work with. PostgreSQL needs major improvement to compete with Oracle and even MySQL on speed. No whacking on the head is going to change that. Certainly. I think the point is what cost do we want to pay for the speedup. I think we all agree that even if we gain a 200% speedup by rewriting COPY in assembly, it's simply not acceptable. Understood, and I totally agree. Another point may be that Bizgres can have a custom patch for the extra speedup, without inflicting the maintenance cost on the community. We are committed to making Postgres the best DBMS for Business Intelligence. Bizgres makes it safe for businesses to rely on open source for their production uses. As far as features go, I think the best way for our customers is to make sure that Bizgres features are supporting the PostgreSQL core and vis-versa. - Luke ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] COPY FROM performance improvements
On Wed, Aug 10, 2005 at 12:57:18PM -0400, Bruce Momjian wrote: Alvaro Herrera wrote: Another question that comes to mind is: have you tried another compiler? I see you are all using GCC at most 3.4; maybe the new optimizing infrastructure in GCC 4.1 means you can have most of the speedup without uglifying the code. What about Intel's compiler? EnterpriseDB mentioned on their web page they use the Intel compiler, so I assume they saw a speedup by using it: http://www.enterprisedb.com/edb-db.do Yes, but notice these guys are Greenplum, not EDB. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Hay dos momentos en la vida de un hombre en los que no debería especular: cuando puede permitírselo y cuando no puede (Mark Twain) ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [PATCHES] [HACKERS] Autovacuum loose ends
Updated this patch again: - vacuum_cost_delay and vacuum_cost_limit can be set per table, as well as globally with autovacuum_vacuum_cost_{limit,delay} - pgstat is reset if recovery is required - pgstat reset at postmaster start is disabled by default - Xid-wraparound VACUUM is now FREEZE without ANALYZE, iff the database has datallowconn=false or datistemplate=true - A database-wide vacuum is also issued if the vacuumxid is found to be very old. Note that I had to add datvacuumxid to the pg_database flat file. -- Alvaro Herrera (alvherre[a]alvh.no-ip.org) Ciencias políticas es la ciencia de entender por qué los políticos actúan como lo hacen (netfunny.com) Index: doc/src/sgml/catalogs.sgml === RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/catalogs.sgml,v retrieving revision 2.110 diff -c -r2.110 catalogs.sgml *** doc/src/sgml/catalogs.sgml 31 Jul 2005 17:19:16 - 2.110 --- doc/src/sgml/catalogs.sgml 7 Aug 2005 19:03:22 - *** *** 1197,1202 --- 1197,1216 entryMultiplier for reltuples to add to structfieldanl_base_thresh//entry /row + +row + entrystructfieldvac_cost_delay/structfield/entry + entrytypeinteger/type/entry + entry/entry + entryCustom variablevacuum_cost_delay/ parameter/entry +/row + +row + entrystructfieldvac_cost_limit/structfield/entry + entrytypeinteger/type/entry + entry/entry + entryCustom variablevacuum_cost_limit/ parameter/entry +/row /tbody /tgroup /table *** *** 1217,1223 para Any of the numerical fields can contain literal-1/ (or indeed any negative value) to indicate that the system-wide default should !be used for this particular value. /para /sect1 --- 1231,1241 para Any of the numerical fields can contain literal-1/ (or indeed any negative value) to indicate that the system-wide default should !be used for this particular value. Observe that the !structfieldvac_cost_delay/ variable inherits its default value from the !varnameautovacuum_vacuum_cost_delay/ configuration parameter, !or from varnamevacuum_cost_delay/ if the former is set to a negative !value. The same applies to structfieldvac_cost_limit/. /para /sect1 Index: doc/src/sgml/runtime.sgml === RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/runtime.sgml,v retrieving revision 1.341 diff -c -r1.341 runtime.sgml *** doc/src/sgml/runtime.sgml 30 Jul 2005 17:15:35 - 1.341 --- doc/src/sgml/runtime.sgml 31 Jul 2005 23:22:11 - *** *** 3399,3404 --- 3399,3436 /listitem /varlistentry + varlistentry id=guc-autovacuum-vacuum-cost-delay xreflabel=autovacuum_vacuum_cost_delay + termvarnameautovacuum_vacuum_cost_delay/varname (typeinteger/type)/term + indexterm +primaryvarnameautovacuum_vacuum_cost_delay/ configuration parameter/primary + /indexterm + listitem +para + Specifies the default value that will be applied to each + commandVACUUM/ operation, for tables that do not have + a default value set in structnamepg_autovacuum/. If a + negative value is specified (like the default value of -1), + the varnamevacuum_cost_delay/ value will be applied instead. +/para + /listitem + /varlistentry + + varlistentry id=guc-autovacuum-cost-limit xreflabel=autovacuum_vacuum_cost_limit + termvarnameautovacuum_vacuum_cost_limit/varname (typeinteger/type)/term + indexterm +primaryvarnameautovacuum_vacuum_cost_limit/ configuration parameter/primary + /indexterm + listitem +para + Specifies the default value that will be applied to each + commandVACUUM/ operation, for tables that do not have + a default value set in structnamepg_autovacuum/. If a + negative value is specified (like the default value of -1), + the varnamevacuum_cost_limit/ value will be applied instead. +/para + /listitem + /varlistentry + /variablelist /sect2 Index: src/backend/access/transam/xlog.c === RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v retrieving revision 1.214 diff -c -r1.214 xlog.c *** src/backend/access/transam/xlog.c 30 Jul 2005 14:15:44 - 1.214 --- src/backend/access/transam/xlog.c 31 Jul 2005 21:36:45 - *** *** 33,38 --- 33,39 #include catalog/catversion.h #include catalog/pg_control.h #include miscadmin.h + #include pgstat.h #include postmaster/bgwriter.h #include storage/bufpage.h #include storage/fd.h ***
Re: [PATCHES] (was COPY FROM) performance improvements
On Wed, 2005-08-10 at 11:37 -0400, Tom Lane wrote: Luke Lonergan [EMAIL PROTECTED] writes: Yes, I think one thing we've learned is that there are important parts of the code, those that are in the data path (COPY, sort, spill to disk, etc) that are in dire need of optimization. For instance, the fgetc() pattern should be banned everywhere in the data path. this discussion seems to presume that raw speed on Intel platforms is the only thing that matters. We have a few other concerns. Portability, readability, maintainability, and correctness all trump platform-specific optimizations. I am interested in the general principle here, not this specific case. As you know, I have raised the need for specific hardware tuning in certain critical areas on a number of occasions. I very much respect the need for all of the other aspects of code quality mentioned. Pipeline parallelism is a feature of all modern CPUs since the Pentium, not just Intel's. I think judicious exploitation of hardware features that are common to multiple hardware architectures would be of considerable benefit to everybody. We do already exploit some common hardware tuning recommendations, such as buffer word alignment, but not others such as false sharing avoidance and pipeline parallelism of key loops. (There may be others...) I say judicious because I do not presume that I am the judge ... but I hope that judgements in these areas can fall towards the side of greater performance as often as possible. Hardware and OS do exist, much as I would prefer the simplicity of life in a layered IT architecture. Best Regards, Simon Riggs ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] (was COPY FROM) performance improvements
Simon Riggs wrote: As you know, I have raised the need for specific hardware tuning in certain critical areas on a number of occasions. I very much respect the need for all of the other aspects of code quality mentioned. Pipeline parallelism is a feature of all modern CPUs since the Pentium, not just Intel's. I think judicious exploitation of hardware features that are common to multiple hardware architectures would be of considerable benefit to everybody. We do already exploit some common hardware tuning recommendations, such as buffer word alignment, but not others such as false sharing avoidance and pipeline parallelism of key loops. (There may be others...) I say judicious because I do not presume that I am the judge ... but I hope that judgements in these areas can fall towards the side of greater performance as often as possible. Hardware and OS do exist, much as I would prefer the simplicity of life in a layered IT architecture. Right. We already have per-cpu test-and-set locks, and lots of macros, so we just need to decide what places we need these optionations, and how to do it cleanly. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[PATCHES] Proposed patch for preventing OID collisions
Attached is a reasonably complete working-out of my earlier proposal to prevent OID collisions by probing catalogs' OID indexes and advancing to the next OID whenever there is a duplicate. The code is driven by the existence of a unique index, and so it is possible for users to take advantage of the feature too: create table foo(...) with oids; create unique index foo_oid on foo(oid); I'm not sure if we really want to document that though, as the implementation will perform poorly when there are long runs of consecutive OIDs in the table. This is relatively unlikely in the system catalogs but seems quite probable in a user table. The patch also removes CheckMaxObjectId, which seems no longer needed. If we commit it, we should remove the code in pg_dump that tries to set the OID counter, since that'd be dead code; but I didn't include that change in this patch. Documentation updates are lacking too. In principle I think this could be back-patched as far as 7.2, since RelationGetIndexList exists that far back, but of course it needs a good deal of testing before we even think about that. (There are a couple of aspects that depend on the recent changes to have fixed, preassigned OIDs for system catalogs, and we'd have to tweak those areas to make a back-patch.) Comments? regards, tom lane binllTLVzLsFX.bin Description: unique_oids.patch.gz ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [PATCHES] Fix oversight in pts_error_callback()
Tom Lane [EMAIL PROTECTED] writes: That's a contrived example (and if I believed it, I would think that the right answer is to emit no errcontext if the elevel is less than ERROR). Yes, I've thought about ignore errcontext by considering elevel. But I scratched the source code for other uses of errcontext, and I found seems all of them understand that errcontext will be called unconditionally. For example, buffer_write_error_callback(), it doesn't say error in writing block ... but says writing block So I think this place is not consistent with others - it just says invalid ... , and should be changed. Regards, Qingqing ---(end of broadcast)--- TIP 6: explain analyze is your friend
[PATCHES] Bug in canonicalize_path()
I found that in port/path.c::canonicalize_path, that if the path was supplied as /usr/local/bin/../.. we would return /usr/local/bin. The problem is then when we saw a trailing .. we stripped it off and the previous directory, but we never checked if the previous directory was itself ... Patch applied to suppress trimming of .. if .. is above it. I tried coding something that would handle ../.. but is started to look too messy and not worth the effort. I don't see a need to backpatch this, but it could produce errors with weird supplied paths. Comments? -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ? pg_config_paths.h Index: path.c === RCS file: /cvsroot/pgsql/src/port/path.c,v retrieving revision 1.51 diff -c -r1.51 path.c *** path.c 26 Jan 2005 19:24:03 - 1.51 --- path.c 11 Aug 2005 03:52:06 - *** *** 284,290 if (len 2 strcmp(path + len - 2, /.) == 0) trim_directory(path); ! else if (len 3 strcmp(path + len - 3, /..) == 0) { trim_directory(path); trim_directory(path); /* remove directory above */ --- 284,293 if (len 2 strcmp(path + len - 2, /.) == 0) trim_directory(path); ! /* We can only deal with /usr/local/.., not /usr/local/../.. */ ! else if (len 3 strcmp(path + len - 3, /..) == 0 !(len != 5 || strcmp(path, ../..) != 0) !(len 6 || strcmp(path + len - 6, /../..) != 0)) { trim_directory(path); trim_directory(path); /* remove directory above */ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [PATCHES] [HACKERS] For review: Server instrumentation patch
Dave Page wrote: The list isn't complete. pgadmin uses these three functions for logfile tracking: - pg_logdir_ls to list logfiles - pg_file_length to check for changes of the current logfile - pg_file_read to retrieve a logfile Yes you're right, I didn't check thoroughly (in my defence, the coffee machine broke this morning). Anyhoo, pg_file_stat is used by pg_file_length, so that would be required as well. None of those allow any modification of the filesystem, so do not suffer the potential security issues that Tom was concerned about, so hopefully there is no problem with them going in? OK, I have modified the patch to include these functions: pg_reload_conf() pg_file_stat() pg_file_read() pg_file_length() pg_dir_ls() pg_logfile_rotate() pg_logdir_ls() These can only be run by the super-user, and can only access files inside PGDATA, or in the logdir directory if that is in a different place from PGDATA. The only part I didn't like about the patch is the stat display: test= select pg_file_stat('postgresql.conf'); pg_file_stat - (12287,2005-08-11 00:06:30,2005-08-11 00:06:43,2005-08-11 00:06:30,f) (1 row) Shouldn't this return multiple labeled columns rather than an array? The patch is attached and genfile.c goes in utils/adt. -- Bruce Momjian| http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 Index: doc/src/sgml/func.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.276 diff -c -c -r1.276 func.sgml *** doc/src/sgml/func.sgml 2 Aug 2005 16:11:56 - 1.276 --- doc/src/sgml/func.sgml 11 Aug 2005 04:21:37 - *** *** 9061,9066 --- 9061,9069 indexterm zone=functions-admin primarypg_cancel_backend/primary /indexterm +indexterm zone=functions-admin + primarypg_reload_conf/primary +/indexterm indexterm zone=functions-admin primarysignal/primary *** *** 9068,9074 /indexterm para ! The function shown in xref linkend=functions-admin-signal-table sends control signals to other server processes. Use of this function is restricted to superusers. --- 9071,9077 /indexterm para ! The functions shown in xref linkend=functions-admin-signal-table sends control signals to other server processes. Use of this function is restricted to superusers. *** *** 9090,9110 entrytypeint/type/entry entryCancel a backend's current query/entry /row /tbody /tgroup /table para ! This function returns 1 if successful, 0 if not successful. The process ID (literalpid/literal) of an active backend can be found from the structfieldprocpid/structfield column in the structnamepg_stat_activity/structname view, or by listing the commandpostgres/command processes on the server with applicationps/. /para ! !indexterm zone=functions-admin ! primarypg_start_backup/primary !/indexterm indexterm zone=functions-admin primarypg_stop_backup/primary --- 9093,9121 entrytypeint/type/entry entryCancel a backend's current query/entry /row + row +entry + literalfunctionpg_reload_conf/function()/literal + /entry +entrytypeint/type/entry +entryTriggers the server processes to reload configuration files/entry + /row /tbody /tgroup /table para ! These functions return 1 if successful, 0 if not successful. The process ID (literalpid/literal) of an active backend can be found from the structfieldprocpid/structfield column in the structnamepg_stat_activity/structname view, or by listing the commandpostgres/command processes on the server with applicationps/. /para !para ! functionpg_reload_conf/ sends a SIGHUP event to the ! postmaster, and thus triggers a reload of the configuration files ! in all backend processes. !/para indexterm zone=functions-admin primarypg_stop_backup/primary *** *** 9309,9314 --- 9320,9457 appropriate. /para +para + The functions shown in xref + linkend=functions-admin-genfile provide native file access to + files on the machine hosting the server. They are restricted to + the cluster directory or the logfile directory. + Use of these functions is restricted to
Re: [PATCHES] remove BufferBlockPointers for speed and space
On Thu, 11 Aug 2005, Qingqing Zhou wrote: It is said that the BufferBlockPointers is used to speedup the BufferGetBlock() macro. I compared three ways of getting block pointers. I.e., existing method (arrary method), calculating block pointer by adding base addr and offset*blockid method (mul method) and optimizing mul method by using bit shift (shift method). All of them calculate the block pointer 8 times (i.e., the BufferBlockPointers array is of size 8), and each take 3 rounds. The result is: SunOS/gcc 3.2 duration round 1 of array method: 4.179 ms duration round 2 of array method: 4.160 ms duration round 3 of array method: 4.143 ms duration round 1 of mul method: 3.311 ms duration round 2 of mul method: 3.233 ms duration round 3 of mul method: 3.233 ms duration round 1 of shift method: 3.554 ms duration round 2 of shift method: 3.235 ms duration round 3 of shift method: 3.233 ms Linux/gcc 3.2 duration round 1 of array method: 0.422 ms duration round 2 of array method: 0.324 ms duration round 3 of array method: 0.354 ms duration round 1 of mul method: 0.271 ms duration round 2 of mul method: 0.248 ms duration round 3 of mul method: 0.304 ms duration round 1 of shift method: 0.322 ms duration round 2 of shift method: 0.239 ms duration round 3 of shift method: 0.265 ms We can conclude that: (1) mul or shift are definitely better than array method; (2) mul and shift are comparable; Do you have results for more recent gcc releases? Thanks, Gavin ---(end of broadcast)--- TIP 6: explain analyze is your friend