Re: [PATCHES] Remove all trace of EXPLAIN EXECUTE

2005-08-10 Thread Simon Riggs
On Tue, 2005-08-09 at 18:50 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  There once was a rumour of a EXPLAIN EXECUTE command.
  This minor patch removes all trace of that, but without disturbing other
  valid occurrences of the EXECUTE command, which still lives on.
 
 It's not as dead as you seem to think.
 
 regression=# prepare foo as select * from tenk1;
 PREPARE
 regression=# explain execute foo;
  QUERY PLAN
 -
  Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
 (1 row)
 
 regression=#

Got me! LOL

Patch retracted.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[PATCHES] 5 new entries for FAQ

2005-08-10 Thread Martijn van Oosterhout
[Sorry for the duplicate post, sent to wrong list first time]

Hi,

After going through pgsql-general a bit I figured there were a few
important questions missing from the FAQ, so I wrote some.

Comments welcome. I can write more, if people can suggest things to
write about. I was thinking something about collation and locales but
I'm sure sure I understand them myself.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.
Index: FAQ.html
===
RCS file: /projects/cvsroot/pgsql/doc/src/FAQ/FAQ.html,v
retrieving revision 1.310
diff -u -r1.310 FAQ.html
--- FAQ.html30 May 2005 13:11:06 -  1.310
+++ FAQ.html10 Aug 2005 07:34:41 -
@@ -116,7 +116,11 @@
 does not exist errors when accessing temporary tables in PL/PgSQL
 functions?BR
  A href=#4.204.20/A) What replication solutions are available?BR
- 
+ A href=#4.214.21/A) How can I make PostgreSQL compare strings 
ignoring case like some other databases?BR
+ A href=#4.224.22/A) Why are PostgreSQL tables names 
case-sensitive?BR
+ A href=#4.234.23/A) Why is PostgreSQL only using one CPU to execute 
my query?BR
+ A href=#4.244.24/A) How can I arrange to have a query run at a 
certain time of day? Regular intervals?BR
+ A href=#4.254.25/A) What does 'index row size ... exceeds btree 
maximum, 2713' mean?
 
 HR
 
@@ -1000,5 +1004,114 @@
 
 PThere are also commercial and hardware-based replication solutions
 available supporting a variety of replication models./P
+
+H3A name=4.214.21/A) How can I make PostgreSQL compare strings 
+ignoring case like some other databases?/H3
+
+PAll text types in PostgreSQL are case-sensitive. This is generally
+what you want. If you have an identifier in a table that must be a
+particular format, it's better to use a SMALLCHECK/SMALL constraint.
+If you want to avoid duplicates that differ only in case, create a
+UNIQUE index like so:
+
+PRE
+CREATE UNIQUE INDEX mytable_myfield ON mytable ( lower(myfield) );
+/PRE
+
+PIf you want to do case-insensitive searches then that index can help
+you also (in a non-unique form). For regular expressions, see A
+href=#4.8FAQ 4.8/A.
+
+PIf you really need to have a column that is case-insensitive (perhaps
+for porting) there is a project on A
+href=http://gborg.postgresql.org/;GBorg/A called A
+
href=http://gborg.postgresql.org/project/citext/projdisplay.php;citext/A
+which can give you a case insensitive text data type.
+
+H3A name=4.224.22/A) Why are PostgreSQL table names 
case-sensitive?/H3
+
+PActually, what happens is that unquoted identifiers are folded to a
+single case before comparison. The SQL standard requires folding to
+uppercase, PostgreSQL folds to lowercase. What this means is that the
+identifiers SMALLMyTable/SMALL, SMALLMytable/SMALL and
+SMALLmytable/SMALL all compare the same. But
+SMALLMyTable/SMALL, SMALLMytable/SMALL and
+SMALLMYTABLE/SMALL are all different, even in totally SQL
+compliant databases.
+
+PThe usual cause for this complaint is that people use a GUI admin
+interface to create their tables and it has helpfully quoted the table
+and field names so that only identical quoting everywhere else can match
+it.
+
+PA good rule of thumb is to either balways/b quote your identifiers
+(in which case you must get the case right all the time) or bnever/b
+quote them (in which case the case never matters).
+
+PIf this is your problem, then you can use your admin interface to rename
+all the columns to all-lowercase names. Or execute queries like those
+below.
+
+PRE
+ALTER TABLE MyTable RENAME TO MyTable;
+ALTER TABLE MyTable RENAME MyField TO MyField;
+/PRE
+
+H3A name=4.234.23/A) Why is PostgreSQL only using one CPU to 
execute my query?/H3
+
+PThe PostgreSQL process model has one single threaded backend handling
+each client. This means that multiple queries from different clients
+will have their loads spread out as determined most appropriate by the
+operating system. But within a single query there is only a single
+thread of execution, so a single query is run only on one CPU (though
+the OS can move it around, PostgreSQL is unaware of this).
+
+PUsually this is not a problem as the system is generally I/O bound
+(limited by disk speed), not CPU bound. However, for some unusual
+workloads you may need to think about connecting more than once to the
+same database so you can have different portions work in parallel.
+
+H3A name=4.244.24/A) How can I arrange to have a query run at a 
certain time of day? Regular 

Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Luke Lonergan
Simon,
 
 That part of the code was specifically written to take advantage of
 processing pipelines in the hardware, not because the actual theoretical
 algorithm for that approach was itself faster.

Yup, good point.
 
 Nobody's said what compiler/hardware they have been using, so since both
 Alon and Tom say their character finding logic is faster, it is likely
 to be down to that? Name your platforms gentlemen, please.

In this case, we've been using gcc (3.2.3 RHEL3 Linux, 3.4.3 Solaris 10) on 
Opteron and Intel Xeon and Pentium 4.  Alon's performance comparisons for the 
parse only were done on a HT enabled P4 3.0GHz on RHEL3 with gcc 3.2.3, 
probably with optimization -O2, but possibly -O3.
 
Note that the level of microparallelism on upcoming CPUs is increasing with 
increasing pipeline depth.  Though there will be a step back on the Intel line 
with the introduction of the Centrino-based Xeon cores in 2006/7, other CPUs 
continue the trend, and I expect the next generation of multi-core CPUs to 
possibly introduce threaded micro-architectures which can also be scheduled as 
pipelines.  The gcc 4 compiler introduces auto vectorization, which may enhance 
the optimality of some loops.
 
I think the key thing is to make as much parallelism apparent to the compiler 
as possible, which will generally mean loops.  This means faster code on all 
modern CPUs and it won't hurt older CPU speeds.

 My feeling is that we may learn something here that applies more widely
 across many parts of the code.

Yes, I think one thing we've learned is that there are important parts of the 
code, those that are in the data path (COPY, sort, spill to disk, etc) that are 
in dire need of optimization.  For instance, the fgetc() pattern should be 
banned everywhere in the data path.
 
BTW - we are tracking down (in our spare time :-() the extremely slow sort 
performance.  We're seeing sort times of 1.7MB/s on our fastest machines, even 
when the work_mem is equal to the square root of the sort set.  This is a 
*serious* problem for us and we aren't getting to it - ideas are welcome.
 
Optimization here means both the use of good fundamental algorithms and 
micro-optimization (minimize memory copies, expose long runs of operations to 
the compiler, maximize computational intensity by working in cache-resident 
blocks, etc).
 
- Luke




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] 5 new entries for FAQ

2005-08-10 Thread Matt Miller
 PIf you're really desparate, you can increase the pagesize a bit (to

Should be desperate, not desparate.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] 5 new entries for FAQ

2005-08-10 Thread Robert Treat
On Wed, 2005-08-10 at 04:04, Martijn van Oosterhout wrote:
 [Sorry for the duplicate post, sent to wrong list first time]
 
 Hi,
 
 After going through pgsql-general a bit I figured there were a few
 important questions missing from the FAQ, so I wrote some.
 
 Comments welcome. I can write more, if people can suggest things to
 write about. I was thinking something about collation and locales but
 I'm sure sure I understand them myself.
 


I might suggest adding links to the relevant portions of the docs, for
example add a link to
http://www.postgresql.org/docs/8.0/static/sql-syntax.html#SQL-SYNTAX-IDENTIFIERS


Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] 5 new entries for FAQ

2005-08-10 Thread Alvaro Herrera
On Wed, Aug 10, 2005 at 10:04:23AM +0200, Martijn van Oosterhout wrote:

 Comments welcome. I can write more, if people can suggest things to
 write about. I was thinking something about collation and locales but
 I'm sure sure I understand them myself.

I'd really love to see a QA for encodings, recoding, and I see strange
characters.  Not sure how to phrase the question though.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Java is clearly an example of a money oriented programming  (A. Stepanov)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Fix oversight in pts_error_callback()

2005-08-10 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] writes
 Please exhibit a case in which you feel this is needed.

 Suppose I want to print a debug info in parseTypeString() like this:
 + elog(DEBUG1, parse type %s, buf.data);
  raw_parsetree_list = raw_parser(buf.data);

That's a contrived example (and if I believed it, I would think that the
right answer is to emit no errcontext if the elevel is less than ERROR).
Give me an actual use case in which the patch gives a better rather than
worse error report.  I think for most people it would just obfuscate the
message.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] 5 new entries for FAQ

2005-08-10 Thread Martijn van Oosterhout
On Wed, Aug 10, 2005 at 11:17:40AM -0400, Alvaro Herrera wrote:
 On Wed, Aug 10, 2005 at 10:04:23AM +0200, Martijn van Oosterhout wrote:
 
  Comments welcome. I can write more, if people can suggest things to
  write about. I was thinking something about collation and locales but
  I'm sure sure I understand them myself.
 
 I'd really love to see a QA for encodings, recoding, and I see strange
 characters.  Not sure how to phrase the question though.

I think you could write a whole section just on them and all the issues
on various platforms. But having never dealt with a system with
multiple languages / encodings I'm not sure I really understand the
issues. You know, like:

Encoding / character sets gotchas / recommendations:

Languages:
  Asian
  European
Programming:
  Perl
  Python
  Java
  ODBC
  Regular expressions
  Full text indexing
  etc...
Platforms:
  Windows
  UNIX
  etc...

The main thing I wonder about is does UTF-8 handle all characters
anybody would want to use. I've been told it doesn't for Asian
languages, in which case I don't see how this is a solvable problem
anyway.

I've collected quite a few comments from other people, so I'll post a
slightly revised patch later.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpDnETFKBczb.pgp
Description: PGP signature


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Nobody's said what compiler/hardware they have been using, so since both
 Alon and Tom say their character finding logic is faster, it is likely
 to be down to that? Name your platforms gentlemen, please.

I tested on HPPA with gcc 2.95.3 and on a Pentium 4 with gcc 3.4.3.
Got pretty much the same results on both.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Tom Lane
Luke Lonergan [EMAIL PROTECTED] writes:
 Yes, I think one thing we've learned is that there are important parts
 of the code, those that are in the data path (COPY, sort, spill to
 disk, etc) that are in dire need of optimization.  For instance, the
 fgetc() pattern should be banned everywhere in the data path.

Luke, I dislike whacking people upside the head, but this discussion
seems to presume that raw speed on Intel platforms is the only thing
that matters.  We have a few other concerns.  Portability, readability,
maintainability, and correctness all trump platform-specific
optimizations.  The COPY patch as presented lost badly on all those
counts, and you are lucky that it didn't get rejected completely.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Alvaro Herrera
On Wed, Aug 10, 2005 at 09:16:08AM -0700, Luke Lonergan wrote:

 On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote:
 
  Luke, I dislike whacking people upside the head, but this discussion
  seems to presume that raw speed on Intel platforms is the only thing
  that matters.  We have a few other concerns.  Portability, readability,
  maintainability, and correctness all trump platform-specific
  optimizations.  The COPY patch as presented lost badly on all those
  counts, and you are lucky that it didn't get rejected completely.
 
 It's a pleasure working with you too Tom :-)
 
 Until you present a result on platform that is faster than Alon's in the
 code that was modified, our proof still stands that his is 20% faster than
 yours.

AFAIR he never claimed otherwise ... his point was that to gain that
additional speedup, the code has to be made considerable worse (in
maintenability terms.)  Have you (or Alon) tried to port the rest of the
speed improvement to the new code?  Maybe it's possible to have at least
some of it without worsening the maintenability too badly.

Another question that comes to mind is: have you tried another compiler?
I see you are all using GCC at most 3.4; maybe the new optimizing
infrastructure in GCC 4.1 means you can have most of the speedup without
uglifying the code.  What about Intel's compiler?

 PostgreSQL needs major improvement to compete with Oracle and even MySQL on
 speed.  No whacking on the head is going to change that.

Certainly.  I think the point is what cost do we want to pay for the
speedup.  I think we all agree that even if we gain a 200% speedup by
rewriting COPY in assembly, it's simply not acceptable.

Another point may be that Bizgres can have a custom patch for the extra
speedup, without inflicting the maintenance cost on the community.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
La libertad es como el dinero; el que no la sabe emplear la pierde (Alvarez)

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Bruce Momjian
Luke Lonergan wrote:
 Tom,
 
 On 8/10/05 8:37 AM, Tom Lane [EMAIL PROTECTED] wrote:
 
  Luke, I dislike whacking people upside the head, but this discussion
  seems to presume that raw speed on Intel platforms is the only thing
  that matters.  We have a few other concerns.  Portability, readability,
  maintainability, and correctness all trump platform-specific
  optimizations.  The COPY patch as presented lost badly on all those
  counts, and you are lucky that it didn't get rejected completely.
 
 It's a pleasure working with you too Tom :-)
 
 Until you present a result on platform that is faster than Alon's in the
 code that was modified, our proof still stands that his is 20% faster than
 yours.

Well, we could write it in assembler and make it even faster. :-)

I assume no one is suggesting that, so in such cases, we need to weigh
readability with performance.  I have not looked at the patch issues,
but usually loop unrolling is the opposite of readability, so we have to
make a tradeoff.  We have used macros in places where function call
overhead is a major hit, so we can consider loop unrolling in places
that are a major performance hit.  The macros we have used have
maintained the readability of the function call (unless you look at the
macro contents) so perhaps the optimizations you suggest can be done
with a similar eye to readability.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Joshua D. Drake

Also, as we proved the last time the correctness argument was thrown in, we
can fix the bugs and still make it a lot faster - and I would stick to that
whether it's a PA-RISC, DEC Alpha, Intel or AMD or event Ultra Sparc.


Luke this comment doesn't work. Do you have a test case that shows that 
on an Ultra Sparc or PPC that you are accurate? Both of these CPUs are 
pretty big players in the Enterprise space.



PostgreSQL needs major improvement to compete with Oracle and even MySQL on
speed.  No whacking on the head is going to change that.


I am going to assume that you forgot to clarify this statement with IN 
BULK LOADING, because if you didn't I would like to see your test results.


My very real life experience shows that MySQL can't not keep up with 
PostgreSQL under load.


Nobody here argues that PostgreSQL needs to improve. If it didn't need
to improve I would be out of business because it would be perfect.


Sincerely,

Joshua D. Drake





- Luke



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly



--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Bruce Momjian
Alvaro Herrera wrote:
 Another question that comes to mind is: have you tried another compiler?
 I see you are all using GCC at most 3.4; maybe the new optimizing
 infrastructure in GCC 4.1 means you can have most of the speedup without
 uglifying the code.  What about Intel's compiler?

EnterpriseDB mentioned on their web page they use the Intel compiler, so
I assume they saw a speedup by using it:

http://www.enterprisedb.com/edb-db.do

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Luke Lonergan
Alvaro,

On 8/10/05 9:46 AM, Alvaro Herrera [EMAIL PROTECTED] wrote:

 AFAIR he never claimed otherwise ... his point was that to gain that
 additional speedup, the code has to be made considerable worse (in
 maintenability terms.)  Have you (or Alon) tried to port the rest of the
 speed improvement to the new code?  Maybe it's possible to have at least
 some of it without worsening the maintenability too badly.

As I suggested previously, there is another, more maintainable way to get
more performance from the parsing logic.

It involves replacing something like this:


char c = input_routine()
 if (c == '\n') {
 else if (
.
.
.
 }


With something like this:


char [32] carr;

nread = Input_routine_new(carr,32)

  for (i=0; inread; i++) {
if (carr[I] == '\n') {
.
.
.
  }


And this section would run much faster (3x?).

This is what I think could make the overall patch 50% faster than it is now
(on the parsing part).

The issue that I expect we'll hear about is that since the parsing is
already 500% faster, it has vanished in the profile.  That's why Tom's
testing is not showing much difference between his and Alon's code, we
actually drop the other sections to bring it forward where we see the bigger
difference.

However, what I'm arguing here and elsewhere is that there's still a lot
more of this kind of optimization to be done.  12 MB/s COPY speed is not
enough.  There's 40% of the time in processing left to smack down.
  
 Another question that comes to mind is: have you tried another compiler?
 I see you are all using GCC at most 3.4; maybe the new optimizing
 infrastructure in GCC 4.1 means you can have most of the speedup without
 uglifying the code.  What about Intel's compiler?

We have routinely distributed PostgreSQL with the Intel compiler, up until
recently.  Interestingly, GCC now beats it handily in our tests on Opteron
and matches it on Xeon, which is too bad - it's my fav compiler.

The problem with this code is that it doesn't have enough micro-parallelism
without loops on the character parsing core.  The compiler can only do
register optimizations and branch prediction (poorly) unless it is given
more to work with.

 PostgreSQL needs major improvement to compete with Oracle and even MySQL on
 speed.  No whacking on the head is going to change that.
 
 Certainly.  I think the point is what cost do we want to pay for the
 speedup.  I think we all agree that even if we gain a 200% speedup by
 rewriting COPY in assembly, it's simply not acceptable.

Understood, and I totally agree.

 Another point may be that Bizgres can have a custom patch for the extra
 speedup, without inflicting the maintenance cost on the community.

We are committed to making Postgres the best DBMS for Business Intelligence.
Bizgres makes it safe for businesses to rely on open source for their
production uses.  As far as features go, I think the best way for our
customers is to make sure that Bizgres features are supporting the
PostgreSQL core and vis-versa.

- Luke 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] COPY FROM performance improvements

2005-08-10 Thread Alvaro Herrera
On Wed, Aug 10, 2005 at 12:57:18PM -0400, Bruce Momjian wrote:
 Alvaro Herrera wrote:
  Another question that comes to mind is: have you tried another compiler?
  I see you are all using GCC at most 3.4; maybe the new optimizing
  infrastructure in GCC 4.1 means you can have most of the speedup without
  uglifying the code.  What about Intel's compiler?
 
 EnterpriseDB mentioned on their web page they use the Intel compiler, so
 I assume they saw a speedup by using it:
 
   http://www.enterprisedb.com/edb-db.do

Yes, but notice these guys are Greenplum, not EDB.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Hay dos momentos en la vida de un hombre en los que no debería
especular: cuando puede permitírselo y cuando no puede (Mark Twain)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] [HACKERS] Autovacuum loose ends

2005-08-10 Thread Alvaro Herrera
Updated this patch again:

  - vacuum_cost_delay and vacuum_cost_limit can be set per table, as well
as globally with autovacuum_vacuum_cost_{limit,delay}
  
  - pgstat is reset if recovery is required
  
  - pgstat reset at postmaster start is disabled by default
 
 - Xid-wraparound VACUUM is now FREEZE without ANALYZE, iff the database
   has datallowconn=false or datistemplate=true

- A database-wide vacuum is also issued if the vacuumxid is found to be
  very old.

Note that I had to add datvacuumxid to the pg_database flat file.

-- 
Alvaro Herrera (alvherre[a]alvh.no-ip.org)
Ciencias políticas es la ciencia de entender por qué
 los políticos actúan como lo hacen  (netfunny.com)
Index: doc/src/sgml/catalogs.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/catalogs.sgml,v
retrieving revision 2.110
diff -c -r2.110 catalogs.sgml
*** doc/src/sgml/catalogs.sgml  31 Jul 2005 17:19:16 -  2.110
--- doc/src/sgml/catalogs.sgml  7 Aug 2005 19:03:22 -
***
*** 1197,1202 
--- 1197,1216 
entryMultiplier for reltuples to add to
 structfieldanl_base_thresh//entry
   /row
+ 
+row
+ entrystructfieldvac_cost_delay/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_delay/ parameter/entry
+/row
+ 
+row
+ entrystructfieldvac_cost_limit/structfield/entry
+ entrytypeinteger/type/entry
+ entry/entry
+ entryCustom variablevacuum_cost_limit/ parameter/entry
+/row
  /tbody
 /tgroup
/table
***
*** 1217,1223 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.
/para
  
   /sect1
--- 1231,1241 
para
 Any of the numerical fields can contain literal-1/ (or indeed
 any negative value) to indicate that the system-wide default should
!be used for this particular value.  Observe that the
!structfieldvac_cost_delay/ variable inherits its default value from the
!varnameautovacuum_vacuum_cost_delay/ configuration parameter,
!or from varnamevacuum_cost_delay/ if the former is set to a negative
!value.  The same applies to structfieldvac_cost_limit/.
/para
  
   /sect1
Index: doc/src/sgml/runtime.sgml
===
RCS file: /home/alvherre/cvs/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.341
diff -c -r1.341 runtime.sgml
*** doc/src/sgml/runtime.sgml   30 Jul 2005 17:15:35 -  1.341
--- doc/src/sgml/runtime.sgml   31 Jul 2005 23:22:11 -
***
*** 3399,3404 
--- 3399,3436 
/listitem
   /varlistentry
  
+  varlistentry id=guc-autovacuum-vacuum-cost-delay 
xreflabel=autovacuum_vacuum_cost_delay
+   termvarnameautovacuum_vacuum_cost_delay/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_delay/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_delay/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
+  varlistentry id=guc-autovacuum-cost-limit 
xreflabel=autovacuum_vacuum_cost_limit
+   termvarnameautovacuum_vacuum_cost_limit/varname 
(typeinteger/type)/term
+   indexterm
+primaryvarnameautovacuum_vacuum_cost_limit/ configuration 
parameter/primary
+   /indexterm
+   listitem
+para
+ Specifies the default value that will be applied to each
+ commandVACUUM/ operation, for tables that do not have
+ a default value set in structnamepg_autovacuum/.  If a 
+ negative value is specified (like the default value of -1),
+ the varnamevacuum_cost_limit/ value will be applied instead.
+/para
+   /listitem
+  /varlistentry
+ 
  /variablelist
 /sect2
  
Index: src/backend/access/transam/xlog.c
===
RCS file: /home/alvherre/cvs/pgsql/src/backend/access/transam/xlog.c,v
retrieving revision 1.214
diff -c -r1.214 xlog.c
*** src/backend/access/transam/xlog.c   30 Jul 2005 14:15:44 -  1.214
--- src/backend/access/transam/xlog.c   31 Jul 2005 21:36:45 -
***
*** 33,38 
--- 33,39 
  #include catalog/catversion.h
  #include catalog/pg_control.h
  #include miscadmin.h
+ #include pgstat.h
  #include postmaster/bgwriter.h
  #include storage/bufpage.h
  #include storage/fd.h
***

Re: [PATCHES] (was COPY FROM) performance improvements

2005-08-10 Thread Simon Riggs
On Wed, 2005-08-10 at 11:37 -0400, Tom Lane wrote:
 Luke Lonergan [EMAIL PROTECTED] writes:
  Yes, I think one thing we've learned is that there are important parts
  of the code, those that are in the data path (COPY, sort, spill to
  disk, etc) that are in dire need of optimization.  For instance, the
  fgetc() pattern should be banned everywhere in the data path.
 
 this discussion
 seems to presume that raw speed on Intel platforms is the only thing
 that matters.  We have a few other concerns.  Portability, readability,
 maintainability, and correctness all trump platform-specific
 optimizations.

I am interested in the general principle here, not this specific case.

As you know, I have raised the need for specific hardware tuning in
certain critical areas on a number of occasions. I very much respect the
need for all of the other aspects of code quality mentioned.

Pipeline parallelism is a feature of all modern CPUs since the Pentium,
not just Intel's. I think judicious exploitation of hardware features
that are common to multiple hardware architectures would be of
considerable benefit to everybody. We do already exploit some common
hardware tuning recommendations, such as buffer word alignment, but not
others such as false sharing avoidance and pipeline parallelism of key
loops. (There may be others...)

I say judicious because I do not presume that I am the judge ... but I
hope that judgements in these areas can fall towards the side of greater
performance as often as possible. Hardware and OS do exist, much as I
would prefer the simplicity of life in a layered IT architecture.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] (was COPY FROM) performance improvements

2005-08-10 Thread Bruce Momjian
Simon Riggs wrote:
 As you know, I have raised the need for specific hardware tuning in
 certain critical areas on a number of occasions. I very much respect the
 need for all of the other aspects of code quality mentioned.
 
 Pipeline parallelism is a feature of all modern CPUs since the Pentium,
 not just Intel's. I think judicious exploitation of hardware features
 that are common to multiple hardware architectures would be of
 considerable benefit to everybody. We do already exploit some common
 hardware tuning recommendations, such as buffer word alignment, but not
 others such as false sharing avoidance and pipeline parallelism of key
 loops. (There may be others...)
 
 I say judicious because I do not presume that I am the judge ... but I
 hope that judgements in these areas can fall towards the side of greater
 performance as often as possible. Hardware and OS do exist, much as I
 would prefer the simplicity of life in a layered IT architecture.

Right. We already have per-cpu test-and-set locks, and lots of macros,
so we just need to decide what places we need these optionations, and
how to do it cleanly.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[PATCHES] Proposed patch for preventing OID collisions

2005-08-10 Thread Tom Lane
Attached is a reasonably complete working-out of my earlier proposal to
prevent OID collisions by probing catalogs' OID indexes and advancing to
the next OID whenever there is a duplicate.  The code is driven by the
existence of a unique index, and so it is possible for users to take
advantage of the feature too:

create table foo(...) with oids;
create unique index foo_oid on foo(oid);

I'm not sure if we really want to document that though, as the
implementation will perform poorly when there are long runs of
consecutive OIDs in the table.  This is relatively unlikely in the
system catalogs but seems quite probable in a user table.

The patch also removes CheckMaxObjectId, which seems no longer needed.
If we commit it, we should remove the code in pg_dump that tries to
set the OID counter, since that'd be dead code; but I didn't include
that change in this patch.  Documentation updates are lacking too.

In principle I think this could be back-patched as far as 7.2, since
RelationGetIndexList exists that far back, but of course it needs a
good deal of testing before we even think about that.  (There are a
couple of aspects that depend on the recent changes to have fixed,
preassigned OIDs for system catalogs, and we'd have to tweak those
areas to make a back-patch.)

Comments?

regards, tom lane



binllTLVzLsFX.bin
Description: unique_oids.patch.gz

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Fix oversight in pts_error_callback()

2005-08-10 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] writes:

 That's a contrived example (and if I believed it, I would think that the
 right answer is to emit no errcontext if the elevel is less than ERROR).


Yes, I've thought about ignore errcontext by considering elevel. But I
scratched the source code for other uses of errcontext, and I found seems
all of them understand that errcontext will be called unconditionally. For
example, buffer_write_error_callback(), it doesn't say error in writing
block ... but says writing  block  So I think this place is not
consistent with others - it just says invalid ... , and should be changed.


Regards,
Qingqing



---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] Bug in canonicalize_path()

2005-08-10 Thread Bruce Momjian
I found that in port/path.c::canonicalize_path, that if the path was
supplied as /usr/local/bin/../.. we would return /usr/local/bin.  The
problem is then when we saw a trailing .. we stripped it off and the
previous directory, but we never checked if the previous directory was
itself ...

Patch applied to suppress trimming of .. if .. is above it.  I tried
coding something that would handle ../.. but is started to look too
messy and not worth the effort.

I don't see a need to backpatch this, but it could produce errors with
weird supplied paths.  Comments?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
? pg_config_paths.h
Index: path.c
===
RCS file: /cvsroot/pgsql/src/port/path.c,v
retrieving revision 1.51
diff -c -r1.51 path.c
*** path.c  26 Jan 2005 19:24:03 -  1.51
--- path.c  11 Aug 2005 03:52:06 -
***
*** 284,290 
  
if (len  2  strcmp(path + len - 2, /.) == 0)
trim_directory(path);
!   else if (len  3  strcmp(path + len - 3, /..) == 0)
{
trim_directory(path);
trim_directory(path);   /* remove directory above */
--- 284,293 
  
if (len  2  strcmp(path + len - 2, /.) == 0)
trim_directory(path);
!   /* We can only deal with /usr/local/.., not 
/usr/local/../.. */
!   else if (len  3  strcmp(path + len - 3, /..) == 0 
!(len != 5 || strcmp(path, ../..) != 0) 
!(len  6 || strcmp(path + len - 6, /../..) 
!= 0))
{
trim_directory(path);
trim_directory(path);   /* remove directory above */

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] For review: Server instrumentation patch

2005-08-10 Thread Bruce Momjian
Dave Page wrote:
  The list isn't complete. pgadmin uses these three functions 
  for logfile 
  tracking:
  
  - pg_logdir_ls to list logfiles
  - pg_file_length to check for changes of the current logfile
  - pg_file_read to retrieve a logfile
 
 Yes you're right, I didn't check thoroughly (in my defence, the coffee
 machine broke this morning). Anyhoo, pg_file_stat is used by
 pg_file_length, so that would be required as well.
 
 None of those allow any modification of the filesystem, so do not suffer
 the potential security issues that Tom was concerned about, so hopefully
 there is no problem with them going in?

OK, I have modified the patch to include these functions:

pg_reload_conf()
pg_file_stat()
pg_file_read()
pg_file_length()
pg_dir_ls()
pg_logfile_rotate()
pg_logdir_ls()

These can only be run by the super-user, and can only access files
inside PGDATA, or in the logdir directory if that is in a different
place from PGDATA.

The only part I didn't like about the patch is the stat display:

test= select pg_file_stat('postgresql.conf');
pg_file_stat

-

 (12287,2005-08-11 00:06:30,2005-08-11 00:06:43,2005-08-11 
00:06:30,f)
(1 row)

Shouldn't this return multiple labeled columns rather than an array?

The patch is attached and genfile.c goes in utils/adt.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
Index: doc/src/sgml/func.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.276
diff -c -c -r1.276 func.sgml
*** doc/src/sgml/func.sgml  2 Aug 2005 16:11:56 -   1.276
--- doc/src/sgml/func.sgml  11 Aug 2005 04:21:37 -
***
*** 9061,9066 
--- 9061,9069 
 indexterm zone=functions-admin
  primarypg_cancel_backend/primary
 /indexterm
+indexterm zone=functions-admin
+ primarypg_reload_conf/primary
+/indexterm
  
 indexterm zone=functions-admin
  primarysignal/primary
***
*** 9068,9074 
 /indexterm
  
 para
! The function shown in xref
  linkend=functions-admin-signal-table sends control signals to
  other server processes.  Use of this function is restricted
  to superusers.
--- 9071,9077 
 /indexterm
  
 para
! The functions shown in xref
  linkend=functions-admin-signal-table sends control signals to
  other server processes.  Use of this function is restricted
  to superusers.
***
*** 9090,9110 
 entrytypeint/type/entry
 entryCancel a backend's current query/entry
/row
   /tbody
  /tgroup
 /table
  
 para
! This function returns 1 if successful, 0 if not successful.
  The process ID (literalpid/literal) of an active backend can be found
  from the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the 
commandpostgres/command
  processes on the server with applicationps/.
 /para
! 
!indexterm zone=functions-admin
! primarypg_start_backup/primary
!/indexterm
  
 indexterm zone=functions-admin
  primarypg_stop_backup/primary
--- 9093,9121 
 entrytypeint/type/entry
 entryCancel a backend's current query/entry
/row
+   row
+entry
+ literalfunctionpg_reload_conf/function()/literal
+ /entry
+entrytypeint/type/entry
+entryTriggers the server processes to reload configuration 
files/entry
+   /row
   /tbody
  /tgroup
 /table
  
 para
! These functions return 1 if successful, 0 if not successful.
  The process ID (literalpid/literal) of an active backend can be found
  from the structfieldprocpid/structfield column in the
  structnamepg_stat_activity/structname view, or by listing the 
commandpostgres/command
  processes on the server with applicationps/.
 /para
!para
! functionpg_reload_conf/ sends a SIGHUP event to the
! postmaster, and thus triggers a reload of the configuration files
! in all backend processes.
!/para
  
 indexterm zone=functions-admin
  primarypg_stop_backup/primary
***
*** 9309,9314 
--- 9320,9457 
  appropriate.
 /para
  
+para
+ The functions shown in xref
+ linkend=functions-admin-genfile provide native file access to
+ files on the machine hosting the server. They are restricted to
+ the cluster directory or the logfile directory.
+ Use of these functions is restricted to 

Re: [PATCHES] remove BufferBlockPointers for speed and space

2005-08-10 Thread Gavin Sherry
On Thu, 11 Aug 2005, Qingqing Zhou wrote:

 It is said that the BufferBlockPointers is used to speedup the
 BufferGetBlock() macro. I compared three ways of getting block pointers.
 I.e., existing method (arrary method), calculating block pointer by adding
 base addr and offset*blockid method (mul method) and optimizing mul method
 by using bit shift (shift method). All of them calculate the block pointer
 8 times (i.e., the BufferBlockPointers array is of size 8), and each
 take 3 rounds.

 The result is:

 SunOS/gcc 3.2
 duration round 1 of array method: 4.179 ms
 duration round 2 of array method: 4.160 ms
 duration round 3 of array method: 4.143 ms
 duration round 1 of mul method: 3.311 ms
 duration round 2 of mul method: 3.233 ms
 duration round 3 of mul method: 3.233 ms
 duration round 1 of shift method: 3.554 ms
 duration round 2 of shift method: 3.235 ms
 duration round 3 of shift method: 3.233 ms

 Linux/gcc 3.2
 duration round 1 of array method: 0.422 ms
 duration round 2 of array method: 0.324 ms
 duration round 3 of array method: 0.354 ms
 duration round 1 of mul method: 0.271 ms
 duration round 2 of mul method: 0.248 ms
 duration round 3 of mul method: 0.304 ms
 duration round 1 of shift method: 0.322 ms
 duration round 2 of shift method: 0.239 ms
 duration round 3 of shift method: 0.265 ms

 We can conclude that:
 (1) mul or shift are definitely better than array method;
 (2) mul and shift are comparable;

Do you have results for more recent gcc releases?

Thanks,

Gavin

---(end of broadcast)---
TIP 6: explain analyze is your friend