Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
I wrote:
 the right way to make this faster is to refactor things so that we
 don't generate useless equivalence classes in the first place, or
 at least don't keep them around in the planner's lists once we realize
 they're useless.

After a bit of hacking, I propose the attached patch.

 I like Heikki's hack to cut down on searching in make_canonical_pathkey,
 but I think that complicating the data structure searching beyond that
 is just a band-aid.

With the given test case and this patch, we end up with exactly two
canonical pathkeys referencing a single EquivalenceClass.  So as far
as I can tell there's not a lot of point in refining the pathkey
searching.  Now, the EquivalenceClass has got 483 members, which
means that there's still some O(N^2) behavior in
get_eclass_for_sort_expr.  There might be some use in refining the
search for a matching eclass member.  It's not sticking out in
profiling like it did before though.

regards, tom lane

diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index d6402cf911817b1b8c17da91019a1fac19bf051a..5c0786f2fe6dea9a72ad66ba93aa8833ab0e26ba 100644
*** a/src/backend/optimizer/README
--- b/src/backend/optimizer/README
*** sort ordering was important; and so usin
*** 632,640 
  orderings doesn't create any real problem.
  
  
  
- Though Bob Devine bob.dev...@worldnet.att.net was not involved in the 
- coding of our optimizer, he is available to field questions about
- optimizer topics.
  
  -- bjm  tgl
--- 632,670 
  orderings doesn't create any real problem.
  
  
+ Order of processing for EquivalenceClasses and PathKeys
+ ---
+ 
+ As alluded to above, there is a specific sequence of phases in the
+ processing of EquivalenceClasses and PathKeys during planning.  During the
+ initial scanning of the query's quals (deconstruct_jointree followed by
+ reconsider_outer_join_clauses), we construct EquivalenceClasses based on
+ mergejoinable clauses found in the quals.  At the end of this process,
+ we know all we can know about equivalence of different variables, so
+ subsequently there will be no further merging of EquivalenceClasses.
+ At that point it is possible to consider the EquivalenceClasses as
+ canonical and build canonical PathKeys that reference them.  Before
+ we reach that point (actually, before entering query_planner at all)
+ we also ensure that we have constructed EquivalenceClasses for all the
+ expressions used in the query's ORDER BY and related clauses.  These
+ classes might or might not get merged together, depending on what we
+ find in the quals.
+ 
+ Because all the EquivalenceClasses are known before we begin path
+ generation, we can use them as a guide to which indexes are of interest:
+ if an index's column is not mentioned in any EquivalenceClass then that
+ index's sort order cannot possibly be helpful for the query.  This allows
+ short-circuiting of much of the processing of create_index_paths() for
+ irrelevant indexes.
+ 
+ There are some cases where planner.c constructs additional
+ EquivalenceClasses and PathKeys after query_planner has completed.
+ In these cases, the extra ECs/PKs are needed to represent sort orders
+ that were not considered during query_planner.  Such situations should be
+ minimized since it is impossible for query_planner to return a plan
+ producing such a sort order, meaning a explicit sort will always be needed.
+ Currently this happens only for queries involving multiple window functions
+ with different orderings, so extra sorts are needed anyway.
  
  
  -- bjm  tgl
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index e44e960b5454d4698ed82e4e857794ffe2a9adf2..c101c272a14b2f1b9d92a54670688df057d84a13 100644
*** a/src/backend/optimizer/path/equivclass.c
--- b/src/backend/optimizer/path/equivclass.c
*** static bool reconsider_full_join_clause(
*** 78,83 
--- 78,87 
   * join.  (This is the reason why we need a failure return.  It's more
   * convenient to check this case here than at the call sites...)
   *
+  * On success return, we have also initialized the clause's left_ec/right_ec
+  * fields to point to the EquivalenceClass built from it.  This saves lookup
+  * effort later.
+  *
   * Note: constructing merged EquivalenceClasses is a standard UNION-FIND
   * problem, for which there exist better data structures than simple lists.
   * If this code ever proves to be a bottleneck then it could be sped up ---
*** process_equivalence(PlannerInfo *root, R
*** 106,111 
--- 110,119 
  			   *em2;
  	ListCell   *lc1;
  
+ 	/* Should not already be marked as having generated an eclass */
+ 	Assert(restrictinfo-left_ec == NULL);
+ 	Assert(restrictinfo-right_ec == NULL);
+ 
  	/* Extract info from given clause */
  	Assert(is_opclause(clause));
  	opno = ((OpExpr *) clause)-opno;
*** 

Re: [HACKERS] sorted writes for checkpoints

2010-10-29 Thread Heikki Linnakangas

On 29.10.2010 06:00, Jeff Janes wrote:

One of the items on the Wiki ToDo list is sorted writes for
checkpoints.  The consensus seemed to be that this should be done by
adding hook(s) into the main code, and then a contrib module to work
with those hooks.  Is there an existing contrib module that one could
best look to for inspiration on how to go about doing this?  I have
the sorted checkpoint working under a guc, but don't know where to
start on converting it to a contrib module instead.


I don't think it's a good idea to have this as a hook. Bgwriter 
shouldn't need to load external code, and checkpoint robustness should 
dependend on user-written code. IIRC Tom Lane didn't even like pallocing 
the memory for the list of dirty pages at checkpoint time because that 
might cause an out-of-memory error. Calling a function in a contrib 
module is much much worse.


Simon's argument in the thread that the todo item points to 
(http://archives.postgresql.org/pgsql-patches/2008-07/msg00123.php) is 
basically that we don't know what the best algorithm is yet and 
benchmarking is a lot of work, so let's just let people do whatever they 
feel like until we settle on the best approach. I think we need to bite 
the bullet and do some benchmarking, and commit one carefully vetted 
patch to the backend.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sorted writes for checkpoints

2010-10-29 Thread Itagaki Takahiro
On Fri, Oct 29, 2010 at 3:23 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Simon's argument in the thread that the todo item points to
 (http://archives.postgresql.org/pgsql-patches/2008-07/msg00123.php) is
 basically that we don't know what the best algorithm is yet and benchmarking
 is a lot of work, so let's just let people do whatever they feel like until
 we settle on the best approach. I think we need to bite the bullet and do
 some benchmarking, and commit one carefully vetted patch to the backend.

When I submitted the patch, I tested it on disk-based RAID-5 machine:
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php
But there were no additional benchmarking reports at that time. We still
need benchmarking before we re-examine the feature. For example, SSD and
SSD-RAID was not popular at that time, but now they might be considerable.

I think direct patching to the core is enough at the first
testing, and we will decide the interface according to the
result. If one algorithm win in all cases, we could just
include it in the core, and then extensibility would not need.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] archives, attachments, etc

2010-10-29 Thread Matteo Beccati
Hi Gurjeet,

On 09/10/2010 22:54, Gurjeet Singh wrote:
 On Sat, Oct 9, 2010 at 3:30 PM, Dimitri Fontaine dimi...@2ndquadrant.fr
 mailto:dimi...@2ndquadrant.fr wrote:
 I wish our super admins would have some time to resume the work on the
 new archives infrastructure, that was about ready for integration if not
 prime time:
 
  http://archives.beccati.org/pgsql-hackers/message/276290
 
 As you see it doesn't suffer from this problem, the threading is not
 split arbitrarily, and less obvious but it runs from a PostgreSQL
 database. Yes, that means the threading code is exercising our recursive
 querying facility, as far as I understand it.
 
 
 Something looks wrong with that thread. The message text in my mails is
 missing. Perhaps that is contained in the .bin files but I can't tell as
 the link leads to 404 Not Found.

Thanks for the private email to point this thread out. I've been overly
busy lately and missed it.

I'll try to debug what happens with your message formatting as soon as I
can find some time.


Cheers
-- 
Matteo Beccati

Development  Consulting - http://www.beccati.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
 On the other hand, if I use a similar test case to my original  one
 (i.e. the tables are much wider) then the query planning takes
 1.42  seconds in 9.1 with this patch instead of about 4.7 seconds
 as we observed it  using PostgreSQL 9.0.0. The beginning of the gprof
 output now looks like  this:

Hi,

I'm really interested in this patch.

I tried a simple test case:

create table t (a integer, b text);

DO $$DECLARE i int;
BEGIN
FOR i IN 0..9000 LOOP
EXECUTE 'create table t' || i || ' ( CHECK (a ' || i*10 || ' 
and a = ' || (i+1)*10 || ' ) ) INHERITS (t)';
   EXECUTE 'create index tidx' || i || ' ON t' || i || '  (a)';
END LOOP;
END$$;

explain select * from t where a  1060 and a  1090;

but I don't get any gain from the patch... explain time is still around 250 ms.

Tried with 9000 partitions, time is still 2 secs.


Maybe I've missed completely the patch purpose?


(I tried the test case at

http://archives.postgresql.org/message-id/4cbd9ddc.4040...@cybertec.at

and that, in fact, gets a boost with this patch).



Leonardo




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
 but I don't get any gain from the patch... explain time is still  around 250 
ms.
 Tried with 9000 partitions, time is still 2  secs.


Small correction: I tried with 3000 partitions (FOR i IN 0..3000 ...)
and got 250ms with both versions, with 9000 partitions 2 secs (again
no gain from the patch)




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sorted writes for checkpoints

2010-10-29 Thread Robert Haas
On Fri, Oct 29, 2010 at 2:58 AM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 On Fri, Oct 29, 2010 at 3:23 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Simon's argument in the thread that the todo item points to
 (http://archives.postgresql.org/pgsql-patches/2008-07/msg00123.php) is
 basically that we don't know what the best algorithm is yet and benchmarking
 is a lot of work, so let's just let people do whatever they feel like until
 we settle on the best approach. I think we need to bite the bullet and do
 some benchmarking, and commit one carefully vetted patch to the backend.

 When I submitted the patch, I tested it on disk-based RAID-5 machine:
 http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php
 But there were no additional benchmarking reports at that time. We still
 need benchmarking before we re-examine the feature. For example, SSD and
 SSD-RAID was not popular at that time, but now they might be considerable.

There are really two separate things here:

(1) trying to do all the writes to file A before you start doing
writes to file B, and
(2) trying to write out blocks to each file in ascending logical block
number order

I'm much more convinced of the value of #1 than I am of the value of
#2.  If we do #1, we can then spread out the checkpoint fsyncs in a
meaningful way without fearing that we'll need to fsync the same file
a second time for the same checkpoint.  We've gotten some pretty
specific reports of problems in this area recently, so it seems likely
that there is some value to be had there.  On the other hand, #2 is
only a win if sorting the blocks in numerical order causes the OS to
write them in a better order than it would otherwise have done.  We've
had recent reports that our block-at-a-time relation extension policy
is leading to severe fragmentation on certain filesystems, so I'm a
bit skeptical about the value of this (though, of course, that can be
overturned if we can collect meaningful evidence).

 I think direct patching to the core is enough at the first
 testing, and we will decide the interface according to the
 result. If one algorithm win in all cases, we could just
 include it in the core, and then extensibility would not need.

I agree with this, and with Heikki's remarks also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sorted writes for checkpoints

2010-10-29 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Simon's argument in the thread that the todo item points to 
 (http://archives.postgresql.org/pgsql-patches/2008-07/msg00123.php) is 
 basically that we don't know what the best algorithm is yet and 
 benchmarking is a lot of work, so let's just let people do whatever they 
 feel like until we settle on the best approach. I think we need to bite 
 the bullet and do some benchmarking, and commit one carefully vetted 
 patch to the backend.

Yeah, I tend to agree.  We've used hooks in the past to allow people to
add on non-critical functionality.  Fooling with the behavior of
checkpoints is far from noncritical.  Furthermore, it's really hard to
see what a sane hook API would even look like.  As Robert comments,
part of any win here would likely come from controlling the timing of
fsyncs, not just writes.  Controlling all that at arm's length from
the code that actually does it seems likely to be messy and inefficient.

Another point is that I don't see any groundswell of demand out there
for custom checkpoint algorithms.  If we did manage to create a hook
API, how likely is it there would ever be more than one plugin?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes:
 I tried a simple test case:

 create table t (a integer, b text);

 DO $$DECLARE i int;
 BEGIN
 FOR i IN 0..9000 LOOP
 EXECUTE 'create table t' || i || ' ( CHECK (a ' || i*10 || ' 
 and a = ' || (i+1)*10 || ' ) ) INHERITS (t)';
EXECUTE 'create index tidx' || i || ' ON t' || i || '  (a)';
 END LOOP;
 END$$;

 explain select * from t where a  1060 and a  1090;

This is going to be dominated by constraint exclusion checking.  There's
basically no fix for that except a more explicit representation of the
partitioning rules.  If the planner has to make 8999 theorem proofs to
remove the 8999 unwanted partitions from the plan, it's gonna take
awhile.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for view triggers in psql

2010-10-29 Thread David Fetter
On Tue, Oct 26, 2010 at 11:55:07AM +0900, Itagaki Takahiro wrote:
 On Tue, Oct 26, 2010 at 11:34 AM, David Fetter da...@fetter.org wrote:
  Do we need to 'add' it?
  Possibly.  My understanding is that it couldn't really replace it.
 
 Ah, I see.  I was wrong.  We can have modification privileges for
 views even if they have no INSTEAD OF triggers.

Right.

 So, I think your original patch is the best solution.  We could use
 has_table_privilege() additionally, but we need to consider any
 other places if we use it.  For example, DROP privileges, etc.

That seems like a matter for a separate patch.  Looking this over, I
found I'd created a query that can never get used, so please find
enclosed the next version of the patch :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***
*** 303,308  static const SchemaQuery Query_for_list_of_tables = {
--- 303,359 
NULL
  };
  
+ /* The bit masks for the following three functions come from
+  * src/include/catalog/pg_trigger.h.
+  */
+ static const SchemaQuery Query_for_list_of_insertables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  2) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
+ static const SchemaQuery Query_for_list_of_deleteables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  3) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
+ static const SchemaQuery Query_for_list_of_updateables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  4) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
  static const SchemaQuery Query_for_list_of_tisv = {
/* catname */
pg_catalog.pg_class c,
***
*** 333,338  static const SchemaQuery Query_for_list_of_tsv = {
--- 384,404 
NULL
  };
  
+ static const SchemaQuery Query_for_list_of_tv = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind IN ('r', 'v'),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
  static const SchemaQuery Query_for_list_of_views = {
/* catname */
pg_catalog.pg_class c,
***
*** 630,636  psql_completion(char *text, int start, int end)
   *prev2_wd,
   *prev3_wd,
   *prev4_wd,
!  *prev5_wd;
  
static const char *const sql_commands[] = {
ABORT, ALTER, ANALYZE, BEGIN, CHECKPOINT, CLOSE, 
CLUSTER,
--- 696,703 
   *prev2_wd,
   *prev3_wd,
   *prev4_wd,
!  *prev5_wd,
!  *prev6_wd;
  
static const char *const sql_commands[] = {
ABORT, ALTER, ANALYZE, BEGIN, CHECKPOINT, CLOSE, 
CLUSTER,
***
*** 669,675  psql_completion(char *text, int start, int end)
completion_info_charp2 = NULL;
  
/*
!* Scan the input line before our current position for the last five
 * words. According to those we'll make some smart decisions on what the
 * user is probably intending to type. TODO: Use strtokx() to do this.
 */
--- 736,742 
completion_info_charp2 = NULL;
  
/*
!* Scan the input line before our current position for the 

Re: [HACKERS] crash in plancache with subtransactions

2010-10-29 Thread Alvaro Herrera
Excerpts from Tom Lane's message of mié oct 27 18:18:06 -0300 2010:
 I wrote:
  Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
  One simple idea is to keep a flag along with the executor state to 
  indicate that the executor state is currently in use. Set it just before 
  calling ExecEvalExpr, and reset afterwards. If the flag is already set 
  in the beginning of exec_eval_simple_expr, we have recursed, and must 
  create a new executor state.
 
  Yeah, the same thought occurred to me in the shower this morning.
  I'm concerned about possible memory leakage during repeated recursion,
  but maybe that can be dealt with.
 
 I spent quite a bit of time trying to deal with the memory-leakage
 problem without adding still more bookkeeping overhead.  It wasn't
 looking good, and then I had a sudden insight: if we see that the in-use
 flag is set, we can simply return FALSE from exec_eval_simple_expr.

I tried the original test cases that were handed to me (quite different
from what I submitted here) and they are fixed also.  Thanks.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
 This is going to be dominated by constraint exclusion  checking.  There's
 basically no fix for that except a more explicit  representation of the
 partitioning rules.  

Damn, I knew that was going to be more complicated :)

So in which case does this patch help? I guess in a multi-index
scenario? childtables.sql is kind of hard to read (I think a FOR loop
would have been more auto-explaining?).




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash in plancache with subtransactions

2010-10-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of mié oct 27 18:18:06 -0300 2010:
 I spent quite a bit of time trying to deal with the memory-leakage
 problem without adding still more bookkeeping overhead.  It wasn't
 looking good, and then I had a sudden insight: if we see that the in-use
 flag is set, we can simply return FALSE from exec_eval_simple_expr.

 I tried the original test cases that were handed to me (quite different
 from what I submitted here) and they are fixed also.  Thanks.

It'd be interesting to know if there's any noticeable slowdown on
affected real-world cases.  (Of course, if they invariably crashed
before, there might not be a way to measure their previous speed...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Cleanup: Compare pointers to NULL instead of 0

2010-10-29 Thread Marti Raudsepp
Coccinelle found a few places in the code where pointer expressions
were compared to 0. I have changed them to NULL instead.

There was one line that I didn't dare to touch, which looks like a
false positive.

src/backend/regex/regc_lex.c:849:
if (v-now - save == 0 || ((int) c  0  (int) c = v-nsubexp))

I couldn't find the definition of v (struct vars) anywhere. Is it
comparing two pointers here? Should it be v-now == save instead?

But this code doesn't originate from PostgreSQL, so maybe it's not
worth making cleanups here.

Regards,
Marti
From 0e939e93935995e9d47f04698903ae427834c257 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp ma...@juffo.org
Date: Fri, 29 Oct 2010 18:59:44 +0300
Subject: [PATCH] Cleanup: Compare pointers to NULL instead of 0

Most of these were discovered with Coccinelle (badzero.cocci from
coccicheck)

Marti Raudsepp
---
 src/backend/utils/adt/tsrank.c  |2 +-
 src/backend/utils/fmgr/dfmgr.c  |2 +-
 src/bin/pg_dump/pg_backup_tar.c |2 +-
 src/port/dirmod.c   |2 +-
 src/timezone/zic.c  |   12 ++--
 5 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/src/backend/utils/adt/tsrank.c b/src/backend/utils/adt/tsrank.c
index d61bcdd..b0417de 100644
--- a/src/backend/utils/adt/tsrank.c
+++ b/src/backend/utils/adt/tsrank.c
@@ -395,7 +395,7 @@ getWeights(ArrayType *win)
 	int			i;
 	float4	   *arrdata;
 
-	if (win == 0)
+	if (win == NULL)
 		return weights;
 
 	if (ARR_NDIM(win) != 1)
diff --git a/src/backend/utils/fmgr/dfmgr.c b/src/backend/utils/fmgr/dfmgr.c
index 566ac46..d08fddd 100644
--- a/src/backend/utils/fmgr/dfmgr.c
+++ b/src/backend/utils/fmgr/dfmgr.c
@@ -616,7 +616,7 @@ find_in_dynamic_libpath(const char *basename)
 	(errcode(ERRCODE_INVALID_NAME),
 	 errmsg(zero-length component in parameter \dynamic_library_path\)));
 
-		if (piece == 0)
+		if (piece == NULL)
 			len = strlen(p);
 		else
 			len = piece - p;
diff --git a/src/bin/pg_dump/pg_backup_tar.c b/src/bin/pg_dump/pg_backup_tar.c
index d7e4c46..006f7da 100644
--- a/src/bin/pg_dump/pg_backup_tar.c
+++ b/src/bin/pg_dump/pg_backup_tar.c
@@ -576,7 +576,7 @@ tarWrite(const void *buf, size_t len, TAR_MEMBER *th)
 {
 	size_t		res;
 
-	if (th-zFH != 0)
+	if (th-zFH != NULL)
 		res = GZWRITE((void *) buf, 1, len, th-zFH);
 	else
 		res = fwrite(buf, 1, len, th-nFH);
diff --git a/src/port/dirmod.c b/src/port/dirmod.c
index a8b8a90..7ab3c9a 100644
--- a/src/port/dirmod.c
+++ b/src/port/dirmod.c
@@ -246,7 +246,7 @@ pgsymlink(const char *oldpath, const char *newpath)
 	else
 		strcpy(nativeTarget, oldpath);
 
-	while ((p = strchr(p, '/')) != 0)
+	while ((p = strchr(p, '/')) != NULL)
 		*p++ = '\\';
 
 	len = strlen(nativeTarget) * sizeof(WCHAR);
diff --git a/src/timezone/zic.c b/src/timezone/zic.c
index 00ca6fa..8a95d6a 100644
--- a/src/timezone/zic.c
+++ b/src/timezone/zic.c
@@ -810,7 +810,7 @@ associate(void)
 			 * Note, though, that if there's no rule, a '%s' in the format is
 			 * a bad thing.
 			 */
-			if (strchr(zp-z_format, '%') != 0)
+			if (strchr(zp-z_format, '%') != NULL)
 error(_(%s in ruleless zone));
 		}
 	}
@@ -,9 +,9 @@ inzsub(char **fields, int nfields, int iscont)
 	z.z_filename = filename;
 	z.z_linenum = linenum;
 	z.z_gmtoff = gethms(fields[i_gmtoff], _(invalid UTC offset), TRUE);
-	if ((cp = strchr(fields[i_format], '%')) != 0)
+	if ((cp = strchr(fields[i_format], '%')) != NULL)
 	{
-		if (*++cp != 's' || strchr(cp, '%') != 0)
+		if (*++cp != 's' || strchr(cp, '%') != NULL)
 		{
 			error(_(invalid abbreviation format));
 			return FALSE;
@@ -1438,9 +1438,9 @@ rulesub(struct rule * rp, const char *loyearp, const char *hiyearp,
 	}
 	else
 	{
-		if ((ep = strchr(dp, '')) != 0)
+		if ((ep = strchr(dp, '')) != NULL)
 			rp-r_dycode = DC_DOWLEQ;
-		else if ((ep = strchr(dp, '')) != 0)
+		else if ((ep = strchr(dp, '')) != NULL)
 			rp-r_dycode = DC_DOWGEQ;
 		else
 		{
@@ -2826,7 +2826,7 @@ mkdirs(char *argname)
 	if (argname == NULL || *argname == '\0')
 		return 0;
 	cp = name = ecpyalloc(argname);
-	while ((cp = strchr(cp + 1, '/')) != 0)
+	while ((cp = strchr(cp + 1, '/')) != NULL)
 	{
 		*cp = '\0';
 #ifdef WIN32
-- 
1.7.3.2


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Boszormenyi Zoltan z...@cybertec.at writes:
 On the other hand, if I use a similar test case to my original one
 (i.e. the tables are much wider) then the query planning takes
 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds
 as we observed it using PostgreSQL 9.0.0. The beginning of the gprof
 output now looks like this:

   %   cumulative   self  self total  
  time   seconds   secondscalls   s/call   s/call  name   
  21.13  0.30 0.30   235091 0.00 0.00  SearchCatCache
   7.04  0.40 0.10  1507206 0.00 0.00  
 hash_search_with_hash_value
   3.52  0.45 0.05  2308219 0.00 0.00  AllocSetAlloc

Yeah, for me it looks even worse: oprofile shows about 77% of time in
SearchCatCache.  I poked around a little and it seems that probably most
of the time is going into searches of the STATRELATTINH syscache, which
looks like this:

$13 = {id = 41, cc_next = 0x2b43a60, 
  cc_relname = 0x7f6bc6ed2218 pg_statistic, cc_reloid = 2619, 
  cc_indexoid = 2696, cc_relisshared = 0 '\000', cc_tupdesc = 0x7f6bc6ed11d8, 
  cc_ntup = 68922, cc_nbuckets = 1024, cc_nkeys = 3, cc_key = {1, 2, 3, 0}, 
  ...

Most of those entries are negative cache entries, since we don't have
any actual stats in this toy example.

I think that we probably should be very circumspect about believing that
this example is still a good guide to what to optimize next; in
particular, in a real-world example with real stats, I'm not sure that
the hot spots will still be in the same places.  I'd advise loading up
some real data and doing more profiling.

However, if the hot spot does stay in SearchCatCache, I can't help
noticing that those bucket chains are looking a bit overloaded ---
sixty-plus entries per bucket ain't good.  Maybe it's time to teach
catcache.c how to reorganize its hashtables once the load factor
exceeds a certain level.  Or more drastically, maybe it should lose
its private hashtable logic and use dynahash.c; I'm not sure at the
moment if the private implementation has any important characteristics
dynahash hasn't got.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
I wrote:
 This is going to be dominated by constraint exclusion checking.

Hmm, maybe I spoke too soon.  With 9000 child tables I get a profile
like this:

samples  %symbol name
447433   47.1553  get_tabstat_entry
185458   19.5456  find_all_inheritors
53064 5.5925  SearchCatCache
33864 3.5690  pg_strtok
26301 2.7719  hash_search_with_hash_value
22577 2.3794  AllocSetAlloc
6696  0.7057  MemoryContextAllocZeroAligned
6250  0.6587  expression_tree_walker
5141  0.5418  LockReleaseAll
4779  0.5037  get_relation_info
4506  0.4749  MemoryContextAlloc
4467  0.4708  expression_tree_mutator
4136  0.4359  pgstat_initstats
3914  0.4125  relation_excluded_by_constraints

get_tabstat_entry and find_all_inheritors are both obviously O(N^2) in
the number of tables they have to deal with.  However, the constant
factors are small enough that you need a heck of a lot of tables
before they become significant consumers of runtime.  I'm not convinced
that we should be optimizing for 9000-child-table cases.

It'd be worth fixing these if we can do it without either introducing a
lot of complexity, or slowing things down for typical cases that have
only a few tables.  Offhand not sure about how to do either.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Robert Haas
On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Boszormenyi Zoltan z...@cybertec.at writes:
 On the other hand, if I use a similar test case to my original one
 (i.e. the tables are much wider) then the query planning takes
 1.42 seconds in 9.1 with this patch instead of about 4.7 seconds
 as we observed it using PostgreSQL 9.0.0. The beginning of the gprof
 output now looks like this:

   %   cumulative   self              self     total
  time   seconds   seconds    calls   s/call   s/call  name
  21.13      0.30     0.30   235091     0.00     0.00  SearchCatCache
   7.04      0.40     0.10  1507206     0.00     0.00  
 hash_search_with_hash_value
   3.52      0.45     0.05  2308219     0.00     0.00  AllocSetAlloc

 Yeah, for me it looks even worse: oprofile shows about 77% of time in
 SearchCatCache.  I poked around a little and it seems that probably most
 of the time is going into searches of the STATRELATTINH syscache, which
 looks like this:

 $13 = {id = 41, cc_next = 0x2b43a60,
  cc_relname = 0x7f6bc6ed2218 pg_statistic, cc_reloid = 2619,
  cc_indexoid = 2696, cc_relisshared = 0 '\000', cc_tupdesc = 0x7f6bc6ed11d8,
  cc_ntup = 68922, cc_nbuckets = 1024, cc_nkeys = 3, cc_key = {1, 2, 3, 0},
  ...

 Most of those entries are negative cache entries, since we don't have
 any actual stats in this toy example.

 I think that we probably should be very circumspect about believing that
 this example is still a good guide to what to optimize next; in
 particular, in a real-world example with real stats, I'm not sure that
 the hot spots will still be in the same places.  I'd advise loading up
 some real data and doing more profiling.

 However, if the hot spot does stay in SearchCatCache, I can't help
 noticing that those bucket chains are looking a bit overloaded ---
 sixty-plus entries per bucket ain't good.  Maybe it's time to teach
 catcache.c how to reorganize its hashtables once the load factor
 exceeds a certain level.  Or more drastically, maybe it should lose
 its private hashtable logic and use dynahash.c; I'm not sure at the
 moment if the private implementation has any important characteristics
 dynahash hasn't got.

I'm not sure what's happening in this particular case, but I seem to
remember poking at a case a while back where we were doing a lot of
repeated statistics lookups for the same columns.  If that's also the
the case here and if there is some way to avoid it (hang a pointer to
the stats off the node tree somewhere?) we might be able to cut down
on the number of hash probes, as an alternative to or in addition to
making them faster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
 Hmm, maybe I spoke too soon.  With 9000 child tables I get  a profile
 like this:


Well, the 9000-table-test-case was meant to check the difference in
performance with/without the patch... I don't see the reason for trying
to optimize such an unrealistic case.

BTW can someone explain to me which are the cases where the
patch actually helps?


 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes:
 BTW can someone explain to me which are the cases where the
 patch actually helps?

Cases with lots of irrelevant indexes.  Zoltan's example had 4 indexes
per child table, only one of which was relevant to the query.  In your
test case there are no irrelevant indexes, which is why the runtime
didn't change.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Oct 29, 2010 at 12:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 However, if the hot spot does stay in SearchCatCache, I can't help
 noticing that those bucket chains are looking a bit overloaded ---
 sixty-plus entries per bucket ain't good.  Maybe it's time to teach
 catcache.c how to reorganize its hashtables once the load factor
 exceeds a certain level.  Or more drastically, maybe it should lose
 its private hashtable logic and use dynahash.c; I'm not sure at the
 moment if the private implementation has any important characteristics
 dynahash hasn't got.

 I'm not sure what's happening in this particular case, but I seem to
 remember poking at a case a while back where we were doing a lot of
 repeated statistics lookups for the same columns.  If that's also the
 the case here and if there is some way to avoid it (hang a pointer to
 the stats off the node tree somewhere?) we might be able to cut down
 on the number of hash probes, as an alternative to or in addition to
 making them faster.

I think there are already layers of caching in the planner to avoid
fetching the same stats entries more than once per query.  The problem
here is that there are so many child tables that even fetching stats
once per table per query starts to add up.  (Also, as I said, I'm
worried that we're being misled by the fact that there are no stats to
fetch --- so we're not seeing the costs of actually doing something with
the stats if they existed.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Madam I

2010-10-29 Thread murphy pope

f

Hi,Dear: 
 we are wholesalers  in china.We can sent our product to 
USA,Germany,Europe and so on.
 we have established a good business relationship with the 
manufactoies.So we can get the best price there. And we mainly do our business 
in bulk with our agents, so we should provide them our best price as 
well.Therefore, our prce is lower than the market price.We provide the kinds of 
shoes,T-shirts,MP3/MP4,Watches and guitar.For example  Nike, Puma,Adidas,ED 
hardy,Apple and so on. Prices depend on the quantity of your order. the more is 
the lower. pleasse contact us.
If you are interested in our products. Please send email to me or msn 
me. or you can go to our web:
Websit: www.depotele.com
As the Christmas Day coming,all the products are free shipping.
Maybe now you have regular business partner.if so,please leave my message in 
your email box,maybe someday it will be useful.

I give you endless brand-new good wishes. Please accept them as a new 
remembrance of our lasting friendship.  

[HACKERS] Fix for cube picksplit function

2010-10-29 Thread Alexander Korotkov
Hackers,

Gist picksplit method implementation in cube contrib module contains a bug
in Guttman's split algorithm implementation. It was mentioned before but it
is still not fixed yet. Current implementation doesn't cause incorrect
behavior, but index becomes very bad, because each index scan require to
read significant part of index.

test=# create table test (a cube);
test=# insert into test (select cube(array[random(),random(),random()]) from
generate_series(1,100));
test=# create index test_cube_idx on test using gist(a);

Before this patch, index scan of relatively small area requires read of 1235
index pages.

test=# explain (analyze,buffers) select * from test where a @
cube(array[0.1,0.1,0.1],array[0.15,0.15,0.15]);
  QUERY
PLAN

--
 Bitmap Heap Scan on test  (cost=88.77..3046.68 rows=1000 width=56) (actual
time=17.497..18.188 rows=120 loops=1)
   Recheck Cond: (a @ '(0.1, 0.1, 0.1),(0.15, 0.15, 0.15)'::cube)
   Buffers: shared hit=1354
   -  Bitmap Index Scan on test_cube_idx  (cost=0.00..88.52 rows=1000
width=0) (actual time=17.428..17.428 rows=120 loops=1)
 Index Cond: (a @ '(0.1, 0.1, 0.1),(0.15, 0.15, 0.15)'::cube)
 Buffers: shared hit=1235
 Total runtime: 18.407 ms
(7 rows)

Time: 19,501 ms

After this patch, index scan of same area requires read of only 44 index
pages.

test=# explain (analyze,buffers) select * from test where a @
cube(array[0.1,0.1,0.1],array[0.15,0.15,0.15]);
 QUERY
PLAN


 Bitmap Heap Scan on test  (cost=76.66..3034.57 rows=1000 width=56) (actual
time=0.828..1.543 rows=120 loops=1)
   Recheck Cond: (a @ '(0.1, 0.1, 0.1),(0.15, 0.15, 0.15)'::cube)
   Buffers: shared hit=163
   -  Bitmap Index Scan on test_cube_idx  (cost=0.00..76.41 rows=1000
width=0) (actual time=0.767..0.767 rows=120 loops=1)
 Index Cond: (a @ '(0.1, 0.1, 0.1),(0.15, 0.15, 0.15)'::cube)
 Buffers: shared hit=44
 Total runtime: 1.759 ms
(7 rows)

Seg contrib module and pg_temporal project contain same bug. But there are
another fix for them, because there is no need to use Guttman's split
algorithm at all in unidimensional case.
I'm going to add this patch to commitfest.


With best regards,
Alexander Korotkov.
*** a/contrib/cube/cube.c
--- b/contrib/cube/cube.c
***
*** 615,621  g_cube_picksplit(PG_FUNCTION_ARGS)
  		else
  		{
  			datum_r = union_dr;
! 			size_r = size_alpha;
  			*right++ = i;
  			v-spl_nright++;
  		}
--- 615,621 
  		else
  		{
  			datum_r = union_dr;
! 			size_r = size_beta;
  			*right++ = i;
  			v-spl_nright++;
  		}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010:
 I wrote:
  This is going to be dominated by constraint exclusion checking.
 
 Hmm, maybe I spoke too soon.  With 9000 child tables I get a profile
 like this:
 
 samples  %symbol name
 447433   47.1553  get_tabstat_entry

Is there a reason for keeping the pgstat info in plain lists?  We could
use dynahash there too, I think.  It would have more palloc overhead
that way, though (hmm, but perhaps that can be fixed by having a smart
alloc function for it, preallocating a bunch of entries instead of one
by one).

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Tom Lane's message of vie oct 29 14:15:55 -0300 2010:
 samples  %symbol name
 447433   47.1553  get_tabstat_entry

 Is there a reason for keeping the pgstat info in plain lists?

Yeah: anything else loses for small numbers of tables per query, which
is the normal case.  I'd guess you'd need ~100 tables touched in
a single transaction before a hashtable is even worth thinking about.

We could possibly adopt a solution similar to the planner's approach for
joinrels: start with a simple list, and switch over to hashing if the
list gets too long.  But I'm really doubtful that it's worth the code
space.  Even with Zoltan's 500-or-so-table case, this wasn't on the
radar screen.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Tasks for Google Code-In

2010-10-29 Thread Selena Deckelmann
The Oregon State University's Open Source Lab (OSL) has contacted me about
coordinating some student work for the Google Code-In, a program targetting
13-18 year olds.

See more here: http://code.google.com/opensource/gci/2010-11/index.html

This program specifically allows documentation, testing and advocacy
activity not normally allowed in the Google Summer of Code. And OSL is
offering to be an umbrella organization -- handling the administrative
details for us.

So!  Do you have some 2-3 week projects in mind appropriate for high school
students to tackle?

Please post some ideas, and we can get this going.

-selena

-- 
http://chesnok.com/daily - me


Re: [HACKERS] [PATCH] Cleanup: Compare pointers to NULL instead of 0

2010-10-29 Thread Tom Lane
Marti Raudsepp ma...@juffo.org writes:
 Coccinelle found a few places in the code where pointer expressions
 were compared to 0. I have changed them to NULL instead.

Applied, thanks!

 There was one line that I didn't dare to touch, which looks like a
 false positive.
 src/backend/regex/regc_lex.c:849:
 if (v-now - save == 0 || ((int) c  0  (int) c = v-nsubexp))

Hmm.  I think this is a false positive in the tool: the difference
between two pointers is an integer, not a pointer, so comparison
to an integer is the semantically correct thing to do --- in fact,
I'd argue that s/0/NULL/ ought to lead to a warning here.

However, by the same token, the code overspecifies the computation.
We don't really need to compute the pointer difference, just compare
the pointers for equality.  If it were all integer arithmetic I'd
expect the compiler to figure that out, but given the type issues
maybe not.  So I changed it to v-now == save as you suggest.  The
only advantage of the previous coding would be if we wanted to change
the test to check for some other number of characters consumed, which
seems mighty unlikely.

 I couldn't find the definition of v (struct vars) anywhere.

It's near the head of regcomp.c.

 But this code doesn't originate from PostgreSQL, so maybe it's not
 worth making cleanups here.

Well, neither is zic, but we still have to look at the warnings.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Leonardo Francalanci
 Cases with lots of irrelevant indexes.  Zoltan's example had  4 indexes
 per child table, only one of which was relevant to the query.   In your
 test case there are no irrelevant indexes, which is why the  runtime
 didn't change.



Mmh... I must be doing something wrong. It looks to me it's not just
the irrelevant indexes: it's the order by that counts. If I remove that
times are the same with and without the patch:

using the test case:

explain select * from inh_parent
where timestamp1 between '2010-04-06' and '2010-06-25'

this one runs in the same time with the patch; but adding:


order by timestamp2

made the non-patched version run 3 times slower.

My test case:

create table t (a integer, b integer, c integer, d integer, e text);

DO $$DECLARE i int;
BEGIN
FOR i IN 0..2000 LOOP
EXECUTE 'create table t' || i || ' ( CHECK (a ' || i*10 || ' 
and a = ' || (i+1)*10 || ' ) ) INHERITS (t)';
EXECUTE 'create index taidx' || i || ' ON t' || i || '  (a)';
EXECUTE 'create index tbidx' || i || ' ON t' || i || '  (b)';
EXECUTE 'create index tcidx' || i || ' ON t' || i || '  (c)';
EXECUTE 'create index tdidx' || i || ' ON t' || i || '  (d)';
END LOOP;
END$$;


explain select * from t where a  1060 and a  109000

this runs in 1.5 secs with and without the patch. But if I add

 order by b 

the non-patched version runs in 10 seconds.

Am I getting it wrong?




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Cleanup: Compare pointers to NULL instead of 0

2010-10-29 Thread Alvaro Herrera
Excerpts from Marti Raudsepp's message of vie oct 29 13:25:03 -0300 2010:
 Coccinelle found a few places in the code where pointer expressions
 were compared to 0. I have changed them to NULL instead.
 
 There was one line that I didn't dare to touch, which looks like a
 false positive.
 
 src/backend/regex/regc_lex.c:849:
 if (v-now - save == 0 || ((int) c  0  (int) c = v-nsubexp))
 
 I couldn't find the definition of v (struct vars) anywhere. Is it
 comparing two pointers here? Should it be v-now == save instead?

regcomp.c line 198.  Yes, it's a pointer.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plan time of MASSIVE partitioning ...

2010-10-29 Thread Tom Lane
Leonardo Francalanci m_li...@yahoo.it writes:
 Cases with lots of irrelevant indexes.  Zoltan's example had  4 indexes
 per child table, only one of which was relevant to the query.   In your
 test case there are no irrelevant indexes, which is why the  runtime
 didn't change.

 Mmh... I must be doing something wrong. It looks to me it's not just
 the irrelevant indexes: it's the order by that counts.

Ah, I oversimplified a bit: actually, if you don't have an ORDER BY or
any mergejoinable join clauses, then the possibly_useful_pathkeys test
in find_usable_indexes figures out that we aren't interested in the sort
ordering of *any* indexes, so the whole thing gets short-circuited.
You need at least the possibility of interest in sorted output from an
indexscan before any of this code runs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash in plancache with subtransactions

2010-10-29 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie oct 29 12:54:52 -0300 2010:
 Alvaro Herrera alvhe...@commandprompt.com writes:

  I tried the original test cases that were handed to me (quite different
  from what I submitted here) and they are fixed also.  Thanks.
 
 It'd be interesting to know if there's any noticeable slowdown on
 affected real-world cases.  (Of course, if they invariably crashed
 before, there might not be a way to measure their previous speed...)

Yeah, the cases that were reported failed with one of

ERROR: invalid memory alloc request size 18446744073482534916
ERROR: could not open relation with OID 0
ERROR: buffer 228 is not owned by resource owner Portal

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sorted writes for checkpoints

2010-10-29 Thread Greg Smith

Itagaki Takahiro wrote:

When I submitted the patch, I tested it on disk-based RAID-5 machine:
http://archives.postgresql.org/pgsql-hackers/2007-06/msg00541.php
But there were no additional benchmarking reports at that time. We still
need benchmarking before we re-examine the feature. For example, SSD and
SSD-RAID was not popular at that time, but now they might be considerable.
  


I did multiple rounds of benchmarking that, just none of it showed any 
improvement so I didn't bother reporting them in detail.  I have 
recently figured out why the performance testing I did of that earlier 
patch probably failed to produce useful results on my system when I was 
testing it back then though.  It relates to trivia around how ext3 
handles fsync that's well understood now (the whole cache flushes out 
when one comes in), but wasn't back then yet.


We have a working set of patches here that both rewrite the checkpoint 
logic to avoid several larger problems with how it works now, as well as 
adding instrumentation that makes it possible to directly measure and 
graph whether methods such as sorting writes provide any improvement or 
not to the process.  My hope is to have those all ready for initial 
submission as part of CommitFest 2010-11, as the main feature addition 
from myself toward improving 9.1.


I have a bunch of background information about this I'm presenting at 
PGWest next week, after which I'll start populating the wiki with more 
details and begin packaging the code too.  I had hoped to revisit the 
checkpoint sorting details after that.  Jeff or yourself are welcome to 
try your own tests in that area, I could use the help.  But I think my 
measurement patches will help you with that considerably once I release 
them in another couple of weeks.  Seeing a graph of latency sync times 
for each file is very informative for figuring out whether a change did 
something useful, more so than just staring at total TPS results.  Such 
latency graphs are what I've recently started to do here, with some 
server-side changes that then feed into gnuplot.


The idea of making something like the sorting logic into a pluggable 
hook seems like a waste of time to me, particulary given that the 
earlier implementation really needed to be allocated a dedicated block 
of shared memory to work well IMHO (and I believe that's still the 
case).  That area isn't where the real problems are at here anyway, 
especially on large memory systems.  How the sync logic works is the 
increasingly troublesome part of the checkpoint code, because the 
problem it has to deal with grows proportionately to the size of the 
write cache on the system.  Typical production servers I deal with have 
about 8X as much RAM now as they did in 2007 when I last investigated 
write sorting.  Regular hard drives sure haven't gotten 8X faster since 
then, and battery-backed caches (which used to have enough memory to 
absorb a large portion of a checkpoint burst) have at best doubled in size.


--
Greg Smith, 2ndQuadrant US g...@2ndquadrant.com Baltimore, MD
PostgreSQL Training, Services and Support  www.2ndQuadrant.us



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] More Coccinelli cleanups

2010-10-29 Thread Marti Raudsepp
Hi list,

Since my previous Coccinelle cleanup patches have gained positive
feedback I decided to try some more.

This time I wrote my own spatches.

patch 0001 turns (a - b == 0) into (a == b) and similarly with !=
patch 0002 applies the same to operators , =, , =

I'm well aware that there's a point where code cleanups defeat their
purpose and become a burden. So this will probably be my last one,
I'll go to doing productive things instead. :)

Regards,
Marti
From aadcc5c31df00e940a89abf392e7c71b54e00b6a Mon Sep 17 00:00:00 2001
From: Marti Raudsepp ma...@juffo.org
Date: Sat, 30 Oct 2010 01:27:06 +0300
Subject: [PATCH 1/2] Cleanup: Simplify comparisons (a - b == 0) to (a == b)

This change was done using Coccinelle using the following spatch:

virtual org,diff
@@
expression a, b;
@@
(
-  a - b == 0
+  a == b
|
-  a - b != 0
+  a != b
)

Marti Raudsepp
---
 contrib/pgcrypto/crypt-des.c |4 ++--
 src/backend/access/gin/gindatapage.c |4 ++--
 src/backend/regex/regc_lex.c |2 +-
 src/timezone/zic.c   |2 +-
 4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/pgcrypto/crypt-des.c b/contrib/pgcrypto/crypt-des.c
index 1f49743..fc3c0e3 100644
--- a/contrib/pgcrypto/crypt-des.c
+++ b/contrib/pgcrypto/crypt-des.c
@@ -669,7 +669,7 @@ px_crypt_des(const char *key, const char *setting)
 	 * zeros.
 	 */
 	q = (uint8 *) keybuf;
-	while (q - (uint8 *) keybuf - 8)
+	while (q - (uint8 *) keybuf != 8)
 	{
 		if ((*q++ = *key  1))
 			key++;
@@ -702,7 +702,7 @@ px_crypt_des(const char *key, const char *setting)
 			 * And XOR with the next 8 characters of the key.
 			 */
 			q = (uint8 *) keybuf;
-			while (q - (uint8 *) keybuf - 8  *key)
+			while (q - (uint8 *) keybuf != 8  *key)
 *q++ ^= *key++  1;
 
 			if (des_setkey((char *) keybuf))
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index f74373c..d0873ce 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -285,7 +285,7 @@ GinDataPageAddItem(Page page, void *data, OffsetNumber offset)
 	else
 	{
 		ptr = GinDataPageGetItem(page, offset);
-		if (maxoff + 1 - offset != 0)
+		if (maxoff + 1 != offset)
 			memmove(ptr + GinSizeOfItem(page), ptr, (maxoff - offset + 1) * GinSizeOfItem(page));
 	}
 	memcpy(ptr, data, GinSizeOfItem(page));
@@ -494,7 +494,7 @@ dataSplitPage(GinBtree btree, Buffer lbuf, Buffer rbuf, OffsetNumber off, XLogRe
 	else
 	{
 		ptr = vector + (off - 1) * sizeofitem;
-		if (maxoff + 1 - off != 0)
+		if (maxoff + 1 != off)
 			memmove(ptr + sizeofitem, ptr, (maxoff - off + 1) * sizeofitem);
 		if (GinPageIsLeaf(lpage))
 		{
diff --git a/src/backend/regex/regc_lex.c b/src/backend/regex/regc_lex.c
index da3ff0b..3360cfb 100644
--- a/src/backend/regex/regc_lex.c
+++ b/src/backend/regex/regc_lex.c
@@ -846,7 +846,7 @@ lexescape(struct vars * v)
 			if (ISERR())
 FAILW(REG_EESCAPE);
 			/* ugly heuristic (first test is exactly 1 digit?) */
-			if (v-now - save == 0 || ((int) c  0  (int) c = v-nsubexp))
+			if (v-now == save || ((int) c  0  (int) c = v-nsubexp))
 			{
 NOTE(REG_UBACKREF);
 RETV(BACKREF, (chr) c);
diff --git a/src/timezone/zic.c b/src/timezone/zic.c
index 8a95d6a..694b45a 100644
--- a/src/timezone/zic.c
+++ b/src/timezone/zic.c
@@ -2780,7 +2780,7 @@ newabbr(const char *string)
 		while (isascii((unsigned char) *cp) 
 			   isalpha((unsigned char) *cp))
 			++cp;
-		if (cp - string == 0)
+		if (cp == string)
 			wp = _(time zone abbreviation lacks alphabetic at start);
 		if (noise  cp - string  3)
 			wp = _(time zone abbreviation has more than 3 alphabetics);
-- 
1.7.3.2

From 1d2ab651cf64b44d0ba68fc500fee0500f9487e4 Mon Sep 17 00:00:00 2001
From: Marti Raudsepp ma...@juffo.org
Date: Sat, 30 Oct 2010 02:22:03 +0300
Subject: [PATCH 2/2] Cleanup: Simplify comparisons (a - b  0) to (a  b) etc

This change was done using Coccinelle using the following spatch:

virtual org,diff
@@
expression a, b;
@@
(
-  a - b  0
+  a  b
|
-  a - b = 0
+  a = b
|
-  a - b  0
+  a  b
|
-  a - b = 0
+  a = b
)

Marti Raudsepp
---
 contrib/pgcrypto/pgp-decrypt.c |2 +-
 src/backend/postmaster/bgwriter.c  |2 +-
 src/backend/utils/adt/formatting.c |2 +-
 src/backend/utils/adt/varlena.c|2 +-
 src/bin/scripts/createlang.c   |6 +++---
 src/bin/scripts/droplang.c |6 +++---
 src/test/examples/testlo.c |4 ++--
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/contrib/pgcrypto/pgp-decrypt.c b/contrib/pgcrypto/pgp-decrypt.c
index c9aa6cd..ffd3287 100644
--- a/contrib/pgcrypto/pgp-decrypt.c
+++ b/contrib/pgcrypto/pgp-decrypt.c
@@ -747,7 +747,7 @@ copy_crlf(MBuf *dst, uint8 *data, int len, int *got_cr)
 			p = tmpbuf;
 		}
 	}
-	if (p - tmpbuf  0)
+	if (p  tmpbuf)
 	{
 		res = mbuf_append(dst, tmpbuf, p - tmpbuf);
 		if (res  0)
diff --git a/src/backend/postmaster/bgwriter.c b/src/backend/postmaster/bgwriter.c
index 0690ab5..c4232c7 100644
---