Re: [PATCHES] Bundle of patches

2007-02-03 Thread Bruce Momjian

I assume we have taken all the patches from here we want.

---

Teodor Sigaev wrote:
 The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches
 we (Oleg and me) developed during our work on porting of the '1C:Enterprise' 
 system to support the PostgreSQL database.
 
 We would like to suggest to commit they to HEAD.
 
 
 1) Typmod for user-defined types
http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
Patch is based on ideas from
http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php
 
Patch adds to type definition two optional function: typmodinput and
typmodoutput. That allows to develop user-defined types with type's
modificators. Built-in types use typmod input/output functions too.
Typmod internally is represented  as non-negative int4 value,
but patch allows to point list of integer in type definition. So,
NUMERIC type works with a help of typmodin/typmodout function.
 
 
 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
http://www.sigaev.ru/misc/NULLS_82-0.5.gz
Allow to sort NULLs as greater or lesser than any value. The goal was to
simplificate migrations from MySQL/MS SQL which think that NULL is less.
Also, syntax conforms to SQL2003. It operate on gram.y level, and
adds 'field is [not] null' qualification to sortClause.
Note, to allow queries like 'select .. union .. order by f nulls first'
pgsql now can rewrite that query to
'select * from (select .. union ..) order by f nulls first'. This solves 
 the
problem with 'resjunk' column in SelectStmt-sortClause.
 
 3) Allow to use index for IS [NOT] NULL
http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
Initially patch was developed by Martijn van Oosterhout 
 kleptog@svana.org.
But it's reworked  and support of searching NULLS to GiST too. Patch
adds new column named amsearchnull to pg_am. To recognize IS NULL clause
ScanKey-sk_flags contains (SK_ISNULL  SK_INDEXFINDNULL) and
ScanKey-sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
ScanKey-sk_strategy == BTLessStrategyNumber. Thats because NULLs are
treated greater than any value. It might be look some odd that
for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST,
but if sk_flags contains SK_ISNULL then we never call user-defined 
 functions.
 
 4) OR clauses optimizations
http://www.sigaev.ru/misc/OR_82-0.6.gz
Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses
generate_bitmapscan and predicate_implied_by/predicate_refuted_by 
 machineries
 
 4.1) Allow any useful common restriction clauses to be extracted from
OR-of-AND quals. Also, it allows to combine several different
operations to one which can be used in index scan.
SELECT
  a, b
FROM
  tst
WHERE ( a = 5 ) OR ( a  5 AND b  5 )
ORDER BY a, b
LIMIT 20;
Limit  (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20 
 loops=1)
 -  Index Scan using abidx on tst  (cost=0.00..3671.26 rows=24878 
 width=8) 
 (actual time=0.265..0.611 rows=20 loops=1)
   Index Cond: (a = 5)
   Filter: ((a = 5) OR ((a  5) AND (b  5)))
 4.2) When OR clauses aren't intersect and use the same index, it's possible
to just concatenate results of indexscans. For that, now postgres may use
Append node. Append node is modified to have a pathkeys.
 
SELECT
  a
FROM
  tst
WHERE ( a  6 AND a  61000 ) OR ( a  2 AND a  21000 )
ORDER BY ASC
LIMIT 20;
Limit  (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 
 rows=20 
 loops=1)
 -  Result  (cost=0.00..4001.55 rows=2008 width=4) (actual 
 time=0.359..0.824 
 rows=20 loops=1)
   -  Append  (cost=0.00..4001.55 rows=2008 width=4) (actual 
 time=0.349..0.742 rows=20 loops=1)
 -  Index Scan using aidx on tst  (cost=0.00..2000.42 
 rows=990 
 width=4) (actual time=0.346..0.684 rows=20 loops=1)
   Index Cond: ((a  2) AND (a  21000))
 -  Index Scan using aidx on tst  (cost=0.00..2001.12 
 rows=1018 
 width=4) (never executed)
   Index Cond: ((a  6) AND (a  61000))
 
Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys
ranges (compare plan with previous one).
SELECT
  a
FROM
  tst
WHERE ( a  6 AND a  61000 ) OR ( a  2 AND a  21000 )
ORDER BY a DESC
LIMIT 20;
Limit  (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 
 rows=20 
 loops=1)
 -  Result  (cost=0.00..4001.55 rows=2008 width=4) (actual 
 time=0.157..0.589 
 rows=20 loops=1)
   -  Append  (cost=0.00..4001.55 rows=2008 width=4) (actual 
 time=0.149..0.511 rows=20 loops=1)
 -  

Re: [pgsql-patches] [HACKERS] [PATCHES] Bundle of patches

2007-01-10 Thread Teodor Sigaev

Nice, thanks a lot.

Tom Lane wrote:

Teodor Sigaev [EMAIL PROTECTED] writes:

Just a freshing for clean applying..
http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz


Applied with some revisions, and pg_dump support and regression tests
added.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-30 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 Just a freshing for clean applying..
 http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz

Applied with some revisions, and pg_dump support and regression tests
added.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-29 Thread Teodor Sigaev

Just a freshing for clean applying..
http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz

Is any objections to commit?
--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Bundle of patches

2006-12-29 Thread Teodor Sigaev

This is not responding to my concern.  What you presented was an

 Sorry, I see your point now.

Is that test enough? Or I should make more?

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-29 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 Just a freshing for clean applying..
 http://www.sigaev.ru/misc/user_defined_typmod-0.11.gz
 Is any objections to commit?

There's still a lot I don't particularly care for here (lack of
documentation being the biggest), but I'll make a pass at cleaning it up.

One thought I had after a quick reading of the patch is that there
doesn't seem to be a whole lot of percentage in trying to move the
typmod handling for time/timestamp/interval types into callable
functions, because we still have to special-case the darn things
thanks to the SQL spec's oddball syntax requirements.  For instance
in format_type_be() you've got

  case TIMETZOID:
  if (with_typemod)
! buf = psnprintf(50, time(%d) with time zone,
! typemod);
  else
  buf = pstrdup(time with time zone);
  break;

changed to

  case TIMETZOID:
  if (with_typemod)
! buf = printTypmod(time, typemod, typeform-typmodoutput);
  else
  buf = pstrdup(time with time zone);
  break;

which hardly seems like much of an improvement.  If we could have gotten
rid of the switch() branch for TIMETZOID entirely, then we'd be
somewhere, but there's no chance because the typmod-less case still has
to be special.  So I'm sort of inclined to leave the processing of these
datatypes as-is, and only use the typmodin/typmodout infrastructure for
datatypes that follow the canonical syntax of type_name(int[,int ...]).
Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-21 Thread Teodor Sigaev

0.9 doesn't apply cleanly after Peter's changes, so, new version

http://www.sigaev.ru/misc/user_defined_typmod-0.10.gz

Teodor Sigaev wrote:

Perhaps an array of int4 would be better?  How much

Done
http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz


The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.


Cleaned.




--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Bundle of patches

2006-12-20 Thread Teodor Sigaev

   Perhaps an array of int4 would be better?  How much

Done
http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz


The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.


Cleaned.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-20 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 Perhaps an array of int4 would be better?  How much

 Done
 http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz

 The patch needs more cleanup before applying, too, eg make comments
 match code, get rid of unused keywords added to gram.y.

 Cleaned.

OK.  I'm up to my rear in the opclass/opfamily rewrite, but will take
another look at the typmod code as soon as I have a working HEAD again ;-)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Bundle of patches

2006-12-05 Thread Teodor Sigaev

useful optimizations most of the time.  Have you done any benchmarking
to find out what the cost is when the optimizations don't succeed?
Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled 
with -O0 and --enable-debug --enable-cassert


Test results (ms)
 |[1]|[2]|[3]
---
[I]   8.3| 88490.275 | 46684.972 |  21.563
[II]  8.3 splited query  |  0.492|   0.560   |  0.635
[III] 8.3 rewrited query |  0.523|   0.952   |  1.004
[IV]  8.3+patch  |  0.444|   0.410   |  0.679

Query description
All queries are variants of sinple OR clause with index support. In tests
for 8.3 and 8.3+patch usial queries are used. '8.2 splited query' test executes
separate queries for each OR-ed clause (in supposition that apllication should
concatenate results). '8.3 rewrited query' test uses rewritten queries with
the same result of execution and queries are fast as I could find.

Test result shows a big advantage of using such kind of optimization and
doesn't demonstrate performance gap on optimization stage.

Test suite:
# select * into foo from generate_series(1,10) as f1, generate_series(1,100) 
as f2;

# create index idx on foo (f1,f2);
# vacuum analyze foo;

Queries and plans in details:
==[I] Without patch===
[1]
# explain analyze select * from foo where (f1=7 and f295) or f17 order 
by f1, f2 limit 10;
Limit  (cost=0.00..45.94 rows=10 width=8) (actual time=88490.044..88490.109 
rows=10 loops=1)
 -  Index Scan using idx on foo  (cost=0.00..14113503.78 rows=3071985 
width=8) (actual time=88490.035..88490.072 rows=10 loops=1)

  Filter: (((f1 = 7) AND (f2  95)) OR (f1  7))
Total runtime: 88490.275 ms

[2]
# explain analyze select * from foo where (f14 and f15) or (f16 
and f17) order by f1, f2 limit 10;
Limit  (cost=0.00..69.91 rows=10 width=8) (actual time=46684.725..46684.813 
rows=10 loops=1)
-  Index Scan using idx on foo  (cost=0.00..14138503.78 rows=2022419 
width=8) (actual time=46684.717..46684.768 rows=10 loops=1)
Filter: (((f1  4) AND (f1  5)) OR ((f1  6) AND (f1  
7)))

Total runtime: 46684.972 ms

[3]
# explain  analyze select * from foo where (f149980 and f15) or (f169980 
and f17) order by f1, f2 limit 10;
Limit  (cost=13581.04..13581.07 rows=10 width=8) (actual time=20.767..20.809 
rows=10 loops=1)
-  Sort  (cost=13581.04..13592.03 rows=4393 width=8) (actual 
time=20.759..20.773 rows=10 loops=1)

Sort Key: f1, f2
-  Bitmap Heap Scan on foo  (cost=102.12..13315.25 rows=4393 width=8) 
(actual time=3.495..11.911 rows=3800 loops=1)
Recheck Cond: (((f1  49980) AND (f1  5)) OR ((f1  69980) AND 
(f1  7)))
-  BitmapOr  (cost=102.12..102.12 rows=4393 width=0) (actual 
time=3.415..3.415 rows=0 loops=1)
-  Bitmap Index Scan on idx  (cost=0.00..51.01 rows=2191 
width=0) (actual time=1.887..1.887 rows=1900 loops=1)

Index Cond: ((f1  49980) AND (f1  5))
-  Bitmap Index Scan on idx  (cost=0.00..51.12 rows=2202 
width=0) (actual time=1.519..1.519 rows=1900 loops=1)

Index Cond: ((f1  69980) AND (f1  7))
Total runtime: 21.563 ms

=[II] Without patch, Two queries===
Here plans are omitted because they are simple index scans.
[1]
# explain analyze select * from foo where f1=7 and f295 order by f1, f2 
limit 10;

Total runtime: 0.219 ms
# explain analyze select * from foo where f17 order by f1, f2 limit 10;
Total runtime: 0.273 ms

[2]
#  explain analyze select * from foo where  f14 and f15 order by f1, 
f2 limit 10;

Total runtime: 0.245 ms
# explain analyze select * from foo where  f16 and f17 order by f1, f2 
limit 10;

Total runtime: 0.315 ms

[3]
# explain analyze select * from foo where f149980 and f15 order by f1, f2 
limit 10;

Total runtime: 0.292 ms
# explain analyze seleCt * from foo where f169980 and f17 order by f1, f2 
limit 10;

Total runtime: 0.343 ms


==[III]Without patch, rewrited query===
[1]
# explain analyze select * from foo where ((f1=7 and f295) or f17) and 
f1=7 order by f1, f2 limit 10;

Limit  (cost=0.00..50.33 rows=10 width=8) (actual time=0.320..0.387 rows=10 
loops=1)
-  Index Scan using idx on foo  (cost=0.00..3974489.10 rows=789613 
width=8) (actual time=0.313..0.348 rows=10 loops=1)

Index Cond: (f1 = 7)
Filter: (((f1 = 7) AND (f2  95)) OR (f1  7))
Total runtime: 0.523 ms

[2]
# explain  analyze
  select * from (
select * from (select * from foo where f14 and f15 order by f1, 
f2 limit 10) as t1

union all
select * from (select * from foo where f16 and f17 order by f1, 
f2 limit 10) as t2

  ) as t
  order by f1, f2 limit 

Re: [PATCHES] Bundle of patches

2006-12-05 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 useful optimizations most of the time.  Have you done any benchmarking
 to find out what the cost is when the optimizations don't succeed?

 Test runs on my notebook with PIII/512Mb, FreeBSD 6.1, postgres was compiled 
 with -O0 and --enable-debug --enable-cassert

This is not responding to my concern.  What you presented was an
advertisement for the cases where the patch is able to find a better
plan.  What I want to know about is how much planning time is added
for queries that it's *not* able to improve.  EXPLAIN ANALYZE output
doesn't address that point because it doesn't show planning time.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[PATCHES] Bundle of patches

2006-12-04 Thread Teodor Sigaev

The 1C (http://www.1c.ru/) company kindly permits to publish a set of patches
we (Oleg and me) developed during our work on porting of the '1C:Enterprise' 
system to support the PostgreSQL database.


We would like to suggest to commit they to HEAD.


1) Typmod for user-defined types
  http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
  Patch is based on ideas from
  http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
  http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php

  Patch adds to type definition two optional function: typmodinput and
  typmodoutput. That allows to develop user-defined types with type's
  modificators. Built-in types use typmod input/output functions too.
  Typmod internally is represented  as non-negative int4 value,
  but patch allows to point list of integer in type definition. So,
  NUMERIC type works with a help of typmodin/typmodout function.


2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
  http://www.sigaev.ru/misc/NULLS_82-0.5.gz
  Allow to sort NULLs as greater or lesser than any value. The goal was to
  simplificate migrations from MySQL/MS SQL which think that NULL is less.
  Also, syntax conforms to SQL2003. It operate on gram.y level, and
  adds 'field is [not] null' qualification to sortClause.
  Note, to allow queries like 'select .. union .. order by f nulls first'
  pgsql now can rewrite that query to
  'select * from (select .. union ..) order by f nulls first'. This solves the
  problem with 'resjunk' column in SelectStmt-sortClause.

3) Allow to use index for IS [NOT] NULL
  http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
  Initially patch was developed by Martijn van Oosterhout kleptog@svana.org.
  But it's reworked  and support of searching NULLS to GiST too. Patch
  adds new column named amsearchnull to pg_am. To recognize IS NULL clause
  ScanKey-sk_flags contains (SK_ISNULL  SK_INDEXFINDNULL) and
  ScanKey-sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
  ScanKey-sk_strategy == BTLessStrategyNumber. Thats because NULLs are
  treated greater than any value. It might be look some odd that
  for IS [NOT] NULL clauses we use Btree strategy numbers even for GiST,
  but if sk_flags contains SK_ISNULL then we never call user-defined functions.

4) OR clauses optimizations
  http://www.sigaev.ru/misc/OR_82-0.6.gz
  Patch can suggest new indexpaths to optimizer for ORed clauses. Patch uses
  generate_bitmapscan and predicate_implied_by/predicate_refuted_by machineries

4.1) Allow any useful common restriction clauses to be extracted from
  OR-of-AND quals. Also, it allows to combine several different
  operations to one which can be used in index scan.
  SELECT
a, b
  FROM
tst
  WHERE ( a = 5 ) OR ( a  5 AND b  5 )
  ORDER BY a, b
  LIMIT 20;
  Limit  (cost=0.00..2.95 rows=20 width=8) (actual time=0.271..0.677 rows=20 
loops=1)
   -  Index Scan using abidx on tst  (cost=0.00..3671.26 rows=24878 width=8) 
(actual time=0.265..0.611 rows=20 loops=1)

 Index Cond: (a = 5)
 Filter: ((a = 5) OR ((a  5) AND (b  5)))
4.2) When OR clauses aren't intersect and use the same index, it's possible
  to just concatenate results of indexscans. For that, now postgres may use
  Append node. Append node is modified to have a pathkeys.

  SELECT
a
  FROM
tst
  WHERE ( a  6 AND a  61000 ) OR ( a  2 AND a  21000 )
  ORDER BY ASC
  LIMIT 20;
  Limit  (cost=0.00..39.86 rows=20 width=4) (actual time=0.364..0.883 rows=20 
loops=1)
   -  Result  (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.359..0.824 
rows=20 loops=1)
 -  Append  (cost=0.00..4001.55 rows=2008 width=4) (actual 
time=0.349..0.742 rows=20 loops=1)
   -  Index Scan using aidx on tst  (cost=0.00..2000.42 rows=990 
width=4) (actual time=0.346..0.684 rows=20 loops=1)

 Index Cond: ((a  2) AND (a  21000))
   -  Index Scan using aidx on tst  (cost=0.00..2001.12 rows=1018 
width=4) (never executed)

 Index Cond: ((a  6) AND (a  61000))

  Also, if there is a 'ORDER BY' clause, childs nodes may be ordered by theys
  ranges (compare plan with previous one).
  SELECT
a
  FROM
tst
  WHERE ( a  6 AND a  61000 ) OR ( a  2 AND a  21000 )
  ORDER BY a DESC
  LIMIT 20;
  Limit  (cost=0.00..39.86 rows=20 width=4) (actual time=0.162..0.651 rows=20 
loops=1)
   -  Result  (cost=0.00..4001.55 rows=2008 width=4) (actual time=0.157..0.589 
rows=20 loops=1)
 -  Append  (cost=0.00..4001.55 rows=2008 width=4) (actual 
time=0.149..0.511 rows=20 loops=1)
   -  Index Scan Backward using aidx on tst  (cost=0.00..2001.12 
rows=1018 width=4) (actual time=0.145..0.450 rows=20 loops=1)

 Index Cond: ((a  6) AND (a  61000))
   -  Index Scan Backward using aidx on tst  (cost=0.00..2000.42 
rows=990 width=4) (never executed)

 Index Cond: ((a  2) AND (a  

Re: [PATCHES] Bundle of patches

2006-12-04 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 2) ORDER BY .. [ NULLS ( FIRST | LAST ) ]
http://www.sigaev.ru/misc/NULLS_82-0.5.gz

i haven't looked at the other patches yet, but this one is just horrid :-(
Instead of creating a proper parse-tree representation of the NULLS
FIRST/LAST option, you've kluged it to convert the syntax into a
double-column ordering using foo IS [NOT] NULL as the first column.
This has obvious semantic disdvantages (what if foo is an expensive
function?); but the real problem is that there's no way for the planner
to reason about ordering in this representation.  This patch would
guarantee that an ORDER BY with the NULLS option couldn't use an
indexscan, even if the index sorts nulls at the correct end.

I think a reasonable implementation requires introducing an explicit
concept of nulls-first-or-last into the planner's model of sort order,
and probably into btree index opclasses as well.  There is already some
discussion about this in the archives with respect to the problem of
handling descending-sort opclasses nicely.  (If you just switch the
operators to make a descending-sort opclass, then nulls end up at the
other end from where they would otherwise, meaning that a backwards
index scan does something unexpected.  We have to fix that or else
descending-sort opclasses can break mergejoins...)

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 3) Allow to use index for IS [NOT] NULL
http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
Initially patch was developed by Martijn van Oosterhout 
 kleptog@svana.org.
But it's reworked  and support of searching NULLS to GiST too. Patch
adds new column named amsearchnull to pg_am. To recognize IS NULL clause
ScanKey-sk_flags contains (SK_ISNULL  SK_INDEXFINDNULL) and
ScanKey-sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
ScanKey-sk_strategy == BTLessStrategyNumber. Thats because NULLs are
treated greater than any value.

And what happens when we implement NULLS FIRST/LAST correctly?  This is
really a poor choice of representation.

One thing I find questionable about this is the assumption that indexes
can support foo IS NULL and foo IS NOT NULL searches equally
conveniently.  This is demonstrably false for, say, hash.  (Hash could
store null keys by assigning them a fixed hashcode, say 0.  Then it
would be able to handle IS NULL searches, but not IS NOT NULL, because
it can't do full-index scans.)

I am not real sure that there is any point in making IS NOT NULL an
indexable condition.  We don't support  as an indexable condition,
and no one's yelled about that.  It might be best just to simplify
the patch to do IS NULL only.  But if we are going to support both,
we probably have to have two pg_am flags not one.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 4) OR clauses optimizations
http://www.sigaev.ru/misc/OR_82-0.6.gz

This seems kinda ugly --- it looks very expensive and unlikely to find
useful optimizations most of the time.  Have you done any benchmarking
to find out what the cost is when the optimizations don't succeed?

I guess my big problem with it is that it's a huge expansion of the
single ugliest, most ad-hoc part of the planner, ie, orindxqual.c.
And the new code is just as ad-hoc as what was there before ...

Also, what's with the pull_tlist bit?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Teodor Sigaev

This has obvious semantic disdvantages (what if foo is an expensive
function?); 

Agree.


but the real problem is that there's no way for the planner
to reason about ordering in this representation.  This patch would
guarantee that an ORDER BY with the NULLS option couldn't use an
indexscan, even if the index sorts nulls at the correct end.


create table foo ( i int);
insert into foo values (1), (5), (NULL);
create index fooidx on foo (i);
set enable_seqscan=off;
set enable_bitmapscan=off;
explain select i from foo order by i asc nulls last;
QUERY PLAN
---
 Index Scan using fooidx on foo  (cost=0.00..12.05 rows=3 width=4)
explain select i from foo order by i desc nulls first;
 QUERY PLAN

 Index Scan Backward using fooidx on foo  (cost=0.00..12.05 rows=3 width=4)

Patch is smart enough about native NULL's ordering, so it adds quals only if 
it needed.


Index support of non-native NULL's ordering, IMHO, has some correlation with 
suggested OR-patch. Sorting by ASC NULLS FIRST may done by two index scan with 
append node:

Append
Index Scan
Cond: foo IS NULL
Index Scan
Cond: foo IS NOT NULL


I think a reasonable implementation requires introducing an explicit
concept of nulls-first-or-last into the planner's model of sort order,

Agree, but I tried to keep patches independent as possible...


If we will have agreement about ways to resolve, I'll will time to work
further in foreseeable future.
--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 1) Typmod for user-defined types
http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
Patch is based on ideas from
http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php

This one seems generally workable, but I really dislike the approach
that's been used for passing typmod arguments to the typmod_in function.
Representing them with an internal parameter means it'll be forever
impossible to write typmod functions in anything but C, which seems an
ugly restriction.  Perhaps an array of int4 would be better?  How much
flexibility do we really want to provide for typmod arguments?  Allowing
full expr_list in the grammar seems less than sane, considering the
result is still going to have to pack into 32 bits.

The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-04 Thread Tom Lane
Teodor Sigaev [EMAIL PROTECTED] writes:
 but the real problem is that there's no way for the planner
 to reason about ordering in this representation.  This patch would
 guarantee that an ORDER BY with the NULLS option couldn't use an
 indexscan, even if the index sorts nulls at the correct end.

 Patch is smart enough about native NULL's ordering, so it adds quals
 only if it needed.

[ looks again... ]  Oh, so you've hard-wired the assumption that ASC/DESC
correlate to NULLS FIRST/LAST right into the grammar :-(

This is not a workable base for future extension.  To be able to do
anything interesting with descending-order opclasses, we really have to
separate ASC/DESC from NULLS FIRST/LAST, not bind them permanently
together.

BTW, another sufficient reason to reject this approach is that a query
entered as ORDER BY foo NULLS FIRST would come out as something quite
different, if reverse-listed by ruleutils.c.  We've paid the price of
that sort of shortsightedness too many times in the past: implementation
details should not get exposed in the reverse-listing.  As a rule of
thumb, if you are putting a transformation involving semantic knowledge
into gram.y, you are probably putting it in the wrong place.  (One of
the reasons I like the typmod patch is that it helps pull a lot of
inappropriate knowledge out of gram.y ...)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Teodor Sigaev

This seems kinda ugly --- it looks very expensive and unlikely to find
useful optimizations most of the time.  Have you done any benchmarking
to find out what the cost is when the optimizations don't succeed?


Yep, it's a big win with order and limit specified.
Let table (a,b) has index over (a,b), so,
queries with ((a=3 AND b10) OR a3) ORDER BY a,b may be done with one pass of 
index scan with condition a=3 and filter ((a=3 and b10) or a3). And scan out 
is already sorted. The single way to execute it without patch is bitmap or scan 
over two index scans and following ordering. If limit is small enough then there 
is a lot  unnecessary work for executor. Thats case may be found by 
find_common_quals() which is fast enough.


Simplest case of second kind is ( a3 or a5 ). If it's possible to prove that 
set of rows for first condition and set for second are not intersected then 
output of correlated index scans can be simply joined/appended. In this case, we
broaden applicability of Append node. What is more, it's possible to order index 
scans by conditions, which allows do not use sort node.


I understand, that proving of non-intersecting and ordering by conditions is an 
expensive operation because of using predicate_implied_by/predicate_refuted_by, 
but in most cases they aren't even called. For using this optimization all 
conditions should on single index - and it's first step.


Suggested plan's is very effective when a or b is large values like varchar, not 
a just integers.



Also, what's with the pull_tlist bit?

pull target list from subplan.

I've add it because query
select b,a from foo where a3 or a5 order by a limit 1;
with plan
Result
Append
Index Scan
Index Scan
fails because Result node thinks that it gets (b,a) tuple, but in fact it gets 
(a,b). So, if pull_tlist is TRUE, then create_append_plan takes target list from 
first subplan. Currently, that bit set only with OR-optimized plan. And 
optimizer of OR-clause guarantee that target lists are the same. Sorry, but I 
didn't find clearer solution...



--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Teodor Sigaev




This one seems generally workable, but I really dislike the approach
that's been used for passing typmod arguments to the typmod_in function.
Representing them with an internal parameter means it'll be forever
impossible to write typmod functions in anything but C, which seems an
ugly restriction.  Perhaps an array of int4 would be better?  How much

I don't think that is a problem - I'll change that


flexibility do we really want to provide for typmod arguments?  Allowing
full expr_list in the grammar seems less than sane, considering the
result is still going to have to pack into 32 bits.

As I remember, I tried to use some thing else but, I've got a lot conflicts
with
AexprConst:
func_name '(' expr_list ')' Sconst



The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.

Ok.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [PATCHES] Bundle of patches

2006-12-04 Thread Teodor Sigaev

And what happens when we implement NULLS FIRST/LAST correctly?  This is
really a poor choice of representation.


If it's just appending of indexscan's it's not a problem...



One thing I find questionable about this is the assumption that indexes
can support foo IS NULL and foo IS NOT NULL searches equally
conveniently.  This is demonstrably false for, say, hash.  (Hash could
store null keys by assigning them a fixed hashcode, say 0.  Then it
would be able to handle IS NULL searches, but not IS NOT NULL, because
it can't do full-index scans.)


Is there a guarantee that hash value of some not-null keys doesn't equal to 
special hash code?




the patch to do IS NULL only.  But if we are going areto support both,
we probably have to have two pg_am flags not one.


GiST isn't effective with single NOT NULL condition ... So, using two flags may 
be useful.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-04 Thread Martijn van Oosterhout
On Mon, Dec 04, 2006 at 01:35:21PM -0500, Tom Lane wrote:
  3) Allow to use index for IS [NOT] NULL
 http://www.sigaev.ru/misc/indexnulls_82-0.6.gz
 Initially patch was developed by Martijn van Oosterhout 
  kleptog@svana.org.
 But it's reworked  and support of searching NULLS to GiST too. Patch
 adds new column named amsearchnull to pg_am. To recognize IS NULL clause
 ScanKey-sk_flags contains (SK_ISNULL  SK_INDEXFINDNULL) and
 ScanKey-sk_strategy == BTEqualStrategyNumber. For IS NOT NULL,
 ScanKey-sk_strategy == BTLessStrategyNumber. Thats because NULLs are
 treated greater than any value.
 
 I am not real sure that there is any point in making IS NOT NULL an
 indexable condition.  We don't support  as an indexable condition,
 and no one's yelled about that.  It might be best just to simplify
 the patch to do IS NULL only.  But if we are going to support both,
 we probably have to have two pg_am flags not one.

Originally I didn't have IS NOT NULL but added it because it was easy
and someone suggested a use case: for indexed columns that have a lot
of nulls, it allows you to create an index scan that stops as soon as
it reaches the first null entry. This is useful for the NULL FIRST/LAST
optimisation for example.

You're right, it doesn't work for hash indexes, but you can't do full
scans on them anyway, so it's not terribly important.

I'd say that ordered indexes like b-tree are the only ones that would
get any benefit from IS NOT NULL.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-04 Thread Martijn van Oosterhout
On Mon, Dec 04, 2006 at 02:04:26PM -0500, Tom Lane wrote:
 Teodor Sigaev [EMAIL PROTECTED] writes:
  1) Typmod for user-defined types
 http://www.sigaev.ru/misc/user_defined_typmod-0.7.gz
 Patch is based on ideas from
 http://archives.postgresql.org/pgsql-hackers/2004-06/msg00932.php
 http://archives.postgresql.org/pgsql-hackers/2005-08/msg01007.php
 
 This one seems generally workable, but I really dislike the approach
 that's been used for passing typmod arguments to the typmod_in function.
 Representing them with an internal parameter means it'll be forever
 impossible to write typmod functions in anything but C, which seems an
 ugly restriction.  Perhaps an array of int4 would be better?  How much
 flexibility do we really want to provide for typmod arguments?  Allowing
 full expr_list in the grammar seems less than sane, considering the
 result is still going to have to pack into 32 bits.

People have been discussion passing character set names as typmod
parameters, so restricting them to int4 seems too tight.

I'd favour the approach where the arguments to the typmod_in function
determine the types required. This allows the system to do proper
checking and casting and most important of all, good error messages,
eg:

ERROR: Invalid argument to type: must be one of
numeric(), numeric(integer), numeric(integer, integer)

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature