Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-17 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Where was it posted anyway?

 Found it:

   
 http://groups.google.com/groups?hl=enlr=ie=UTF-8selm=200312010450.hB14ovH16330%40candle.pha.pa.usrnum=8

Thanks.  The original patch is much older than I thought --- I was
looking in the November/December part of the archives.

 Personally, because frequently accessed duplicates appear more forward
 in the duplicate index, I think the sorting is only valuable when
 creating a new index.

Yes, and that's what this does.  Looking back, the original discussion
got a little confused because the TODO item about order duplicate index
entries by tid got brought into the mix.  Actually this patch has
nothing to do with that, because it only acts during btree creation not
during index updates.

On inspection I have no problem with the patch, only with the comments ;-)
If you like I'll revise the comments and apply.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-17 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Where was it posted anyway?
 
  Found it:
 
  
  http://groups.google.com/groups?hl=enlr=ie=UTF-8selm=200312010450.hB14ovH16330%40candle.pha.pa.usrnum=8
 
 Thanks.  The original patch is much older than I thought --- I was
 looking in the November/December part of the archives.
 
  Personally, because frequently accessed duplicates appear more forward
  in the duplicate index, I think the sorting is only valuable when
  creating a new index.
 
 Yes, and that's what this does.  Looking back, the original discussion
 got a little confused because the TODO item about order duplicate index
 entries by tid got brought into the mix.  Actually this patch has
 nothing to do with that, because it only acts during btree creation not
 during index updates.
 
 On inspection I have no problem with the patch, only with the comments ;-)
 If you like I'll revise the comments and apply.

Great.  Seems harmless and he showed good performance with it.  I agree
the discussion got confused, and that is why I kept it in my mailbox to
revisit.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-16 Thread Bruce Momjian

Here is more detail on the patch.

---

Manfred Koizar wrote:
 On Mon, 1 Dec 2003 00:02:54 -0500 (EST), Bruce Momjian
 [EMAIL PROTECTED] wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
 And if it doesn't help index
   creation speed, at least the resulting index has better correlation.
 
 ... which has been shown by the example in the original message:
  Result without patch:
 ctid
  --
   (153,14)
   (306,23)
   (305,80)
   (152,91)
(76,68)
(38,34)
   (153,34)
   (305,50)
 (9,62)
   (305,40)
  (10 rows)
  
  Result with patch:
ctid
  
(0,5)
   (0,10)
   (0,15)
   (0,20)
   (0,25)
   (0,30)
   (0,35)
   (0,40)
   (0,45)
   (0,50)
  (10 rows)
 
 And I think we all agree, that better index correlation leads to better
 performance.
 
  I think this is a *very* dubious idea.  It introduces a low-level
  implementation dependency into our sort behavior
 
 The patch modifies the static function comparetup_index() in
 tuplesort.c.
 The comment above this function says
 /*
  * Routines specialized for IndexTuple case
  *
  * NOTE: actually, these are specialized for the btree case; [...]
  */
 
 comparetup_index() compares two IndexTuples.  The structure
 IndexTupleData consists basically of not much more than an ItemPointer,
 and the patch is not much more than adding a comparison of two
 ItemPointers.  So how does the patch introduce a new low level
 implementation dependency?
 
 Roger --- patch removed.  Thanks.
 
 Could we agree on only removing the first five a half lines of the
 comment in the patch?
 
 Servus
  Manfred
 
 ---(end of broadcast)---
 TIP 3: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-16 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Where are we on this?   It seems like a win to me.
 
 I thought it was a bad idea, although I no longer remember the details.

If I remember correctly, you didn't like the index routines reading the
tuple information, or something like that, but there was a performance
benefit for duplicate keys, so I think we should re-investigate this.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-16 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 If I remember correctly, you didn't like the index routines reading the
 tuple information, or something like that, but there was a performance
 benefit for duplicate keys, so I think we should re-investigate this.

I don't see the actual patch either in the hackers or patches archives,
nor on your to-apply pages, making it a bit difficult to re-investigate.
Where was it posted anyway?

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PATCHES] [HACKERS] Index creation takes for ever

2004-03-16 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  If I remember correctly, you didn't like the index routines reading the
  tuple information, or something like that, but there was a performance
  benefit for duplicate keys, so I think we should re-investigate this.
 
 I don't see the actual patch either in the hackers or patches archives,
 nor on your to-apply pages, making it a bit difficult to re-investigate.
 Where was it posted anyway?

Found it:


http://groups.google.com/groups?hl=enlr=ie=UTF-8selm=200312010450.hB14ovH16330%40candle.pha.pa.usrnum=8

Personally, because frequently accessed duplicates appear more forward
in the duplicate index, I think the sorting is only valuable when
creating a new index.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-12-02 Thread Manfred Koizar
On Mon, 01 Dec 2003 13:32:10 -0500, Tom Lane [EMAIL PROTECTED] wrote:
Manfred Koizar [EMAIL PROTECTED] writes:
 comparetup_index() compares two IndexTuples.  The structure
 IndexTupleData consists basically of not much more than an ItemPointer,
 and the patch is not much more than adding a comparison of two
 ItemPointers.  So how does the patch introduce a new low level
 implementation dependency?

Because it sorts on tuple position, which is certainly about as low
level as you can get.

The patch affects only the sort during index creation.  Mapping key
values to tuple positions is the sole purpose of an index.  The notion
that an index should not care for tuple positions looks a bit strange to
me.

  More to the point, though, no evidence has been
provided that this is a good idea.

The test script I posted with the patch shows that the patch produces
more efficient b-tree indices when there are lots of duplicates.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-12-01 Thread Manfred Koizar
On Mon, 1 Dec 2003 00:02:54 -0500 (EST), Bruce Momjian
[EMAIL PROTECTED] wrote:
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
And if it doesn't help index
  creation speed, at least the resulting index has better correlation.

... which has been shown by the example in the original message:
 Result without patch:
ctid
 --
  (153,14)
  (306,23)
  (305,80)
  (152,91)
   (76,68)
   (38,34)
  (153,34)
  (305,50)
(9,62)
  (305,40)
 (10 rows)
 
 Result with patch:
   ctid
 
   (0,5)
  (0,10)
  (0,15)
  (0,20)
  (0,25)
  (0,30)
  (0,35)
  (0,40)
  (0,45)
  (0,50)
 (10 rows)

And I think we all agree, that better index correlation leads to better
performance.

 I think this is a *very* dubious idea.  It introduces a low-level
 implementation dependency into our sort behavior

The patch modifies the static function comparetup_index() in
tuplesort.c.
The comment above this function says
/*
 * Routines specialized for IndexTuple case
 *
 * NOTE: actually, these are specialized for the btree case; [...]
 */

comparetup_index() compares two IndexTuples.  The structure
IndexTupleData consists basically of not much more than an ItemPointer,
and the patch is not much more than adding a comparison of two
ItemPointers.  So how does the patch introduce a new low level
implementation dependency?

Roger --- patch removed.  Thanks.

Could we agree on only removing the first five a half lines of the
comment in the patch?

Servus
 Manfred

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-12-01 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes:
 comparetup_index() compares two IndexTuples.  The structure
 IndexTupleData consists basically of not much more than an ItemPointer,
 and the patch is not much more than adding a comparison of two
 ItemPointers.  So how does the patch introduce a new low level
 implementation dependency?

Because it sorts on tuple position, which is certainly about as low
level as you can get.  More to the point, though, no evidence has been
provided that this is a good idea.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-11-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 If qsort is to blame, then maybe this patch could help.  It sorts
 equal key values on item pointer.  And if it doesn't help index
 creation speed, at least the resulting index has better correlation.

 I will try to apply it within the next 48 hours.

I think this is a *very* dubious idea.  It introduces a low-level
implementation dependency into our sort behavior on the strength of no
more than an unfounded speculation that some platform's broken qsort
might run faster.  Even if the speculation were proven true, I'd be
hesistant to apply it.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-11-30 Thread Bruce Momjian

Patch removed from queue.

---

Manfred Koizar wrote:
 On Mon, 01 Sep 2003 08:46:09 -0400, Tom Lane [EMAIL PROTECTED]
 wrote:
 [EMAIL PROTECTED] writes:
  it took 69 minutes to finish, 75% of this time was devoted to create 2
  indexes on varchar(2) with value being 'O', 'N' or null;
 
 I still say it's either strcoll or qsort's fault.
 
 If qsort is to blame, then maybe this patch could help.  It sorts
 equal key values on item pointer.  And if it doesn't help index
 creation speed, at least the resulting index has better correlation.
 
 Test script:
 CREATE TABLE t (i int NOT NULL, t text NOT NULL);
 INSERT INTO t VALUES (1, 'lajshdflasjhdflajhsdfljhasdlfjhasdf');
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t VALUES (100, 's,dmfa.,smdn.famsndfamdnsbfmansdbf');
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 INSERT INTO t SELECT * FROM t;
 ANALYZE t;
 CREATE INDEX t_i ON t(i);
 SET enable_seqscan = 0;
 SELECT ctid FROM t WHERE i=100 LIMIT 10;
 
 Result without patch:
ctid
 --
  (153,14)
  (306,23)
  (305,80)
  (152,91)
   (76,68)
   (38,34)
  (153,34)
  (305,50)
(9,62)
  (305,40)
 (10 rows)
 
 Result with patch:
   ctid
 
   (0,5)
  (0,10)
  (0,15)
  (0,20)
  (0,25)
  (0,30)
  (0,35)
  (0,40)
  (0,45)
  (0,50)
 (10 rows)
 
 For testing purposes I have made a second patch that provides a
 boolean GUC variable sort_index.  It is available here:
 http://www.pivot.at/pg/23.test-IdxTupleSort.diff
 
 Servus
  Manfred

 diff -ruN ../base/src/backend/utils/sort/tuplesort.c 
 src/backend/utils/sort/tuplesort.c
 --- ../base/src/backend/utils/sort/tuplesort.c2003-08-17 21:58:06.0 
 +0200
 +++ src/backend/utils/sort/tuplesort.c2003-09-05 10:04:22.0 +0200
 @@ -2071,6 +2071,33 @@
   (errcode(ERRCODE_UNIQUE_VIOLATION),
errmsg(could not create unique index),
errdetail(Table contains duplicated values.)));
 + else
 + {
 + /*
 +  * If key values are equal, we sort on ItemPointer.  This might help
 +  * for some bad qsort implementation having performance problems
 +  * with many equal items.  OTOH I wouldn't trust such a weak qsort
 +  * to handle pre-sorted sequences very well ...
 +  *
 +  * Anyway, this code doesn't hurt much, and it helps produce indices
 +  * with better index correlation which is a good thing per se.
 +  */
 + ItemPointer tid1 = tuple1-t_tid;
 + ItemPointer tid2 = tuple2-t_tid;
 + BlockNumber blk1 = ItemPointerGetBlockNumber(tid1);
 + BlockNumber blk2 = ItemPointerGetBlockNumber(tid2);
 + 
 + if (blk1 != blk2)
 + return (blk1  blk2) ? -1 : 1;
 + else
 + {
 + OffsetNumber pos1 = ItemPointerGetOffsetNumber(tid1);
 + OffsetNumber pos2 = ItemPointerGetOffsetNumber(tid2);
 + 
 + if (pos1 != pos2)
 + return (pos1  pos2) ? -1 : 1;
 + }
 + }
  
   return 0;
  }

 
 ---(end of broadcast)---
 TIP 6: Have you searched our list archives?
 
http://archives.postgresql.org

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-11-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  If qsort is to blame, then maybe this patch could help.  It sorts
  equal key values on item pointer.  And if it doesn't help index
  creation speed, at least the resulting index has better correlation.
 
  I will try to apply it within the next 48 hours.
 
 I think this is a *very* dubious idea.  It introduces a low-level
 implementation dependency into our sort behavior on the strength of no
 more than an unfounded speculation that some platform's broken qsort
 might run faster.  Even if the speculation were proven true, I'd be
 hesistant to apply it.

Roger --- patch removed.  Thanks.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index creation takes for ever

2003-09-08 Thread Zeugswetter Andreas SB SD

  I don't think so, because the patch does nothing to keep the sort
  order once the index is initially created.
 
 As Tom mentioned, we might not want to keep the tid's in order after the
 index is created because he wants the most recent tid's first, so the
 expired ones migrate to the end.

But on average this argument only holds true for unique indexes, no ?
Is there any code that stops the heap lookup after the visible tuple is found ?
At least in an index with more rows per key you will fetch all heaps after the 
first one anyway to get at the next row. This is better done in heap order, no ?

And the bitmap approach will not work for large result sets.

Summa summarum I would leave the TODO item (maybe add a comment 
(only for non-unique, evaluate performance))

Andreas

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index creation takes for ever

2003-09-08 Thread Manfred Koizar
On Mon, 8 Sep 2003 11:31:05 +0200, Zeugswetter Andreas SB SD
[EMAIL PROTECTED] wrote:
 As Tom mentioned, we might not want to keep the tid's in order after the
 index is created because he wants the most recent tid's first, so the
 expired ones migrate to the end.

But on average this argument only holds true for unique indexes, no ?
Is there any code that stops the heap lookup after the visible tuple is found ?
At least in an index with more rows per key you will fetch all heaps after the 
first one anyway to get at the next row. This is better done in heap order, no ?

Yes, yes, yes, and yes.  Seems we all agree on that; the patch has
been queued for 7.5.

Servus
 Manfred

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index creation takes for ever

2003-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 I assume this completes this TODO:
   * Order duplicate index entries by tid for faster heap lookups

I don't know why that TODO entry exists, but I think the idea is
counterproductive.  The existing btree code will tend to put newer
versions of a row earlier (because it puts a new entry in front of any
with duplicate keys), which usually reduces the time spent skipping dead
rows.  Forcing tid ordering will cost us more in dead-row skipping than
it's likely to save elsewhere.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  I assume this completes this TODO:
  * Order duplicate index entries by tid for faster heap lookups
 
 I don't know why that TODO entry exists, but I think the idea is
 counterproductive.  The existing btree code will tend to put newer
 versions of a row earlier (because it puts a new entry in front of any
 with duplicate keys), which usually reduces the time spent skipping dead
 rows.  Forcing tid ordering will cost us more in dead-row skipping than
 it's likely to save elsewhere.

I assume you are talking about a unique index that probably only has a
few non-expired rows (in which case the newer rows first is better). 
The TODO deals with cases where you have lots of valid duplicate index
rows, and you want to spin through all the matching rows in heap order
rather than randomly.  This is related to our CLUSTER capability.  The
idea originally came from Vadim.

At what point does this patch do the sorting?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-09-07 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 * Order duplicate index entries by tid for faster heap lookups

 I don't know why that TODO entry exists, but I think the idea is
 counterproductive.

 I assume you are talking about a unique index that probably only has a
 few non-expired rows (in which case the newer rows first is better). 
 The TODO deals with cases where you have lots of valid duplicate index
 rows, and you want to spin through all the matching rows in heap order
 rather than randomly.

Maybe so, but it would degrade the performance in the unique-index case
if we do it as the TODO is worded.

My own opinion is that the bitmap-index-lookup approach will be superior
to trying to keep the index entries in TID order.  (That's the idea
we've been discussing for awhile of separating the heap-fetch stage from
the index-scan stage: scan the index, make a sparse bitmap of the TIDs
we need to visit, possibly AND or OR this bitmap with maps derived from
other indexes, and finally visit the rows in heap order.)

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PATCHES] [HACKERS] Index creation takes for ever

2003-09-07 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  * Order duplicate index entries by tid for faster heap lookups
 
  I don't know why that TODO entry exists, but I think the idea is
  counterproductive.
 
  I assume you are talking about a unique index that probably only has a
  few non-expired rows (in which case the newer rows first is better). 
  The TODO deals with cases where you have lots of valid duplicate index
  rows, and you want to spin through all the matching rows in heap order
  rather than randomly.
 
 Maybe so, but it would degrade the performance in the unique-index case
 if we do it as the TODO is worded.

Yes, the wording is just a guide.

 My own opinion is that the bitmap-index-lookup approach will be superior
 to trying to keep the index entries in TID order.  (That's the idea
 we've been discussing for awhile of separating the heap-fetch stage from
 the index-scan stage: scan the index, make a sparse bitmap of the TIDs
 we need to visit, possibly AND or OR this bitmap with maps derived from
 other indexes, and finally visit the rows in heap order.)

Oh, yes.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index creation takes for ever

2003-09-07 Thread Manfred Koizar
On Sun, 7 Sep 2003 11:43:42 -0400 (EDT), Bruce Momjian
[EMAIL PROTECTED] wrote:
I assume this completes this TODO:

   * Order duplicate index entries by tid for faster heap lookups

I don't think so, because the patch does nothing to keep the sort
order once the index is initially created.

 If you want to post it now, [...]

I did already post it.  It's only the last page or so of the original
message.  The link in that message points to a testing aid which is
not part of what I would like to see committed.

Servus
 Manfred

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Index creation takes for ever

2003-09-01 Thread ohp

# requires BTREE_BUILD_STATS
#show_btree_build_stats = false


#
#   Access statistics collection
#
#stats_start_collector = true
#stats_reset_on_server_start = true
stats_command_string = true
stats_row_level = true
#stats_block_level = false


#
#   Lock Tracing
#
#trace_notify = false

# requires LOCK_DEBUG
#trace_locks = false
#trace_userlocks = false
#trace_lwlocks = false
#debug_deadlocks = false
#trace_lock_oidmin = 16384
#trace_lock_table = 0


#
#   Misc
#
#autocommit = true
#dynamic_library_path = '$libdir'
#search_path = '$user,public'
datestyle = 'postgres, european'
#timezone = unknown # actually, defaults to TZ environment setting
#australian_timezones = false
#client_encoding = sql_ascii# actually, defaults to database encoding
#authentication_timeout = 60# 1-600, in seconds
#deadlock_timeout = 1000# in milliseconds
#default_transaction_isolation = 'read committed'
#max_expr_depth = 1 # min 10
#max_files_per_process = 1000   # min 25
#password_encryption = true
#sql_inheritance = true
#transform_null_equals = false
#statement_timeout = 0  # 0 is disabled, in milliseconds
#db_user_namespace = false

#
# Local Settings
LC_MESSAGES = 'fr_FR'
LC_MONETARY = 'fr_FR'


The machine has 1 G DDR ECC Registred RAM, 2 1,8 GZ XEON running uw713
PGversion is 7.3.4, macine is not swaping.

Those index creation took 100% of 1 CPU.

Regards
On Thu, 28 Aug 2003, Tom Lane wrote:

 Date: Thu, 28 Aug 2003 10:13:21 -0400
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: pgsql-hackers list [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Index creation takes for ever

 [EMAIL PROTECTED] writes:
  I've then pg_dump'ed the database and recreate an other both on 7.3.4 and
  7.4b

  Both are still running after more than 30 minutes of CPU (100% cpu taken)
  creating the levt_lu_ligne_evt_key.

 That's hard to believe.  I get

 regression=# SELECT levt_lu,count(*) from ligne_evt group by levt_lu;
  levt_lu | count
 -+
  N   |  49435
  O   | 181242
 (2 rows)

 Time: 6927.28 ms
 regression=# create index levt_lu_ligne_evt_key on ligne_evt (levt_lu);
 CREATE INDEX
 Time: 14946.74 ms

 on a not-very-fast machine ... and it seems to be mostly I/O bound.

 What platform are you on?  I could believe that the local qsort() is
 incredibly inefficient with many equal keys.  Another possibility is
 that you're using a non-C locale and strcoll() is horribly slow.

   regards, tom lane


-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index creation takes for ever

2003-09-01 Thread Tom Lane
[EMAIL PROTECTED] writes:
 it took 69 minutes to finish, 75% of this time was devoted to create 2
 indexes on varchar(2) with value being 'O', 'N' or null;
 I wonder if it's a configuration matter.

I still say it's either strcoll or qsort's fault.  Try swapping in our
own version of qsort to see if the behavior changes.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index creation takes for ever

2003-09-01 Thread ohp
Hi Tom,
I've made some tests with your qsort and it DEFINITIVLY help
~3 mn instead of 69.

However this is for 7.3.4 I've got no probs with 7.4b.
Did something change in btree creation?
On Mon, 1 Sep 2003, Tom Lane wrote:

 Date: Mon, 01 Sep 2003 08:46:09 -0400
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: pgsql-hackers list [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Index creation takes for ever

 [EMAIL PROTECTED] writes:
  it took 69 minutes to finish, 75% of this time was devoted to create 2
  indexes on varchar(2) with value being 'O', 'N' or null;
  I wonder if it's a configuration matter.

 I still say it's either strcoll or qsort's fault.  Try swapping in our
 own version of qsort to see if the behavior changes.

   regards, tom lane


-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index creation takes for ever

2003-09-01 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I've made some tests with your qsort and it DEFINITIVLY help
 ~3 mn instead of 69.

 However this is for 7.3.4 I've got no probs with 7.4b.
 Did something change in btree creation?

Hmm, I wouldn't have thought so, but perhaps we did change something
that would affect this.  You might need to burrow down as far as seeing
exactly what qsort calls are being made in each version ...

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index creation takes for ever

2003-08-30 Thread Bruce Momjian
Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  This is on unixware 7 (both 7.3.4 and 7.4b)
 
  I'm on the FR language (I'll re-initdb whith lang=C to see what happens)
 
 Okay.  If you find it's still slow in C locale, the next thing to try
 would be forcing use of our own qsort, as we already do for Solaris.
 You'd need to tweak this bit in configure.in:
 
 # Solaris has a very slow qsort in certain cases, so we replace it.
 case $host_os in solaris*) 
 AC_LIBOBJ(qsort) ;;
 esac
 
 I'm not sure why it's done that way though.  'Twould be better to let
 the per-platform template files determine it.

The macro has to be in the configure.in file, but we could flag whether
the macro should be used in the template file.  I have avoided adding a
variable just for one platform because it just adds an extra level of
abstraction for little benefit.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index creation takes for ever

2003-08-30 Thread ohp
Thanks to evryone that help on this one.
I reinitdb --locale=C and reloaded everything today.
That did the trick.
However, is there a way to test that strcol is really the culprit?
On Sat, 30 Aug 2003, Bruce Momjian wrote:

 Date: Sat, 30 Aug 2003 11:11:11 -0400 (EDT)
 From: Bruce Momjian [EMAIL PROTECTED]
 To: Tom Lane [EMAIL PROTECTED]
 Cc: [EMAIL PROTECTED], pgsql-hackers list [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Index creation takes for ever

 Tom Lane wrote:
  [EMAIL PROTECTED] writes:
   This is on unixware 7 (both 7.3.4 and 7.4b)
 
   I'm on the FR language (I'll re-initdb whith lang=C to see what happens)
 
  Okay.  If you find it's still slow in C locale, the next thing to try
  would be forcing use of our own qsort, as we already do for Solaris.
  You'd need to tweak this bit in configure.in:
 
  # Solaris has a very slow qsort in certain cases, so we replace it.
  case $host_os in solaris*)
  AC_LIBOBJ(qsort) ;;
  esac
 
  I'm not sure why it's done that way though.  'Twould be better to let
  the per-platform template files determine it.

 The macro has to be in the configure.in file, but we could flag whether
 the macro should be used in the template file.  I have avoided adding a
 variable just for one platform because it just adds an extra level of
 abstraction for little benefit.



-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Index creation takes for ever

2003-08-28 Thread ohp
Hi every one,

I've tried to reindex one of my customer's table to gain some disk space.

I had to stop after 90 m cpu...

I've then pg_dump'ed the database and recreate an other both on 7.3.4 and
7.4b

Both are still running after more than 30 minutes of CPU (100% cpu taken)
creating the levt_lu_ligne_evt_key.

Disks don't do anything, just cpu.

Here are the info I have
test=# \d ligneevt
Table public.ligne_evt
 Column  |   Type   |   Modifiers

-+--+---
-
 levt_cod| integer  | not null default nextval('seq_levt
_cod'::text)
 levt_tevt_cod   | integer  | not null
 levt_date   | timestamp with time zone | not null
 levt_type_per1  | integer  | not null
 levt_perso_cod1 | integer  | not null
 levt_type_per2  | integer  |
 levt_perso_cod2 | integer  |
 levt_texte  | text |
 levt_lu | character varying(2) | default 'N'
 levt_visible| character varying(2) |
 levt_attaquant  | integer  |
 levt_cible  | integer  |
Indexes: ligne_evt_pkey primary key btree (levt_cod),
 levt_attaquant_ligne_evt_key btree (levt_attaquant),
 levt_cible_ligne_evt_key btree (levt_cible),
 levt_lu_ligne_evt_key btree (levt_lu),
 levt_visible_ligne_evt_key btree (levt_visible),
 ligne_evt_levt_cod_key btree (levt_cod),
 ligne_evt_levt_date_key btree (levt_date),
 ligne_evt_levt_perso_cod1_key btree (levt_perso_cod1),
 ligne_evt_levt_perso_cod2_key btree (levt_perso_cod2),
 ligne_evt_levt_tevt_cod_key btree (levt_tevt_cod),
 ligne_evt_levt_type_per1_key btree (levt_type_per1),
 ligne_evt_levt_type_per2_key btree (levt_type_per2)
Triggers: RI_ConstraintTrigger_5453038

test=# SELECT count(*) from ligne_evt ;
 count

 230670
(1 row)

test=# SELECT levt_lu,count(*) from ligne_evt group by levt_lu;
 levt_lu | count
-+
 N   |  49435
 O   | 181242
(2 rows)

Can some one help me?

Regards
-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Index creation takes for ever

2003-08-28 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I've then pg_dump'ed the database and recreate an other both on 7.3.4 and
 7.4b

 Both are still running after more than 30 minutes of CPU (100% cpu taken)
 creating the levt_lu_ligne_evt_key.

That's hard to believe.  I get

regression=# SELECT levt_lu,count(*) from ligne_evt group by levt_lu;
 levt_lu | count
-+
 N   |  49435
 O   | 181242
(2 rows)

Time: 6927.28 ms
regression=# create index levt_lu_ligne_evt_key on ligne_evt (levt_lu);
CREATE INDEX
Time: 14946.74 ms

on a not-very-fast machine ... and it seems to be mostly I/O bound.

What platform are you on?  I could believe that the local qsort() is
incredibly inefficient with many equal keys.  Another possibility is
that you're using a non-C locale and strcoll() is horribly slow.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Index creation takes for ever

2003-08-28 Thread Tom Lane
[EMAIL PROTECTED] writes:
 This is on unixware 7 (both 7.3.4 and 7.4b)

 I'm on the FR language (I'll re-initdb whith lang=C to see what happens)

Okay.  If you find it's still slow in C locale, the next thing to try
would be forcing use of our own qsort, as we already do for Solaris.
You'd need to tweak this bit in configure.in:

# Solaris has a very slow qsort in certain cases, so we replace it.
case $host_os in solaris*) 
AC_LIBOBJ(qsort) ;;
esac

I'm not sure why it's done that way though.  'Twould be better to let
the per-platform template files determine it.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Index creation takes for ever

2003-08-28 Thread ohp
Okay, so far,
I've reinitdb (on 7.4b) with LANG=C and it worked.
So I reinitDB with LANG=FR and used LANG=C to psql -f xxx.sql template1 to
recreate the db and it worked too...

I did'nt initdb between cvs changes, maybe that's why.

7.4b seems ok.
However, is there a way I can reindex on my 7.3.4 production
cluster cluster without an initdb?

would LANG=C psql -creindex table blah db would help?
On Thu, 28 Aug 2003, Tom Lane wrote:

 Date: Thu, 28 Aug 2003 10:38:26 -0400
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: pgsql-hackers list [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Index creation takes for ever

 [EMAIL PROTECTED] writes:
  This is on unixware 7 (both 7.3.4 and 7.4b)

  I'm on the FR language (I'll re-initdb whith lang=C to see what happens)

 Okay.  If you find it's still slow in C locale, the next thing to try
 would be forcing use of our own qsort, as we already do for Solaris.
 You'd need to tweak this bit in configure.in:

 # Solaris has a very slow qsort in certain cases, so we replace it.
 case $host_os in solaris*)
 AC_LIBOBJ(qsort) ;;
 esac

 I'm not sure why it's done that way though.  'Twould be better to let
 the per-platform template files determine it.

   regards, tom lane


-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Index creation takes for ever

2003-08-28 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I've reinitdb (on 7.4b) with LANG=C and it worked.
 So I reinitDB with LANG=FR and used LANG=C to psql -f xxx.sql template1 to
 recreate the db and it worked too...

That's weird.  I don't understand why an initdb in the same locale would
make the problem go away.

 I did'nt initdb between cvs changes, maybe that's why.

If you had only seen the problem in 7.4b then I could believe that theory,
but since you also saw it in the stable 7.3 installation, I don't.

 would LANG=C psql -creindex table blah db would help?

No, the LANG environment of psql wouldn't make the slightest difference
here.

I don't think you told us your platform?  (OS version, kernel, compiler,
etc)

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Index creation takes for ever

2003-08-28 Thread ohp
My platforms are Unixware 713.

Am I right to be afraid that I have to pg_dump and reload?
Stiil it's amazind, 7.3 has been up for months and I discover the proclem
today...
Well, I don't reindex that often...
On Thu, 28 Aug 2003, Tom Lane wrote:

 Date: Thu, 28 Aug 2003 12:19:13 -0400
 From: Tom Lane [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: pgsql-hackers list [EMAIL PROTECTED]
 Subject: Re: [HACKERS] Index creation takes for ever

 [EMAIL PROTECTED] writes:
  I've reinitdb (on 7.4b) with LANG=C and it worked.
  So I reinitDB with LANG=FR and used LANG=C to psql -f xxx.sql template1 to
  recreate the db and it worked too...

 That's weird.  I don't understand why an initdb in the same locale would
 make the problem go away.

  I did'nt initdb between cvs changes, maybe that's why.

 If you had only seen the problem in 7.4b then I could believe that theory,
 but since you also saw it in the stable 7.3 installation, I don't.

  would LANG=C psql -creindex table blah db would help?

 No, the LANG environment of psql wouldn't make the slightest difference
 here.

 I don't think you told us your platform?  (OS version, kernel, compiler,
 etc)

   regards, tom lane


-- 
Olivier PRENANT Tel: +33-5-61-50-97-00 (Work)
6, Chemin d'Harraud Turrou   +33-5-61-50-97-01 (Fax)
31190 AUTERIVE   +33-6-07-63-80-64 (GSM)
FRANCE  Email: [EMAIL PROTECTED]
--
Make your life a dream, make your dream a reality. (St Exupery)

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])