Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Martijn van Oosterhout
On Fri, Dec 23, 2005 at 11:19:01PM -0500, Rod Taylor wrote:
  In many cases you could use temporary tables, but sometimes you might want
  multiple processes or multiple transactions to be able to see the data.
 
 Could always implement GLOBAL TEMP tables that have the ability to use
 these kinds of shortcuts.

Indeed, that's the thought I get while reading this thread.

TEMP tables are not xlogged

For global temp tables I'd suggest creating a schema pg_globaltemp or
some such where:

1. No Xlog, like TEMP tables
2. Use shared buffer cache like normal tables.
3. On startup after unclean shutdown, delete anything in that schema

And (AFAIK) we get extra SQL complience to boot. If you really want an
extension, how about one to un-TEMP-ify a temp table.

That would allow the following:

CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

-- Load data, munge, etc... no xlog traffic
-- If system crashes here, nothing to recover

ALTER TABLE blah ALTER SCHEMA TO public;  -- untempify

-- If PITR, write Xlog data now
-- If no PITR, just move file and update catalog

How does this not do what people are asking for?
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpOhtfDdTXRH.pgp
Description: PGP signature


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Christopher Kings-Lynne

I've gotten interested again in the issue of row comparisons, eg
(a, b, c) = (1, 2, 3)
We've discussed this before, the most comprehensive thread being
http://archives.postgresql.org/pgsql-performance/2004-07/msg00188.php
but nothing's gotten done.  Unless someone's already working on this
I think I will take it up.


Can someone explain to me how:

(a, b)  (1, 2)

is different to

a  1 and b  2

?

Chris

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] bugs with certain Asian multibyte charsets

2005-12-24 Thread Tatsuo Ishii
I have found long standing bug with with certain Asian multibyte
charsets handling(original report was made by Mr. Ishida).

Some text operations on certain Asian charsets such as EUCj-JP code
set 3 (JIS X 0212) make wrong results. As far as I know, these
include:

- strpos
- regular expression

It seems LIKE is not affected by this bug.

The bug has been there since 6.4. The reason we did not notice the bug
is the affected charsts are merely used. Other charsets affected by
the bug are EUC_CN code set 2, 3 (it seems they are not used at all)
and EUC_TW code set 2, 3 (it seems code set 3 is not used). As far as
I know, EUC_KR is not affected (I believe code set 2, 3 is not used in
EUC_KR).

Here are the description of the bug.

- strpos

In EUC_JP database:

SELECT strpos(hextostr('8faaa18faae1'), hextostr('8faae1'));

returns 1, instead of 2. where hextostr() is a hexadecial to string
conversion functin developed by Mr. Ishida. Those three bytes sequence
starting with 8f is a JIS X 0212 letter encoded in EUC-JP (for
example, 8faaa18faae1 consists of 2 EUC_JP letters).

- regexp

SELECT hextostr('8faaa18faaa1') ~ hextostr('8faae1');

returns false instead of true.

details of the bug:

In backend/utils/mb/wchar.c there are functions to convert multibyte
to wchar. When the conversion performed, the second or third byte was
masked by 0x3f and which makes, for example, 8faaa1 and 8faae1 look
same.

I'm going to commit fixes for 7.3-statble, 7.4-stable, 8.0-stable,
8.1-stable and current.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Martijn van Oosterhout
On Fri, Dec 23, 2005 at 03:18:21PM -0500, Tom Lane wrote:
 I've gotten interested again in the issue of row comparisons, eg
   (a, b, c) = (1, 2, 3)
 We've discussed this before, the most comprehensive thread being
 http://archives.postgresql.org/pgsql-performance/2004-07/msg00188.php
 but nothing's gotten done.  Unless someone's already working on this
 I think I will take it up.

snip

Since this is related to the COLLATE stuff I'm working on I'd like to
make a few comments.

 1. Determine which interpretations (btree strategy numbers) exist for
 each pairwise operator.  There must be at least one interpretation that
 is common to all the operators, else fail (for instance, it doesn't help
 if we can identify one operator as  and another as ).

One thing my COLLATE patch does is distinguish between collations and
operator classes. So the reverse operator class issue disappears
because it's just a collation and doesn't need a operator class
(although it won't break anything, see below).

 2. If there is more than one common interpretation, prefer the one that
 uses the largest number of default opclasses.  If there's a tie, we
 could either reject the construct as ambiguous, or select one of the
 possibilities arbitrarily ... any thoughts about that?

In standard SQL, each node in a query has a collation. Columns use the
collation they were given when the table was created, constants use the
default for the type. It's a little more complicated than that, see the
standard for details.

Anyway, a collation identifies a btree operator class so this problem
solves itself. For each pair of values you are comparing, determine the
collation and look up the operator class to ensure you're using the
same strategy type. There are minor details relating to reverse
collations but they're minor.

The only problem reverse operator classes bring here is that the system
won't realise it and thus won't know that the index is usable. Unless
the user specifies the collation as part of the query.

 3. A given operator could have the selected interpretation in more than
 one opclass.  Prefer the default opclass if any; otherwise, again we
 have the choice of rejecting or making an arbitrary choice.

If there's a problem, bail. The standard allows you to specify the
collation on a per node basis so any ambiguities can be resolved by the
user.

So something like:

(a COLLATE hungarian, b COLLATE posix, c COLLATE ignorecase) = ('x','y','z')

Would know exactly what to do (and if you could use an index)...

Now, since COLLATE support is still in progress, I'm not sure how much
any of this helps you. I'm up to modifying the scankeys but it's hard
when you jave to keep rgrepping the tree to work out what is called
from where...

For other people reading this thread, the reason why it can't be
decomposed into (a=1 AND b=2 AND c=3) is because the standard treats
the row as a unit, checking left to right, so:

(4,0,0)  (5,0,0)
(1,2,3)  (0,7,8)

So it needs a new node type and needs to know which index to use.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpajDYwByGz1.pgp
Description: PGP signature


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Christopher Kings-Lynne

Now, since COLLATE support is still in progress, I'm not sure how much
any of this helps you. I'm up to modifying the scankeys but it's hard
when you jave to keep rgrepping the tree to work out what is called
from where...


src/tools/make_ctags is your friend...

Chris

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Missing DATE selectivity

2005-12-24 Thread Simon Riggs
The TODO list has an item add missing date selectivity, which is a
hang-over from 1999 or before.

Is this still an issue? Can somebody describe exactly what it is?

I can't find the exact discussion of what the problem was/is, so its
better to ask directly...

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Martijn van Oosterhout
On Sat, Dec 24, 2005 at 06:05:58PM +0800, Christopher Kings-Lynne wrote:
 Now, since COLLATE support is still in progress, I'm not sure how much
 any of this helps you. I'm up to modifying the scankeys but it's hard
 when you jave to keep rgrepping the tree to work out what is called
 from where...
 
 src/tools/make_ctags is your friend...

That just shows you where a symbol is defined, not where it's called
from. When you change the parameters of a function, you need to make
sure you found all the places that used it...

IOW, it's good for going down the call tree, but not up it.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpFVFoqg15RC.pgp
Description: PGP signature


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Peter Eisentraut
Am Samstag, 24. Dezember 2005 11:46 schrieb Martijn van Oosterhout:
 That just shows you where a symbol is defined, not where it's called
 from.

I've never used ctags, but etags certainly do what you ask for.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Martijn van Oosterhout
On Sat, Dec 24, 2005 at 12:25:41PM +0100, Peter Eisentraut wrote:
 Am Samstag, 24. Dezember 2005 11:46 schrieb Martijn van Oosterhout:
  That just shows you where a symbol is defined, not where it's called
  from.
 
 I've never used ctags, but etags certainly do what you ask for.

Really? I've searched the emacs documentation but don't see it. M-.
finds the definition of a tag, but how do you find usages of a tag?

Thanks in advance,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp3qSXE2cOJa.pgp
Description: PGP signature


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Simon Riggs
On Fri, 2005-12-23 at 22:41 -0500, Greg Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 
  Qingqing Zhou [EMAIL PROTECTED] writes:
   I revised the idea with MINIMAL XLOG (instead of WITHOUT XLOG) like the
   below. I think in this way, we can always gaurantee its correctness and
   can always improve it.
  
  I think the entire idea is a waste of time anyway.  If we have the COPY
  case covered then we have done the only case that has any practical use
  AFAICS --- you don't use UPDATE or DELETE for bulk loads, and you don't
  use retail INSERTs either if you are striving for speed.
 
 Well the industry standard approach for loading data is called ETL. I
 forget what the E stands for, but the idea is to first load the data into a
 table exactly as it appears in the input data.
 
 Then the T stands for transform. This could include just normalizing the
 input data into the correct format but it could also include looking up
 foreign key values for input strings, checking for duplicate records, etc.
 This would be a series of UPDATEs and DELETEs.
 
 I guess the L stands for load, where you insert it into the real tables.

Just for completeness: the whole data load area is typically known as
ETL now, but there are two main approaches: ETL and ELT

ETL stands for 
Extract - take the data from the source system
Transform - the process of mapping from source to target data model
Load - load the transformed data into the target system

In this mode, the only SQL used is COPY, during the L phase.

ETL is the industry standard as set by vendors who sell tools that
work that way: Informatica, Ascential etc. Another, equally viable route
is similar to your description, but is known as ELT, which is supported
by vendors such as Oracle (proprietary-only) and Sunopsis (open):

Extract - take the data from the source system
Load - load the data into the DW staging area in same form as source
Tranform - transform the data using SQL commands

In the latter ELT approach, you use COPY during the L phase, but then
you definitely do use INSERT SELECT and sometimes UPDATE commands,
rarely DELETEs during the T phase.

The ELT approach is the main basis for a *large* number of very large
Oracle, Teradata and DB2 data warehouses that have custom-built load
procedures. Check out this link for a high level explanation,
http://www.sunopsis.com/corporate/us/products/sunopsis/snps_dc.htm
but noting that Sunopsis did not invent this approach, or the name ELT.

 Up until the point where you load it into the real tables you don't really
 care about the transactional integrity of the data. If it's lost you can
 always just restart the process.

 In many cases you could use temporary tables, but sometimes you might want
 multiple processes or multiple transactions to be able to see the data.
 Consider for example a loading process that includes a human operator
 reviewing the data before the data is approved to be loaded into the final
 tables.

Agreed

 But I don't see turning on and off the WAL on a per-transaction basis to be
 useful. Every transaction in the system is affected by the WAL status of every
 other transaction working with the same tables. It doesn't serve any purpose
 to have one transaction bypassing the WAL while everyone else does WAL logging
 for the same table; they're all going to lose if the system crashes.

Yes, I really don't like that way.

 It seems to me the only rational way to approach this is to have a per-table
 flag that sets that table to be non-logged. Essentially changing a table's
 behaviour to that of a temporary table except that other transactions can see
 it. If the system crashes the table is truncated on system restore.

Often, you need to speed up the load into your largest tables. The data
you are loading often comes from a file that you can backed-up, so if
the database crashes during the load, you can always restart the load.
But that doesn't mean you want to lose the data that is already there if
you crash.

That's exactly what COPY LOCK has been designed to do. It is analagous
to Oracle's sql*loader direct path nologging mode.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 One thing my COLLATE patch does is distinguish between collations and
 operator classes. So the reverse operator class issue disappears
 because it's just a collation and doesn't need a operator class
 (although it won't break anything, see below).

Are you suggesting that COLLATE will impose comparison semantics on
all datatypes including non-string types?  If so, I'd be interested
to know what you have in mind.  If not, claiming that it makes the
issue go away is nonsensical.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Now, since COLLATE support is still in progress, I'm not sure how much
 any of this helps you. I'm up to modifying the scankeys but it's hard
 when you jave to keep rgrepping the tree to work out what is called
 from where...

 src/tools/make_ctags is your friend...

If grep -r is too slow for you, there's a package called glimpse that
I've used for years.  It builds a full-text index of any specified
collection of files, and then does grep-like searches nearly
instantaneously.  The output format is the same as grep so you can
teach emacs to visit all the hits.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 Can someone explain to me how:
 (a, b)  (1, 2)
 is different to
 a  1 and b  2

Right at the moment our code interprets it that way, but this behavior
is wrong per spec.  It should be an ordered column-by-column comparison,
so that the equivalent simple expression is

(a  1) OR (a = 1 AND b  2)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Greg Stark

Martijn van Oosterhout kleptog@svana.org writes:

 CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

I don't think you can have your temp table inherit from a real table. That
would make your real table have temp table semantics. Ie, records in it will
spontaneously disappear on reboot.

But you could use LIKE to define the table which I think does what you want
here.

Partitioned table support would eventually include the ability to take the
table (once you've done the untempify thing below) and move it into the
partitioned table as a partition.

 -- Load data, munge, etc... no xlog traffic
 -- If system crashes here, nothing to recover
 
 ALTER TABLE blah ALTER SCHEMA TO public;  -- untempify

Personally I don't like the idea of tying temporary table status with the
schema. Is this in the standard? It seems to me that you should be able to
create temporary tables in whatever schema you want. People sometimes use
schemas to separate different modules of their application. It's strange to
force them to share a single namespace for temporary tables.

I would think untempify should be a separate special command, not some hidden
feature in ALTER SCHEMA. It makes it less surprising when someone reads the
code. It also makes it more likely someone who needs the functionality will
find it.

ALTER TABLE blah SET NOT TEMPORARY

This also raises the question of whether it should be possible to do:

ALTER TABLE blah SET TEMPORARY

and when it should be possible. At a first cut, any table that isn't involved
in any foreign key relationships and isn't part of any inherited table
structure should be eligible. 

Come to think of it my database would benefit from this feature. I have cache
tables I rebuild nightly. They have to be visible from every session but I
don't care about losing them on a crash since they can always be rebuilt. As
it stands they inflate the daily backups, and if I were using PITR the nightly
build would flood the PITR logs with tons of useless data. Making them global
temporary tables would let me reduce the backup size, reduce PITR volume, and
speed up the nightly cache build in one stroke.

-- 
greg


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-24 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Martijn van Oosterhout kleptog@svana.org writes:
 CREATE GLOBAL TEMP TABLE blah INHERITS myhugetable;

 I don't think you can have your temp table inherit from a real table.

Bear in mind also that this notion of a GLOBAL TEMP table has less than
nothing to do with what the standard says a GLOBAL TEMP table is.  You
can't argue for the idea by claiming it increases our standards
compliance, because it actually is moving even further away.  The spec
says that a temp table is a table that exists but starts empty in every
session, ie, data inserted by one session is not visible to any other
session.  The GLOBAL/LOCAL choice does not affect this, it actually just
governs visibility of the table within different modules (a concept we
don't have at all).

An ALTER TABLE SET LOGGED/UNLOGGED switch might have some merit, but
please don't muddy the waters by confusing this with temp-table status.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] default resource limits

2005-12-24 Thread Andrew Dunstan


[moving to -hackers]

Peter Eisentraut wrote:


Am Samstag, 24. Dezember 2005 00:20 schrieb Andrew Dunstan:
 


The rationale is one connection per apache thread (which on Windows
defaults to 400). If people think this is too many I could live with
winding it back a bit - the defaults number of apache workers on Unix is
250, IIRC.
   



It's 150.  I don't mind increasing the current 100 to 150, although I find 
tying this to apache pretty bogus.
 



According to 
http://httpd.apache.org/docs/2.0/mod/mpm_common.html#maxclients the 
default for the prefork MPM, which is the default on Unix, is 256. 400 
appears to be what is used for hybrid MPMs like worker, which is not the 
default for any platform. The default Windows MPM (mpm_winnt) is 
apparently governed by the ThreadsPerChild setting, which defaults to 
64, not 400 as I previously stated.


I really don't like the prospect of making the defaults platform specific, 
especially if the only rationale for that would be apache does it.  Why 
does apache allocate more connections on Windows anyway?


 



It uses a *very* different engine.

Maybe referring to apache is not ideal, although playing nicely with a 
very common client doesn't strike me as totally bogus either.


But what is the rationale for the current settings, or for anything else 
that might be proposed? I have yet to hear any. Is there anyone who 
thinks that 1000/2 for shared_buffers/max_fsm_pages is a good set of 
defaults?


Maybe we need to split this into two pieces, given Tom's legitimate 
concern about semaphore use. How about we increase the allowed range for 
shared_buffers and max_fsm_pages, as proposed in my patch, and leave the 
max_connections issue on the table? I also wondered if instead of first 
setting max_connections and then shared_buffers/max_fsm_pages, we should 
try to scale them in synch somehow.


cheers

andrew





---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] default resource limits

2005-12-24 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Maybe we need to split this into two pieces, given Tom's legitimate 
 concern about semaphore use. How about we increase the allowed range for 
 shared_buffers and max_fsm_pages, as proposed in my patch, and leave the 
 max_connections issue on the table? I also wondered if instead of first 
 setting max_connections and then shared_buffers/max_fsm_pages, we should 
 try to scale them in synch somehow.

The existing initdb code actually does try to scale them in sync to some
extent --- take a closer look at the arguments being passed during the
max-connections test phase.  It won't choose a large max_connections
unless it can simultaneously get 5 times that many shared_buffers.
I think this probably needs to be more aggressive though.  In a
situation of limited SHMMAX it's probably more important to keep
shared_buffers as high as we can than to get a high max_connections.
We could think about increasing the 5x multiplier, adding Min and/or Max
limits, or some combination.

BTW, I fat-fingered the calculations I was doing last night --- the
actual shmem consumption in CVS tip seems to be more like 17K per
max_connection increment, assuming max_locks_per_connection = 64.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] default resource limits

2005-12-24 Thread Andrew Dunstan



Tom Lane wrote:


BTW, I fat-fingered the calculations I was doing last night --- the
actual shmem consumption in CVS tip seems to be more like 17K per
max_connection increment, assuming max_locks_per_connection = 64.

 




ITYM max_locks_per_transaction (which as the docs say is confusingly named).

So if we went to 256, say, as an upper limit on max_connections, that 
would account for an extra 2.6Mb of memory use - a pretty modest 
increase, really.



cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] default resource limits

2005-12-24 Thread Andrew Dunstan



Robert Treat wrote:

Maybe we should write something in to check if apache is installed if we're so 
concerned about that usage... 



Er, yeah, I'll get right on that. (Don't hold your breath.)

I already know that I set the connection limits 
lower on most of the installations I do (given than most installations are 
not production webservers).  



So do I. In fact, even on production web servers I usually use 
connection pooling and can rarely get an app to approach saturating a 
pool size of around 20 let alone 100. But then you and I know something 
about tuning Postgres.  What I am aiming for is something that is closer 
to the norm on out of the box configuration.


There is also the argument to be made that just 
because systems these days have more memory doesn't mean we have to use it. 

 



Just because we can run with very little memory doesn't mean we have to. 
What is the point of having lots of memory if you don't use it? We are 
talking defaults here. initdb will still scale down on resource-starved 
machines.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Merry Christmas!

2005-12-24 Thread Joshua D. Drake

Hello,

Just a small thank you to all the developers of PostgreSQL for making
2005 another incredible year for Command Prompt, Inc.

Command Prompt hopes to continue increasing it's community contributions
next year.

Merry Christmas everyone!

Sincerely,

Joshua D. Drake
President
Command Prompt, Inc.

--
The PostgreSQL Company - Command Prompt, Inc. 1.503.667.4564
PostgreSQL Replication, Consulting, Custom Development, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: PLphp, PLperl - http://www.commandprompt.com/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-24 Thread Manfred Koizar
On Thu, 22 Dec 2005 10:40:24 -0500, Tom Lane [EMAIL PROTECTED]
wrote:
If you move items
from one page to the other in the opposite direction from the way the
scan is going, then it will miss those items.

AFAIU the (PG implementaion of the) LY method is designed to make
scans immune against problems caused by items moving right within the
same page and against page splits, i.e. items moving to a *new* right
sibling.  So making scans work with items moving to an *existing*
right sibling doesn't seem out of reach.

The code following this comment in _bt_restscan
/*
 * The item we're looking for moved right at least one page, so
 * move right.  We are careful here to pin and read-lock the next
 * non-dead page before releasing the current one.  This ensures
 * that a concurrent btbulkdelete scan cannot pass our position
 * --- if it did, it might be able to reach and delete our target
 * item before we can find it again.
 */
looks like it'd work for the case of page merging as well, as long as
we are careful to move items always from left to right.

BTW, if after having locked both pages we find that we have
super-exclusive locks, i.e. nobody else has pins on these pages, we
can reorganize much more agressively.  It might even be safe to move
items to the left page.  The parent page might need some special
handling, though.

Servus
 Manfred

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-24 Thread Kevin Brown
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  We already do something similar for page deletions.  Empty pages are not
  deleted right away, but they are marked with BTP_DEAD, and then deleted
  on a subsequent vacuum.  Or something like that, I don't remember the
  exact details.
 
 Right, and the reason for that is exactly that there might be a
 concurrent indexscan already in flight to the newly-dead page.
 We must wait to recycle the page until we are certain no such scans
 remain.
 
 It doesn't matter whether a concurrent indexscan visits the dead
 page or not, *because it's empty* and so there's nothing to miss.
 So there's no race condition.  But if you try to move valid data
 across pages then there is a race condition.

Hmm...

Well, REINDEX is apparently a very expensive operation right now.  But
how expensive would it be to go through the entire index and perform
the index page merge operation being discussed here, and nothing else?

If it's fast enough, might it be worthwhile to implement just this
alone as a separate maintenance command (e.g., VACUUM INDEX) that
acquires the appropriate lock (AccessExclusive, I'd expect) on the
index to prevent exactly the issues you're concerned about?

If it's fast enough even on large tables, it would be a nice
alternative to REINDEX, I'd think.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] default resource limits

2005-12-24 Thread Andrew Dunstan

[moved to -hackers]

Petr Jelinek said:
 Andrew Dunstan wrote:

 Just because we can run with very little memory doesn't mean we have to.
 What is the point of having lots of memory if you don't use it? We are
talking defaults here. initdb will still scale down on
 resource-starved  machines.


 Why not just give user few configs tuned to different things like mysql
  does ? Or better, make it initdb option so it will try to set
 higher/lower limits depending on type of config.



And what settings will be tried by distros that automatically run initdb on
first startup?

I don't mind having an initdb option that tunes the settings tried, but that
doesn't remove the need to choose some defaults.

I'm not sure that I think mysql's setup is a good example to follow.

cheers

andrew




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Fixing row comparison semantics

2005-12-24 Thread Alvaro Herrera
Martijn van Oosterhout wrote:

  src/tools/make_ctags is your friend...
 
 That just shows you where a symbol is defined, not where it's called
 from. When you change the parameters of a function, you need to make
 sure you found all the places that used it...
 
 IOW, it's good for going down the call tree, but not up it.

I use cscope for that.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-24 Thread Tom Lane
Manfred Koizar [EMAIL PROTECTED] writes:
 BTW, if after having locked both pages we find that we have
 super-exclusive locks, i.e. nobody else has pins on these pages, we
 can reorganize much more agressively.

No, we cannot.  I'm getting tired of explaining this.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-24 Thread Tom Lane
Kevin Brown [EMAIL PROTECTED] writes:
 Well, REINDEX is apparently a very expensive operation right now.  But
 how expensive would it be to go through the entire index and perform
 the index page merge operation being discussed here, and nothing else?
 If it's fast enough, might it be worthwhile to implement just this
 alone as a separate maintenance command (e.g., VACUUM INDEX) that
 acquires the appropriate lock (AccessExclusive, I'd expect) on the
 index to prevent exactly the issues you're concerned about?
 If it's fast enough even on large tables, it would be a nice
 alternative to REINDEX, I'd think.

This would work, but it's hard to tell if it'd be worthwhile short
of actually doing an implementation and field-testing it ...

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Happy Holidays

2005-12-24 Thread Andy Astor








To the PostgreSQL Community:



All of us at EnterpriseDB want to thank the community for
creating an amazing base upon which to build our new company. Its been a
busy year for us, going from zero to 70 people, raising venture capital, and
working hard to promote both PostgreSQL itself and EnterpriseDBs
additions. In the coming year, we will continue to sponsor and employ individuals,
fund and contribute projects, and look for other creative ways to help out. (Feel
free to contact me directly with any suggestions.) In the meantime, we wish
everyone and their families a fantastic holiday season, and an incredible 2006.



All the best,



 -- Andy 



-

Andy Astor, CEO
EnterpriseDB Corporation
777 New Durham
Road
Edison, NJ
 08817
Tel 732.331.1310
www.enterprisedb.com