Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Magnus Hagander
  Ok, I've coded up a patch that changes the code to use a 
 mutex instead.
 
 Are we asserting the problem is caused by the spinlock random 
 wake-up order?

Not asserting, more making a wild guess. Which I, as I said, no lnoger
really beleive in - but since the patch was already coded up it's worth
a try.

 I am not sure why this would fix the problem. If my memory 
 serves, a critical section might be a problem if one process 
 aborts unexpected while it is inside. Other waiting processes 
 can never have a chance to enter it (also have no chance to 
 handle SIGQUIT) -- so this patch may solve this.

A critical section only exists within a single process, so that realliy
doesn't apply. And if a thread crashes, the whole process exists.


 There is another suspect in 
 http://www.devisser-siderius.com/stack1.jpg,
 i.e., process 3 does shmctl. I once filed a server core dump 
 bug in win32 of reporting WSAEWOULDBLOCK.
 (http://archives.postgresql.org/pgsql-bugs/2006-02/msg00185.ph
 p). AFAICS, it is actually an mistranslated EINTR. There 
 seems some relation between these issues, but I didn't come 
 up with a complete theory of it.

There could well be. Except the link you sent pointed to a thread stuck
in pgwin32_waitforsinglesocket() insider pgwin32_send() - this is where
I beleive the problem is now.

I'm less-than-trusting the function names in the stacktrace after
examining some more. I'm suspecting process explorer can only see
non-static functions, and that the pg_queue_signal+0x120 actually
points into a different function. (really, pg_queue_signal cannot
possibly be 0x120 bytes machine code..) I bet it's just in
pg_signal_thread(), which is a perfectlyi normal place to block. It also
matches the behaviour I see on a completely fresh backend - which also
shows that pg_queue_signal+0x120.

A good thing to test would be to rebuild signal.c and socket.c without
any functions declared as static and see if the picture changes. (If
nothing else it would confirm this behaviour in process explorer)

Mvh,
 Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] [HACKERS] Automatic free space map filling

2006-03-13 Thread ITAGAKI Takahiro
Simon Riggs [EMAIL PROTECTED] wrote:

  Zeugswetter Andreas DCP SD [EMAIL PROTECTED] wrote:
   Ok, we cannot reuse a dead tuple. Maybe we can reuse the space of a dead
   tuple by reducing the tuple to it's header info.
  
  Attached patch realizes the concept of his idea. The dead tuples will be
  reduced to their headers are done by bgwriter.
 
 I'm interested in this patch but you need to say more about it. I get
 the general idea but it would be useful if you could give a full
 description of what this patch is trying to do and why.

OK, I try to explain the patch. Excuse me for a long writing.


* Purpose
  The basic idea is just reducing the dead tuple to it's header info,
suggested by Andreas. This is a lightweight per-page sweeping to reduce
the consumption of free space map and the necessity of VACUUM; i.e,
normal VACUUM is still needed occasionally.

  I think it is useful on heavy-update workloads. It showed 5-10% of
performance improvement on DBT-2 after 9 hours running *without* vacuum.
I don't know whether it is still effective with well-scheduled vacuum.

* Why does bgwriter do vacuum?
  Sweeping has cost, so non-backend process should do. Also, the page worth
vacuum are almost always dirty, because tuples on the page are just updated
or deleted. Bgwriter treats dirty pages, so I think it is a good place for
sweeping.

* Locking
  We must take super-exclusive-lock of the pages before vacuum. In the patch,
bgwriter tries to take exclusive-lock before it writes a page, and does
vacuum only if the lock is super-exclusive. Otherwise, it gives up and
writes the pages normally. This is an optimistic way, but I assume the
possibility is high because the most pages written by bgwriter are least
recently used (LRU).

* Keep the headers
  We cannot remove dead tuples completely in per-page sweep, because
references to the tuples from indexes still remains. We might keep only
line pointers (4 bytes), but it might lead line-pointer-bloat problems,
(http://archives.postgresql.org/pgsql-hackers/2006-03/msg00116.php).
so the headers (4+32 byte) should be left.

* Other twists and GUC variables in the patch
- Bgwriter cannot access the catalogs, so I added BM_RELATION hint bit
  to BufferDesc. Only relation pages will be swept. This is enabled by
  GUC variable 'bgvacuum_relation'.
- I changed bgwriter_lru_maxpages to be adjusted automatically. Backends
  won't do vacuum not to disturb their processing, so bgwriter should write
  most of dirty pages. ('bgvacuum_autotune')
- After sweepping, the page will be added to free space map. I made a simple
  replacement algorithm of free space map, that replaces the page with least
  spaces near the added one. ('bgvacuum_fsm')

* Issues
- If WAL is produced by sweeping a page, writing the page should be pended
  for a while, because flushing the WAL is needed before writing the page.
- Bgwriter writes pages in 4 contexts, background-writes for LRU, ALL,
  checkpoint and shutdown. In current patch, pages are swept in 3 contexts
  except shutdown, but it may be better to do only on LRU.

* Related discussions
- Real-Time Vacuum Possibility (Rod Taylor)
http://archives.postgresql.org/pgsql-hackers/2005-03/msg00518.php
| have the bgwriter take a look at the pages it has, and see if it can do
| any vacuum work based on pages it is about to send to disk
- Pre-allocated free space for row updating (like PCTFREE) (Satoshi Nagayasu)
http://archives.postgresql.org/pgsql-hackers/2005-08/msg01135.php
| light-weight repairing on a single page is needed to maintain free space
- Dead Space Map (Heikki Linnakangas)
http://archives.postgresql.org/pgsql-hackers/2006-02/msg01125.php
| vacuuming pages one by one as they're written by bgwriter


Thank you for reading till the last.
I'd like to hear your comments.

---
ITAGAKI Takahiro
NTT Cyber Space Laboratories



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [COMMITTERS] pgsql: Remove Jan Wieck`s name from copyrights, and put in standard

2006-03-13 Thread Peter Eisentraut
Greg Sabino Mullane wrote:
 I think everyone realizes at this point that the PGDG is not
 an official legal entity, but do we at least have a modern
 statement from Core as to what it is unofficially? In other
 words, the PostgreSQL Global Development Group is
 composed of 

Under international copyright law, the copyright is held by the authors 
of the work, no matter what you write into a copyright notice, if any.  
The only purpose of the copyright notices under discussion here is to 
notify the recipient of the file that this file belongs to the 
PostgreSQL source code, the authors of the code claim to have 
copyright, and you should check the license before doing anything 
further.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Where Can I Find The Code Segment For WAL Control?

2006-03-13 Thread Csaba Nagy
[Please use reply to all so the list is CC-d]

Charlie,

I guess what you're after is to make sure the WAL buffers are shipped to
the stand-by at the same time as they are committed to disk. In any
other case your desire to have the stand-by EXACTLY in sync with the
primary server will not gonna work.

But that would mean that the communication to the stand-by will become a
performance bottleneck, as all transactions are only finished after the
WAL records for them are synced to the disk. So if you want your
stand-by completely in sync with your primary, you will want that the
transactions finish only after their WAL records are pushed to the
stand-by too... and then if the communication to the stand-by fails, all
your transactions will wait after it, possibly causing the primary to
stop working properly. So now you have another point of failure, and
instead of making the setup safer, you make it unsafer. What I want to
say is that it is likely not feasible to keep the stand-by completely in
sync.

In practice it is enough to keep the standby NEARLY in sync with the
primary server. That means you will ship the WAL records asynchronously,
i.e. after they are written to the disk, and in a separate thread.

What I'm after is to have a thread which starts streaming the current
WAL file, and keeps streaming it as it grows. I'm not completely sure
how I'll implement that, but I guess it will need to do a loop and
transfer whatever records are available, and then sleep a few seconds if
it reaches the end. It must be prepared to stumble upon partially
written WAL records, and sleep on those too. On the stand-by end, the
current partial WAL will not be used unless the stand-by is fired up...

So I'm after a solution which makes sure the stand-by is as up to date
as possible, with a few seconds allowed gap in normal operation, and
possibly more if the communication channel has bandwidth problems and
the server is very busy. Usually if the server crashes, than there are
worse problems than the few seconds/minutes worth of lost transactions.
To name one, if the server crashes you will have for sure at least a few
minutes of downtime. At least for our application, downtime in a busy
period is actually worse than the lost data (that we can recover from
other logs)...

Cheers,
Csaba.

On Sun, 2006-03-12 at 02:50, 王宝兵 wrote:
  
 Csaba: 
 
  
 
 Firstly I must thank you for your help.Some of our designs are identical
 except the following: 
 
  
 
 - create a standby manager program which only needs to know how to 
 
 Access the primary server in order to create the standby (by connecting 
 
 To it through normal data base connections and using the above mentioned 
 
 Functions to stream the files); 
 
  
 
 In my opinion,if we create a standby manager program and run it as a daemon
 process,it will check the state of the WAL files of the Principal every few
 seconds.But there is a risk for data lost.For an instance,if the Principal
 has flushed its log buffer to the disk and the dirty data are also flushed
 immediately,but the standby manager program is running in its interval.Then
 the Principal fails.In this situation,the Principal has updated its database
 but the log segment hasn't been sent to the Mirror,because the time point
 for the standby manager program to check the WAL files hasn't come.And then
 these data are lost. 
 
  
 
 I think this situation will happen very probably in a big cooperation and it
 s very serious. 
 
  
 
 Perhaps I have misunderstood your opinion.If that,I apologize. 
 
  
 
 Charlie Wang 
 
  
 
 
 
 
 
 
 
 
 
 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] [HACKERS] pg_freespacemap question

2006-03-13 Thread Alvaro Herrera
Mark Kirkwood wrote:
 Christopher Kings-Lynne wrote:
 The point here is that if tuples require 50 bytes, and there are 20
 bytes free on a page, pgstattuple counts 20 free bytes while FSM
 ignores the page.  Recording that space in the FSM will not improve
 matters, it'll just risk pushing out FSM records for pages that do
 have useful amounts of free space.
 
 Maybe an overloaded pgstattuple function that allows you to request FSM 
 behavior?
 
 That's a nice idea - could also do equivalently by adding an extra 
 column usable_free_space or some such, and calculating this using FSM 
 logic.

The current pgstattuple function scans the whole table, so I don't think
this is a good idea.  Re: the overloaded function, I think the behaviors
are different enough to merit a separate function, with a different
name.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Jan de Visser
On Sunday 12 March 2006 09:40, Magnus Hagander wrote:
 Looking a my system while testing this it still loooked like it was
 hanging on that plac ein the code, even though I saw no problems. So I'm
 not convinced we can actually trust the stacktrace from the non-default
 threads. So I don't think this patch will actually work :-( But it's
 worth a try.

I'm afraid you're right. Hangs again :(

jan


-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Jan de Visser
On Monday 13 March 2006 09:26, Jan de Visser wrote:
 On Sunday 12 March 2006 09:40, Magnus Hagander wrote:
  Looking a my system while testing this it still loooked like it was
  hanging on that plac ein the code, even though I saw no problems. So I'm
  not convinced we can actually trust the stacktrace from the non-default
  threads. So I don't think this patch will actually work :-( But it's
  worth a try.

 I'm afraid you're right. Hangs again :(

I now have the toolchain set up, so if you want me to try stuff, please let me 
know. Resolving this is important to us.

On a whim, I replaced InitializeCriticalSection with 
InitializeCriticalSectionAndSpinCount, since MSDN told me that would be 
better for SMP. No joy.

jan

-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Jonah H. Harris
On 3/13/06, Dave Cramer [EMAIL PROTECTED] wrote:
One of the purposes of this as I understand it is to allow clients toget back the generated key(s). I don't see enough of the syntax tosee if this is possible with the DB2 syntax below.
I believe it would be something like

CREATE SEQUENCE test_id_seq;
CREATE TABLE test_tbl (test_id bigint not null default nextval('test_id_seq'), test_name varchar(32), PRIMARY KEY(test_id));

To get the generated sequence:
SELECT test_id FROM NEW TABLE INSERT INTO test_tbl VALUES (nextval('test_id_seq'), 'Joe Blow');
The reason for NEW is because there is no OLD version of the record.

In contrast, something similar to:
SELECT test_name FROM OLD TABLE UPDATE test_tbl SET test_name = 'John Doe' WHERE test_id = 1;
would return Joe Blow

Whereas:
SELECT test_name FROM NEW TABLE UPDATE test_tbl SET test_name = 'John Doe' WHERE test_id = 1;
would return John Doe

Again, I haven't really used it, but have read over the docs
briefly. I'm just wondering if anyone has used it and
likes/dislikes it.

-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Lukas Smith

Jonah H. Harris wrote:

Again, I haven't really used it, but have read over the docs briefly.  
I'm just wondering if anyone has used it and likes/dislikes it.


I guess you could get the same effect from a transaction. If there is 
much network overhead you could also write a stored procedure. This is 
obviously more direct. Due to caching I am not sure how much 
performance improvement there is in doing  the read/write in one statement.


IMHO this is only useful in fringe cases, can be implemented efficiently 
with existing syntax and so just adds useless complexity.


regards,
Lukas

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Fwd: DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Jonah H. Harris
On 3/13/06, Lukas Smith [EMAIL PROTECTED] wrote:


I guess you could get the same effect from a transaction. If there ismuch network overhead you could also write a stored procedure. This isobviously more direct. Due to caching I am not sure how much

performance improvement there is in doingthe read/write in one statement.
What are you talking about?
IMHO this is only useful in fringe cases, can be implemented efficiently
with existing syntax and so just adds useless complexity.

Show me an example of how efficiently you can get this syntax.
Let's see, I know I could get it using our current syntax too... 

SELECT test_id FROM insert_into_table('test_tbl', array['nextval(\'test_id_seq\')','test']) AS t1(test_id BIGINT);

Where insert_into_table is a general function that takes inputs, builds
the insert statement, executes the insert statement, builds a
selection, and returns the row. Of course, you could have a
*custom* function that only works for the test_tbl, that would make it
easy to return the next sequence id... or wait, you could write the
general function to go lookup the table definition, find out for itself
to do the nextval, and do more craziness ad nauseum.

In the end, how is this more efficient or easy than:

INSERT INTO test_tbl VALUES (nextval('test_id_seq'), 'John Doe') RETURNING test_id;
OR

SELECT test_id FROM NEW TABLE INSERT INTO test_tbl VALUES (nextval('test_id_seq'), 'John Doe');

Based on your statement, this should be really easy.
It's easy to make generalized statements, so let's see an example to
compare.

-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324




Re: [HACKERS] Fwd: DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Lukas Smith

Jonah H. Harris wrote:


In the end, how is this more efficient or easy than:

INSERT INTO test_tbl VALUES (nextval('test_id_seq'), 'John Doe') 
RETURNING test_id;

OR
SELECT test_id FROM NEW TABLE INSERT INTO test_tbl VALUES 
(nextval('test_id_seq'), 'John Doe');


Based on your statement, this should be really easy.  It's easy to make 
generalized statements, so let's see an example to compare.


Ah, I was just not clear on what you were comparing this against. So 
nevermind.


regards,
Lukas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Magnus Hagander
   Looking a my system while testing this it still loooked 
 like it was 
   hanging on that plac ein the code, even though I saw no 
 problems. So 
   I'm not convinced we can actually trust the stacktrace from the 
   non-default threads. So I don't think this patch will 
 actually work 
   :-( But it's worth a try.
 
  I'm afraid you're right. Hangs again :(
 
 I now have the toolchain set up, so if you want me to try 
 stuff, please let me know. Resolving this is important to us.

Great. That'll certainly help - now you don't have to wait for binaries
from me.

What I'd be interested in seeing is new stackdumps from a version where
you:
1) Do *not* have the patch for mutexes applied
2) Have removed static from all the function devlarations in signal.c
and socket.c, bnoth in src/backend/port/win32.

If you can, it'd be interesting to see it from the pre-SP1 install as
well - once it hangs.


 On a whim, I replaced InitializeCriticalSection with 
 InitializeCriticalSectionAndSpinCount, since MSDN told me 
 that would be better for SMP. No joy.

No, that should make no difference - except possibly a tiny difference
in speed.


Do you have the ability to test 8.0 on the same machine? We did some
extensive modifications to the signal stuff between 8.0 and 8.1, it'd be
interesting to see if that changed things.

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Restoring a Full Cluster on a Different Architecture (32 x 64)

2006-03-13 Thread Rodrigo Hjort
Dear PostgreSQL Hackers,We got a PG 8.1 on a Debian 64 bits, which does a full backup (PITR) daily.Then we installed a Debian 32 bits (actually, it's on VMWare) and wanted to restore the previous PG cluster on it.
As there are a lot of indexes, specially GiST, pg_dump and pg_restore are not viable - will take a lot of time!Well, the fact is that we've got the message below on postmaster start attempt:
WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this program
is expecting. The results below are untrustworthy.As the architecture on both Linuxes are different (32 and 64 bits), I think PGDATA/global/pg_control might contains 64 bit data such that the 32 bits binary won't recognize or even mispell it. Am I right?
What could be done in order to fix it? Is there any kind of application to translate it or the only solution was to pg_dumpall and pg_restore the cluster?**
[EMAIL PROTECTED]:/tmp/lala/global$ uname -aLinux pga1 2.6.8-2-686 #1 Tue Aug 16 13:22:48 UTC 2005 i686 GNU/Linux[EMAIL PROTECTED]:/tmp/lala/global$ pg_controldata /var/lib/postgresql/8.1/main/WARNING: Calculated CRC checksum does not match value stored in file.
Either the file is corrupt, or it has a different layout than this programis expecting. The results below are untrustworthy.pg_control version number: 812Catalog version number: 200510211
Database system identifier: 4883914971069546458Database cluster state: in productionpg_control last modified: Wed 31 Dec 1969 09:00:00 PM BRTCurrent log file ID: 1142136269
Next log file segment: 0Latest checkpoint location: 1/30Prior checkpoint location: 1/2F71B630Latest checkpoint's REDO location: 1/2F71B5E0Latest checkpoint's UNDO location: 1/2F71B630
Latest checkpoint's TimeLineID: 0Latest checkpoint's NextXID: 0Latest checkpoint's NextOID: 1Latest checkpoint's NextMultiXactId: 36239847Latest checkpoint's NextMultiOffset: 1819439
Time of latest checkpoint: Wed 31 Dec 1969 09:00:11 PM BRTMaximum data alignment: 25Database block size: 0Blocks per segment of large relation: 8Bytes per WAL segment: 0
Maximum length of identifiers: 0Maximum columns in an index: 1093850759Date/time type storage: 64-bit integersMaximum length of locale name: 131072LC_COLLATE:LC_CTYPE:
**pgsql01:~# uname -aLinux pgsql01 2.6.8-11-em64t-p4-smp #1 SMP Mon Oct 3 00:07:51 CEST 2005 x86_64 GNU/Linux
pgsql01:~# /usr/lib/postgresql/8.1/bin/pg_controldata /pg/data/pg_control version number: 812Catalog version number: 200510211Database system identifier: 4883914971069546458
Database cluster state: in productionpg_control last modified: Mon Mar 13 14:19:42 2006Current log file ID: 1Next log file segment: 51Latest checkpoint location: 1/3289F8E0
Prior checkpoint location: 1/32827710Latest checkpoint's REDO location: 1/3289F8E0Latest checkpoint's UNDO location: 0/0Latest checkpoint's TimeLineID: 1Latest checkpoint's NextXID: 37253588
Latest checkpoint's NextOID: 1819439Latest checkpoint's NextMultiXactId: 11Latest checkpoint's NextMultiOffset: 25Time of latest checkpoint: Mon Mar 13 14:19:42 2006Maximum data alignment: 8
Database block size: 8192Blocks per segment of large relation: 131072Bytes per WAL segment: 16777216Maximum length of identifiers: 64Maximum columns in an index: 32
Date/time type storage: 64-bit integersMaximum length of locale name: 128LC_COLLATE: pt_BRLC_CTYPE: pt_BR**
Regards,Rodrigo HjortGTI - Projeto PostgreSQLCELEPAR - Cia de Informática do Paranáhttp://www.pr.gov.br



Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Magnus Hagander
 On Monday 13 March 2006 12:27, Magnus Hagander wrote:
  Great. That'll certainly help - now you don't have to wait for 
  binaries from me.
 
  What I'd be interested in seeing is new stackdumps from a version 
  where
  you:
  1) Do *not* have the patch for mutexes applied
  2) Have removed static from all the function devlarations in 
  signal.c and socket.c, bnoth in src/backend/port/win32.
 
 I did that, and the interesting thing is that: 
 1. It takes much longer to hang.

?! That shouldn't be related :-)

 2. Once it hangs, the stacktraces are the same.

Hmm. That's weird :-(
Did you do a make clean? Sometimes needed to get the port stuff in,
mingw messes up sometimes.

 3 (and this is the kicker). The thing starts working again 
 after a couple (+/-
 5) minutes ?

Interesting. And you get nothing in the logs? (pg_log / eventlog)


 1. can probably be explained by the fact that I didn't 
 compile with any optimization. Can you tell me what CFLAGS 
 the binary distro uses? 

You can use pg_config to see that.

2. I don't know (are there other tools I can use?), and 
Not really, but try the make clean.

 3. I frankly don't understand. I know 
 for sure that with the stock 8.1.3 it would not revive itself 
 (I let it running for a *long* time).

Very interesting.


  If you can, it'd be interesting to see it from the pre-SP1 
 install as 
  well - once it hangs.
 
 I've never seen a pre-SP1 install hang.

Oh, hang on, what I meant was with the post-SP1 hang but with stats
disabled. :-)


  Do you have the ability to test 8.0 on the same machine? We 
 did some 
  extensive modifications to the signal stuff between 8.0 and 
 8.1, it'd 
  be interesting to see if that changed things.
 
 I seem to remember we made ourselves dependend on 8.1 
 somehow, but will check.

Ok. Please do.

//Magnus

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Gavin Sherry
On Sun, 12 Mar 2006, Jonah H. Harris wrote:

 I was talking with Jonathan Gennick about the INS/UPD/DEL RETURNING stuff,
 and he recommended looking into the way DB2 handles similar functionality.
 After looking into it a bit, it's more inline with what Tom's suggestion was
 regarding a query from the operation rather than returning the values in the
 manner currently required.

 Here's DB2's syntax... does anyone have any familiarity with it?

 Simply put, it's sort-of like:

 SELECT * FROM (FINAL | NEW | OLD) TABLE (INSERT | UPDATE | DELETE)

 I'd like to hear from anyone that's used it to see if it really is better...
 logically it seems nicer, but I've never used it.

It works well for cases where you want to pass the result of an
insert/delete/update to another query. There was a paper on IBM developer
works on how they got the 7 or so queries in an order transaction in TPC-C
down to 3 queries and increased throughput impressively.

This doesn't solve the generated keys problem that the Java and probably
.NET interfaces have. Mind, RETURNING doesn't solve anything either.

I prefer this syntax to RETURNING. Then again, Oracle is a bigger target
than DB2 so... I'm not sure.

Thanks,

Gavin

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Tom Lane
Gavin Sherry [EMAIL PROTECTED] writes:
 On Sun, 12 Mar 2006, Jonah H. Harris wrote:
 SELECT * FROM (FINAL | NEW | OLD) TABLE (INSERT | UPDATE | DELETE)

 This doesn't solve the generated keys problem that the Java and probably
 .NET interfaces have. Mind, RETURNING doesn't solve anything either.

Why not?  AFAICS, either one lets you get at generated keys.

It's quite unclear to me what the difference is between FINAL and
NEW ... any clarification there?

The OLD idea is cute but I'm not sure how useful it really is.  They
seem to have missed a bet anyway: if I understand how this works, you
can't get values from both new and old row states in the UPDATE case.
The classification seems bogus for both INSERT and DELETE, too; neither
of them have more than one row state to deal with.

Also, is the front SELECT allowed to have its own WHERE, or is it
constrained to return exactly one row per inserted/updated/deleted row?
If it can have a WHERE then there's a syntactic ambiguity in
SELECT ... FROM NEW TABLE UPDATE ... WHERE ...

More generally, this syntax is problematic in that it's syntactically
possible to use SELECT FROM NEW TABLE ... as a sub-query, which seems
like a truly horrid idea from both semantics and implementation
perspectives.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Restoring a Full Cluster on a Different Architecture (32 x 64)

2006-03-13 Thread Martijn van Oosterhout
On Mon, Mar 13, 2006 at 02:56:00PM -0300, Rodrigo Hjort wrote:
 Dear PostgreSQL Hackers,
 
 We got a PG 8.1 on a Debian 64 bits, which does a full backup (PITR) daily.
 Then we installed a Debian 32 bits (actually, it's on VMWare) and wanted to
 restore the previous PG cluster on it.
 As there are a lot of indexes, specially GiST, pg_dump and pg_restore
 are not viable - will take a lot of time!

Can't be done. The differences in alignments, size, placement, etc will
make it completly. PostgreSQL doesn't even try to maintain a consistant
file format with different configure options...

pg_dump is the only way.
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


signature.asc
Description: Digital signature


Re: [HACKERS] Restoring a Full Cluster on a Different Architecture (32 x 64)

2006-03-13 Thread Jonah H. Harris
On 3/13/06, Rodrigo Hjort [EMAIL PROTECTED] wrote:
As
the architecture on both Linuxes are different (32 and 64 bits), I
think PGDATA/global/pg_control might contains 64 bit data such that
the 32 bits binary won't recognize or even mispell it. Am I right?

Yes, the platform architecture is key. You won't be able to read the 64-bit data files on a 32-bit box.
What could be done in order to fix it? Is there any kind of
application to translate it or the only solution was to pg_dumpall
and pg_restore the cluster?
Yes, dump and restore is the best way to go.
-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Jonah H. Harris
On 3/13/06, Tom Lane [EMAIL PROTECTED] wrote:
Also, is the front SELECT allowed to have its own WHERE, or is itconstrained to return exactly one row per inserted/updated/deleted row?
If it can have a WHERE then there's a syntactic ambiguity inSELECT ... FROM NEW TABLE UPDATE ... WHERE ...
Yes, I believe it supports SELECT .. FROM NEW TABLE (UPDATE .. WHERE ..) WHERE

IBM's paper, Returning Modified Rows--SELECT Statements with Side Effects is here:
http://www.isys.ucl.ac.be/vldb04/eProceedings/contents/pdf/IND1P1.PDF


I'll look up more.-- Jonah H. Harris, Database Internals ArchitectEnterpriseDB Corporation732.331.1324


Re: [HACKERS] Restoring a Full Cluster on a Different Architecture (32 x 64)

2006-03-13 Thread Greg Stark

Rodrigo Hjort [EMAIL PROTECTED] writes:

 What could be done in order to fix it? Is there any kind of application to
 translate it or the only solution was to pg_dumpall and pg_restore the
 cluster?

Unfortunately pg_dump/pg_restore is going to be your only option here. The
database files are specific to the architecture and 32-bit and 64-bit linux
are different architectures. It's just as hard as moving from Sparc to IA32.
The only mechanism Postgres has it to do a dump and restore.

-- 
greg


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Merlin Moncure
 Do you have the ability to test 8.0 on the same machine? We did some
 extensive modifications to the signal stuff between 8.0 and 8.1, it'd be
 interesting to see if that changed things.

I had very similar behavior some weeks back on a machine that had not
been upgraded to 8.1.  It was a dual opteron on win2k server.  Some
simple queries (select 1 + 2) would work ok but anything that did real
work on tables would hang and the backend would not respond to
signals.  Only recourse was to end task from task mgr which cycled the
entire server with no data loss.

This may or may not be the same problem but it sounds similar.

Unfortunately the problem was extremely time sensitive and I could not
play with it much.  However, since this is pre-8.1 this argues against
Qingqing's signal changes (maybe).  I've since moved on to a linux
environment so my win32 contributions will diminish greatly :)

merlin

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Proposal for updatable views

2006-03-13 Thread Bernd Helmle



--On Sonntag, März 12, 2006 23:52:12 -0500 Neil Conway [EMAIL PROTECTED] 
wrote:



On Fri, 2006-03-10 at 11:21 +0100, Bernd Helmle wrote:

Please find attached a patch that implements SQL92-compatible updatable
views.


I'm currently reviewing this. Comments later...



ok


Please note that the patch isn't complete yet


Do you have a list of known TODO items?



The code needs to be teached to handle indexed array fields correctly, at 
the moment this causes the backend to crash.


And there's also a shift/reduce conflict, which needs to be fixed in 
gram.y. The code has some fragments around which aren't used anymore, so a 
cleanup is on my todo as well (however, some are already ifdef'ed out).


   Bernd




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] log_duration and log_statement

2006-03-13 Thread Guillaume Smet
Hello,

Here are some background information to explain our issue and request.
We are currently planning a migration from PostgreSQL 7.4 to
PostgreSQL 8.1. We work on a medium sized database (2GB) with a rather
important activity (12 millions queries a day with peaks up to 1000
queries/s).
We are analyzing the logs with a tool we developed (namely pgFouine
available on pgFoundry). We currently use the following configuration
for logging:
- log_min_duration_statement = 500 to log the slowest queries
- log_duration to log every query duration and have a global overview
of our database activity (used to generate this sort of graphs:
http://people.openwide.fr/~gsmet/postgresql/graphs.html ).
We cannot log every query as we already generate 1.2GB of logs a day
while only logging the text of one hundredth of the queries so we log
only the duration for the not so slow queries.

I didn't notice the log_duration behaviour has changed starting from
8.0 (thanks to oicu for pointing me the 8.0 release notes on
#postgresql) and what we did is not possible anymore with 8.x as
log_duration now only logs the duration for queries logged with
log_statement.

I think the former behaviour can be interesting in our case and
probably for many other people out there who use log analysis tools as
logging only slow queries is not enough to have an overview of the
database activity.
I was thinking about something like log_duration = 'none|logged|all'
which will allow us to switch between:
- none: we don't log the duration (=log_duration=off);
- logged: we log the duration only for logged queries (depending on
log_statement as for 8.0);
- all: we log every duration as 7.4 did before when log_duration was on.

Any comment on this?

Thanks in advance for considering my request.

Regards,

--
Guillaume

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Gavin Sherry
On Mon, 13 Mar 2006, Tom Lane wrote:

 Gavin Sherry [EMAIL PROTECTED] writes:
  On Sun, 12 Mar 2006, Jonah H. Harris wrote:
  SELECT * FROM (FINAL | NEW | OLD) TABLE (INSERT | UPDATE | DELETE)

  This doesn't solve the generated keys problem that the Java and probably
  .NET interfaces have. Mind, RETURNING doesn't solve anything either.

 Why not?  AFAICS, either one lets you get at generated keys.

There are a few different ways to get at generated keys from JDBC at
least. The case we cannot trivially deal with is when the code executes a
statement and then wants a result set of all generated keys. That is, it
doesn't register which generated keys it wants returned before the query
is executed.


 It's quite unclear to me what the difference is between FINAL and
 NEW ... any clarification there?

NEW returns the representation of the data which the statement creates;
FINAL is the final representation of the data, after AFTER triggers have
been applied.


 The OLD idea is cute but I'm not sure how useful it really is.  They
 seem to have missed a bet anyway: if I understand how this works, you
 can't get values from both new and old row states in the UPDATE case.
 The classification seems bogus for both INSERT and DELETE, too; neither
 of them have more than one row state to deal with.

Right, it's not as useful as our OLD.*, NEW.*.


 Also, is the front SELECT allowed to have its own WHERE, or is it
 constrained to return exactly one row per inserted/updated/deleted row?
 If it can have a WHERE then there's a syntactic ambiguity in
   SELECT ... FROM NEW TABLE UPDATE ... WHERE ...

That's definately ambiguous. The manual doesn't clarify and I do not have
DB2 installed locally.


 More generally, this syntax is problematic in that it's syntactically
 possible to use SELECT FROM NEW TABLE ... as a sub-query, which seems
 like a truly horrid idea from both semantics and implementation
 perspectives.

I cannot see any reference to whether this is allowed in DB2. The DB2
manual and other IBM apps use it extensively in named expressions. Ie,

WITH
foo as (SELECT FROM NEW TABLE(...)),
bar as (SELECT FROM OLD TABLE(...))
SELECT ... FROM foo, bar

It does say that a 'data change table reference' is simply a type of table
reference so I suppose it can occur in a sub query. The ability to have
INSERT ... RETURNING in a from clause would achieve most of this, I think.

Thanks,

Gavin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] DB2-style INS/UPD/DEL RETURNING

2006-03-13 Thread Simon Riggs
On Sun, 2006-03-12 at 11:11 -0500, Jonah H. Harris wrote:
 I was talking with Jonathan Gennick about the INS/UPD/DEL RETURNING
 stuff, and he recommended looking into the way DB2 handles similar
 functionality.  After looking into it a bit, it's more inline with
 what Tom's suggestion was regarding a query from the operation rather
 than returning the values in the manner currently required. 
 
 Here's DB2's syntax... does anyone have any familiarity with it?
 
 Simply put, it's sort-of like:
 
 SELECT * FROM (FINAL | NEW | OLD) TABLE (INSERT | UPDATE | DELETE)
 
 I'd like to hear from anyone that's used it to see if it really is
 better... logically it seems nicer, but I've never used it. 

Hmmm...well, IMHO either syntax is fairly contrived, but the DB2 syntax
does seem a more meaningful way of doing this. It is pretty obscure
though...most DB2 people don't know the above syntax because its new in
DB2 8.1

The DB2 syntax allows you to more easily do things like a simultaneous
copy-and-delete from a holding table into a main table, e.g.

INSERT INTO MAINTABLE
SELECT * FROM NEW TABLE (DELETE FROM HOLDINGTABLE WHERE ...)

Thats quite a nice performance trick I've used to save doing separate
INSERT and DELETE tasks on a busy table. 

The Oracle syntax reads less well for that type of task. 

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PERFORM] Hanging queries on dual CPU windows

2006-03-13 Thread Jan de Visser
On Monday 13 March 2006 12:27, Magnus Hagander wrote:
 Great. That'll certainly help - now you don't have to wait for binaries
 from me.

 What I'd be interested in seeing is new stackdumps from a version where
 you:
 1) Do *not* have the patch for mutexes applied
 2) Have removed static from all the function devlarations in signal.c
 and socket.c, bnoth in src/backend/port/win32.

I did that, and the interesting thing is that: 
1. It takes much longer to hang.
2. Once it hangs, the stacktraces are the same.
3 (and this is the kicker). The thing starts working again after a couple (+/- 
5) minutes ?

1. can probably be explained by the fact that I didn't compile with any 
optimization. Can you tell me what CFLAGS the binary distro uses? 2. I don't 
know (are there other tools I can use?), and 3. I frankly don't understand. I 
know for sure that with the stock 8.1.3 it would not revive itself (I let it 
running for a *long* time).


 If you can, it'd be interesting to see it from the pre-SP1 install as
 well - once it hangs.

I've never seen a pre-SP1 install hang.


  On a whim, I replaced InitializeCriticalSection with
  InitializeCriticalSectionAndSpinCount, since MSDN told me
  that would be better for SMP. No joy.

 No, that should make no difference - except possibly a tiny difference
 in speed.


 Do you have the ability to test 8.0 on the same machine? We did some
 extensive modifications to the signal stuff between 8.0 and 8.1, it'd be
 interesting to see if that changed things.

I seem to remember we made ourselves dependend on 8.1 somehow, but will check.

jan




-- 
--
Jan de Visser                     [EMAIL PROTECTED]

                Baruk Khazad! Khazad ai-menu!
--

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal for updatable views

2006-03-13 Thread Jaime Casanova
On 3/13/06, Bernd Helmle [EMAIL PROTECTED] wrote:


 --On Sonntag, März 12, 2006 23:52:12 -0500 Neil Conway [EMAIL PROTECTED]
 wrote:

  On Fri, 2006-03-10 at 11:21 +0100, Bernd Helmle wrote:
  Please find attached a patch that implements SQL92-compatible updatable
  views.
 
  I'm currently reviewing this. Comments later...
 

 ok

  Please note that the patch isn't complete yet
 
  Do you have a list of known TODO items?
 

There's a problem with CASTed expressions because it thinks (and with
reason) that they are functions expressions (and those are not
allowed) but with CAST you have to be flexible...

i was working on that but at the time i am very busy...

--
regards,
Jaime Casanova

What they (MySQL) lose in usability, they gain back in benchmarks, and that's
all that matters: getting the wrong answer really fast.
   Randal L. Schwartz

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Proposal for updatable views

2006-03-13 Thread Neil Conway
On Sun, 2006-03-12 at 23:39 +0800, William ZHANG wrote:
 Maybe you can fix it like UNIONJOIN.

Indeed, that is one option. Because the syntax is WITH [ LOCAL |
CASCADED ] CHECK OPTION, ISTM we'll actually need three new tokens:
WITH_LOCAL, WITH_CASCADED, and WITH_CHECK, which is even uglier :-( Per
a suggestion from Dennis Bjorklund, it might be cleaner to introduce a
lexer hack for the places where WITH can occur in a SelectStmt, which I
believe is just WITH TIME ZONE.

 But Tom said he want to remove the support for UNION JOIN and save the
 overhead

It would be unfortunate to revert the change, but I doubt the overhead
is very significant. Does anyone have any better suggestions for how to
resolve the problem? (My Bison-foo is weak, I have to confess...)

-Neil



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Proposal for updatable views

2006-03-13 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Sun, 2006-03-12 at 23:39 +0800, William ZHANG wrote:
 Maybe you can fix it like UNIONJOIN.

 Indeed, that is one option.

Not any more ;-)

 It would be unfortunate to revert the change, but I doubt the overhead
 is very significant. Does anyone have any better suggestions for how to
 resolve the problem? (My Bison-foo is weak, I have to confess...)

Worst case is we promote WITH to a fully reserved word.  While I don't
normally care for doing that, it *is* a reserved word per SQL99, and
offhand I don't see likely scenarios for someone using with as a table
or column or function name.  (Anyone know of a language in which with
is a noun or verb?)

A quick look at the grammar suggests that the key problem is the
opt_timezone production --- it might be that if we removed that in
favor of spelling out the alternatives at the call sites, the conflict
would go away.  bison-fu is all about postponing shift/reduce decisions
until you've seen enough to be sure ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Proposal for updatable views

2006-03-13 Thread William ZHANG
 A quick look at the grammar suggests that the key problem is the
 opt_timezone production --- it might be that if we removed that in
 favor of spelling out the alternatives at the call sites, the conflict
 would go away.  bison-fu is all about postponing shift/reduce decisions
 until you've seen enough to be sure ...
 
 regards, tom lane

Yes, if we can change opt_timezone and related production rules,
it is a better choice. 
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster