Re: [HACKERS] SET TRANSACTION not compliant with SQL:2003

2008-03-13 Thread Simon Riggs
On Wed, 2008-03-12 at 15:51 -0400, Bruce Momjian wrote:
> Tom's comment on this from the patch queue is that the standard assume
> autocommit off, which affect some of your analysis below.

This isn't an important area for me, but I don't think we follow the
standard in the way we do it now and we should at least note that.

The standard specifically refers to "the next transaction", so the
setting of autocommit is irrelevant here.

> Simon Riggs wrote:
> > The SQL:2003 standard definition of SET TRANSACTION differs in major
> > ways from PostgreSQL's, which produces some interesting behaviour. 
> > 
> > We currently claim conformance, though this is not accurate.
> > 
> > ...
> > 
> > If a  that does not specify LOCAL is
> > executed, then
> > Case:
> > i) If an SQL-transaction is currently active, then an exception
> > condition is raised: invalid transaction
> >state ? active SQL-transaction.
> > 
> > ...
> > 
> > Case:
> > a) If LOCAL is not specified, then let TXN be the next SQL-transaction
> > for the SQL-agent.
> > b) Otherwise, let TXN be the branch of the active SQL-transaction at the
> > current SQL-connection.
> > 
> > 
> > The standard behaviour is that SET TRANSACTION defines the mode used in
> > the *next* transaction, not the current one. We should allow this
> > meaning, since programs written to spec will act differently with the
> > current implementation. We currently only change the *current*
> > transaction. Executing within the current transaction is supposed to
> > throw an error; that's probably too late to change, but the standard
> > does give some clues for other errors.
> > 
> > Proposed changes:
> > 
> > 1. Solo SET TRANSACTION statements produce no WARNING, nor do anything.
> > This isn't the way the SQL:2003 standard specifies it should work.
> > We should take the values from SET TRANSACTION and apply them to the
> > *next* transaction:
> > - these will apply to next TXN, unless specifically overridden during
> > the START TRANSACTION command
> > - these values apply for one transaction only, after which we revert
> > back to the session default.
> > 
> > 2. Duplicate calls to SET TRANSACTION are allowed within a transaction.
> > => Should be ERROR: Transaction mode already set.
> > 
> > postgres=# begin;
> > BEGIN
> > postgres=# set transaction read only;
> > SET
> > postgres=# set transaction read only;
> > SET
> > postgres=# commit;
> > COMMIT
> > 
> > 3. Multiple conflicting calls to SET TRANSACTION are allowed within a
> > transaction.
> > => Should be ERROR: Transaction mode already set.
> > 
> > postgres=# begin;
> > BEGIN
> > postgres=# set transaction isolation level read committed;
> > SET
> > postgres=# set transaction isolation level serializable;
> > SET
> > postgres=# commit;
> > COMMIT
> > 
> > 4. SET TRANSACTION can be called after a SAVEPOINT, i.e. it can be
> > called in a subtransaction.
> > => Should be ERROR: SET TRANSACTION must not be called in a
> > subtransaction.
> > (Calling SET TRANSACTION ISOLATION LEVEL already throws this error, so
> > change should be small)
> > 
> > 5. The standard uses the keyword LOCAL like this:
> > SET LOCAL TRANSACTION ...
> > which in this context means the part of a distributed (two-phased)
> > commit on this database.
> > We should accept, but ignore this keyword.
> > 
> > -- 
> >   Simon Riggs
> >   2ndQuadrant  http://www.2ndQuadrant.com
> > 
> > 
> > ---(end of broadcast)---
> > TIP 6: explain analyze is your friend
> 
-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing Transaction Start/End Contention

2008-03-13 Thread Simon Riggs
On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote:

> Is this still a TODO?

I think so.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal: new large object API

2008-03-13 Thread Tatsuo Ishii
I would like to propose new large object client side API for 8.4.

Currently we have:

  Oid lo_import(PGconn *conn, const char *filename);

But we do not have an API which imports a large object specifying the
object id. This is inconvenient and inconsistent since we already have
lo_create() and lo_open() which allow to specify the large object id.

So I propose to add new API:

  int lo_import_with_oid(PGconn *conn, Oid lobjId, const char 
*filename);

Another idea is changing the signature of lo_import:

  Oid lo_import(PGconn *conn, Oid lobjId, const char *filename);

which will be cleaner but break the backward compatibility.

Comments are welcome.
--
Tatsuo Ishii
SRA OSS, Inc. Japan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CSStorm occurred again by postgreSQL8.2

2008-03-13 Thread Simon Riggs
On Wed, 2008-03-12 at 20:13 -0400, Bruce Momjian wrote:
> Is this a TODO?  Tom's reply was:

The general topic, yes. The caveats still apply.

> > Nonsense.  Main transaction exit also takes an exclusive lock, and is
> > far more likely to be exercised in typical workloads than a
> > subtransaction abort.
> > 
> > In any case: there has still not been any evidence presented by anyone
> > that optimizing XidCacheRemoveRunningXids will help one bit.  Given the
> > difficulty of measuring any benefit from the last couple of
> > optimizations in this general area, I'm thinking that such evidence
> > will be hard to come by.  And we have got way more than enough on our
> > plates already.  Can we let go of this for 8.3, please?
> 
> ---
> 
> Simon Riggs wrote:
> > On Wed, 2006-09-13 at 21:45 -0400, Tom Lane wrote:
> > 
> > > Anyway, given that there's this one nonobvious gotcha, there might be
> > > others.  My recommendation is that we take this off the open-items list
> > > for 8.2 and revisit it in the 8.3 cycle when there's more time.
> > 
> > Well, its still 8.3 just...
> > 
> > As discussed in the other thread "Final Thoughts for 8.3 on LWLocking
> > and Scalability", XidCacheRemoveRunningXids() is now the only holder of
> > an X lock during normal processing, so I would like to remove it. 
> > Here's how:
> > 
> > Currently, we take the lock, remove the subxact and then shuffle down
> > all the other subxactIds so that the subxact cache is contiguous.
> > 
> > I propose that we simply zero out the subxact entry without re-arranging
> > the cache; this will be atomic, so we need not acquire an X lock. We
> > then increment ndeletedxids. When we enter a new subxact into the cache,
> > if ndeletedxids > 0 we scan the cache to find an InvalidTransactionId
> > that we can use, then decrement ndeletedxids. So ndeletedxids is just a
> > hint, not an absolute requirement. nxids then becomes the number of
> > cache entries and never goes down until EOXact. The subxact cache is no
> > longer in order, but then it doesn't need to be either.
> > 
> > When we take a snapshot we will end up taking a copy of zeroed cache
> > entries, so the snapshots will be slightly larger than previously.
> > Though still no larger than the max. The size reduction was not so large
> > as to make a significant difference across the whole array, so
> > scalability is the main issue to resolve.
> > 
> > The snapshots will be valid with no change, since InvalidTransactionId
> > will never match against any recorded Xid.
> > 
> > I would also like to make the size of the subxact cache configurable
> > with a parameter such as subtransaction_cache_size = 64 (default), valid
> > range 4-256.
> > 
> > -- 
> >   Simon Riggs
> >   2ndQuadrant  http://www.2ndQuadrant.com
> > 
> > 
> > ---(end of broadcast)---
> > TIP 9: In versions below 8.0, the planner will ignore your desire to
> >choose an index scan if your joining column's datatypes do not
> >match
> 
-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 

  PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nasty bug in heap_page_prune

2008-03-13 Thread Pavan Deolasee
On Thu, Mar 13, 2008 at 10:23 AM, Tom Lane <[EMAIL PROTECTED]> wrote:
>
>  But we could push them out earlier, ie,
>  before making the actual page changes in heap_page_prune.  This seems
>  safe since an unnecessary invalidation notice cannot break anything,
>  at worst it causes useless work.
>

If I understand correctly, that would let us call CacheInvalidateHeapTuple()
outside a critical section. Seems like a good idea to me.


>
>  I'm inclined to set this up as having heap_page_prune first call
>  a function named something like "BeginNontransactionalInvalidation",
>  then do its CacheInvalidateHeapTuple calls, then call
>  "EndNontransactionalInvalidation".  In the initial implementation
>  the first would just assert that the inval queue is empty, and the
>  second would push out the queue.  If we ever need to generalize
>  things then the code structure would be there to build a phony
>  subtransaction.
>


I wonder if we can have a separate list of non-transaction events in
InvalidationListHeader and broadcast those events irrespective of
transaction commit or abort. But we may still need the
"BeginNontransactionalInvalidation" and  "EndNontransactionalInvalidation"
markers to push these events to the non-transactional list.

>  Obviously this is a bit of a hack, but there's hardly anything about
>  VACUUM FULL that's not a bit of a hack ...
>

True :-) And I would personally prefer any hack than playing with left
over redirected line pointers in VACUUM FULL.


Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing Transaction Start/End Contention

2008-03-13 Thread Alvaro Herrera
Simon Riggs wrote:
> On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote:
> 
> > Is this still a TODO?
> 
> I think so.

How about this wording:

"Review Simon's claims to improve performance"

;-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing Transaction Start/End Contention

2008-03-13 Thread Mark Mielke

Alvaro Herrera wrote:

Simon Riggs wrote:
  

On Tue, 2008-03-11 at 20:23 -0400, Bruce Momjian wrote:


Is this still a TODO?
  

I think so.



How about this wording:

"Review Simon's claims to improve performance


What sort of evidence is usually compelling? It seems to me that this 
sort of change only benefits configurations with dozens or more CPUs/cores?


I ask, because I saw a few references to "I see no performance change - 
but then, I don't have the right hardware." It seems to me that it 
should be obvious that contention will only show up under very high 
concurrency? :-)


Cheers,
mark

--
Mark Mielke <[EMAIL PROTECTED]>



Re: [HACKERS] Ideas input sought for this year's SOC page

2008-03-13 Thread Jonah H. Harris
On Wed, Mar 12, 2008 at 1:19 PM, Josh Berkus <[EMAIL PROTECTED]> wrote:
>  ... with new ideas and projects appropriate for PostgreSQL 8.4/8.5.
>  Please make suggestions.  Thanks!

How about optimizing the SQL used by pgsnmpd, as well as extending it
to support traps?

-- 
Jonah H. Harris, Sr. Software Architect | phone: 732.331.1324
EnterpriseDB Corporation | fax: 732.331.1301
499 Thornall Street, 2nd Floor | [EMAIL PROTECTED]
Edison, NJ 08837 | http://www.enterprisedb.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] aclitem out/in don't "work"

2008-03-13 Thread Gregory Stark

Just noting what seems to me to be a bug, though perhaps not a consequential
one since we never dump out the contents of system tables and don't expect
people to be using aclitem for non-system tables.

The output of aclitem can't be read back back in as an aclitem:

 postgres=# select relacl::text from pg_class limit 1;
relacl   
 
  {=r/stark}
 (1 row)

 postgres=# select relacl::text::aclitem from pg_class limit 1;
 ERROR:  unrecognized key word: ""
 HINT:  ACL key word must be "group" or "user".


I can't look at fixing it right now but I'll poke my nose in a bit later when
I'm done what I'm doing.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

2008-03-13 Thread Tom Lane
Meetesh Karia <[EMAIL PROTECTED]> writes:
> Additionally, here's what I get when I run your test below (my server 
> encoding is UTF-8):

> ltefull=# create table x (r varchar(255) unique);
> NOTICE:  CREATE TABLE / UNIQUE will create implicit index "x_r_key" for 
> table "x"
> CREATE TABLE
> ltefull=#
> ltefull=# set client_encoding=WIN1250;
> SET
> ltefull=# insert into x (r) values ('Daniel Brühl');
> INSERT 0 1
> ltefull=#
> ltefull=# insert into x (r) values ('Daniel Bruehl');
> ERROR:  duplicate key value violates unique constraint "x_r_key"

You said this was on Windows, right?

I was about to say "that should be impossible", until I looked at
varstr_cmp() and realized that whoever put in the WIN32/UTF8
special case omitted this part:

/*
 * In some locales strcoll() can claim that nonidentical strings are
 * equal.  Believing that would be bad news for a number of reasons,
 * so we follow Perl's lead and sort "equal" strings according to
 * strcmp().
 */
if (result == 0)
result = strcmp(a1p, a2p);

So we behave differently on Windows (with UTF8) than anywhere else.
This is pretty nasty, not least because it means that texteq is
inconsistent with other text comparison operators.

I think this is a "must fix" bug for 8.3.1, anyone disagree?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nasty bug in heap_page_prune

2008-03-13 Thread Tom Lane
"Pavan Deolasee" <[EMAIL PROTECTED]> writes:
> I wonder if we can have a separate list of non-transaction events in
> InvalidationListHeader and broadcast those events irrespective of
> transaction commit or abort.

Yeah, I started with that same idea.  But AFAICS there is no percentage
in postponing the broadcast until commit/abort; we may as well push the
messages out immediately.  The reason inval postpones transactional
messages until commit/abort is that that's when the invalidation
actually "takes effect" (or not) from the point of view of other
transactions; telling them to flush their caches earlier would be
useless.  For these nontransactional invalidations the inval is
effective immediately, and other sessions can reload their caches
as soon as we release buffer lock.  (Well, except that VACUUM FULL
is holding ex-lock on the whole table...)

Anyway, point is that that seems to be extra complication that
doesn't buy anything, and if anything puts the behavior further
away from what it should ideally be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reducing Transaction Start/End Contention

2008-03-13 Thread Tom Lane
Mark Mielke <[EMAIL PROTECTED]> writes:
> Alvaro Herrera wrote:
>> How about this wording:
>> "Review Simon's claims to improve performance

> What sort of evidence is usually compelling? It seems to me that this 
> sort of change only benefits configurations with dozens or more CPUs/cores?

The main point in my mind was that that analysis was based on the code
as it then stood.  Florian's work to reduce ProcArrayLock contention
might have invalidated some or all of the ideas.  So it needs a fresh
look.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [ADMIN] Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

2008-03-13 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> I think this is a "must fix" bug for 8.3.1, anyone disagree?

Agreed.

It seems we should collect cases like this for the regression tests. The only
one I was aware of previously was the Turkish one.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Nasty bug in heap_page_prune

2008-03-13 Thread Pavan Deolasee
On Thu, Mar 13, 2008 at 9:22 PM, Tom Lane <[EMAIL PROTECTED]> wrote:

>
>  Yeah, I started with that same idea.  But AFAICS there is no percentage
>  in postponing the broadcast until commit/abort; we may as well push the
>  messages out immediately.

Seems like a good plan to me.

Thanks,
Pavan

-- 
Pavan Deolasee
EnterpriseDB http://www.enterprisedb.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [ADMIN] Encoding problems with migration from 8.0.14 to 8.3.0 on Windows

2008-03-13 Thread Meetesh Karia
Thanks for the confirmation of this.  For now I'll continue to use the C 
locale and I'll switch with 8.3.1.


Meetesh

Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:

  

I think this is a "must fix" bug for 8.3.1, anyone disagree?



Agreed.

It seems we should collect cases like this for the regression tests. The only
one I was aware of previously was the Turkish one.

  


Re: [HACKERS] aclitem out/in don't "work"

2008-03-13 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> The output of aclitem can't be read back back in as an aclitem:

>  postgres=# select relacl::text from pg_class limit 1;
> relacl   
>  
>   {=r/stark}
>  (1 row)

>  postgres=# select relacl::text::aclitem from pg_class limit 1;
>  ERROR:  unrecognized key word: ""
>  HINT:  ACL key word must be "group" or "user".

relacl is aclitem[] not aclitem.  The error message could perhaps be
improved, but I'm not sure that it's an urgent problem...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] aclitem out/in don't "work"

2008-03-13 Thread Gregory Stark
"Tom Lane" <[EMAIL PROTECTED]> writes:

> Gregory Stark <[EMAIL PROTECTED]> writes:
>> The output of aclitem can't be read back back in as an aclitem:
>
>>  postgres=# select relacl::text from pg_class limit 1;
>> relacl   
>>  
>>   {=r/stark}
>>  (1 row)
>
>>  postgres=# select relacl::text::aclitem from pg_class limit 1;
>>  ERROR:  unrecognized key word: ""
>>  HINT:  ACL key word must be "group" or "user".
>
> relacl is aclitem[] not aclitem.  The error message could perhaps be
> improved, but I'm not sure that it's an urgent problem...

oh, doh.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for db level triggers

2008-03-13 Thread James Mansion

James Mansion wrote:

In usage:

AFTER START clears counters and flags.
UPDATE triggers on data set counters and flags.
BEFORE COMMIT examines the counters and flags and performs any final 
validation or

adjustments (or external events such as sending a MoM message)

I'd like to point out also that AFTER CONNECT is a good opportunity to 
CREATE TEMP TABLE (be
nice if a global temp table definition could be persisted and 
automatically duplicated into each session, but

never mind).

And if we use data update triggers to insert into a DELETE ROWS temp 
table or an in-memory data
structure, the BEFORE COMMIT trigger is the place to do a bulk copy into 
real table(s) or combine

rows into a BLOb in an audit table.

James


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PROC_VACUUM_FOR_WRAPAROUND doesn't work expectedly

2008-03-13 Thread ITAGAKI Takahiro
I found autovacuum can be canceled by blocked backends even if the vacuum
is for preventing XID wraparound in 8.3.0 and HEAD. Autovacuum sets
PROC_VACUUM_FOR_WRAPAROUND flag just before vacuum, but the flag will be
cleared at the beginning of vacuum; PROC_VACUUM_FOR_WRAPAROUND is not set
during the vacuum.

The sequence is below:
vacuum()
 -> CommitTransactionCommand()
 -> ProcArrayEndTransaction()
 -> proc->vacuumFlags &= ~PROC_VACUUM_STATE_MASK;
 -> vacuum_rel()

PROC_VACUUM_STATE_MASK is defined as (0x0E), that is including
PROC_VACUUM_FOR_WRAPAROUND (0x08). The wraparound flag is cleared
before vacuum tasks.

I tried to make a patch to exclude PROC_VACUUM_FOR_WRAPAROUND
from PROC_VACUUM_STATE_MASK and make autovacuum workers to clear
PROC_VACUUM_FOR_WRAPAROUND by themselves. Is it a reasonable solution?


Index: src/backend/postmaster/autovacuum.c
===
--- src/backend/postmaster/autovacuum.c (HEAD)
+++ src/backend/postmaster/autovacuum.c (working copy)
@@ -2163,6 +2163,12 @@
PG_END_TRY();
 
/* the PGPROC flags are reset at the next end of transaction */
+   if (tab->at_wraparound)
+   {
+   LWLockAcquire(ProcArrayLock, LW_EXCLUSIVE);
+   MyProc->vacuumFlags &= ~PROC_VACUUM_FOR_WRAPAROUND;
+   LWLockRelease(ProcArrayLock);
+   }
 
/* be tidy */
pfree(tab);
Index: src/include/storage/proc.h
===
--- src/include/storage/proc.h  (HEAD)
+++ src/include/storage/proc.h  (working copy)
@@ -45,7 +45,7 @@
 #definePROC_VACUUM_FOR_WRAPAROUND 0x08 /* set by 
autovac only */
 
 /* flags reset at EOXact */
-#definePROC_VACUUM_STATE_MASK (0x0E)
+#definePROC_VACUUM_STATE_MASK (PROC_IN_VACUUM | 
PROC_IN_ANALYZE)
 
 /*
  * Each backend has a PGPROC struct in shared memory.  There is also a list of

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS]COPY issue(gsoc project)

2008-03-13 Thread longlong
2008/3/12, Neil Conway <[EMAIL PROTECTED]>:
>
> I don't see why creating index entries in bulk has anything to do with
> COPY vs. INSERT: if a lot of rows are being loaded into the table in a
> single command, it would be a win to create the index entries in bulk,
> regardless of whether COPY or INSERT ... SELECT is being used.
>
> In any case, the "create indexes in bulk" hasn't actually been
> implemented in mainline Postgres...
>
> I mentioned pgloader just as an example of an existing implementation of
> the "error recovery in COPY" idea. The issues with doing an
> implementation of error recovery in the backend that I see are:
>
> * in order to be sure that you can recover from an error, you
>   need to abort the current subtransaction
>
> * starting and committing a subtransaction for every row of the COPY
>   would be too expensive
>
> * therefore, start and commit a subtransaction for every "n" rows
>   of input. If an error occurs, you lose at most "n-1" rows of
>   valid input, which you need to backup and reinsert. There are
>   various approaches to choosing "n" (statically, based on the
>   error rate of previous batches in the same load, etc.).
>
>
> -Neil
>
>
> i think this is a better idea.
from *NikhilS *
http://archives.postgresql.org/pgsql-hackers/2007-12/msg00584.php
But instead of using a per insert or a batch insert substraction, I am
thinking that we can start off a subtraction and continue it till we
encounter a failure. The moment an error is encountered, since we have the
offending (already in heap) tuple around, we can call a simple_heap_delete
on the same and commit (instead of aborting) this subtransaction after doing
some minor cleanup. This current input data row can also be logged into a
bad file. Recall that we need to only handle those errors in which the
simple_heap_insert is successful, but the index insertion or the after row
insert trigger causes an error. The rest of the load then can go ahead with
the start of a new subtransaction.
the simplest thing are often the best.
i think it's hard to implement or some other deficiency since you want
subtransaction or every "n" rows.

you have mentioned that the 'n' can be changed according when and where the
error happened in thread *"Re: VLDB Features" .this is like some *mechanisms
in tcp Congestion Control.but you can't ignore the time wasted in
subtransaction before it encounters an error especially when the 'n' is big.

i don't know the cost of a subtransaction(begin and commit) and  an copy
line reading(CopyReadLine()) exactly. so i just calculate the number of
subtransactions.
f(n)=(1-(1-p)^n)**m+m/n
m is the number of lines. n is the subtransaction lines. p is the
possibility of each row encounters an error.
*big 'n' can reduce the number of subtransaction(m/n), but also increase the
*possibility of *having a error. unless the p is extremely small, choosing a
big 'n' is a big mistake.
in fact the errors always get together (my experience), the situation may be
a little better.

however,  the idea(from NikhilS) that i start with is the perfect solution.
yes i have seen in the email archives in thread *"Re: VLDB Features" and i
notice* some disagreements about commit problems and etc. this won't be a
problem since so many similar problems have been solved in pg.


[HACKERS] Data Recovery feature

2008-03-13 Thread Amit jain
Hi,

What are the Data Recovery feature available in postgreSQL apart from
pg_restore and PITR.
If my Database has corrupted its not starting which feature i can use for
Data Recovery ?
Any help willbe highly appreciated.