[HACKERS] Online base backup from the hot-standby

2011-08-05 Thread Jun Ishiduka
 I will provide a patch which can exeute pg_start/stop_backup
 including to solve above comment and conditions in next stage.
 Then please review.

done.


* Procedure

1. Call pg_start_backup('x') on the standby.
2. Take a backup of the data dir.
3. Call pg_stop_backup() on the standby.
4. Copy the control file on the standby to the backup.
5. Check whether the control file is status during hot standby with 
pg_controldata.
   - If the standby promote between 3. and 4., the backup can not recovery.
  - pg_control is that Minimum recovery ending location is equals 0/0.
  - backup-end record is not written.

* Not correspond yet

  * full_page_write = off
- If the primary is full_page_write = off, archive recovery may not act 
   normally. Therefore the standby may need to check whether 
full_page_write
   = off to WAL.


Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka@po.ntts.co.jp



standby_online_backup_05.patch
Description: Binary data


standby_online_backup_doc.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-08-05 Thread Cédric Villemain
2011/8/5 Jun Ishiduka ishizuka@po.ntts.co.jp:
 I will provide a patch which can exeute pg_start/stop_backup
 including to solve above comment and conditions in next stage.
 Then please review.

 done.

great !



 * Procedure

 1. Call pg_start_backup('x') on the standby.
 2. Take a backup of the data dir.
 3. Call pg_stop_backup() on the standby.
 4. Copy the control file on the standby to the backup.
 5. Check whether the control file is status during hot standby with 
 pg_controldata.
   - If the standby promote between 3. and 4., the backup can not recovery.
      - pg_control is that Minimum recovery ending location is equals 0/0.
      - backup-end record is not written.

 * Not correspond yet

  * full_page_write = off
    - If the primary is full_page_write = off, archive recovery may not act
       normally. Therefore the standby may need to check whether 
 full_page_write
       = off to WAL.

Isn't having a standby make the full_page_write = on in all case
(bypass configuration) ?


 
 Jun Ishizuka
 NTT Software Corporation
 TEL:045-317-7018
 E-Mail: ishizuka@po.ntts.co.jp
 


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers





-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: display of object comments

2011-08-05 Thread Robert Haas
On Thu, Aug 4, 2011 at 10:24 PM, Josh Kupershmidt schmi...@gmail.com wrote:
 A few existing backslash commands, such as \dn and maybe \db, don't
 exactly follow this precedent. Not sure if we want to bother adjusting
 the existing commands to be consistent in this regard. Defining
 typical usage is pretty wishy-washy, so I'm not real inclined to try
 messing with the existing commands.

Me neither.

 For foreign data wrappers, foreign servers, and foreign tables, I am
 wondering if there is any particular rule we should adhere to in terms
 of where the description shows up in the output column list.  It
 doesn't seem entirely consistent the way you've done it here, but
 maybe you've put more thought into it than I have.

 Hrm, what wasn't consistent? I intended to just put the Description
 column at the end of the outputs for \dew, \des, and \det, which seems
 to be the way other commands handle this. Though now that I double
 check, I notice that the verbose modes of these commands include extra
 columns which come after Description, and it might be better to
 force Description to stay at the end in those cases, the way that
 \dT[+] and \dFt[+] do. Though perhaps you're complaining about
 something different -- \dL isn't forcing Description to the end in
 verbose mode.

Oh, I see.  I was right - you have thought about this more than I have.  :-)

I guess my vote is to make the SQL/MED stuff show the description only
in verbose mode, and always at the end; and revise what we did with
\dL to put the description at the end even in verbose mode.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-05 Thread Simon Riggs
On Thu, Aug 4, 2011 at 11:07 PM, Bruce Momjian br...@momjian.us wrote:
 Simon Riggs wrote:
 On Thu, Aug 4, 2011 at 10:46 PM, Bruce Momjian br...@momjian.us wrote:

  Right. ?I brought up SELECT INTO because you could make the argument
  that INSERT ... SELECT is not a utility command like the other ones and
  therefore can't be done easily, but CREATE TABLE AS is internal SELECT
  INTO and implemented in execMain.c, which I think is where INSERT ...
  SELECT would also be implemented.

 What you should be asking is whether the optimisation would be
 effective for INSERT SELECT, or even test it.

 My observation is that the optimisation is only effective for very
 large loads that cause I/O. As RAM sizes get bigger, I'm inclined to
 remove the optimisation and make it optional since it is ineffective
 in many cases and negative benefit for smaller cases.

 I am confused how generating WAL traffic that is larger than the heap
 file we are fsync'ing can possibly be slower.

I'm telling you what I know to be true as an assistance to you.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plperl crash with Debian 6 (64 bit), pl/perlu, libwww and https

2011-08-05 Thread Andrew Dunstan



On 08/04/2011 11:23 PM, Alex Hunsaker wrote:

On Thu, Aug 4, 2011 at 19:40, Andrew Dunstanand...@dunslane.net  wrote:


Let's slow down a bit. Nobody that we know of has encountered the problem
Tom's referring to, over all the years plperlu has been available. The
changes you're proposing have the potential to downgrade the usefulness of
plperlu considerably without fixing anything that's known to be an actual
problem. Instead of fixing a problem caused by using LWP you could well make
LWP totally unusable from plperlu.

Well, im not sure about it making LWP totally unusable... You could
always use statement_timeout if you were worried about it blocking
^_^.




Making users set statement_timeout would be a degradation in utility. 
For one thing it means you'd never be able to get back and handle an 
unresponsiveness reply. And it would be extra work.


(I don't use LWP in any plperlu code AFAIR, but I do use other things 
that could well want to set signals. At the very least a change like 
this would mandate a LOT of extra testing by my clients.)




And it still won't do a thing about signal handlers installed by C code.

And plperlu would be the tip of the iceberg. What about all the other PLs,
not to mention non-PL loadable modules?

Maybe the answer is to re-issue the appropriate pqsignals() instead of
doing the perl variant?

For PL/Perl(u) we could still disallow any signals the postmaster
uses, from my quick look that would be: HUP, INT, TERM, QUIT, ALRM,
PIPE, USR1, USR2, FPE. All we would need to do is restore ALRM.

Or am I too paranoid about someone shooting themselves in the foot via
USR1? (BTW you can set signals in plperl, but you can't call alarm()
or kill())




This whole thing is a massive over-reaction to a problem we almost 
certainly know how to fix fairly simply and relatively painlessly, and 
attempts (unsuccessfully, at least insofar as comprehensiveness is 
concerned) to fix a problem nobody's actually reported having AFAIK.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] GetSnapshotData() comments

2011-08-05 Thread Robert Haas
I think that the first sentence, in the following comment within
GetSnapshotData(), is not quite right, because at the time this is
executed, we already hold ProcArrayLock, which is the only lock that
gets grabbed here:

/*
 * If we're in recovery then snapshot data comes from a different place,
 * so decide which route we take before grab the lock. It is
possible for
 * recovery to end before we finish taking snapshot, and for newly
 * assigned transaction ids to be added to the procarray. Xmax cannot
 * change while we hold ProcArrayLock, so those newly added transaction
 * ids would be filtered away, so we need not be concerned about them.
 */
snapshot-takenDuringRecovery = RecoveryInProgress();

Having thought it over for a bit, I believe that the code is correct
and it's only the comment that needs to be updated.  If we were to set
snapshot-takenDuringRecovery before acquiring ProcArrayLock, then the
following sequence of events might occur:

T1: Enter GetSnapshotData().  Set snapshot-takenDuringRecovery = true.
Recovery ends
T2: Begin, get an XID.
T3: Begin, get an XID.
T3: Commit, advancing latestCompletedXid.
T1: Acquire ProcArrayLock and use the in recovery path, missing T2's
XID (because it's not in the subxip[] array).
T1: Do some stuff with the snapshot... not seeing T2's XID...
T2: Commit
T1: Do some stuff with the snapshot... now seeing T2's XID...

I think if we just delete the first sentence of the comment, the rest
is all correct.

A little further down, there is this comment:

/*
 * Spin over procArray checking xid, xmin, and
subxids.  The goal is
 * to gather all active xids, find the lowest xmin,
and try to record
 * subxids. During recovery no xids will be assigned,
so all normal
 * backends can be ignored, nor are there any VACUUMs
running. All
 * prepared transaction xids are held in
KnownAssignedXids, so these
 * will be seen without needing to loop through procs here.
 */

...but this code is only executed when recovery is not in progress.
So I feel like everything after try to record subxids should either
be removed, or relocated to the following else block, where the
recovery path is discussed in detail.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread Tom Lane
I wrote:
 Ahh ... you know what, never mind about stack traces, let's just see if
 the attached patch doesn't fix it.

On reflection, that patch would only fix the issue for pg_class, and
that's not the only catalog that gets consulted during relcache reloads.
I think we'd better do it as attached, instead.

regards, tom lane

diff --git a/src/backend/utils/cache/relcache.c b/src/backend/utils/cache/relcache.c
index 81cea8b..337300f 100644
*** a/src/backend/utils/cache/relcache.c
--- b/src/backend/utils/cache/relcache.c
*** RelationCacheInvalidate(void)
*** 2159,2164 
--- 2159,2169 
  	List	   *rebuildList = NIL;
  	ListCell   *l;
  
+ 	/*
+ 	 * Reload relation mapping data before starting to reconstruct cache.
+ 	 */
+ 	RelationMapInvalidateAll();
+ 
  	/* Phase 1 */
  	hash_seq_init(status, RelationIdCache);
  
*** RelationCacheInvalidate(void)
*** 2184,2189 
--- 2189,2204 
  		else
  		{
  			/*
+ 			 * If it's a mapped relation, immediately update its rd_node in
+ 			 * case its relfilenode changed.  We must do this during phase 1
+ 			 * in case the relation is consulted during rebuild of other
+ 			 * relcache entries in phase 2.  It's safe since consulting the
+ 			 * map doesn't involve any access to relcache entries.
+ 			 */
+ 			if (RelationIsMapped(relation))
+ RelationInitPhysicalAddr(relation);
+ 
+ 			/*
  			 * Add this entry to list of stuff to rebuild in second pass.
  			 * pg_class_oid_index goes on the front of rebuildFirstList, other
  			 * nailed indexes on the back, and everything else into
*** RelationCacheInvalidate(void)
*** 2209,2219 
  	 */
  	smgrcloseall();
  
- 	/*
- 	 * Reload relation mapping data before starting to reconstruct cache.
- 	 */
- 	RelationMapInvalidateAll();
- 
  	/* Phase 2: rebuild the items found to need rebuild in phase 1 */
  	foreach(l, rebuildFirstList)
  	{
--- 2224,2229 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-05 Thread Robert Haas
On Thu, Aug 4, 2011 at 4:49 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Can we please commit a fix for this problem?

Done.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] per-column FDW options, v5

2011-08-05 Thread Robert Haas
On Fri, Jul 29, 2011 at 10:09 AM, Robert Haas robertmh...@gmail.com wrote:
 2011/7/29 Shigeru Hanada shigeru.han...@gmail.com:
 Here is a rebased version of per-column FDW options patch.  I've
 proposed this patch in last CF, but it was marked as returned with
 feedback.  So I would like to propose in next CF 2011-09.  I already
 moved CF item into new topic SQL/MED of CF 2011-09.

 I did a quick review of this patch and it looks good to me, modulo
 some quibbles with the wording of the documentation and comments and a
 couple of minor stylistic nitpicks.  Barring objections, I'll fix this
 up a bit and commit it.

Done.

Incidentally, I notice that if you do:

\d some_foreign_table

...the table-level options are not displayed.  We probably ought to do
something about that...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-05 Thread Peter Geoghegan
On 5 August 2011 17:12, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Aug 4, 2011 at 4:49 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Can we please commit a fix for this problem?

 Done.

Thanks Robert.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-08-05 Thread Kohei KaiGai
BTW, what is the current status of this patch?
The status of contrib/sepgsql part is unclear for me, although we agreed that
syscache is suitable mechanism for security labels.

Thanks,

2011/7/22 Kohei KaiGai kai...@kaigai.gr.jp:
 2011/7/22 Yeb Havinga yebhavi...@gmail.com:
 On 2011-07-22 11:55, Kohei Kaigai wrote:

 2) Also I thought if it could work to not remember tcontext is valid, but
 instead remember the consequence,
 which is that it is replaced by unlabeled. It makes the avc_cache
 struct shorter and the code somewhat
 simpler.

 Here is a reason why we hold tcontext, even if it is not valid.
 The hash key of avc_cache is combination of scontext, tcontext and tclass.
 Thus, if we replaced an invalid
 tcontext by unlabeled context, it would always make cache mishit and
 performance loss.

 I see that now, thanks.

 I have no further comments, and I think that the patch in it's current
 status is ready for committer.

 Thanks for your reviewing.

 The attached patch is a revised one according to your suggestion to
 include fallback for 'unlabeled' label within sepgsql_avc_lookup().
 And I found a noise in regression test results, so eliminated it from v5.
 --
 KaiGai Kohei kai...@kaigai.gr.jp

-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-05 Thread Peter Geoghegan
Now, apart from the Flex warning, there are just 3 warnings left. They
all look like this:

repl_gram.y:106:30: warning: implicit conversion from enumeration type
'enum ReplNodeTag' to different enumeration type 'NodeTag' (aka 'enum
NodeTag') [-Wconversion]
(yyval.node) = (Node *)
makeNode(IdentifySystemCmd);

^~~
../../../src/include/nodes/nodes.h:475:64: note: expanded from:
#define makeNode(_type_)((_type_ *)
newNode(sizeof(_type_),T_##_type_))
 ^
scratch space:180:1: note: expanded from:
T_IdentifySystemCmd
^
../../../src/include/nodes/nodes.h:452:19: note: expanded from:
_result-type = (tag); \
  ~  ^~~
Attached patch fixes all 3 warnings with an explicit cast, so the
number of warnings with Clang is the same number as GCC 4.5 - 1. On
GCC 4.6, there are still quite a few -Wunused-but-set-variable
warnings left despite an effort to eradicate them. Perhaps I should
look into that next.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
diff --git a/src/include/nodes/nodes.h b/src/include/nodes/nodes.h
index d8bc6b8..5a87f92 100644
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -472,7 +472,7 @@ extern PGDLLIMPORT Node *newNodeMacroHolder;
 #endif   /* __GNUC__ */
 
 
-#define makeNode(_type_)		((_type_ *) newNode(sizeof(_type_),T_##_type_))
+#define makeNode(_type_)		((_type_ *) newNode(sizeof(_type_), (NodeTag) T_##_type_))
 #define NodeSetTag(nodeptr,t)	(((Node*)(nodeptr))-type = (t))
 
 #define IsA(nodeptr,_type_)		(nodeTag(nodeptr) == T_##_type_)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Further news on Clang - spurious warnings

2011-08-05 Thread Tom Lane
Peter Geoghegan pe...@2ndquadrant.com writes:
 Now, apart from the Flex warning, there are just 3 warnings left. They
 all look like this:

 repl_gram.y:106:30: warning: implicit conversion from enumeration type
 'enum ReplNodeTag' to different enumeration type 'NodeTag' (aka 'enum
 NodeTag') [-Wconversion]

 Attached patch fixes all 3 warnings with an explicit cast,

This patch is a truly horrid idea, as it eliminates the error checking
the compiler is trying to provide, and does so globally not only in the
trouble spots.

If I were trying to get rid of this warning, I'd be wondering why
ReplNodeTag is a distinct enum in the first place.  Surely it does not
make sense to be using the Node mechanisms on something that isn't
conforming to the standard node numbering.  If these nodes ever got
passed to anything else in the system, they'd get misinterpreted.
So I'm inclined to think that clang has pointed out a real issue,
rather than something we ought to paper over.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-08-05 Thread Robert Haas
On Fri, Aug 5, 2011 at 2:36 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 BTW, what is the current status of this patch?

I think it's waiting for me to get around to reviewing it.  Sorry,
been busy :-(

 The status of contrib/sepgsql part is unclear for me, although we agreed that
 syscache is suitable mechanism for security labels.

I thought we agreed the opposite - viz, you need a custom cache.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-05 Thread Jeff Davis
On Thu, 2011-08-04 at 18:07 -0400, Bruce Momjian wrote:
 I am confused how generating WAL traffic that is larger than the heap
 file we are fsync'ing can possibly be slower.  Are you just throwing out
 an idea to try to make me prove it?

That's worded in a slightly confusing way, but here is the trade-off:

1. If you are using WAL, then regardless of what your transaction does,
only the WAL needs to be fsync'd at commit time. Conveniently, that's
being written sequentially, so it's a single fairly cheap fsync; and all
the data page changes are deferred, collected together, and fsync'd at
checkpoint time (rather than commit time). The cost is that you
double-write the data.

2. If you aren't using WAL, you need to fsync every data file the
transaction touched, which are probably not localized with other
activity. Also, the _entire_ data files needs to be sync'd, so perhaps
many other transactions have made changes to one data file all over, and
it may require _many_ seeks to accomplish the one fsync. The benefit is
that you don't double-write the data.

So, fundamentally, WAL is (in the OLTP case, where a transaction is much
shorter than a checkpoint interval) a big performance _win_, because it
allows us to do nice sequential writing in a single place for all
activities of all transactions; and defer all those random writes to
data pages until the next checkpoint. So we shouldn't treat WAL like a
cost burden that we want to avoid in every case we can.

But in the data load case (where many checkpoints may happen during a
single transaction anyway), it happens that avoiding WAL is a
performance win, because the seeks are not the dominant cost.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-08-05 Thread Peter Eisentraut
On tor, 2011-08-04 at 16:15 -0400, Alvaro Herrera wrote:
  Have you considered just cataloging NOT NULL constraints as CHECK
  constraints and teaching the reverse parser to convert x CHECK (x IS
  NOT NULL) to x NOT NULL.
 
 Hmm, no, I admit I haven't.  The current approach was suggested very
 early in the history of this patch.

Well, the early patch thought this was a small problem.  Now we know
it's a big problem, so it might be better to solve it terms of another
big problem that is already solved. :-)

  It seems to me that we're adding a whole
  lot of hoopla here that is essentially identical to the existing CHECK
  constraint support (it must be, per SQL standard), for no additional
  functionality.
 
 Yeah, perhaps you're right.  The main reason they were considered
 separately is that we wanted to have them to be optimized via
 pg_attribute.attnotnull, but my patch does away with the need for that
 because it is maintained separately anyway.

Hmm, OK, but in any case you could have kept attnotnull and treated it
as a kind of optimization that indicates whether you can derive
not-nullability from existing CHECK constraints (which you can easily do
in enough cases).

 Before embarking on rewriting this patch from scratch, I would like to
 know what's your opinion (or the SQL standard's) on the fact that this
 patch separated the PRIMARY KEY from NOT NULL constraints, so that they
 don't act exactly alike (to wit, the not-nullness of a PK does not
 inherit while the one from a NOT NULL constraint does).

The SQL standard deals with inheritance in terms of composite types,
which don't have constraints, so that doesn't give any guidance.

That said, I think the substitutability property of object-oriented
systems, namely that you can use a child object in place of a parent
object, requires in principle that we inherit all constraints (by
default, at least).  We don't inherit primary keys because of
implementation issues with indexes, but at some point in the future we
should fix that.  So to some degree, inheriting the not-null property of
primary keys while not inheriting the rest of it is a bit wrong, but it
would appear to be a step in the right direction, and changing
established behavior seems a bit gratuitous to me.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread daveg
On Fri, Aug 05, 2011 at 12:10:31PM -0400, Tom Lane wrote:
 I wrote:
  Ahh ... you know what, never mind about stack traces, let's just see if
  the attached patch doesn't fix it.
 
 On reflection, that patch would only fix the issue for pg_class, and
 that's not the only catalog that gets consulted during relcache reloads.
 I think we'd better do it as attached, instead.
 
   regards, tom lane

Should this be applied in addition to the earlier patch, or to replace it?

-dg
 
-- 
David Gould   da...@sonic.net  510 536 1443510 282 0869
If simplicity worked, the world would be overrun with insects.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] error: could not find pg_class tuple for index 2662

2011-08-05 Thread Tom Lane
daveg da...@sonic.net writes:
 Should this be applied in addition to the earlier patch, or to replace it?

Apply it instead of the earlier one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-08-05 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On tor, 2011-08-04 at 16:15 -0400, Alvaro Herrera wrote:
 Yeah, perhaps you're right.  The main reason they were considered
 separately is that we wanted to have them to be optimized via
 pg_attribute.attnotnull, but my patch does away with the need for that
 because it is maintained separately anyway.

 Hmm, OK, but in any case you could have kept attnotnull and treated it
 as a kind of optimization that indicates whether you can derive
 not-nullability from existing CHECK constraints (which you can easily do
 in enough cases).

Yes.  I thought that was how we were going to do it, and I'm rather
distressed to hear of attnotnull going away.  Even if there were not a
performance reason to keep it (and I'll bet there is), you can be sure
that removing that column will break a lot of client-side code.  See
recent complaints about Robert removing relistemp, which has only been
around for a release or two.  attnotnull goes back to the beginning,
more or less.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cataloguing NOT NULL constraints

2011-08-05 Thread Alvaro Herrera
Excerpts from Tom Lane's message of vie ago 05 21:23:41 -0400 2011:
 Peter Eisentraut pete...@gmx.net writes:
  On tor, 2011-08-04 at 16:15 -0400, Alvaro Herrera wrote:
  Yeah, perhaps you're right.  The main reason they were considered
  separately is that we wanted to have them to be optimized via
  pg_attribute.attnotnull, but my patch does away with the need for that
  because it is maintained separately anyway.
 
  Hmm, OK, but in any case you could have kept attnotnull and treated it
  as a kind of optimization that indicates whether you can derive
  not-nullability from existing CHECK constraints (which you can easily do
  in enough cases).
 
 Yes.  I thought that was how we were going to do it, and I'm rather
 distressed to hear of attnotnull going away.  Even if there were not a
 performance reason to keep it (and I'll bet there is), you can be sure
 that removing that column will break a lot of client-side code.  See
 recent complaints about Robert removing relistemp, which has only been
 around for a release or two.  attnotnull goes back to the beginning,
 more or less.

Err, obviously I didn't express myself very well.  I am not removing the
column.  What I tried to say is that we no longer need to optimize the
representation of NOT NULL as separate entities from CHECK constraints,
because attnotnull is maintained separately from the pg_constraint
entries.  In other words, from that point of view, representing NOT NULL
as CHECK is not a problem from a performance POV, because it is already
taken care of by letting attnotnull continue to represent them as a
cache.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-05 Thread Bruce Momjian
Jeff Davis wrote:
 On Thu, 2011-08-04 at 18:07 -0400, Bruce Momjian wrote:
  I am confused how generating WAL traffic that is larger than the heap
  file we are fsync'ing can possibly be slower.  Are you just throwing out
  an idea to try to make me prove it?
 
 That's worded in a slightly confusing way, but here is the trade-off:
 
 1. If you are using WAL, then regardless of what your transaction does,
 only the WAL needs to be fsync'd at commit time. Conveniently, that's
 being written sequentially, so it's a single fairly cheap fsync; and all
 the data page changes are deferred, collected together, and fsync'd at
 checkpoint time (rather than commit time). The cost is that you
 double-write the data.
 
 2. If you aren't using WAL, you need to fsync every data file the
 transaction touched, which are probably not localized with other
 activity. Also, the _entire_ data files needs to be sync'd, so perhaps
 many other transactions have made changes to one data file all over, and
 it may require _many_ seeks to accomplish the one fsync. The benefit is
 that you don't double-write the data.
 
 So, fundamentally, WAL is (in the OLTP case, where a transaction is much
 shorter than a checkpoint interval) a big performance _win_, because it
 allows us to do nice sequential writing in a single place for all
 activities of all transactions; and defer all those random writes to
 data pages until the next checkpoint. So we shouldn't treat WAL like a
 cost burden that we want to avoid in every case we can.
 
 But in the data load case (where many checkpoints may happen during a
 single transaction anyway), it happens that avoiding WAL is a
 performance win, because the seeks are not the dominant cost.

Well, if the table is created in the same transaction (which is the only
case under consideration), no other sessions can write to the table so
you are just writing the entire table on commit, rather than to the WAL.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reduce WAL logging of INSERT SELECT

2011-08-05 Thread Gokulakannan Somasundaram

 However, for small operations it's a net loss - you avoid writing a WAL
 record, but you have to fsync() the heap instead. If you only modify a few
 rows, the extra fsync (or fsyncs if there are indexes too) is more expensive
 than writing the WAL.

 We'd need a heuristic to decide whether to write WAL or fsync at the end.
 For regular INSERTs, UPDATEs and DELETEs, you have the planner's estimate of
 number of rows affected. Another thing we should do is move the fsync call
 from the end of COPY (and other such operations) to the end of transaction.
 That way if you do e.g one COPY followed by a bunch of smaller INSERTs or
 UPDATEs, you only need to fsync the files once.


Have you thought about recovery, especially when the page size is greater
than the disk block size(  4K ). With WAL, there is a way to restore the
pages to the original state, during recovery, if there are partial page
writes. Is it possible to do the same with direct fsync without WAL?

Gokul.


Re: [HACKERS] [v9.1] sepgsql - userspace access vector cache

2011-08-05 Thread Kohei KaiGai
2011/8/5 Robert Haas robertmh...@gmail.com:
 On Fri, Aug 5, 2011 at 2:36 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 BTW, what is the current status of this patch?

 I think it's waiting for me to get around to reviewing it.  Sorry,
 been busy :-(

Thanks for your efforts.

 The status of contrib/sepgsql part is unclear for me, although we agreed that
 syscache is suitable mechanism for security labels.

 I thought we agreed the opposite - viz, you need a custom cache.

Sorry, I missed to append not just after syscache is 
Your explanation is right.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers