Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 Michael Paquier escribió:
  Hi all,
 
  Please find attached a patch to dump pageinspect to 1.2. This simply
  changes page_header to use the new internal datatype pg_lsn instead of
 text.

 Uhm.  Does this crash if you pg_upgrade a server that has 1.1 installed?

Oops yes you're right., it is obviously broken. Just re-thinking about
that, dumping this module just to change an argument type does not seem to
be worth the code complication. Thoughts?
Regards,
-- 
Michael


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Andres Freund
On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
 On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
 alvhe...@2ndquadrant.comwrote:
 
  Michael Paquier escribió:
   Hi all,
  
   Please find attached a patch to dump pageinspect to 1.2. This simply
   changes page_header to use the new internal datatype pg_lsn instead of
  text.
 
  Uhm.  Does this crash if you pg_upgrade a server that has 1.1 installed?
 
 Oops yes you're right., it is obviously broken. Just re-thinking about
 that, dumping this module just to change an argument type does not seem to
 be worth the code complication. Thoughts?

It seem simple to support both, old and new type. page_header() (which
IIRC is the only thing returning an LSN) likely uses
get_call_result_type(). So you can just check what it's
expecting. Either support both types or error out if it's the old
extension version.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-24 Thread MauMau

From: Rajeev rastogi rajeev.rast...@huawei.com
Please find the attached modified patch.

Thanks, reviewed and made this patch ready for committer.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Tomas Vondra
On 7.2.2014 00:47, Andrew Dunstan wrote:
 
 On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
  Should I make new version of patch? Right now it's placed on github.
 May be Andrew wants to change something?

 
 
 Attached are updated patches.
 
 Apart from the things Teodor has fixed, this includes
 
  * switching to using text representation in jsonb send/recv
  * implementation of jsonb_array_elements_text that we need now we have
json_array_elements_text
  * some code fixes requested in code reviews, plus some other tidying
and refactoring.
 
 cheers

Hi,

I'm slightly uncertain if this is the current version of the patches, or
whether I should look at
https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
slightly modified code.

Anyway, the only thing I noticed in the v10 version so far is slight
difference in naming - while we have json_to_hstore/hstore_to_json, we
have jsonb2hstore/hstore2jsonb. I propose to change this to
jsonb_to_hstore/hstore_to_jsonb.

May not be needed if the implicit casts go through.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Oleg Bartunov
Yes, the repository you mentioned is the last version of our
development. It contains various fixes of issues by Andres, but we are
waiting Andrew, who is working on jsonb stuff.

On Mon, Feb 24, 2014 at 5:34 PM, Tomas Vondra t...@fuzzy.cz wrote:
 On 7.2.2014 00:47, Andrew Dunstan wrote:

 On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
  Should I make new version of patch? Right now it's placed on github.
 May be Andrew wants to change something?



 Attached are updated patches.

 Apart from the things Teodor has fixed, this includes

  * switching to using text representation in jsonb send/recv
  * implementation of jsonb_array_elements_text that we need now we have
json_array_elements_text
  * some code fixes requested in code reviews, plus some other tidying
and refactoring.

 cheers

 Hi,

 I'm slightly uncertain if this is the current version of the patches, or
 whether I should look at
 https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
 slightly modified code.

 Anyway, the only thing I noticed in the v10 version so far is slight
 difference in naming - while we have json_to_hstore/hstore_to_json, we
 have jsonb2hstore/hstore2jsonb. I propose to change this to
 jsonb_to_hstore/hstore_to_jsonb.

 May not be needed if the implicit casts go through.

 regards
 Tomas


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Alvaro Herrera
Andres Freund escribió:
 On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
  On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
  alvhe...@2ndquadrant.comwrote:
  
   Michael Paquier escribió:
Hi all,
   
Please find attached a patch to dump pageinspect to 1.2. This
simply changes page_header to use the new internal datatype
pg_lsn instead of text.
  
   Uhm.  Does this crash if you pg_upgrade a server that has 1.1
   installed?
  
  Oops yes you're right., it is obviously broken. Just re-thinking
  about that, dumping this module just to change an argument type does
  not seem to be worth the code complication. Thoughts?
 
 It seem simple to support both, old and new type. page_header() (which
 IIRC is the only thing returning an LSN) likely uses
 get_call_result_type(). So you can just check what it's
 expecting. Either support both types or error out if it's the old
 extension version.

Yeah, erroring out seems good enough.  Particularly if you add a hint
saying please upgrade the extension.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 12:20 AM, Josh Berkus j...@agliodbs.com wrote:
 All,

 Here's a draft cleanup on the JSON section of the Datatype docs.  Since
 there's been a bunch of incremental patches on this, I just did a diff
 against HEAD.

 I looked over json-functions a bit, but am not clear on what needs to
 change there; the docs are pretty similar to other sections of
 Functions, and if they're complex it's because of the sheer number of
 JSON-related functions.

 Anyway, this version of datatypes introduces a comparison table, which I
 think should make things a bit clearer for users.

I still find the phrasing as jsonb is more efficient for most
purposes to be a bit off  Basically, the text json type is faster for
serialization/deserialization pattern (not just document preservation)
and jsonb is preferred when storing json and doing repeated
subdocument accesses.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:06 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 20, 2014 at 9:43 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/5/14, 1:31 PM, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.

 If we're going to do that, I suggest pg_lsn rather than pglsn.  We
 already have pg_node_tree, so using underscores for separation would
 be more consistent.

 Yes, that's a good precedent in multiple ways.
 Here are updated patches to use pg_lsn instead of pglsn...

 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.
 Thanks!
 Oops, it looks like I am coming after the battle (time difference does
 not help). I'll be more careful to test such patches on 32b platforms
 as well in the future.
 After re-reading the code, I found two incorrect comments in the new
 regression tests. Patch fixing them is attached.

Thanks, committed.  But I left out the whitespace change you included.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-24 Thread Andres Freund
On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com wrote:

 +   /*
 +* XXX: It's impolite to ignore our argument and keep decoding until 
 the
 +* current position.
 +*/
 
 Eh, what?

So, the background here is that I was thinking of allowing to specify a
limit for the number of returned rows. For the sql interface that sounds
like a good idea. I am just not so sure anymore that allowing to specify
a LSN as a limit is sufficient. Maybe simply allow to limit the number
of changes and check everytime a transaction has been replayed?

It's all trivial codewise, I am just wondering about the interface most
users would want.

 +* We misuse the original meaning of SnapshotData's xip and
 subxip fields
 +* to make the more fitting for our needs.
 [...]
 +* XXX: Do we want extra fields instead of misusing existing
 ones instead?
 
 If we're going to do this, then it surely needs to be documented in
 snapshot.h.  On the second question, you're not the first hacker to
 want to abuse the meanings of the existing fields; SnapshotDirty
 already does it.  It's tempting to think we need a more principled
 approach to this, like what we've done with Node i.e. typedef enum ...
 SnapshotType; and then a separate struct definition for each kind, all
 beginning with SnapshotType type.

Hm, essentially that's what the -satisfies pointer already is, right?

There's already documentation of the extra fields in snapbuild, but I
understand you'd rather have them moved?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Pavel Stehule
2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com:

 Hi,

 On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
  There is relative few very long ProcArrayLocks lwlocks
 
  This issue is very pathologic on fast computers with more than 8 CPU.
 This
  issue was detected after migration from 8.4 to 9.2. (but tested with same
  result on 9.0)  I see it on devel 9.4 today actualized.
 
  When I moved PREPARE from cycle, then described issues is gone. But when
 I
  use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
  related to planner, ...

 In addition to the issue Jeff mentioned, I'd suggest trying the same
 workload with repeatable read. That can do *wonders* because of the
 reduced number of snapshots.


I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?


 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I still find the phrasing as jsonb is more efficient for most
 purposes to be a bit off  Basically, the text json type is faster for
 serialization/deserialization pattern (not just document preservation)
 and jsonb is preferred when storing json and doing repeated
subdocument accesses.

Hm, I'm going to withdraw that.  I had done some testing of simple
deserialization (cast to text and the like) and noted that jsonb was
as much as 5x slower.  However, I just did some checking on
json[b]_populate_recordset though and it's pretty much a wash.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Andres Freund
On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
354246.00 93.0% s_lock
 /usr/lib/postgresql/9.2/bin/postgres
 10503.00  2.8% LWLockRelease
  /usr/lib/postgresql/9.2/bin/postgres
  8802.00  2.3% LWLockAcquire
  /usr/lib/postgresql/9.2/bin/postgres
   828.00  0.2% _raw_spin_lock
 [kernel.kallsyms]
   559.00  0.1% _raw_spin_lock_irqsave
 [kernel.kallsyms]
   340.00  0.1% switch_mm
  [kernel.kallsyms]
   305.00  0.1% poll_schedule_timeout
  [kernel.kallsyms]
   274.00  0.1% native_write_msr_safe
  [kernel.kallsyms]
   257.00  0.1% _raw_spin_lock_irq
 [kernel.kallsyms]
   238.00  0.1% apic_timer_interrupt
 [kernel.kallsyms]
   236.00  0.1% __schedule
 [kernel.kallsyms]
   213.00  0.1% HeapTupleSatisfiesMVCC
 
 With systemtap I got list of spin locks
 
 light weight locks
 lockname   mode  countavg (time)
 DynamicLocks  Exclusive   2804   1025
 DynamicLocks Shared106130
ProcArrayLock  Exclusive 63 963551
ProcArrayLock Shared 50   4160
 LockMgrLocks  Exclusive 18159
  IndividualLock   Exclusive  2  7
 
 There is relative few very long ProcArrayLocks lwlocks

It's odd that there are so many exclusive acquisition
ProcArrayLocks... A hierarchical profile would be interesting. I'd
suggest compiling postgres with -fno-omit-frame-pointer and doing a
profile with perf.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-24 Thread Christian Kruse
Hi,

attached you will find a new version of the patch, removing the ctid
text but leaving the ctid itself in the message.

On 23/02/14 11:14, Amit Kapila wrote:
  In general, why I am suggesting to restrict display of newly added
  context for the case it is added to ensure that it doesn't get started
  displaying in other cases where it might make sense or might not
  make sense.
 
  Anything failing while inside an XactLockTableWait() et al. will benefit
  from the context. In which scenarios could that be unneccessary?
 
 This path is quite deep, so I have not verified that whether
 this new context can make sense for all error cases.
 For example, in below path (error message), it doesn't seem to
 make any sense (I have tried to generate it through debugger,
 actual message will vary).
 
 XactLockTableWait-SubTransGetParent-SimpleLruReadPage_ReadOnly-SimpleLruReadPage-SlruReportIOError
 
 ERROR:  could not access status of transaction 927
 DETAIL:  Could not open file pg_subtrans/: No error.
 CONTEXT:  relation public.t1
 tuple ctid (0,2)

To be honest, I don't like the idea of setting up this error context
only for log_lock_wait messages. This sounds unnecessary complex to me
and I think that in the few cases where this message doesn't add a
value (and thus is useless) don't justify such complexity.

 I have not checked that, but the reason I mentioned about database name
 is due to display of database oid in similar message, please see the
 message below. I think in below context we get it from lock tag and
 I think for the patch case also, it might be better to display, but not
 a mandatory thing. Consider it as a suggestion which if you also feel
 good, then do it, else ignore it.
 
 LOG:  process 4656 still waiting for ExclusiveLock on tuple (0,1) of relation 
 57
 513 of database 12045 after 1046.000 ms

After thinking a bit about this I added the database name to the
context message, see attached patch.

   Currently I simply display the whole tuple with truncating long
   fields. This is plain easy, no distinction necessary. I did some code
   reading and it seems somewhat complex to get the PK columns and it
   seems that we need another lock for this, too - maybe not the best
   idea when we are already in locking trouble, do you disagree?
 
  I don't think you need to take another lock for this, it must already
  have appropriate lock by that time. There should not be any problem
  in calling relationHasPrimaryKey except that currently it is static, you
  need to expose it.
 
  Other locations that deal with similar things (notably
  ExecBuildSlotValueDescription()) doesn't either. I don't think this
  patch should introduce it then.
 
 Displaying whole tuple doesn't seem to be the most right way
 to get debug information and especially in this case we are
 already displaying tuple offset(ctid) which is unique identity
 to identify a tuple. It seems to me that it is sufficient to display
 unique value of tuple if present.
 I understand that there is no clear issue here, so may be if others also
 share their opinion then it will be quite easy to take a call.

That would be nice. I didn't change it, yet.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..a04414e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask  HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithErrorContext(relation, tp, (MultiXactId) xwait,
+	  MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3255,8 +3257,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, remain,
-			infomask);
+			MultiXactIdWaitWithErrorContext(relation, oldtup,
+	   (MultiXactId) xwait, mxact_status, remain, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3330,7 +3332,7 @@ l2:
 			else
 			{
 /* wait 

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
Hi Kaigai-san,

Sorry to leave the thread for a while.

2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
 (1) Interface to add alternative paths in addition to built-in join paths

 This patch adds add_join_path_hook on add_paths_to_joinrel to allow
 extensions to provide alternative scan path in addition to the built-in
 join paths. Custom-scan path being added is assumed to perform to scan
 on a (virtual) relation that is a result set of joining relations.
 My concern is its arguments to be pushed. This hook is declared as follows:

 /* Hook for plugins to add custom join path, in addition to default ones */
 typedef void (*add_join_path_hook_type)(PlannerInfo *root,
 RelOptInfo *joinrel,
 RelOptInfo *outerrel,
 RelOptInfo *innerrel,
 JoinType jointype,
 SpecialJoinInfo *sjinfo,
 List *restrictlist,
 List *mergeclause_list,
 SemiAntiJoinFactors *semifactors,
 Relids param_source_rels,
 Relids extra_lateral_rels);
 extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;

 Likely, its arguments upper than restrictlist should be informed to 
 extensions,
 because these are also arguments of add_paths_to_joinrel().
 However, I'm not 100% certain how about other arguments should be informed.
 Probably, it makes sense to inform param_source_rels and extra_lateral_rels
 to check whether the path is sensible for parameterized paths.
 On the other hand, I doubt whether mergeclause_list is usuful to deliver.
 (It may make sense if someone tries to implement their own merge-join
 implementation??)

 I'd like to seem idea to improve the current interface specification.

I've read the code path to add custom join again, and felt that
providing semifactors seems not necessary for the first cut, because
it is used in only initial_cost_nestloop (final_cost_nestloop receives
semifactors but it is not used there), and external module would not
become so smart before 9.5 development cycle.  It seems enough complex
to postpone determinig  whether it's essential for add_join_path_hook.
 Do you have any concrete use case for the parameter?

mergeclause_list and param_source_rels seem little easier to use, but
anyway it should be documented how to use those parameters.

IMHO, minimal interface seems better than fully-fledged but half-baked
one, especially in the first-cut.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:27 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Yes, that's a good precedent in multiple ways.
  Here are updated patches to use pg_lsn instead of pglsn...

 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.

 cool, thanks you two.

 I wonder if pg_stat_replication shouldn't be updated to use it as well?
 SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
 that as names that are possible candidates for conversion.
 I was sure to have forgotten some views or functions in the previous
 patch... Please find attached a patch making pg_stat_replication use
 pg_lsn instead of the existing text fields.
 Regards,

Committed.  Do we want to do anything about pageinspect?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
 Folks,

 Let me remind the custom-scan patches; that is a basis feature of
 remote join of postgres_fdw, cache-only scan, (upcoming) GPU
 acceleration feature or various alternative ways to scan/join relations.
 Unfortunately, small amount of discussion we could have in this commit
 fest, even though Hanada-san volunteered to move the patches into
 ready for committer state at the CF-Nov.

I found some cosmetic flaw and .gitignore leak in the patches.  Please
see attached a patch for details.

-- 
Shigeru HANADA


custom_scan_cosmetic_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-24 Thread Bruce Momjian
On Wed, Feb 19, 2014 at 06:37:42PM -0500, Tom Lane wrote:
 Emre Hasegeli e...@hasegeli.com writes:
  [ cites bug #5705 ]
 
 Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
 opclasses are.  There was discussion at the time of just ripping them
 out despite the compatibility hit.  We didn't do it, but if we had
 then life would be simpler now.
 
 Perhaps it would be acceptable to drop the btree_gist implementation
 and teach pg_upgrade to refuse to upgrade if the old database contains
 any such indexes.  I'm not sure that solves the problem, though, because
 I think pg_upgrade will still fail if the opclass is in the old database
 even though unused (because you'll get the complaint about multiple
 default opclasses anyway).  The only simple way to avoid that is to not
 have btree_gist loaded at all in the old DB, and I doubt that's an
 acceptable upgrade restriction.  It would require dropping indexes of
 the other types supported by btree_gist, which would be a pretty hard
 sell since those aren't broken.
 
 Really what we'd need here is for btree_gist to be upgraded to a version
 that doesn't define gist_inet_ops (or at least doesn't mark it as default)
 *before* pg_upgrading to a server version containing the proposed new
 implementation.  Not sure how workable that is.  Could we get away with
 having pg_upgrade unset the default flag in the old database?
 (Ick, but there are no very good alternatives here ...)

pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
that now.  Can we make the changes in the new cluster, or in pg_dump
when in binary upgrade mode?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 1:39 PM, Marti Raudsepp ma...@juffo.org wrote:
 On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp ma...@juffo.org wrote:
 With partial-sort-basic-1 and this fix on the same test suite, the
 planner overhead is now a more manageable 0.5% to 1.3%; one test is
 faster by 0.5%.

 Ping, Robert or anyone, does this overhead seem bearable or is that
 still too much?

 Do these numbers look conclusive enough or should I run more tests?

Tom should really be the one to comment on this, I think.  I read
through the patch quickly and it looks much less scary than the early
versions, but it's not obvious to me whether the remaining overhead is
enough to worry about.  I'd need to spend more time studying it to
form a really sound opinion on that topic, and unfortunately I don't
have that time right now.

I think it'd be interesting to try to determine specifically where
that overhead is coming from.  Pick the test case where it's the worst
(1.3%) and do a perf with and without the patch and look at the
difference in the call graph.  It's possible we could have changes on
that order of magnitude just from more or less fortuitous code layout
decisions as code shifts around, but it's also possible that there's a
real effect there we should think harder about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2014-02-24 Thread Andres Freund
Hi,

I took a quick peek at this, and noticed the following things:
* I am pretty sure this patch doesn't compile anymore after the latest
  set of releases.
* This definitely should include isolationtester tests actually
  performing concurrent ALTER TABLEs. All that's currently there is
  tests that the locklevel isn't too high, but not that it actually works.
* So far no DDL uses ShareRowExclusiveLocks, why do so now? Not sure if
  there aren't relevant cases for with foreign key checks (which afair
  *do* acquire SRE locks).
* Why is AddConstraint so complicated when it's all used SRE locks?
* Are you sure AlterConstraint is generally safe without an AEL? It
  should be safe to change whether something is deferred, but not
  necessarily whether it's deferrable?
* Why does ChangeOwner need AEL?
* You changed several places to take out lower locks, which in itself is
  fine, but doesn't that lead to lock upgrade hazard if a later step
  acquires a stronger lock? Or do we take out a strong enough lock from
  the beginning.
* There's no explanation why the EOXact TupleDesc stuff is needed?

That's it for now,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I still find the phrasing as jsonb is more efficient for most
 purposes to be a bit off  Basically, the text json type is faster for
 serialization/deserialization pattern (not just document preservation)
 and jsonb is preferred when storing json and doing repeated
subdocument accesses.

 Hm, I'm going to withdraw that.  I had done some testing of simple
 deserialization (cast to text and the like) and noted that jsonb was
 as much as 5x slower.  However, I just did some checking on
 json[b]_populate_recordset though and it's pretty much a wash.

[sorry for noise on this].

Here's the use case coverage as I see it today:

CASE:jsonjsonb hstore
Static document: yes  poor  poor
Precise document:yes  nono
Serialization:   yes  nono
Deserialization: poor***  yes*  no
Repeated Access: poor yes   yes
Manipulation:no   no**  yes
GIST/GIN searching:  no   no**  yes

notes:
* jsonb gets 'yes' for deserialization assuming andrew's 'two level'
deserialization fix goes in (otherwise 'poor').
** jsonb can't do this today, but presumably will be able to soon
*** 'poor' unless json type also gets the deserialization fix, then 'yes'.
 hstore can deserialize hstore format, but will rely on json/jsonb
for deserializing json

'Static document' represents edge cases where the json is opaque to
the database but performance -- for example large map polygons.
'Precise document' represents cases where whitespace or key order is important.

Peter asked upthread how to access the various features.  Well, today,
it basically means a bit of nimble casting to different structures
depending on which particular features are important to you, which
IMNSHO is not bad at all as long as we understand that most people who
rely on jsonb will also need hstore for its searching and operators.
Down the line when hstore and jsonb are more flushed out it's going to
come down to an API style choice.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-21 14:15:09 +0100, Christian Kruse wrote:
 +/* --
 + * pgstat_fetch_stat_local_beentry() -
 + *
 + *  Like pgstat_fetch_stat_beentry() but with local addtions (like xid and
 + *  xmin values of the backend)

s/local addtions/locally computed addititions/

 +/* --
 + * LocalPgBackendStatus
 + *
 + * When we build the backend status array, we use LocalPgBackendStatus to be
 + * able to add new values to the struct when needed without adding new fields
 + * to the shared memory. It contains the backend status as a first member.
 + * --
 + */
 +typedef struct LocalPgBackendStatus
 +{
 + /*
 +  * Local version of the backend status entry
 +  */
 + PgBackendStatus backendStatus;
 +
 + /*
 +  * The xid of the current transaction if available, InvalidTransactionId
 +  * if not
 +  */
 + TransactionId backend_xid;
 +
 + /*
 +  * The xmin of the current session if available, InvalidTransactionId
 +  * if not
 +  */
 + TransactionId backend_xmin;
 +} LocalPgBackendStatus;
 +

Those are sentences, so they should include a . at the end.

I think this is ready for committer.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-02-24 Thread Dean Rasheed
On 20 February 2014 01:48, Florian Pflug f...@phlo.org wrote:
 On Jan29, 2014, at 13:45 , Florian Pflug f...@phlo.org wrote:
 In fact, I'm
 currently leaning towards just forbidding non-strict forward transition
 function with strict inverses, and adding non-NULL counters to the
 aggregates that then require them. It's really only the SUM() aggregates
 that are affected by this, I think.

 I finally got around to doing that, and the results aren't too bad. The
 attached patches required that the strictness settings of the forward and
 reverse transition functions agree, and employ exactly the same NULL-skipping
 logic we always had.

 The only aggregates seriously affected by that change were SUM(int2) and
 SUM(int4).

 The SUM, AVG and STDDEV aggregates which use NumericAggState where
 already mostly prepared for this - all they required were a few adjustments
 to correctly handle the last non-NULL, non-NaN input being removed, and a few
 additional PG_ARGISNULL calls for the inverse transition functions since 
 they're
 now non-strict. I've also modified them to unconditionally allocate the state
 at the first call, instead upon seeing the first non-NULL input, but that 
 isn't
 strictly required. But without that, the state can have three classes of 
 values -
 SQL-NULL, NULL pointer and valid pointer, and that's just confusing...

 SUM(int2) and SUM(int4) now simply use the same transition functions as
 AVG(int2) and AVG(int4), which use an int8 array to track the sum of the 
 inputs
 and the number of inputs, plus a new final function int2int4_sum(). 
 Previously,
 they used a single int8 as their state type.

 Since I was touching the code anyway, I removed some unnecessary inverse
 transition functions - namely, int8_avg_accum_inv and numeric_avg_accum_inv. 
 These
 are completely identical to their non-avg cousins - the only difference 
 between
 the corresponding forward transition functions is whether they request 
 computation
 of sumX2 (i.e. the sum of squares of the inputs) or not.

 I haven't yet updated the docs - it'll do that if and when there's consensus
 about whether this is the way to go or not.


I haven't looked at this in any detail yet, but that seems much neater
to me. It seems perfectly sensible that the forward and inverse
transition functions should have the same strictness settings, and
enforcing that keeps the logic simple, as well as hopefully making it
easier to document.

It's a shame that more transition functions cannot be made strict,
when they actually ignore null values, but I think trying to solve
that can be regarded as outside the scope of this patch.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [SQL] Comparison semantics of CHAR data type

2014-02-24 Thread Bruce Momjian
On Fri, Feb 14, 2014 at 05:02:26PM -0500, Bruce Momjian wrote:
 On Thu, Feb 13, 2014 at 09:47:01PM -0500, Bruce Momjian wrote:
  On Wed, Oct 16, 2013 at 02:17:11PM -0400, Bruce Momjian wrote:
 You can see the UTF8 case is fine because \n is considered greater
 than space, but in the C locale, where \n is less than space, the
 false return value shows the problem with
 internal_bpchar_pattern_compare() trimming the string and first
 comparing on lengths.  This is exactly the problem you outline, where
 space trimming assumes everything is less than a space.

For collations other than C some of those issues that have to do with
string comparisons might simply be hidden, depending on how strcoll()
handles inputs off different lengths: If strcoll() applies implicit
space padding to the shorter value, there won't be any visible
difference in ordering between bpchar and varchar values.  If strcoll()
does not apply such space padding, the right-trimming of bpchar values
causes very similar issues even in a en_US collation.
  
  I have added the attached C comment to explain the problem, and added a
  TODO item to fix it if we ever break binary upgrading.
  
  Does anyone think this warrants a doc mention?
 
 I have done some more thinking on this and I found a way to document
 this, which reduces our need to actually fix it some day.  I am afraid
 the behavioral change needed to fix this might break so many
 applications that the fix will never be done, though I will keep the
 TODO item until I get more feedback on that.  Patch attached.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-02-24 Thread Dean Rasheed
On 13 February 2014 04:12, Craig Ringer cr...@2ndquadrant.com wrote:
 On 02/11/2014 08:19 PM, Yeb Havinga wrote:

 I compared output of psql -ef of the minirim.sql script posted earlier
 in http://www.postgresql.org/message-id/52f54927.1040...@gmail.com
 between v4 and v7.

 Not everything is ok.

 +psql:/home/m/minirim2.sql:409: ERROR:  attribute 6 has wrong type
 +DETAIL:  Table has type tsrange, but query expects _confidentialitycode.

 This looks like an issue with attribute transformations made when
 applying security barrier quals.

 +psql:/home/m/minirim2.sql:439: connection to server was lost

 That's dying with:

 Program received signal SIGABRT, Aborted.
 0x003f02c359e9 in __GI_raise (sig=sig@entry=6) at 
 ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 56return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
 (gdb) bt
 #0  0x003f02c359e9 in __GI_raise (sig=sig@entry=6) at 
 ../nptl/sysdeps/unix/sysv/linux/raise.c:56
 #1  0x003f02c370f8 in __GI_abort () at abort.c:90
 #2  0x00758d9d in ExceptionalCondition 
 (conditionName=conditionName@entry=0x8b7bf0 !(!bms_is_empty(upper_varnos)),
 errorType=errorType@entry=0x78e89c FailedAssertion, 
 fileName=fileName@entry=0x8b78d7 subselect.c, 
 lineNumber=lineNumber@entry=1394) at assert.c:54
 #3  0x0061de2b in convert_EXISTS_sublink_to_join (root=optimized 
 out, sublink=optimized out, under_not=under_not@entry=0 '\000', 
 available_rels=0x117d038)
 at subselect.c:1394
 #4  0x0061efa9 in pull_up_sublinks_qual_recurse 
 (root=root@entry=0x117d058, node=0x117a0f8, 
 jtlink1=jtlink1@entry=0x7fff6d97f708,
 available_rels1=available_rels1@entry=0x117d038, 
 jtlink2=jtlink2@entry=0x0, available_rels2=available_rels2@entry=0x0) at 
 prepjointree.c:391
 #5  0x0061f339 in pull_up_sublinks_jointree_recurse 
 (root=root@entry=0x117d058, jtnode=0x117a040, 
 relids=relids@entry=0x7fff6d97f758) at prepjointree.c:208
 #6  0x0062066f in pull_up_sublinks (root=root@entry=0x117d058) at 
 prepjointree.c:147
 #7  0x0061967e in subquery_planner (glob=0x10c3fb8, 
 parse=parse@entry=0x1179af8, parent_root=parent_root@entry=0x1182fd0, 
 hasRecursion=hasRecursion@entry=0 '\000',
 `
 #8  0x005fc093 in set_subquery_pathlist (root=root@entry=0x1182fd0, 
 rel=rel@entry=0x1179370, rti=rti@entry=4, rte=rte@entry=0x1183980) at 
 allpaths.c:1209
 #9  0x005fc54e in set_rel_size (root=root@entry=0x1182fd0, 
 rel=0x1179370, rti=rti@entry=4, rte=0x1183980) at allpaths.c:265
 #10 0x005fc62e in set_base_rel_sizes (root=root@entry=0x1182fd0) at 
 allpaths.c:180
 #11 0x005fd584 in make_one_rel (root=root@entry=0x1182fd0, 
 joinlist=joinlist@entry=0x1179560) at allpaths.c:138
 #12 0x0061617a in query_planner (root=root@entry=0x1182fd0, 
 tlist=tlist@entry=0x11771c8, qp_callback=qp_callback@entry=0x616fd6 
 standard_qp_callback,
 qp_extra=qp_extra@entry=0x7fff6d97fa00) at planmain.c:236
 #13 0x006182f2 in grouping_planner (root=root@entry=0x1182fd0, 
 tuple_fraction=tuple_fraction@entry=0) at planner.c:1280
 #14 0x00619a11 in subquery_planner (glob=glob@entry=0x10c3fb8, 
 parse=parse@entry=0x10c3d30, parent_root=parent_root@entry=0x0,
 hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=0, 
 subroot=subroot@entry=0x7fff6d97fb58) at planner.c:574
 #15 0x00619c1f in standard_planner (parse=0x10c3d30, 
 cursorOptions=0, boundParams=0x0) at planner.c:211
 #16 0x00619e35 in planner (parse=parse@entry=0x10c3d30, 
 cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
 planner.c:139
 #17 0x0068c64b in pg_plan_query (querytree=0x10c3d30, 
 cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
 postgres.c:759
 #18 0x0068c6d3 in pg_plan_queries (querytrees=optimized out, 
 cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
 postgres.c:818
 #19 0x0068c932 in exec_simple_query 
 (query_string=query_string@entry=0x10c2d78 SELECT * FROM hl7.test;) at 
 postgres.c:983
 #20 0x0068e46e in PostgresMain (argc=optimized out, 
 argv=argv@entry=0x10cd1f0, dbname=0x10cd058 regress, username=optimized 
 out) at postgres.c:4003
 #21 0x006419a7 in BackendRun (port=port@entry=0x10c7e50) at 
 postmaster.c:4080
 #22 0x006432c2 in BackendStartup (port=port@entry=0x10c7e50) at 
 postmaster.c:3769
 #23 0x00643526 in ServerLoop () at postmaster.c:1580
 #24 0x0064467c in PostmasterMain (argc=argc@entry=4, 
 argv=argv@entry=0x10a8750) at postmaster.c:1235
 #25 0x005d692c in main (argc=4, argv=0x10a8750) at main.c:205
 (gdb)


 It's crashing while pulling up the query over emp (hl7.employee) and
 part (hl7.participation).

 Given the simplicity of what the row-security code its self is doing,
 I'm wondering if this is a case that isn't handled in updatable s.b.
 views. I'll look into it.


I'm not sure how much further you've got 

Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Josh Berkus
On 02/24/2014 07:08 AM, Merlin Moncure wrote:
 On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure mmonc...@gmail.com wrote:
 I still find the phrasing as jsonb is more efficient for most
 purposes to be a bit off  Basically, the text json type is faster for
 serialization/deserialization pattern (not just document preservation)
 and jsonb is preferred when storing json and doing repeated
 subdocument accesses.
 
 Hm, I'm going to withdraw that.  I had done some testing of simple
 deserialization (cast to text and the like) and noted that jsonb was
 as much as 5x slower.  However, I just did some checking on
 json[b]_populate_recordset though and it's pretty much a wash.

Aside from that, I want our docs to make a strong endorsement of using
jsonb over json for most users.  jsonb will continue to be developed and
improved in the future; it is very unlikely that json will.  Maybe
that's what I should say rather than anything about efficiency.

In other words: having an ambiguous, complex evaluation of json vs.
jsonb does NOT benefit most users.  The result will be some users
choosing json and then pitching fit when they want jsonb in 9.5 and have
to rewrite all their tables.

Mind you, we'll need to fix the slow deserialization, though.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-24 Thread Robert Haas
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris j...@wizmail.org wrote:
 On 09/02/14 17:11, Jeremy Harris wrote:
 On 06/02/14 18:21, Jeff Janes wrote:
   Did you try sorting already-sorted, reverse
 sorted, or pipe-organ shaped data sets?  We will also need to test it on
 strings.  I usually use md5(random()::text) to generate strings for such
 purposes, at least for a first pass.


 Attached is version 2 of the patch, which fixes the performance on
 constant-input.

 Having beaten on this some more I'm prepared to abandon it.

 The wallclock time, for random input, drifts up at larger N
 (compared to the existing code) despite the number of comparisons
 being consistently less.

 Run under cachegrind, it takes about N/10 last-level cache misses,
 all for the new item being introduced to the heap.  The existing
 code takes none at all.

Can you explain this further?  This seems like an important clue that
could be useful when trying to optimize this code, but I'm a little
unclear which part of the operation has more cache misses with your
changes and why.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
 As others, I am concerned about people being confused when funny-looking
 trigger names suddenly appearing when you disable all table triggers.
 
 What I ended up doing is to create a user and internal section when
 displaying disabled triggers:
 
   Disabled user triggers:
   check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
 trigf()
   Disabled internal triggers:
   RI_ConstraintTrigger_c_16409 AFTER INSERT ON orders FROM customer 
 NOT DEF ...
   RI_ConstraintTrigger_c_16410 AFTER UPDATE ON orders FROM customer 
 NOT DEF ...
   
 I kept the Triggers section unchanged, showing only user triggers.  I
 also updated the code to handle 8.3+ servers.
 
 Patch attached.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-24 Thread Robert Haas
On Mon, Feb 24, 2014 at 9:48 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com 
 wrote:

 +   /*
 +* XXX: It's impolite to ignore our argument and keep decoding until 
 the
 +* current position.
 +*/

 Eh, what?

 So, the background here is that I was thinking of allowing to specify a
 limit for the number of returned rows. For the sql interface that sounds
 like a good idea. I am just not so sure anymore that allowing to specify
 a LSN as a limit is sufficient. Maybe simply allow to limit the number
 of changes and check everytime a transaction has been replayed?

The last idea there seems like pretty sound, but ...

 It's all trivial codewise, I am just wondering about the interface most
 users would want.

...I can't swear it meets this criterion.

 +* We misuse the original meaning of SnapshotData's xip and
 subxip fields
 +* to make the more fitting for our needs.
 [...]
 +* XXX: Do we want extra fields instead of misusing existing
 ones instead?

 If we're going to do this, then it surely needs to be documented in
 snapshot.h.  On the second question, you're not the first hacker to
 want to abuse the meanings of the existing fields; SnapshotDirty
 already does it.  It's tempting to think we need a more principled
 approach to this, like what we've done with Node i.e. typedef enum ...
 SnapshotType; and then a separate struct definition for each kind, all
 beginning with SnapshotType type.

 Hm, essentially that's what the -satisfies pointer already is, right?

Sorta, yeah.  But with nodes, you can change the whole struct
definition for each type.

 There's already documentation of the extra fields in snapbuild, but I
 understand you'd rather have them moved?

Yeah, I think it needs to be documented where SnapshotData is defined.
 There may be reason to mention it again, or in more detail,
elsewhere.  But there should be some mention of it there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] old warning in docs

2014-02-24 Thread Bruce Momjian
On Sat, Feb 15, 2014 at 07:34:20PM -0500, Bruce Momjian wrote:
 On Sat, Feb 15, 2014 at 07:19:46PM -0500, Robert Haas wrote:
  On Thu, Feb 13, 2014 at 10:55 AM, Bruce Momjian br...@momjian.us wrote:
   I have created the attached patch which removes many of the pre-8.0
   references, and trims some of the 8.1-8.3 references.  There are
   probably some of these that should be kept, but it is easier to show you
   all the possibilities and we can trim down the removal list based on
   feedback.
  
  The changes to lobj.sgml seem pointless.  I would also vote for
  keeping xindex.sgml as-is.  The rest of the changes seem like
  improvements.
 
 OK, done.  Updated patch attached.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Pavel Stehule
2014-02-24 16:09 GMT+01:00 Andres Freund and...@2ndquadrant.com:

 On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
 354246.00 93.0% s_lock
  /usr/lib/postgresql/9.2/bin/postgres
  10503.00  2.8% LWLockRelease
   /usr/lib/postgresql/9.2/bin/postgres
   8802.00  2.3% LWLockAcquire
   /usr/lib/postgresql/9.2/bin/postgres
828.00  0.2% _raw_spin_lock
  [kernel.kallsyms]
559.00  0.1% _raw_spin_lock_irqsave
  [kernel.kallsyms]
340.00  0.1% switch_mm
   [kernel.kallsyms]
305.00  0.1% poll_schedule_timeout
   [kernel.kallsyms]
274.00  0.1% native_write_msr_safe
   [kernel.kallsyms]
257.00  0.1% _raw_spin_lock_irq
  [kernel.kallsyms]
238.00  0.1% apic_timer_interrupt
  [kernel.kallsyms]
236.00  0.1% __schedule
  [kernel.kallsyms]
213.00  0.1% HeapTupleSatisfiesMVCC
 
  With systemtap I got list of spin locks
 
  light weight locks
  lockname   mode  countavg (time)
  DynamicLocks  Exclusive   2804   1025
  DynamicLocks Shared106130
 ProcArrayLock  Exclusive 63 963551
 ProcArrayLock Shared 50   4160
  LockMgrLocks  Exclusive 18159
   IndividualLock   Exclusive  2  7
 
  There is relative few very long ProcArrayLocks lwlocks

 It's odd that there are so many exclusive acquisition
 ProcArrayLocks... A hierarchical profile would be interesting. I'd
 suggest compiling postgres with -fno-omit-frame-pointer and doing a
 profile with perf.


I had no experience with perf, so maybe it is not what you want

-  19.59%   postmaster  postgres
   - s_lock
  - 55.06% LWLockAcquire
 + 99.84% TransactionIdIsInProgress
  - 44.63% LWLockRelease
 + 99.91% TransactionIdIsInProgress
-  13.84%   postmaster  postgres
   - tas
  - 97.97% s_lock
 + 55.01% LWLockAcquire
 + 44.99% LWLockRelease
  - 1.10% LWLockAcquire
 + 99.89% TransactionIdIsInProgress
  - 0.93% LWLockRelease
 + 99.93% TransactionIdIsInProgress



-  19.59%   postmaster  postgres
   - s_lock
  - 55.06% LWLockAcquire
 - 99.84% TransactionIdIsInProgress
  HeapTupleSatisfiesMVCC
  heap_hot_search_buffer
  index_fetch_heap
  index_getnext
  get_actual_variable_range
  ineq_histogram_selectivity
  scalarineqsel
  mergejoinscansel
  cached_scansel
  initial_cost_mergejoin
  try_mergejoin_path
  sort_inner_and_outer
  add_paths_to_joinrel
  make_join_rel
  make_rels_by_clause_joins
  join_search_one_level
  standard_join_search
  make_rel_from_joinlist
  make_one_rel
  query_planner
  grouping_planner
  subquery_planner
  standard_planner
  planner
  pg_plan_query
  pg_plan_queries
  BuildCachedPlan
  GetCachedPlan
  exec_bind_message
  PostgresMain
  ExitPostmaster
  BackendStartup
  ServerLoop
  PostmasterMain
  startup_hacks

regards

Pavel


Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2014-02-24 Thread Bruce Momjian
On Mon, Feb 17, 2014 at 11:14:33AM -0500, Bruce Momjian wrote:
 On Sun, Feb 16, 2014 at 09:26:47PM -0500, Robert Haas wrote:
   So, would anyone like me to create patches for any of these items before
   we hit 9.4 beta?  We have added autovacuum_work_mem, and increasing
   work_mem and maintenance_work_mem by 4x is a simple operation.  Not sure
   about the others.  Or do we just keep this all for 9.5?
  
  I don't think anyone objected to increasing the defaults for work_mem
  and maintenance_work_mem by 4x, and a number of people were in favor,
  so I think we should go ahead and do that.  If you'd like to do the
  honors, by all means!
 
 OK, patch attached.

Patch applied.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Andres Freund
On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
 On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
  As others, I am concerned about people being confused when funny-looking
  trigger names suddenly appearing when you disable all table triggers.
  
  What I ended up doing is to create a user and internal section when
  displaying disabled triggers:
  
  Disabled user triggers:
  check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
  trigf()
  Disabled internal triggers:
  RI_ConstraintTrigger_c_16409 AFTER INSERT ON orders FROM customer 
  NOT DEF ...
  RI_ConstraintTrigger_c_16410 AFTER UPDATE ON orders FROM customer 
  NOT DEF ...
  
  I kept the Triggers section unchanged, showing only user triggers.  I
  also updated the code to handle 8.3+ servers.
  
  Patch attached.
 
 Patch applied.

Thanks. It'd have been nice tho, to mention Fabrízio in the commit
message as the patch's author.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
 On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
  On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
   As others, I am concerned about people being confused when funny-looking
   trigger names suddenly appearing when you disable all table triggers.
   
   What I ended up doing is to create a user and internal section when
   displaying disabled triggers:
   
 Disabled user triggers:
 check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
   trigf()
 Disabled internal triggers:
 RI_ConstraintTrigger_c_16409 AFTER INSERT ON orders FROM customer 
   NOT DEF ...
 RI_ConstraintTrigger_c_16410 AFTER UPDATE ON orders FROM customer 
   NOT DEF ...
 
   I kept the Triggers section unchanged, showing only user triggers.  I
   also updated the code to handle 8.3+ servers.
   
   Patch attached.
  
  Patch applied.
 
 Thanks. It'd have been nice tho, to mention Fabrízio in the commit
 message as the patch's author.

Uh, I was thinking of that, but I basically rewrote the patch from
scratch and changed its visible behavior, so I was worried about perhaps
blaming him if it introduced a bug.  I should have said original patch
by ..., but because so much of it was new, I didn't bother.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Andres Freund
On 2014-02-24 13:16:39 -0500, Bruce Momjian wrote:
 On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
  On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
   On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
As others, I am concerned about people being confused when funny-looking
trigger names suddenly appearing when you disable all table triggers.

What I ended up doing is to create a user and internal section when
displaying disabled triggers:

Disabled user triggers:
check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE 
PROCEDURE trigf()
Disabled internal triggers:
RI_ConstraintTrigger_c_16409 AFTER INSERT ON orders FROM 
customer NOT DEF ...
RI_ConstraintTrigger_c_16410 AFTER UPDATE ON orders FROM 
customer NOT DEF ...

I kept the Triggers section unchanged, showing only user triggers.  I
also updated the code to handle 8.3+ servers.

Patch attached.
   
   Patch applied.
  
  Thanks. It'd have been nice tho, to mention Fabrízio in the commit
  message as the patch's author.
 
 Uh, I was thinking of that, but I basically rewrote the patch from
 scratch and changed its visible behavior, so I was worried about perhaps
 blaming him if it introduced a bug.  I should have said original patch
 by ..., but because so much of it was new, I didn't bother.

I just seems nicer to relatively new contributors to mention their names
when they try to contribute.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Fabrízio de Royes Mello
On Mon, Feb 24, 2014 at 3:23 PM, Andres Freund and...@2ndquadrant.com
wrote:
   Thanks. It'd have been nice tho, to mention Fabrízio in the commit
   message as the patch's author.
 
  Uh, I was thinking of that, but I basically rewrote the patch from
  scratch and changed its visible behavior, so I was worried about perhaps
  blaming him if it introduced a bug.  I should have said original patch
  by ..., but because so much of it was new, I didn't bother.

 I just seems nicer to relatively new contributors to mention their names
 when they try to contribute.


Hey guys, I'm not worried about it... to me the most important thing is the
improvement and the learning. So I'm happy to help in some way.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Mon, Feb 24, 2014 at 07:23:50PM +0100, Andres Freund wrote:
 On 2014-02-24 13:16:39 -0500, Bruce Momjian wrote:
  On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
   On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
 As others, I am concerned about people being confused when 
 funny-looking
 trigger names suddenly appearing when you disable all table triggers.
 
 What I ended up doing is to create a user and internal section when
 displaying disabled triggers:
 
   Disabled user triggers:
   check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE 
 PROCEDURE trigf()
   Disabled internal triggers:
   RI_ConstraintTrigger_c_16409 AFTER INSERT ON orders FROM 
 customer NOT DEF ...
   RI_ConstraintTrigger_c_16410 AFTER UPDATE ON orders FROM 
 customer NOT DEF ...
   
 I kept the Triggers section unchanged, showing only user triggers.  
 I
 also updated the code to handle 8.3+ servers.
 
 Patch attached.

Patch applied.
   
   Thanks. It'd have been nice tho, to mention Fabrízio in the commit
   message as the patch's author.
  
  Uh, I was thinking of that, but I basically rewrote the patch from
  scratch and changed its visible behavior, so I was worried about perhaps
  blaming him if it introduced a bug.  I should have said original patch
  by ..., but because so much of it was new, I didn't bother.
 
 I just seems nicer to relatively new contributors to mention their names
 when they try to contribute.

Agreed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Andrew Dunstan


On 02/24/2014 11:06 AM, Merlin Moncure wrote:

On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure mmonc...@gmail.com wrote:

On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure mmonc...@gmail.com wrote:

I still find the phrasing as jsonb is more efficient for most
purposes to be a bit off  Basically, the text json type is faster for
serialization/deserialization pattern (not just document preservation)
and jsonb is preferred when storing json and doing repeated
subdocument accesses.

Hm, I'm going to withdraw that.  I had done some testing of simple
deserialization (cast to text and the like) and noted that jsonb was
as much as 5x slower.  However, I just did some checking on
json[b]_populate_recordset though and it's pretty much a wash.

[sorry for noise on this].

Here's the use case coverage as I see it today:

CASE:jsonjsonb hstore
Static document: yes  poor  poor
Precise document:yes  nono
Serialization:   yes  nono
Deserialization: poor***  yes*  no
Repeated Access: poor yes   yes
Manipulation:no   no**  yes
GIST/GIN searching:  no   no**  yes

notes:
* jsonb gets 'yes' for deserialization assuming andrew's 'two level'
deserialization fix goes in (otherwise 'poor').
** jsonb can't do this today, but presumably will be able to soon
*** 'poor' unless json type also gets the deserialization fix, then 'yes'.
 hstore can deserialize hstore format, but will rely on json/jsonb
for deserializing json

'Static document' represents edge cases where the json is opaque to
the database but performance -- for example large map polygons.
'Precise document' represents cases where whitespace or key order is important.

Peter asked upthread how to access the various features.  Well, today,
it basically means a bit of nimble casting to different structures
depending on which particular features are important to you, which
IMNSHO is not bad at all as long as we understand that most people who
rely on jsonb will also need hstore for its searching and operators.
Down the line when hstore and jsonb are more flushed out it's going to
come down to an API style choice.





Frankly, a lot of the above doesn't make much sense to me. WTF is 
Manipulation'?


Unless I see much more actual info on the tests being conducted it's 
just about impossible to comment. The performance assessment at this 
stage is simply anecdotal as far as I'm concerned.


populate_record() is likely to be a *very* poor point of comparison 
anyway, I would expect the performance numbers to be dominated by the 
input function calls for the object's component types, and that's going 
to be the same in both cases. If you want to prove something here you'll 
need to supply profiling numbers showing where it spends its time in 
each case.


Having had my schedule very seriously disrupted by the storm in the US 
South East a week or so ago, I am finally getting back to being able to 
devote some time to jsonb. I hope to have new patches available today or 
tomorrow at the latest.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Bruce Momjian
On Sun, Feb 23, 2014 at 07:50:08AM +0330, Mohsen SM wrote:
 Hello.
 I have a new type similar to varchar.
 I want to fine how did I can to calculate typemod
 and where must I calculate typemod for this type.

Well, typmods are type-specific, so there is no official way to
calculate it.  I would look at how an existing type uses typmod and copy
that.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Alvaro Herrera
Bruce Momjian escribió:
 On Sun, Feb 23, 2014 at 07:50:08AM +0330, Mohsen SM wrote:
  Hello.
  I have a new type similar to varchar.
  I want to fine how did I can to calculate typemod
  and where must I calculate typemod for this type.
 
 Well, typmods are type-specific, so there is no official way to
 calculate it.  I would look at how an existing type uses typmod and copy
 that.

Our system is pretty neat.  See a complex example here:
https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_typmod.c

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Jeff Janes
On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule pavel.steh...@gmail.comwrote:




 2014-02-23 21:32 GMT+01:00 Andres Freund and...@2ndquadrant.com:

 Hi,

 On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
  There is relative few very long ProcArrayLocks lwlocks
 
  This issue is very pathologic on fast computers with more than 8 CPU.
 This
  issue was detected after migration from 8.4 to 9.2. (but tested with
 same
  result on 9.0)  I see it on devel 9.4 today actualized.
 
  When I moved PREPARE from cycle, then described issues is gone. But
 when I
  use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
  related to planner, ...

 In addition to the issue Jeff mentioned, I'd suggest trying the same
 workload with repeatable read. That can do *wonders* because of the
 reduced number of snapshots.


 I tested it, and it doesn't help.

 Is there some patch, that I can test related to this issue?


This is the one that I was referring to:

http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us

Cheers,

Jeff


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Bruce Momjian escribió:
 Well, typmods are type-specific, so there is no official way to
 calculate it.  I would look at how an existing type uses typmod and copy
 that.

 Our system is pretty neat.  See a complex example here:
 https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_typmod.c

One other point is that if you do consult the varchar functions as
an example, be aware that there's an offset of 4 in their definition
of the typmod (eg, for varchar(3) the stored typmod is 7).  This is
entirely for legacy reasons so there's no good reason to duplicate it
in a new custom-made type.  Except for the rule that negative values
mean unspecified typmod (which you have to support), you can
define the contents of the typmod value however you want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-20 13:25:35 +, Greg Stark wrote:
 rmgr: Heaplen (rec/tot):235/   267, tx:5943845, lsn:
 FD/2F0A3640, prev FD/2F0A3600, bkp: , desc: insert: rel
 1663/16385/212653; tid 13065/2

  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |
 t_ctid   | t_infomask2 | t_infomask | t_hoff |
 ++--++-+-+--++-+++-
   2 |   3424 |1 |232 | 5943845 |  728896 |0 |
 (13065,2)  |  32 |   4419 | 32 |
   3 |   3152 |1 |272 | 5943849 | 5943879 |0 |
 (13065,4)  |   49184 |   9475 | 32 |
   4 |   2864 |1 |287 | 5943879 | 5943880 |0 |
 (13065,7)  |   49184 |   9475 | 32 |
   7 |   2576 |1 |287 | 5943880 |   0 |0 |
 (13065,7)  |   32800 |  10499 | 32 |

Any chance you could send those again, with less linebreak damage? They
are really hard to read this way...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Alvaro Herrera
Here's a reformatted copy.  I think this is the same bug as Peter G.
reported in 
http://www.postgresql.org/message-id/CAM3SWZTMQiCi5PV5OWHb+bYkUcnCk=o67w0csswpvv7xfuc...@mail.gmail.com

I have a hunch that this is related to the heap_lock_updated business.
I haven't investigated yet.


Greg Stark wrote:
 I have a database where a a couple rows don't appear in index scans
 but do appear in sequential scans. It looks like the same problem as
 Peter reported but this is a different database. I've extracted all
 the xlogdump records and below are the ones I think are relevant. You
 can see that lp 2 gets a few HOT updates and concurrently has someone
 create a MultiXact NO KEY UPDATE lock while one of those HOT updates
 is pending but not committed. The net result seems to be that the ctid
 update chain got broken. The index of course points to the head of the
 HOT chain so it doesn't find the live tail whereas the sequential scan
 picks it up.
 
 I don't see any evidence of MultiXactId wraparound, the members run to
 001F and the offsets run to 000B. This is on a standby that's been
 activated but afaik that shouldn't change these files any more right?
 
 rmgr: Heaplen (rec/tot):235/   267, tx:5943845, lsn: 
 FD/2F0A3640, prev FD/2F0A3600, bkp: , desc: insert: rel 
 1663/16385/212653; tid 13065/2
 rmgr: Transaction len (rec/tot): 12/44, tx:5943845, lsn: 
 FD/2F0A8178, prev FD/2F0A8148, bkp: , desc: commit: 2014-02-19 
 20:41:23.698513 UTC
 rmgr: Heaplen (rec/tot): 25/57, tx:5943847, lsn: 
 FD/2F0AA440, prev FD/2F0AA3F8, bkp: , desc: lock 5943847: rel 
 1663/16385/212653; tid 13065/2 LOCK_ONLY KEYSHR_LOCK
 rmgr: Transaction len (rec/tot): 12/44, tx:5943847, lsn: 
 FD/2F0AA480, prev FD/2F0AA440, bkp: , desc: commit: 2014-02-19 
 20:41:23.713969 UTC
 rmgr: Heaplen (rec/tot):291/   323, tx:5943849, lsn: 
 FD/2F0ADFC0, prev FD/2F0ADF90, bkp: , desc: hot_update: rel 
 1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
 rmgr: Heap2   len (rec/tot): 25/57, tx:5943851, lsn: 
 FD/2F0AE450, prev FD/2F0AE408, bkp: , desc: lock updated: xmax 5943851 
 msk 000a; rel 1663/16385/212653; tid 13065/3
 rmgr: MultiXact   len (rec/tot): 28/60, tx:5943851, lsn: 
 FD/2F0AE490, prev FD/2F0AE450, bkp: , desc: create mxid 728896 offset 
 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
 rmgr: Heaplen (rec/tot): 25/57, tx:5943851, lsn: 
 FD/2F0AE4D0, prev FD/2F0AE490, bkp: , desc: lock 728896: rel 
 1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK
 rmgr: Transaction len (rec/tot): 12/44, tx:5943849, lsn: 
 FD/2F0AE510, prev FD/2F0AE4D0, bkp: , desc: commit: 2014-02-19 
 20:41:23.744989 UTC
 rmgr: Transaction len (rec/tot): 12/44, tx:5943851, lsn: 
 FD/2F0AE570, prev FD/2F0AE540, bkp: , desc: commit: 2014-02-19 
 20:41:23.746820 UTC
 rmgr: Heaplen (rec/tot):306/   338, tx:5943879, lsn: 
 FD/2F103788, prev FD/2F103758, bkp: , desc: hot_update: rel 
 1663/16385/212653; tid 13065/3 xmax 5943879 ; new tid 13065/4 xmax 0
 rmgr: Transaction len (rec/tot): 12/44, tx:5943879, lsn: 
 FD/2F1038E0, prev FD/2F103788, bkp: , desc: commit: 2014-02-19 
 20:41:24.580827 UTC
 rmgr: Heaplen (rec/tot):306/   338, tx:5943880, lsn: 
 FD/2F103910, prev FD/2F1038E0, bkp: , desc: hot_update: rel 
 1663/16385/212653; tid 13065/4 xmax 5943880 ; new tid 13065/7 xmax 0
 rmgr: Transaction len (rec/tot): 12/44, tx:5943880, lsn: 
 FD/2F106070, prev FD/2F106030, bkp: , desc: commit: 2014-02-19 
 20:41:24.617048 UTC
 
 
  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |   t_ctid   
 | t_infomask2 | t_infomask | t_hoff |
 ++--++-+-+--++-+++-
   2 |   3424 |1 |232 | 5943845 |  728896 |0 | (13065,2)  
 |  32 |   4419 | 32 |
   3 |   3152 |1 |272 | 5943849 | 5943879 |0 | (13065,4)  
 |   49184 |   9475 | 32 |
   4 |   2864 |1 |287 | 5943879 | 5943880 |0 | (13065,7)  
 |   49184 |   9475 | 32 |
   7 |   2576 |1 |287 | 5943880 |   0 |0 | (13065,7)  
 |   32800 |  10499 | 32 |
 
 
 -- 
 greg
 
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Andrew Dunstan


On 02/24/2014 02:15 PM, Andrew Dunstan wrote:



Having had my schedule very seriously disrupted by the storm in the US 
South East a week or so ago, I am finally getting back to being able 
to devote some time to jsonb. I hope to have new patches available 
today or tomorrow at the latest.






Update to this: A recent commit caused an unfortunate merge conflict in 
the hstore code that I have asked Teodor to resolve. I can't post new 
clean patches until that's been done.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-24 17:55:14 -0300, Alvaro Herrera wrote:
 Greg Stark wrote:
  I have a database where a a couple rows don't appear in index scans
  but do appear in sequential scans. It looks like the same problem as
  Peter reported but this is a different database. I've extracted all
  the xlogdump records and below are the ones I think are relevant. You
  can see that lp 2 gets a few HOT updates and concurrently has someone
  create a MultiXact NO KEY UPDATE lock while one of those HOT updates
  is pending but not committed.

Per se the sequence of records doesn't look bad (even though I am not
happy that we log intermediate and final rows first, and only then the
start of the chain).

  The net result seems to be that the ctid
  update chain got broken. The index of course points to the head of the
  HOT chain so it doesn't find the live tail whereas the sequential scan
  picks it up.

Yea, that's the problem.

  rmgr: Heaplen (rec/tot):291/   323, tx:5943849, lsn: 
  FD/2F0ADFC0, prev FD/2F0ADF90, bkp: , desc: hot_update: rel 
  1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
  rmgr: Heap2   len (rec/tot): 25/57, tx:5943851, lsn: 
  FD/2F0AE450, prev FD/2F0AE408, bkp: , desc: lock updated: xmax 5943851 
  msk 000a; rel 1663/16385/212653; tid 13065/3
  rmgr: MultiXact   len (rec/tot): 28/60, tx:5943851, lsn: 
  FD/2F0AE490, prev FD/2F0AE450, bkp: , desc: create mxid 728896 offset 
  1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
  rmgr: Heaplen (rec/tot): 25/57, tx:5943851, lsn: 
  FD/2F0AE4D0, prev FD/2F0AE490, bkp: , desc: lock 728896: rel 
  1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK

   lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |   t_ctid  
   | t_infomask2 | t_infomask | t_hoff |
  ++--++-+-+--++-+++-
2 |   3424 |1 |232 | 5943845 |  728896 |0 | (13065,2) 
   |  32 |   4419 | 32 |
3 |   3152 |1 |272 | 5943849 | 5943879 |0 | (13065,4) 
   |   49184 |   9475 | 32 |
4 |   2864 |1 |287 | 5943879 | 5943880 |0 | (13065,7) 
   |   49184 |   9475 | 32 |
7 |   2576 |1 |287 | 5943880 |   0 |0 | (13065,7) 
   |   32800 |  10499 | 32 |

Those together explain the story. Note this bit:

static void
heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
{
...
HeapTupleHeaderClearHotUpdated(htup);
HeapTupleHeaderSetXmax(htup, xlrec-locking_xid);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Make sure there is no forward chain link in t_ctid */
htup-t_ctid = xlrec-target.tid;
...
}

So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
HOT_UPDATED flag.

Which means fkey locks have never properly worked across SR/crash
recovery.

Haven't thought about how to fix it yet, I hope won't have to (hint hint).

We somehow need to have a policy of testing changes to the WAL format
without full_page_writes. They hide bugs in replay far, far too often.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 22:17:31 +0100, Andres Freund wrote:
 Those together explain the story. Note this bit:
 
 static void
 heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
 {
 ...
 HeapTupleHeaderClearHotUpdated(htup);
 HeapTupleHeaderSetXmax(htup, xlrec-locking_xid);
 HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
 /* Make sure there is no forward chain link in t_ctid */
 htup-t_ctid = xlrec-target.tid;
 ...
 }
 
 So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
 HOT_UPDATED flag.

Some quick archeology shows that the HeapTupleHeaderClearHotUpdated()
was in the original HOT commit (282d2a03d) and clearing of t_ctid was in
the original commit implementing FOR SHARE (bedb78d38). Both look like
they are copied from other places, I don't immediately see any need for
them here...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.7

2014-02-24 Thread Robert Haas
On Mon, Feb 24, 2014 at 10:11 AM, Andres Freund and...@2ndquadrant.com wrote:
 Changes in this version include:
 * changed slot error handling log by introducing ephermal slots which
   get dropped on errors. This is the biggest change.
 * added quoting in the test_decoding output plugin
 * closing of a tight race condition during slot creation where WAL could
   have been removed
 * comment and other adjustments, many of them noticed by robert

I did another read-through of this this afternoon, focusing on the
stuff you changed and parts I hadn't looked at carefully yet.
Comments below.

Documentation needs to be updated for pg_stat_replication view.

I still think pg_create_logical_replication_slot should be in slotfuncs.c.

 /* Size of an indirect datum that contains an indirect TOAST pointer */
 #define INDIRECT_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(struct
varatt_indirect))

+/* Size of an indirect datum that contains a standard TOAST pointer */
+#define INDIRECT_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(struct
varatt_indirect))

Isn't the new hunk a duplicate of the existing definition, except for
a one-word change to the comment?

I don't think the completely-unsecured directory operations in
test_decoding_regsupport.c are acceptable.  Tom fought tooth and nail
to make sure that similar capabilities in adminpack carried meaningful
security restrictions.

/*
+* Check whether there are, possibly unconnected, logical
slots that refer
+* to the to-be-dropped database. The database lock we are holding
+* prevents the creation of new slots using the database.
+*/
+   if (ReplicationSlotsCountDBSlots(db_id, nslots, nslots_active))
+   ereport(ERROR,
+   (errcode(ERRCODE_OBJECT_IN_USE),
+errmsg(database \%s\ is used in a
logical decoding slot,
+   dbname),
+errdetail(There are %d slot(s), %d
of them active,
+  nslots, nslots_active)));

What are you going to do when we get around to supporting this on a
standby?  Whatever the answer is, maybe add a TODO comment.

+ * loop for now..
+* more than twice..

Extra periods.

+* The replicatio slot mechanism is used to prevent removal of required

Typo.

+
+   /*
+* GetRunningTransactionData() acquired ProcArrayLock, we must release
+* it. We can do that before inserting the WAL record because
+* ProcArrayApplyRecoveryInfo can recheck the commit status using the
+* clog. If we're doing logical replication we can't do that though, so
+* hold the lock for a moment longer.
+*/
+   if (wal_level  WAL_LEVEL_LOGICAL)
+   LWLockRelease(ProcArrayLock);
+
recptr = LogCurrentRunningXacts(running);

+   /* Release lock if we kept it longer ... */
+   if (wal_level = WAL_LEVEL_LOGICAL)
+   LWLockRelease(ProcArrayLock);
+

This seems unfortunate.  The comment should clearly explain why it's necessary.

+   /*
+* Startup logical state, needs to be setup now so we have proper data
+* during restore.
+*/
+   StartupReorderBuffer();

Should add blank line before this.

+   CheckPointSnapBuild();
+   CheckpointLogicalRewriteHeap();

Shouldn't the capitalization be consistent?

-   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
+   if (IsSystemRelation(scan-rs_rd)
+   || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
+   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
+   else
+   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalDataXmin);

Instead of changing the callers of heap_page_prune_opt() in this way,
I think it might be better to change heap_page_prune_opt() to take
only the first two of its current three parameters; everybody's just
passing RecentGlobalXmin right now anyway. Then, we could change the
first check in heap_page_prune_opt() to check first whether
PageIsPrunable(page, RecentGlobalDataXmin).  If not, give up.  If so,
then check that (!IsSystemRelation(scan-rs_rd) 
!RelationIsAccessibleInLogicalDecoding(scan-rs_rd)) ||
PageIsPrunable(page, RecentGlobalXmin)).  The advantage of this is
that we avoid code duplication, and we avoid checking a couple of
conditions if pd_prune_xmin is very recent.

-   if (nrels  0 || nmsgs  0 || RelcacheInitFileInval ||
forceSyncCommit)
+   if (nrels  0 || nmsgs  0 || RelcacheInitFileInval ||
forceSyncCommit ||
+   XLogLogicalInfoActive())

Mmph.  Is this really necessary?  If so, why?  The comments could elucidate.

+   bool fail_softly = slot-data.persistency == RS_EPHEMERAL;

This should be contingent on whether we're being called in the error
pathway, not the slot type.  I think you should pass a bool.


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Peter Geoghegan
On Mon, Feb 24, 2014 at 1:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 We somehow need to have a policy of testing changes to the WAL format
 without full_page_writes. They hide bugs in replay far, far too often.

What's the easiest way to get atomic page writes at the FS level on
your laptop? ZFS or some data journaling FS, I suppose.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.7

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-24 17:06:53 -0500, Robert Haas wrote:
 I still think pg_create_logical_replication_slot should be in slotfuncs.c.

Ok, I don't feel too strongly, so I can change it. I wanted to keep
logical/ stuff out of slotfuncs.c, but there's not really a strong
reason for that.

 I don't think the completely-unsecured directory operations in
 test_decoding_regsupport.c are acceptable.  Tom fought tooth and nail
 to make sure that similar capabilities in adminpack carried meaningful
 security restrictions.

I actually thought they'd be too ugly to live and we'd remove them
pre-commit.

There's no security problem though afaics, since they aren't actually
created, and you need to be superuser to create C functions.

 /*
 +* Check whether there are, possibly unconnected, logical
 slots that refer
 +* to the to-be-dropped database. The database lock we are holding
 +* prevents the creation of new slots using the database.
 +*/
 +   if (ReplicationSlotsCountDBSlots(db_id, nslots, nslots_active))
 +   ereport(ERROR,
 +   (errcode(ERRCODE_OBJECT_IN_USE),
 +errmsg(database \%s\ is used in a
 logical decoding slot,
 +   dbname),
 +errdetail(There are %d slot(s), %d
 of them active,
 +  nslots, nslots_active)));
 
 What are you going to do when we get around to supporting this on a
 standby?  Whatever the answer is, maybe add a TODO comment.

I think it should actually mostly work out, anybody actively connected
to a slot will be kicked of (normal HS mechanisms)... But the slot would
currently live on which obviously isn't nice.

Will add TODO.

 +   /*
 +* GetRunningTransactionData() acquired ProcArrayLock, we must release
 +* it. We can do that before inserting the WAL record because
 +* ProcArrayApplyRecoveryInfo can recheck the commit status using the
 +* clog. If we're doing logical replication we can't do that though, 
 so
 +* hold the lock for a moment longer.
 +*/
 +   if (wal_level  WAL_LEVEL_LOGICAL)
 +   LWLockRelease(ProcArrayLock);
 +
 recptr = LogCurrentRunningXacts(running);
 
 +   /* Release lock if we kept it longer ... */
 +   if (wal_level = WAL_LEVEL_LOGICAL)
 +   LWLockRelease(ProcArrayLock);
 +

 This seems unfortunate.  The comment should clearly explain why it's 
 necessary.

There's another (existing) comment ontop of the function giving a bit
more context, but I'll expand.

I'd actually prefer to remove that special case alltogether, I don't
have much trust in those codepaths for HS... But that's not an argument
I want to fight out right nwo.

 -   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
 +   if (IsSystemRelation(scan-rs_rd)
 +   || RelationIsAccessibleInLogicalDecoding(scan-rs_rd))
 +   heap_page_prune_opt(scan-rs_rd, buffer, RecentGlobalXmin);
 +   else
 +   heap_page_prune_opt(scan-rs_rd, buffer, 
 RecentGlobalDataXmin);
 
 Instead of changing the callers of heap_page_prune_opt() in this way,
 I think it might be better to change heap_page_prune_opt() to take
 only the first two of its current three parameters; everybody's just
 passing RecentGlobalXmin right now anyway.

Sounds like a plan.

 -   if (nrels  0 || nmsgs  0 || RelcacheInitFileInval ||
 forceSyncCommit)
 +   if (nrels  0 || nmsgs  0 || RelcacheInitFileInval ||
 forceSyncCommit ||
 +   XLogLogicalInfoActive())
 
 Mmph.  Is this really necessary?  If so, why?  The comments could elucidate.

We could get rid of it by (optionally) adding information about the
database oid to compact commit, but that'd increase the size of the
record.

 +   bool fail_softly = slot-data.persistency == RS_EPHEMERAL;
 
 This should be contingent on whether we're being called in the error
 pathway, not the slot type.  I think you should pass a bool.

Why? I had it that way at first, but for persistent slots this won't be
called in error pathways as we won't drop there.

 There are a bunch of places where you're testing IsSystemRelation() ||
 RelationIsAccessibleInLogicalDecoding().  Maybe we need a macro
 encapsulating that test, with a name chose to explain the point of it.

Sounds like a idea.

  It seems to be indicating, roughly, whether the relation should
 participate in RecentGlobalXmin or RecentGlobalDataXmin.  But is there
 any point at all of separating those when !XLogLogicalInfoActive()?
 The test expands to:
 
 IsSystemRelation() || (XLogLogicalInfoActive() 
 RelationNeedsWAL(relation)  (IsCatalogRelation(relation) ||
 RelationIsUsedAsCatalogTable(relation)))
 
 So basically this is all tables created in pg_catalog during initdb
 plus all TOAST tables in the system.  If 

Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 15:05:37 -0800, Peter Geoghegan wrote:
 On Mon, Feb 24, 2014 at 1:17 PM, Andres Freund and...@2ndquadrant.com wrote:
  We somehow need to have a policy of testing changes to the WAL format
  without full_page_writes. They hide bugs in replay far, far too often.
 
 What's the easiest way to get atomic page writes at the FS level on
 your laptop? ZFS or some data journaling FS, I suppose.

TBH I don't care about torn pages during normal testing. I don't want to
suggest disabling it for real workloads with real data, just that it's
important to do so during development/testing of WAL related code,
because otherwise it will hide/fixup many errors.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Peter Geoghegan
On Mon, Feb 24, 2014 at 3:17 PM, Andres Freund and...@2ndquadrant.com wrote:
 TBH I don't care about torn pages during normal testing. I don't want to
 suggest disabling it for real workloads with real data, just that it's
 important to do so during development/testing of WAL related code,
 because otherwise it will hide/fixup many errors.

Sure, but you might want to account for torn pages anyway.
Particularly if you're interested in some degree of automation, as we
all seem to be.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 15:20:13 -0800, Peter Geoghegan wrote:
 On Mon, Feb 24, 2014 at 3:17 PM, Andres Freund and...@2ndquadrant.com wrote:
  TBH I don't care about torn pages during normal testing. I don't want to
  suggest disabling it for real workloads with real data, just that it's
  important to do so during development/testing of WAL related code,
  because otherwise it will hide/fixup many errors.
 
 Sure, but you might want to account for torn pages anyway.
 Particularly if you're interested in some degree of automation, as we
 all seem to be.

Hm, well. I have to admit, if a test machine crashes, I'd just rebuild
the cluster. Unless I am working on crash safety testing in which case
I'd probably not have full_page_writes disabled...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)

2014-02-24 Thread Haribabu Kommi
On Tue, Feb 25, 2014 at 10:44 AM, Kouhei Kaigai kai...@ak.jp.nec.comwrote:

 Thanks for your testing,

  Getting some compilation warnings while compiling the extension and also
  I am not able to load the extension because of undefined symbol
  get_restriction_qual_cost.
 
 It seems to me you applied only part-1 portion of the custom-scan patches.

 The get_restriction_qual_cost() is re-defined as extern function, from
 static one, on the part-2 portion (ctidscan) to estimate cost value of the
 qualifiers in extension. Also, bms_to_string() and bms_from_string() are
 added on the part-3 portion (postgres_fdw) portion to carry a bitmap-set
 according to the copyObject manner.

 It may make sense to include the above supplemental changes in the cache-
 scan feature also, until either of them getting upstreamed.


Thanks for the information, I will apply other patches also and start
testing.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Michael Paquier
On Tue, Feb 25, 2014 at 12:41 AM, Robert Haas robertmh...@gmail.com wrote:

 On Wed, Feb 19, 2014 at 8:27 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
  On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
  michael.paqu...@gmail.com wrote:
   Yes, that's a good precedent in multiple ways.
   Here are updated patches to use pg_lsn instead of pglsn...
 
  OK, so I think this stuff is all committed now, with assorted changes.
   Thanks for your work on this.
 
  cool, thanks you two.
 
  I wonder if pg_stat_replication shouldn't be updated to use it as well?
  SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
  that as names that are possible candidates for conversion.
  I was sure to have forgotten some views or functions in the previous
  patch... Please find attached a patch making pg_stat_replication use
  pg_lsn instead of the existing text fields.
  Regards,

 Committed.  Do we want to do anything about pageinspect?

Thanks. We're dealing with that on another thread, I'll send an updated
patch there.
-- 
Michael


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Alvaro Herrera
While we're messing with this, I wonder if there's any way to have
infomask and infomask2 displayed in hex format rather than plain int
without having to specify that in every query.  I'm not well known for
being able to do such conversions off the top of my head ...

(Not that it's this patch' responsibility to do that.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2014-02-24 Thread Peter Eisentraut
committed


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor comment improvements in tablecmds.c

2014-02-24 Thread Etsuro Fujita
This is a small patch to improve comments in tablecmds.c.  Please find
attached a patch.

Thanks,

Best regards,
Etsuro Fujita
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 08b037e..ed9d206 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2468,7 +2468,8 @@ RenameConstraint(RenameStmt *stmt)
 }
 
 /*
- * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE RENAME
+ * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/MATERIALIZED VIEW/FOREIGN TABLE
+ * RENAME
  */
 Oid
 RenameRelation(RenameStmt *stmt)
@@ -2476,8 +2477,9 @@ RenameRelation(RenameStmt *stmt)
Oid relid;
 
/*
-* Grab an exclusive lock on the target table, index, sequence or view,
-* which we will NOT release until end of transaction.
+* Grab an exclusive lock on the target table, index, sequence, view,
+* materialized view, or foreign table, which we will NOT release until
+* end of transaction.
 *
 * Lock level used here should match RenameRelationInternal, to avoid 
lock
 * escalation.
@@ -2520,8 +2522,9 @@ RenameRelationInternal(Oid myrelid, const char 
*newrelname, bool is_internal)
Oid namespaceId;
 
/*
-* Grab an exclusive lock on the target table, index, sequence or view,
-* which we will NOT release until end of transaction.
+* Grab an exclusive lock on the target table, index, sequence, view,
+* materialized view, or foreign table, which we will NOT release until
+* end of transaction.
 */
targetrelation = relation_open(myrelid, AccessExclusiveLock);
namespaceId = RelationGetNamespace(targetrelation);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-24 Thread Rajeev rastogi
On 04 February 2014 14:38, Myself wrote:

 
 On 4th February 2014, Christian kruse Wrote:
  On 04/02/14 12:38, Fujii Masao wrote:
   ISTM that the phrase Request queue is not used much around the
 lock.
   Using the phrase wait queue or Simon's suggestion sound better to
  at least me.
   Thought?
 
  Sounds reasonable to me. Attached patch changes messages to the
  following:
 
  Process holding the lock: A. Wait queue: B.
  Processes holding the lock: A, B. Wait queue: C.
 
 This looks good to me also.

I have tested the revised patch and found ready to be committed.

I am marking this as Ready for Committer.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Kouhei Kaigai
  /* Hook for plugins to add custom join path, in addition to default
  ones */ typedef void (*add_join_path_hook_type)(PlannerInfo *root,
  RelOptInfo *joinrel,
  RelOptInfo *outerrel,
  RelOptInfo *innerrel,
  JoinType jointype,
  SpecialJoinInfo *sjinfo,
  List *restrictlist,
  List *mergeclause_list,
  SemiAntiJoinFactors *semifactors,
  Relids param_source_rels,
  Relids extra_lateral_rels);
  extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
 
  Likely, its arguments upper than restrictlist should be informed to
  extensions, because these are also arguments of add_paths_to_joinrel().
  However, I'm not 100% certain how about other arguments should be informed.
  Probably, it makes sense to inform param_source_rels and
  extra_lateral_rels to check whether the path is sensible for parameterized
 paths.
  On the other hand, I doubt whether mergeclause_list is usuful to deliver.
  (It may make sense if someone tries to implement their own merge-join
  implementation??)
 
  I'd like to seem idea to improve the current interface specification.
 
 I've read the code path to add custom join again, and felt that providing
 semifactors seems not necessary for the first cut, because it is used in
 only initial_cost_nestloop (final_cost_nestloop receives semifactors but
 it is not used there), and external module would not become so smart before
 9.5 development cycle.  It seems enough complex to postpone determinig
 whether it's essential for add_join_path_hook.
  Do you have any concrete use case for the parameter?
 
The reason why I asked the question above is, I haven't been 100% certain
about its usage. Indeed, semifactors is applied on a limited usage, but
quite easy to reproduce by extension later (using clauselist_selectivity)
if extension wants this factor. So, I agree with removing the semifactors
here.

 mergeclause_list and param_source_rels seem little easier to use, but
 anyway it should be documented how to use those parameters.

The mergeclause_list might not be sufficient for extensions to determine
whether its own mergejoin is applicable here. See the comment below; that
is on the head of select_mergejoin_clauses.

|  * *mergejoin_allowed is normally set to TRUE, but it is set to FALSE if
|  * this is a right/full join and there are nonmergejoinable join clauses.
|  * The executor's mergejoin machinery cannot handle such cases, so we have
|  * to avoid generating a mergejoin plan.  (Note that this flag does NOT
|  * consider whether there are actually any mergejoinable clauses.  This is
|  * correct because in some cases we need to build a clauseless mergejoin.
|  * Simply returning NIL is therefore not enough to distinguish safe from
|  * unsafe cases.)
| 
It says, mergejoin_clause == NIL is not a sufficient check to determine
whether the mergejoin logic is applicable on the target join.
So, either of them is probably an option for extension that tries to implement
their own mergejoin logic; (1) putting both of mergejoin_allowed and
mergeclause_list as arguments of the hook, or (2) re-definition of
select_mergejoin_clauses() as extern function to reproduce the variables on
demand. Which one is more preferable?

BTW, I found a duplicate clause_sides_match_join() definition, that is
invoked at select_mergejoin_clauses(), in joinpath.c and analyzejoins.c.
Either of them should be eliminated, I think.

For param_source_rels and extra_lateral_rels, I'll put source code comments
around add_join_path_hook.
Earlier half of try_(nestloop|hashjoin|mergejoin)_path is probably useful
as example of extension implementation.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]
 Sent: Tuesday, February 25, 2014 12:41 AM
 To: Kohei KaiGai
 Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Jim Mlodgenski; Robert Haas;
 Tom Lane; PgHacker; Peter Eisentraut
 Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
 
 Hi Kaigai-san,
 
 Sorry to leave the thread for a while.
 
 2014-02-23 22:24 GMT+09:00 Kohei KaiGai kai...@kaigai.gr.jp:
  (1) Interface to add alternative paths in addition to built-in join
  paths
 
  This patch adds add_join_path_hook on add_paths_to_joinrel to allow
  extensions to provide alternative scan path in addition to the
  built-in join paths. Custom-scan path being added is assumed to
  perform to scan on a (virtual) relation that is a result set of joining
 relations.
  My concern is its arguments to be pushed. This hook is declared as follows:
 
  /* Hook 

Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Mon, Feb 24, 2014 at 11:34 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

 Andres Freund escribió:
  On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
   On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
   alvhe...@2ndquadrant.comwrote:
  
Michael Paquier escribió:
 Hi all,

 Please find attached a patch to dump pageinspect to 1.2. This
 simply changes page_header to use the new internal datatype
 pg_lsn instead of text.
   
Uhm.  Does this crash if you pg_upgrade a server that has 1.1
installed?
   
   Oops yes you're right., it is obviously broken. Just re-thinking
   about that, dumping this module just to change an argument type does
   not seem to be worth the code complication. Thoughts?
 
  It seem simple to support both, old and new type. page_header() (which
  IIRC is the only thing returning an LSN) likely uses
  get_call_result_type(). So you can just check what it's
  expecting. Either support both types or error out if it's the old
  extension version.

 Yeah, erroring out seems good enough.  Particularly if you add a hint
 saying please upgrade the extension.
Done this way by checking the type OID of TupleDesc returned by
get_call_result_type. Supporting both formats just adds complexity for
no real gain.
-- 
Michael
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 0e267eb..ee78cb2 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -4,8 +4,8 @@ MODULE_big	= pageinspect
 OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.1.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.2.sql pageinspect--1.0--1.1.sql \
+	pageinspect--1.1--1.2.sql pageinspect--unpackaged--1.0.sql
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/pageinspect/pageinspect--1.1--1.2.sql b/contrib/pageinspect/pageinspect--1.1--1.2.sql
new file mode 100644
index 000..5e23ca4
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.1--1.2.sql
@@ -0,0 +1,18 @@
+/* contrib/pageinspect/pageinspect--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use ALTER EXTENSION pageinspect UPDATE TO 1.2 to load this file. \quit
+
+DROP FUNCTION page_header(bytea);
+CREATE FUNCTION page_header(IN page bytea,
+OUT lsn pg_lsn,
+OUT checksum smallint,
+OUT flags smallint,
+OUT lower smallint,
+OUT upper smallint,
+OUT special smallint,
+OUT pagesize smallint,
+OUT version smallint,
+OUT prune_xid xid)
+AS 'MODULE_PATHNAME', 'page_header'
+LANGUAGE C STRICT;
diff --git a/contrib/pageinspect/pageinspect--1.1.sql b/contrib/pageinspect/pageinspect--1.1.sql
deleted file mode 100644
index 22a47d5..000
--- a/contrib/pageinspect/pageinspect--1.1.sql
+++ /dev/null
@@ -1,107 +0,0 @@
-/* contrib/pageinspect/pageinspect--1.1.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use CREATE EXTENSION pageinspect to load this file. \quit
-
---
--- get_raw_page()
---
-CREATE FUNCTION get_raw_page(text, int4)
-RETURNS bytea
-AS 'MODULE_PATHNAME', 'get_raw_page'
-LANGUAGE C STRICT;
-
-CREATE FUNCTION get_raw_page(text, text, int4)
-RETURNS bytea
-AS 'MODULE_PATHNAME', 'get_raw_page_fork'
-LANGUAGE C STRICT;
-
---
--- page_header()
---
-CREATE FUNCTION page_header(IN page bytea,
-OUT lsn text,
-OUT checksum smallint,
-OUT flags smallint,
-OUT lower smallint,
-OUT upper smallint,
-OUT special smallint,
-OUT pagesize smallint,
-OUT version smallint,
-OUT prune_xid xid)
-AS 'MODULE_PATHNAME', 'page_header'
-LANGUAGE C STRICT;
-
---
--- heap_page_items()
---
-CREATE FUNCTION heap_page_items(IN page bytea,
-OUT lp smallint,
-OUT lp_off smallint,
-OUT lp_flags smallint,
-OUT lp_len smallint,
-OUT t_xmin xid,
-OUT t_xmax xid,
-OUT t_field3 int4,
-OUT t_ctid tid,
-OUT t_infomask2 integer,
-OUT t_infomask integer,
-OUT t_hoff smallint,
-OUT t_bits text,
-OUT t_oid oid)
-RETURNS SETOF record
-AS 'MODULE_PATHNAME', 'heap_page_items'
-LANGUAGE C STRICT;
-
---
--- bt_metap()
---
-CREATE FUNCTION bt_metap(IN relname text,
-OUT magic int4,
-OUT version int4,
-OUT root int4,
-OUT level int4,
-OUT fastroot int4,
-OUT fastlevel int4)
-AS 'MODULE_PATHNAME', 'bt_metap'
-LANGUAGE C STRICT;
-
---
--- bt_page_stats()
---
-CREATE FUNCTION bt_page_stats(IN relname text, IN blkno int4,
-OUT blkno int4,
-OUT type char,
-OUT live_items int4,
-OUT dead_items int4,
-OUT avg_item_size int4,
-OUT page_size int4,
-OUT free_size int4,
-OUT btpo_prev int4,
-OUT btpo_next int4,
-OUT btpo int4,
-OUT btpo_flags int4)
-AS 'MODULE_PATHNAME', 'bt_page_stats'
-LANGUAGE C STRICT;
-
---
--- bt_page_items()
---
-CREATE FUNCTION bt_page_items(IN relname text, IN blkno int4,
-OUT itemoffset smallint,
-OUT ctid tid,
-

Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Tue, Feb 25, 2014 at 10:02 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 While we're messing with this, I wonder if there's any way to have
 infomask and infomask2 displayed in hex format rather than plain int
 without having to specify that in every query.  I'm not well known for
 being able to do such conversions off the top of my head ...
Something like calling DirectFunctionCall1 with to_hex and return the
infomask fields as text values instead of integers? This looks
straight-forward.

 (Not that it's this patch' responsibility to do that.)
Definitely a different patch.

Don't you think that this would break existing applications? I see
more flexibility to keep them as they are now, as integers, users can
always tune their queries to do this post-processing with to_hex for
them as they've (always?) been doing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Bit data type header reduction in some cases

2014-02-24 Thread Haribabu Kommi
Hi,

It's regarding a Todo item of Bit data type header reduction in some
cases. The header contains two parts. 1)  The varlena header is
automatically converted to 1 byte header from 4 bytes in case of small
data. 2) The bit length header called bit_len to store the actual bit
length which is of 4 bytes in size. I want to reduce this bit_len size to 1
byte in some cases as similar to varlena header. With this change the size
of the column reduced by 3 bytes, thus shows very good decrease in disk
usage.

I am planning to the change it based on the total length of bits data. If
the number of bits is less than 0x7F then one byte header can be chosen
instead of 4 byte header. During reading of the data, the similar way it
can be calculated.

The problem I am thinking is, as it can cause problems to old databases
having bit columns when they
upgrade to a newer version without using pg_dumpall. Is there anyway to
handle this problem? Or Is there any better way to handle the problem
itself? please let me know your suggestions.

Regards,
Hari Babu
Fujitsu Australia