[HACKERS] Bit data type header reduction in some cases

2014-02-24 Thread Haribabu Kommi
Hi,

It's regarding a Todo item of "Bit data type header reduction" in some
cases. The header contains two parts. 1)  The varlena header is
automatically converted to 1 byte header from 4 bytes in case of small
data. 2) The bit length header called "bit_len" to store the actual bit
length which is of 4 bytes in size. I want to reduce this bit_len size to 1
byte in some cases as similar to varlena header. With this change the size
of the column reduced by 3 bytes, thus shows very good decrease in disk
usage.

I am planning to the change it based on the total length of bits data. If
the number of bits is less than 0x7F then one byte header can be chosen
instead of 4 byte header. During reading of the data, the similar way it
can be calculated.

The problem I am thinking is, as it can cause problems to old databases
having bit columns when they
upgrade to a newer version without using pg_dumpall. Is there anyway to
handle this problem? Or Is there any better way to handle the problem
itself? please let me know your suggestions.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Tue, Feb 25, 2014 at 10:02 AM, Alvaro Herrera
 wrote:
> While we're messing with this, I wonder if there's any way to have
> infomask and infomask2 displayed in hex format rather than plain int
> without having to specify that in every query.  I'm not well known for
> being able to do such conversions off the top of my head ...
Something like calling DirectFunctionCall1 with to_hex and return the
infomask fields as text values instead of integers? This looks
straight-forward.

> (Not that it's this patch' responsibility to do that.)
Definitely a different patch.

Don't you think that this would break existing applications? I see
more flexibility to keep them as they are now, as integers, users can
always tune their queries to do this post-processing with to_hex for
them as they've (always?) been doing.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Mon, Feb 24, 2014 at 11:34 PM, Alvaro Herrera
 wrote:
>
> Andres Freund escribió:
> > On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
> > > On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
> > > wrote:
> > >
> > > > Michael Paquier escribió:
> > > > > Hi all,
> > > > >
> > > > > Please find attached a patch to dump pageinspect to 1.2. This
> > > > > simply changes page_header to use the new internal datatype
> > > > > pg_lsn instead of text.
> > > >
> > > > Uhm.  Does this crash if you pg_upgrade a server that has 1.1
> > > > installed?
> > > >
> > > Oops yes you're right., it is obviously broken. Just re-thinking
> > > about that, dumping this module just to change an argument type does
> > > not seem to be worth the code complication. Thoughts?
> >
> > It seem simple to support both, old and new type. page_header() (which
> > IIRC is the only thing returning an LSN) likely uses
> > get_call_result_type(). So you can just check what it's
> > expecting. Either support both types or error out if it's the old
> > extension version.
>
> Yeah, erroring out seems good enough.  Particularly if you add a hint
> saying "please upgrade the extension".
Done this way by checking the type OID of TupleDesc returned by
get_call_result_type. Supporting both formats just adds complexity for
no real gain.
-- 
Michael
diff --git a/contrib/pageinspect/Makefile b/contrib/pageinspect/Makefile
index 0e267eb..ee78cb2 100644
--- a/contrib/pageinspect/Makefile
+++ b/contrib/pageinspect/Makefile
@@ -4,8 +4,8 @@ MODULE_big	= pageinspect
 OBJS		= rawpage.o heapfuncs.o btreefuncs.o fsmfuncs.o
 
 EXTENSION = pageinspect
-DATA = pageinspect--1.1.sql pageinspect--1.0--1.1.sql \
-	pageinspect--unpackaged--1.0.sql
+DATA = pageinspect--1.2.sql pageinspect--1.0--1.1.sql \
+	pageinspect--1.1--1.2.sql pageinspect--unpackaged--1.0.sql
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/pageinspect/pageinspect--1.1--1.2.sql b/contrib/pageinspect/pageinspect--1.1--1.2.sql
new file mode 100644
index 000..5e23ca4
--- /dev/null
+++ b/contrib/pageinspect/pageinspect--1.1--1.2.sql
@@ -0,0 +1,18 @@
+/* contrib/pageinspect/pageinspect--1.1--1.2.sql */
+
+-- complain if script is sourced in psql, rather than via ALTER EXTENSION
+\echo Use "ALTER EXTENSION pageinspect UPDATE TO 1.2" to load this file. \quit
+
+DROP FUNCTION page_header(bytea);
+CREATE FUNCTION page_header(IN page bytea,
+OUT lsn pg_lsn,
+OUT checksum smallint,
+OUT flags smallint,
+OUT lower smallint,
+OUT upper smallint,
+OUT special smallint,
+OUT pagesize smallint,
+OUT version smallint,
+OUT prune_xid xid)
+AS 'MODULE_PATHNAME', 'page_header'
+LANGUAGE C STRICT;
diff --git a/contrib/pageinspect/pageinspect--1.1.sql b/contrib/pageinspect/pageinspect--1.1.sql
deleted file mode 100644
index 22a47d5..000
--- a/contrib/pageinspect/pageinspect--1.1.sql
+++ /dev/null
@@ -1,107 +0,0 @@
-/* contrib/pageinspect/pageinspect--1.1.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION pageinspect" to load this file. \quit
-
---
--- get_raw_page()
---
-CREATE FUNCTION get_raw_page(text, int4)
-RETURNS bytea
-AS 'MODULE_PATHNAME', 'get_raw_page'
-LANGUAGE C STRICT;
-
-CREATE FUNCTION get_raw_page(text, text, int4)
-RETURNS bytea
-AS 'MODULE_PATHNAME', 'get_raw_page_fork'
-LANGUAGE C STRICT;
-
---
--- page_header()
---
-CREATE FUNCTION page_header(IN page bytea,
-OUT lsn text,
-OUT checksum smallint,
-OUT flags smallint,
-OUT lower smallint,
-OUT upper smallint,
-OUT special smallint,
-OUT pagesize smallint,
-OUT version smallint,
-OUT prune_xid xid)
-AS 'MODULE_PATHNAME', 'page_header'
-LANGUAGE C STRICT;
-
---
--- heap_page_items()
---
-CREATE FUNCTION heap_page_items(IN page bytea,
-OUT lp smallint,
-OUT lp_off smallint,
-OUT lp_flags smallint,
-OUT lp_len smallint,
-OUT t_xmin xid,
-OUT t_xmax xid,
-OUT t_field3 int4,
-OUT t_ctid tid,
-OUT t_infomask2 integer,
-OUT t_infomask integer,
-OUT t_hoff smallint,
-OUT t_bits text,
-OUT t_oid oid)
-RETURNS SETOF record
-AS 'MODULE_PATHNAME', 'heap_page_items'
-LANGUAGE C STRICT;
-
---
--- bt_metap()
---
-CREATE FUNCTION bt_metap(IN relname text,
-OUT magic int4,
-OUT version int4,
-OUT root int4,
-OUT level int4,
-OUT fastroot int4,
-OUT fastlevel int4)
-AS 'MODULE_PATHNAME', 'bt_metap'
-LANGUAGE C STRICT;
-
---
--- bt_page_stats()
---
-CREATE FUNCTION bt_page_stats(IN relname text, IN blkno int4,
-OUT blkno int4,
-OUT type "char",
-OUT live_items int4,
-OUT dead_items int4,
-OUT avg_item_size int4,
-OUT page_size int4,
-OUT free_size int4,
-OUT btpo_prev int4,
-OUT btpo_next int4,
-OUT btpo int4,
-OUT btpo_flags int4)
-AS 'MODULE_PATHNAME', 'bt_page_stats'
-LANGUAGE C STRICT;
-
---
--- bt_page_items()
---
-CREATE FUNCTION bt_page_items(IN relname text, IN blkno int4,
-OUT itemo

Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Kouhei Kaigai
> > /* Hook for plugins to add custom join path, in addition to default
> > ones */ typedef void (*add_join_path_hook_type)(PlannerInfo *root,
> > RelOptInfo *joinrel,
> > RelOptInfo *outerrel,
> > RelOptInfo *innerrel,
> > JoinType jointype,
> > SpecialJoinInfo *sjinfo,
> > List *restrictlist,
> > List *mergeclause_list,
> > SemiAntiJoinFactors *semifactors,
> > Relids param_source_rels,
> > Relids extra_lateral_rels);
> > extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
> >
> > Likely, its arguments upper than restrictlist should be informed to
> > extensions, because these are also arguments of add_paths_to_joinrel().
> > However, I'm not 100% certain how about other arguments should be informed.
> > Probably, it makes sense to inform param_source_rels and
> > extra_lateral_rels to check whether the path is sensible for parameterized
> paths.
> > On the other hand, I doubt whether mergeclause_list is usuful to deliver.
> > (It may make sense if someone tries to implement their own merge-join
> > implementation??)
> >
> > I'd like to seem idea to improve the current interface specification.
> 
> I've read the code path to add custom join again, and felt that providing
> semifactors seems not necessary for the first cut, because it is used in
> only initial_cost_nestloop (final_cost_nestloop receives semifactors but
> it is not used there), and external module would not become so smart before
> 9.5 development cycle.  It seems enough complex to postpone determinig
> whether it's essential for add_join_path_hook.
>  Do you have any concrete use case for the parameter?
> 
The reason why I asked the question above is, I haven't been 100% certain
about its usage. Indeed, semifactors is applied on a limited usage, but
quite easy to reproduce by extension later (using clauselist_selectivity)
if extension wants this factor. So, I agree with removing the semifactors
here.

> mergeclause_list and param_source_rels seem little easier to use, but
> anyway it should be documented how to use those parameters.
>
The mergeclause_list might not be sufficient for extensions to determine
whether its own mergejoin is applicable here. See the comment below; that
is on the head of select_mergejoin_clauses.

|  * *mergejoin_allowed is normally set to TRUE, but it is set to FALSE if
|  * this is a right/full join and there are nonmergejoinable join clauses.
|  * The executor's mergejoin machinery cannot handle such cases, so we have
|  * to avoid generating a mergejoin plan.  (Note that this flag does NOT
|  * consider whether there are actually any mergejoinable clauses.  This is
|  * correct because in some cases we need to build a clauseless mergejoin.
|  * Simply returning NIL is therefore not enough to distinguish safe from
|  * unsafe cases.)
| 
It says, mergejoin_clause == NIL is not a sufficient check to determine
whether the mergejoin logic is applicable on the target join.
So, either of them is probably an option for extension that tries to implement
their own mergejoin logic; (1) putting both of mergejoin_allowed and
mergeclause_list as arguments of the hook, or (2) re-definition of
select_mergejoin_clauses() as extern function to reproduce the variables on
demand. Which one is more preferable?

BTW, I found a duplicate clause_sides_match_join() definition, that is
invoked at select_mergejoin_clauses(), in joinpath.c and analyzejoins.c.
Either of them should be eliminated, I think.

For param_source_rels and extra_lateral_rels, I'll put source code comments
around add_join_path_hook.
Earlier half of try_(nestloop|hashjoin|mergejoin)_path is probably useful
as example of extension implementation.

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei 


> -Original Message-
> From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]
> Sent: Tuesday, February 25, 2014 12:41 AM
> To: Kohei KaiGai
> Cc: Kaigai, Kouhei(海外, 浩平); Stephen Frost; Jim Mlodgenski; Robert Haas;
> Tom Lane; PgHacker; Peter Eisentraut
> Subject: Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)
> 
> Hi Kaigai-san,
> 
> Sorry to leave the thread for a while.
> 
> 2014-02-23 22:24 GMT+09:00 Kohei KaiGai :
> > (1) Interface to add alternative paths in addition to built-in join
> > paths
> >
> > This patch adds "add_join_path_hook" on add_paths_to_joinrel to allow
> > extensions to provide alternative scan path in addition to the
> > built-in join paths. Custom-scan path being added is assumed to
> > perform to scan on a (virtual) relation that is a result set of joining
> relations.
> > My concern is its arguments to be p

Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-24 Thread Rajeev rastogi
On 04 February 2014 14:38, Myself wrote:

> 
> On 4th February 2014, Christian kruse Wrote:
> > On 04/02/14 12:38, Fujii Masao wrote:
> > > ISTM that the phrase "Request queue" is not used much around the
> lock.
> > > Using the phrase "wait queue" or Simon's suggestion sound better to
> > at least me.
> > > Thought?
> >
> > Sounds reasonable to me. Attached patch changes messages to the
> > following:
> >
> > Process holding the lock: A. Wait queue: B.
> > Processes holding the lock: A, B. Wait queue: C.
> 
> This looks good to me also.

I have tested the revised patch and found ready to be committed.

I am marking this as "Ready for Committer".

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Minor comment improvements in tablecmds.c

2014-02-24 Thread Etsuro Fujita
This is a small patch to improve comments in tablecmds.c.  Please find
attached a patch.

Thanks,

Best regards,
Etsuro Fujita
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 08b037e..ed9d206 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -2468,7 +2468,8 @@ RenameConstraint(RenameStmt *stmt)
 }
 
 /*
- * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/FOREIGN TABLE RENAME
+ * Execute ALTER TABLE/INDEX/SEQUENCE/VIEW/MATERIALIZED VIEW/FOREIGN TABLE
+ * RENAME
  */
 Oid
 RenameRelation(RenameStmt *stmt)
@@ -2476,8 +2477,9 @@ RenameRelation(RenameStmt *stmt)
Oid relid;
 
/*
-* Grab an exclusive lock on the target table, index, sequence or view,
-* which we will NOT release until end of transaction.
+* Grab an exclusive lock on the target table, index, sequence, view,
+* materialized view, or foreign table, which we will NOT release until
+* end of transaction.
 *
 * Lock level used here should match RenameRelationInternal, to avoid 
lock
 * escalation.
@@ -2520,8 +2522,9 @@ RenameRelationInternal(Oid myrelid, const char 
*newrelname, bool is_internal)
Oid namespaceId;
 
/*
-* Grab an exclusive lock on the target table, index, sequence or view,
-* which we will NOT release until end of transaction.
+* Grab an exclusive lock on the target table, index, sequence, view,
+* materialized view, or foreign table, which we will NOT release until
+* end of transaction.
 */
targetrelation = relation_open(myrelid, AccessExclusiveLock);
namespaceId = RelationGetNamespace(targetrelation);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SSL: better default ciphersuite

2014-02-24 Thread Peter Eisentraut
committed


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Alvaro Herrera
While we're messing with this, I wonder if there's any way to have
infomask and infomask2 displayed in hex format rather than plain int
without having to specify that in every query.  I'm not well known for
being able to do such conversions off the top of my head ...

(Not that it's this patch' responsibility to do that.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Michael Paquier
On Tue, Feb 25, 2014 at 12:41 AM, Robert Haas  wrote:

> On Wed, Feb 19, 2014 at 8:27 PM, Michael Paquier
>  wrote:
> > On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund 
> wrote:
> >> On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
> >>> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
> >>>  wrote:
> >>> >> Yes, that's a good precedent in multiple ways.
> >>> > Here are updated patches to use pg_lsn instead of pglsn...
> >>>
> >>> OK, so I think this stuff is all committed now, with assorted changes.
> >>>  Thanks for your work on this.
> >>
> >> cool, thanks you two.
> >>
> >> I wonder if pg_stat_replication shouldn't be updated to use it as well?
> >> SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
> >> that as names that are possible candidates for conversion.
> > I was sure to have forgotten some views or functions in the previous
> > patch... Please find attached a patch making pg_stat_replication use
> > pg_lsn instead of the existing text fields.
> > Regards,
>
> Committed.  Do we want to do anything about pageinspect?
>
Thanks. We're dealing with that on another thread, I'll send an updated
patch there.
-- 
Michael


Re: contrib/cache_scan (Re: [HACKERS] What's needed for cache-only table scan?)

2014-02-24 Thread Haribabu Kommi
On Tue, Feb 25, 2014 at 10:44 AM, Kouhei Kaigai wrote:

> Thanks for your testing,
>
> > Getting some compilation warnings while compiling the extension and also
> > I am not able to load the extension because of undefined symbol
> > "get_restriction_qual_cost".
> >
> It seems to me you applied only part-1 portion of the custom-scan patches.
>
> The get_restriction_qual_cost() is re-defined as extern function, from
> static one, on the part-2 portion (ctidscan) to estimate cost value of the
> qualifiers in extension. Also, bms_to_string() and bms_from_string() are
> added on the part-3 portion (postgres_fdw) portion to carry a bitmap-set
> according to the copyObject manner.
>
> It may make sense to include the above supplemental changes in the cache-
> scan feature also, until either of them getting upstreamed.


Thanks for the information, I will apply other patches also and start
testing.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 15:20:13 -0800, Peter Geoghegan wrote:
> On Mon, Feb 24, 2014 at 3:17 PM, Andres Freund  wrote:
> > TBH I don't care about torn pages during normal testing. I don't want to
> > suggest disabling it for real workloads with real data, just that it's
> > important to do so during development/testing of WAL related code,
> > because otherwise it will hide/fixup many errors.
> 
> Sure, but you might want to account for torn pages anyway.
> Particularly if you're interested in some degree of automation, as we
> all seem to be.

Hm, well. I have to admit, if a test machine crashes, I'd just rebuild
the cluster. Unless I am working on crash safety testing in which case
I'd probably not have full_page_writes disabled...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Peter Geoghegan
On Mon, Feb 24, 2014 at 3:17 PM, Andres Freund  wrote:
> TBH I don't care about torn pages during normal testing. I don't want to
> suggest disabling it for real workloads with real data, just that it's
> important to do so during development/testing of WAL related code,
> because otherwise it will hide/fixup many errors.

Sure, but you might want to account for torn pages anyway.
Particularly if you're interested in some degree of automation, as we
all seem to be.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 15:05:37 -0800, Peter Geoghegan wrote:
> On Mon, Feb 24, 2014 at 1:17 PM, Andres Freund  wrote:
> > We somehow need to have a policy of testing changes to the WAL format
> > without full_page_writes. They hide bugs in replay far, far too often.
> 
> What's the easiest way to get atomic page writes at the FS level on
> your laptop? ZFS or some data journaling FS, I suppose.

TBH I don't care about torn pages during normal testing. I don't want to
suggest disabling it for real workloads with real data, just that it's
important to do so during development/testing of WAL related code,
because otherwise it will hide/fixup many errors.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.7

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-24 17:06:53 -0500, Robert Haas wrote:
> I still think pg_create_logical_replication_slot should be in slotfuncs.c.

Ok, I don't feel too strongly, so I can change it. I wanted to keep
logical/ stuff out of slotfuncs.c, but there's not really a strong
reason for that.

> I don't think the completely-unsecured directory operations in
> test_decoding_regsupport.c are acceptable.  Tom fought tooth and nail
> to make sure that similar capabilities in adminpack carried meaningful
> security restrictions.

I actually thought they'd be too ugly to live and we'd remove them
pre-commit.

There's no security problem though afaics, since they aren't actually
created, and you need to be superuser to create C functions.

> /*
> +* Check whether there are, possibly unconnected, logical
> slots that refer
> +* to the to-be-dropped database. The database lock we are holding
> +* prevents the creation of new slots using the database.
> +*/
> +   if (ReplicationSlotsCountDBSlots(db_id, &nslots, &nslots_active))
> +   ereport(ERROR,
> +   (errcode(ERRCODE_OBJECT_IN_USE),
> +errmsg("database \"%s\" is used in a
> logical decoding slot",
> +   dbname),
> +errdetail("There are %d slot(s), %d
> of them active",
> +  nslots, nslots_active)));
> 
> What are you going to do when we get around to supporting this on a
> standby?  Whatever the answer is, maybe add a TODO comment.

I think it should actually mostly work out, anybody actively connected
to a slot will be kicked of (normal HS mechanisms)... But the slot would
currently live on which obviously isn't nice.

Will add TODO.

> +   /*
> +* GetRunningTransactionData() acquired ProcArrayLock, we must release
> +* it. We can do that before inserting the WAL record because
> +* ProcArrayApplyRecoveryInfo can recheck the commit status using the
> +* clog. If we're doing logical replication we can't do that though, 
> so
> +* hold the lock for a moment longer.
> +*/
> +   if (wal_level < WAL_LEVEL_LOGICAL)
> +   LWLockRelease(ProcArrayLock);
> +
> recptr = LogCurrentRunningXacts(running);
> 
> +   /* Release lock if we kept it longer ... */
> +   if (wal_level >= WAL_LEVEL_LOGICAL)
> +   LWLockRelease(ProcArrayLock);
> +

> This seems unfortunate.  The comment should clearly explain why it's 
> necessary.

There's another (existing) comment ontop of the function giving a bit
more context, but I'll expand.

I'd actually prefer to remove that special case alltogether, I don't
have much trust in those codepaths for HS... But that's not an argument
I want to fight out right nwo.

> -   heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
> +   if (IsSystemRelation(scan->rs_rd)
> +   || RelationIsAccessibleInLogicalDecoding(scan->rs_rd))
> +   heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
> +   else
> +   heap_page_prune_opt(scan->rs_rd, buffer, 
> RecentGlobalDataXmin);
> 
> Instead of changing the callers of heap_page_prune_opt() in this way,
> I think it might be better to change heap_page_prune_opt() to take
> only the first two of its current three parameters; everybody's just
> passing RecentGlobalXmin right now anyway.

Sounds like a plan.

> -   if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
> forceSyncCommit)
> +   if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
> forceSyncCommit ||
> +   XLogLogicalInfoActive())
> 
> Mmph.  Is this really necessary?  If so, why?  The comments could elucidate.

We could get rid of it by (optionally) adding information about the
database oid to compact commit, but that'd increase the size of the
record.

> +   bool fail_softly = slot->data.persistency == RS_EPHEMERAL;
> 
> This should be contingent on whether we're being called in the error
> pathway, not the slot type.  I think you should pass a bool.

Why? I had it that way at first, but for persistent slots this won't be
called in error pathways as we won't drop there.

> There are a bunch of places where you're testing IsSystemRelation() ||
> RelationIsAccessibleInLogicalDecoding().  Maybe we need a macro
> encapsulating that test, with a name chose to explain the point of it.

Sounds like a idea.

>  It seems to be indicating, roughly, whether the relation should
> participate in RecentGlobalXmin or RecentGlobalDataXmin.  But is there
> any point at all of separating those when !XLogLogicalInfoActive()?
> The test expands to:
> 
> IsSystemRelation() || (XLogLogicalInfoActive() &&
> RelationNeedsWAL(relation) && (IsCatalogRelation(relation) ||
> RelationIsUsedAsCatalogTable(relation)))
> 
> So basically this is

Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Peter Geoghegan
On Mon, Feb 24, 2014 at 1:17 PM, Andres Freund  wrote:
> We somehow need to have a policy of testing changes to the WAL format
> without full_page_writes. They hide bugs in replay far, far too often.

What's the easiest way to get atomic page writes at the FS level on
your laptop? ZFS or some data journaling FS, I suppose.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.7

2014-02-24 Thread Robert Haas
On Mon, Feb 24, 2014 at 10:11 AM, Andres Freund  wrote:
> Changes in this version include:
> * changed slot error handling log by introducing "ephermal" slots which
>   get dropped on errors. This is the biggest change.
> * added quoting in the test_decoding output plugin
> * closing of a tight race condition during slot creation where WAL could
>   have been removed
> * comment and other adjustments, many of them noticed by robert

I did another read-through of this this afternoon, focusing on the
stuff you changed and parts I hadn't looked at carefully yet.
Comments below.

Documentation needs to be updated for pg_stat_replication view.

I still think pg_create_logical_replication_slot should be in slotfuncs.c.

 /* Size of an indirect datum that contains an indirect TOAST pointer */
 #define INDIRECT_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(struct
varatt_indirect))

+/* Size of an indirect datum that contains a standard TOAST pointer */
+#define INDIRECT_POINTER_SIZE (VARHDRSZ_EXTERNAL + sizeof(struct
varatt_indirect))

Isn't the new hunk a duplicate of the existing definition, except for
a one-word change to the comment?

I don't think the completely-unsecured directory operations in
test_decoding_regsupport.c are acceptable.  Tom fought tooth and nail
to make sure that similar capabilities in adminpack carried meaningful
security restrictions.

/*
+* Check whether there are, possibly unconnected, logical
slots that refer
+* to the to-be-dropped database. The database lock we are holding
+* prevents the creation of new slots using the database.
+*/
+   if (ReplicationSlotsCountDBSlots(db_id, &nslots, &nslots_active))
+   ereport(ERROR,
+   (errcode(ERRCODE_OBJECT_IN_USE),
+errmsg("database \"%s\" is used in a
logical decoding slot",
+   dbname),
+errdetail("There are %d slot(s), %d
of them active",
+  nslots, nslots_active)));

What are you going to do when we get around to supporting this on a
standby?  Whatever the answer is, maybe add a TODO comment.

+ * loop for now..
+* more than twice..

Extra periods.

+* The replicatio slot mechanism is used to prevent removal of required

Typo.

+
+   /*
+* GetRunningTransactionData() acquired ProcArrayLock, we must release
+* it. We can do that before inserting the WAL record because
+* ProcArrayApplyRecoveryInfo can recheck the commit status using the
+* clog. If we're doing logical replication we can't do that though, so
+* hold the lock for a moment longer.
+*/
+   if (wal_level < WAL_LEVEL_LOGICAL)
+   LWLockRelease(ProcArrayLock);
+
recptr = LogCurrentRunningXacts(running);

+   /* Release lock if we kept it longer ... */
+   if (wal_level >= WAL_LEVEL_LOGICAL)
+   LWLockRelease(ProcArrayLock);
+

This seems unfortunate.  The comment should clearly explain why it's necessary.

+   /*
+* Startup logical state, needs to be setup now so we have proper data
+* during restore.
+*/
+   StartupReorderBuffer();

Should add blank line before this.

+   CheckPointSnapBuild();
+   CheckpointLogicalRewriteHeap();

Shouldn't the capitalization be consistent?

-   heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
+   if (IsSystemRelation(scan->rs_rd)
+   || RelationIsAccessibleInLogicalDecoding(scan->rs_rd))
+   heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalXmin);
+   else
+   heap_page_prune_opt(scan->rs_rd, buffer, RecentGlobalDataXmin);

Instead of changing the callers of heap_page_prune_opt() in this way,
I think it might be better to change heap_page_prune_opt() to take
only the first two of its current three parameters; everybody's just
passing RecentGlobalXmin right now anyway. Then, we could change the
first check in heap_page_prune_opt() to check first whether
PageIsPrunable(page, RecentGlobalDataXmin).  If not, give up.  If so,
then check that (!IsSystemRelation(scan->rs_rd) &&
!RelationIsAccessibleInLogicalDecoding(scan->rs_rd)) ||
PageIsPrunable(page, RecentGlobalXmin)).  The advantage of this is
that we avoid code duplication, and we avoid checking a couple of
conditions if pd_prune_xmin is very recent.

-   if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
forceSyncCommit)
+   if (nrels > 0 || nmsgs > 0 || RelcacheInitFileInval ||
forceSyncCommit ||
+   XLogLogicalInfoActive())

Mmph.  Is this really necessary?  If so, why?  The comments could elucidate.

+   bool fail_softly = slot->data.persistency == RS_EPHEMERAL;

This should be contingent on whether we're being called in the error
pathway, not the slot type.  I think you should pass

Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-24 22:17:31 +0100, Andres Freund wrote:
> Those together explain the story. Note this bit:
> 
> static void
> heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
> {
> ...
> HeapTupleHeaderClearHotUpdated(htup);
> HeapTupleHeaderSetXmax(htup, xlrec->locking_xid);
> HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
> /* Make sure there is no forward chain link in t_ctid */
> htup->t_ctid = xlrec->target.tid;
> ...
> }
> 
> So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
> HOT_UPDATED flag.

Some quick archeology shows that the HeapTupleHeaderClearHotUpdated()
was in the original HOT commit (282d2a03d) and clearing of t_ctid was in
the original commit implementing FOR SHARE (bedb78d38). Both look like
they are copied from other places, I don't immediately see any need for
them here...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-24 17:55:14 -0300, Alvaro Herrera wrote:
> Greg Stark wrote:
> > I have a database where a a couple rows don't appear in index scans
> > but do appear in sequential scans. It looks like the same problem as
> > Peter reported but this is a different database. I've extracted all
> > the xlogdump records and below are the ones I think are relevant. You
> > can see that lp 2 gets a few HOT updates and concurrently has someone
> > create a MultiXact NO KEY UPDATE lock while one of those HOT updates
> > is pending but not committed.

Per se the sequence of records doesn't look bad (even though I am not
happy that we log intermediate and final rows first, and only then the
start of the chain).

> > The net result seems to be that the ctid
> > update chain got broken. The index of course points to the head of the
> > HOT chain so it doesn't find the live tail whereas the sequential scan
> > picks it up.

Yea, that's the problem.

> > rmgr: Heaplen (rec/tot):291/   323, tx:5943849, lsn: 
> > FD/2F0ADFC0, prev FD/2F0ADF90, bkp: , desc: hot_update: rel 
> > 1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
> > rmgr: Heap2   len (rec/tot): 25/57, tx:5943851, lsn: 
> > FD/2F0AE450, prev FD/2F0AE408, bkp: , desc: lock updated: xmax 5943851 
> > msk 000a; rel 1663/16385/212653; tid 13065/3
> > rmgr: MultiXact   len (rec/tot): 28/60, tx:5943851, lsn: 
> > FD/2F0AE490, prev FD/2F0AE450, bkp: , desc: create mxid 728896 offset 
> > 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
> > rmgr: Heaplen (rec/tot): 25/57, tx:5943851, lsn: 
> > FD/2F0AE4D0, prev FD/2F0AE490, bkp: , desc: lock 728896: rel 
> > 1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK

> >  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |   t_ctid  
> >  | t_infomask2 | t_infomask | t_hoff |
> > ++--++-+-+--++-+++-
> >   2 |   3424 |1 |232 | 5943845 |  728896 |0 | (13065,2) 
> >  |  32 |   4419 | 32 |
> >   3 |   3152 |1 |272 | 5943849 | 5943879 |0 | (13065,4) 
> >  |   49184 |   9475 | 32 |
> >   4 |   2864 |1 |287 | 5943879 | 5943880 |0 | (13065,7) 
> >  |   49184 |   9475 | 32 |
> >   7 |   2576 |1 |287 | 5943880 |   0 |0 | (13065,7) 
> >  |   32800 |  10499 | 32 |

Those together explain the story. Note this bit:

static void
heap_xlog_lock(XLogRecPtr lsn, XLogRecord *record)
{
...
HeapTupleHeaderClearHotUpdated(htup);
HeapTupleHeaderSetXmax(htup, xlrec->locking_xid);
HeapTupleHeaderSetCmax(htup, FirstCommandId, false);
/* Make sure there is no forward chain link in t_ctid */
htup->t_ctid = xlrec->target.tid;
...
}

So, the replay of FD/2F0AE4D0 breaks the ctid chain *and* unsets the
HOT_UPDATED flag.

Which means fkey locks have never properly worked across SR/crash
recovery.

Haven't thought about how to fix it yet, I hope won't have to (hint hint).

We somehow need to have a policy of testing changes to the WAL format
without full_page_writes. They hide bugs in replay far, far too often.


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Andrew Dunstan


On 02/24/2014 02:15 PM, Andrew Dunstan wrote:



Having had my schedule very seriously disrupted by the storm in the US 
South East a week or so ago, I am finally getting back to being able 
to devote some time to jsonb. I hope to have new patches available 
today or tomorrow at the latest.






Update to this: A recent commit caused an unfortunate merge conflict in 
the hstore code that I have asked Teodor to resolve. I can't post new 
clean patches until that's been done.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Alvaro Herrera
Here's a reformatted copy.  I think this is the same bug as Peter G.
reported in 
http://www.postgresql.org/message-id/CAM3SWZTMQiCi5PV5OWHb+bYkUcnCk=o67w0csswpvv7xfuc...@mail.gmail.com

I have a hunch that this is related to the heap_lock_updated business.
I haven't investigated yet.


Greg Stark wrote:
> I have a database where a a couple rows don't appear in index scans
> but do appear in sequential scans. It looks like the same problem as
> Peter reported but this is a different database. I've extracted all
> the xlogdump records and below are the ones I think are relevant. You
> can see that lp 2 gets a few HOT updates and concurrently has someone
> create a MultiXact NO KEY UPDATE lock while one of those HOT updates
> is pending but not committed. The net result seems to be that the ctid
> update chain got broken. The index of course points to the head of the
> HOT chain so it doesn't find the live tail whereas the sequential scan
> picks it up.
> 
> I don't see any evidence of MultiXactId wraparound, the members run to
> 001F and the offsets run to 000B. This is on a standby that's been
> activated but afaik that shouldn't change these files any more right?
> 
> rmgr: Heaplen (rec/tot):235/   267, tx:5943845, lsn: 
> FD/2F0A3640, prev FD/2F0A3600, bkp: , desc: insert: rel 
> 1663/16385/212653; tid 13065/2
> rmgr: Transaction len (rec/tot): 12/44, tx:5943845, lsn: 
> FD/2F0A8178, prev FD/2F0A8148, bkp: , desc: commit: 2014-02-19 
> 20:41:23.698513 UTC
> rmgr: Heaplen (rec/tot): 25/57, tx:5943847, lsn: 
> FD/2F0AA440, prev FD/2F0AA3F8, bkp: , desc: lock 5943847: rel 
> 1663/16385/212653; tid 13065/2 LOCK_ONLY KEYSHR_LOCK
> rmgr: Transaction len (rec/tot): 12/44, tx:5943847, lsn: 
> FD/2F0AA480, prev FD/2F0AA440, bkp: , desc: commit: 2014-02-19 
> 20:41:23.713969 UTC
> rmgr: Heaplen (rec/tot):291/   323, tx:5943849, lsn: 
> FD/2F0ADFC0, prev FD/2F0ADF90, bkp: , desc: hot_update: rel 
> 1663/16385/212653; tid 13065/2 xmax 5943849 ; new tid 13065/3 xmax 0
> rmgr: Heap2   len (rec/tot): 25/57, tx:5943851, lsn: 
> FD/2F0AE450, prev FD/2F0AE408, bkp: , desc: lock updated: xmax 5943851 
> msk 000a; rel 1663/16385/212653; tid 13065/3
> rmgr: MultiXact   len (rec/tot): 28/60, tx:5943851, lsn: 
> FD/2F0AE490, prev FD/2F0AE450, bkp: , desc: create mxid 728896 offset 
> 1632045 nmembers 2: 5943849 (nokeyupd) 5943851 (keysh)
> rmgr: Heaplen (rec/tot): 25/57, tx:5943851, lsn: 
> FD/2F0AE4D0, prev FD/2F0AE490, bkp: , desc: lock 728896: rel 
> 1663/16385/212653; tid 13065/2 IS_MULTI EXCL_LOCK
> rmgr: Transaction len (rec/tot): 12/44, tx:5943849, lsn: 
> FD/2F0AE510, prev FD/2F0AE4D0, bkp: , desc: commit: 2014-02-19 
> 20:41:23.744989 UTC
> rmgr: Transaction len (rec/tot): 12/44, tx:5943851, lsn: 
> FD/2F0AE570, prev FD/2F0AE540, bkp: , desc: commit: 2014-02-19 
> 20:41:23.746820 UTC
> rmgr: Heaplen (rec/tot):306/   338, tx:5943879, lsn: 
> FD/2F103788, prev FD/2F103758, bkp: , desc: hot_update: rel 
> 1663/16385/212653; tid 13065/3 xmax 5943879 ; new tid 13065/4 xmax 0
> rmgr: Transaction len (rec/tot): 12/44, tx:5943879, lsn: 
> FD/2F1038E0, prev FD/2F103788, bkp: , desc: commit: 2014-02-19 
> 20:41:24.580827 UTC
> rmgr: Heaplen (rec/tot):306/   338, tx:5943880, lsn: 
> FD/2F103910, prev FD/2F1038E0, bkp: , desc: hot_update: rel 
> 1663/16385/212653; tid 13065/4 xmax 5943880 ; new tid 13065/7 xmax 0
> rmgr: Transaction len (rec/tot): 12/44, tx:5943880, lsn: 
> FD/2F106070, prev FD/2F106030, bkp: , desc: commit: 2014-02-19 
> 20:41:24.617048 UTC
> 
> 
>  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |   t_ctid   
> | t_infomask2 | t_infomask | t_hoff |
> ++--++-+-+--++-+++-
>   2 |   3424 |1 |232 | 5943845 |  728896 |0 | (13065,2)  
> |  32 |   4419 | 32 |
>   3 |   3152 |1 |272 | 5943849 | 5943879 |0 | (13065,4)  
> |   49184 |   9475 | 32 |
>   4 |   2864 |1 |287 | 5943879 | 5943880 |0 | (13065,7)  
> |   49184 |   9475 | 32 |
>   7 |   2576 |1 |287 | 5943880 |   0 |0 | (13065,7)  
> |   32800 |  10499 | 32 |
> 
> 
> -- 
> greg
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Another possible corruption bug in 9.3.2 or possibly a known MultiXact problem?

2014-02-24 Thread Andres Freund
On 2014-02-20 13:25:35 +, Greg Stark wrote:
> rmgr: Heaplen (rec/tot):235/   267, tx:5943845, lsn:
> FD/2F0A3640, prev FD/2F0A3600, bkp: , desc: insert: rel
> 1663/16385/212653; tid 13065/2

>  lp | lp_off | lp_flags | lp_len | t_xmin  | t_xmax  | t_field3 |
> t_ctid   | t_infomask2 | t_infomask | t_hoff |
> ++--++-+-+--++-+++-
>   2 |   3424 |1 |232 | 5943845 |  728896 |0 |
> (13065,2)  |  32 |   4419 | 32 |
>   3 |   3152 |1 |272 | 5943849 | 5943879 |0 |
> (13065,4)  |   49184 |   9475 | 32 |
>   4 |   2864 |1 |287 | 5943879 | 5943880 |0 |
> (13065,7)  |   49184 |   9475 | 32 |
>   7 |   2576 |1 |287 | 5943880 |   0 |0 |
> (13065,7)  |   32800 |  10499 | 32 |

Any chance you could send those again, with less linebreak damage? They
are really hard to read this way...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Tom Lane
Alvaro Herrera  writes:
> Bruce Momjian escribió:
>> Well, typmods are type-specific, so there is no official way to
>> calculate it.  I would look at how an existing type uses typmod and copy
>> that.

> Our system is pretty neat.  See a complex example here:
> https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_typmod.c

One other point is that if you do consult the varchar functions as
an example, be aware that there's an offset of 4 in their definition
of the typmod (eg, for varchar(3) the stored typmod is 7).  This is
entirely for legacy reasons so there's no good reason to duplicate it
in a new custom-made type.  Except for the rule that negative values
mean "unspecified typmod" (which you have to support), you can
define the contents of the typmod value however you want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Jeff Janes
On Mon, Feb 24, 2014 at 7:02 AM, Pavel Stehule wrote:

>
>
>
> 2014-02-23 21:32 GMT+01:00 Andres Freund :
>
> Hi,
>>
>> On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
>> > There is relative few very long ProcArrayLocks lwlocks
>> >
>> > This issue is very pathologic on fast computers with more than 8 CPU.
>> This
>> > issue was detected after migration from 8.4 to 9.2. (but tested with
>> same
>> > result on 9.0)  I see it on devel 9.4 today actualized.
>> >
>> > When I moved PREPARE from cycle, then described issues is gone. But
>> when I
>> > use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
>> > related to planner, ...
>>
>> In addition to the issue Jeff mentioned, I'd suggest trying the same
>> workload with repeatable read. That can do *wonders* because of the
>> reduced number of snapshots.
>>
>>
> I tested it, and it doesn't help.
>
> Is there some patch, that I can test related to this issue?
>

This is the one that I was referring to:

http://www.postgresql.org/message-id/11927.1384199...@sss.pgh.pa.us

Cheers,

Jeff


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Alvaro Herrera
Bruce Momjian escribió:
> On Sun, Feb 23, 2014 at 07:50:08AM +0330, Mohsen SM wrote:
> > Hello.
> > I have a new type similar to varchar.
> > I want to fine how did I can to calculate typemod
> > and where must I calculate typemod for this type.
> 
> Well, typmods are type-specific, so there is no official way to
> calculate it.  I would look at how an existing type uses typmod and copy
> that.

Our system is pretty neat.  See a complex example here:
https://github.com/postgis/postgis/blob/svn-trunk/postgis/gserialized_typmod.c

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] typemode for variable types

2014-02-24 Thread Bruce Momjian
On Sun, Feb 23, 2014 at 07:50:08AM +0330, Mohsen SM wrote:
> Hello.
> I have a new type similar to varchar.
> I want to fine how did I can to calculate typemod
> and where must I calculate typemod for this type.

Well, typmods are type-specific, so there is no official way to
calculate it.  I would look at how an existing type uses typmod and copy
that.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Andrew Dunstan


On 02/24/2014 11:06 AM, Merlin Moncure wrote:

On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure  wrote:

On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure  wrote:

I still find the phrasing "as jsonb is more efficient for most
purposes" to be a bit off  Basically, the text json type is faster for
serialization/deserialization pattern (not just document preservation)
and jsonb is preferred when storing json and doing repeated
subdocument accesses.

Hm, I'm going to withdraw that.  I had done some testing of simple
deserialization (cast to text and the like) and noted that jsonb was
as much as 5x slower.  However, I just did some checking on
json[b]_populate_recordset though and it's pretty much a wash.

[sorry for noise on this].

Here's the use case coverage as I see it today:

CASE:jsonjsonb hstore
Static document: yes  poor  poor
Precise document:yes  nono
Serialization:   yes  nono
Deserialization: poor***  yes*  no
Repeated Access: poor yes   yes
Manipulation:no   no**  yes
GIST/GIN searching:  no   no**  yes

notes:
* jsonb gets 'yes' for deserialization assuming andrew's 'two level'
deserialization fix goes in (otherwise 'poor').
** jsonb can't do this today, but presumably will be able to soon
*** 'poor' unless json type also gets the deserialization fix, then 'yes'.
 hstore can deserialize hstore format, but will rely on json/jsonb
for deserializing json

'Static document' represents edge cases where the json is opaque to
the database but performance -- for example large map polygons.
'Precise document' represents cases where whitespace or key order is important.

Peter asked upthread how to access the various features.  Well, today,
it basically means a bit of nimble casting to different structures
depending on which particular features are important to you, which
IMNSHO is not bad at all as long as we understand that most people who
rely on jsonb will also need hstore for its searching and operators.
Down the line when hstore and jsonb are more flushed out it's going to
come down to an API style choice.





Frankly, a lot of the above doesn't make much sense to me. WTF is 
"Manipulation'?


Unless I see much more actual info on the tests being conducted it's 
just about impossible to comment. The performance assessment at this 
stage is simply anecdotal as far as I'm concerned.


populate_record() is likely to be a *very* poor point of comparison 
anyway, I would expect the performance numbers to be dominated by the 
input function calls for the object's component types, and that's going 
to be the same in both cases. If you want to prove something here you'll 
need to supply profiling numbers showing where it spends its time in 
each case.


Having had my schedule very seriously disrupted by the storm in the US 
South East a week or so ago, I am finally getting back to being able to 
devote some time to jsonb. I hope to have new patches available today or 
tomorrow at the latest.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Mon, Feb 24, 2014 at 07:23:50PM +0100, Andres Freund wrote:
> On 2014-02-24 13:16:39 -0500, Bruce Momjian wrote:
> > On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
> > > On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
> > > > On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
> > > > > As others, I am concerned about people being confused when 
> > > > > funny-looking
> > > > > trigger names suddenly appearing when you disable all table triggers.
> > > > > 
> > > > > What I ended up doing is to create a user and internal section when
> > > > > displaying disabled triggers:
> > > > > 
> > > > >   Disabled user triggers:
> > > > >   check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE 
> > > > > PROCEDURE trigf()
> > > > >   Disabled internal triggers:
> > > > >   "RI_ConstraintTrigger_c_16409" AFTER INSERT ON orders FROM 
> > > > > customer NOT DEF ...
> > > > >   "RI_ConstraintTrigger_c_16410" AFTER UPDATE ON orders FROM 
> > > > > customer NOT DEF ...
> > > > >   
> > > > > I kept the "Triggers" section unchanged, showing only user triggers.  
> > > > > I
> > > > > also updated the code to handle 8.3+ servers.
> > > > > 
> > > > > Patch attached.
> > > > 
> > > > Patch applied.
> > > 
> > > Thanks. It'd have been nice tho, to mention Fabrízio in the commit
> > > message as the patch's author.
> > 
> > Uh, I was thinking of that, but I basically rewrote the patch from
> > scratch and changed its visible behavior, so I was worried about perhaps
> > blaming him if it introduced a bug.  I should have said "original patch
> > by ...", but because so much of it was new, I didn't bother.
> 
> I just seems nicer to relatively new contributors to mention their names
> when they try to contribute.

Agreed.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Fabrízio de Royes Mello
On Mon, Feb 24, 2014 at 3:23 PM, Andres Freund 
wrote:
> > > Thanks. It'd have been nice tho, to mention Fabrízio in the commit
> > > message as the patch's author.
> >
> > Uh, I was thinking of that, but I basically rewrote the patch from
> > scratch and changed its visible behavior, so I was worried about perhaps
> > blaming him if it introduced a bug.  I should have said "original patch
> > by ...", but because so much of it was new, I didn't bother.
>
> I just seems nicer to relatively new contributors to mention their names
> when they try to contribute.
>

Hey guys, I'm not worried about it... to me the most important thing is the
improvement and the learning. So I'm happy to help in some way.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Andres Freund
On 2014-02-24 13:16:39 -0500, Bruce Momjian wrote:
> On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
> > On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
> > > On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
> > > > As others, I am concerned about people being confused when funny-looking
> > > > trigger names suddenly appearing when you disable all table triggers.
> > > > 
> > > > What I ended up doing is to create a user and internal section when
> > > > displaying disabled triggers:
> > > > 
> > > > Disabled user triggers:
> > > > check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE 
> > > > PROCEDURE trigf()
> > > > Disabled internal triggers:
> > > > "RI_ConstraintTrigger_c_16409" AFTER INSERT ON orders FROM 
> > > > customer NOT DEF ...
> > > > "RI_ConstraintTrigger_c_16410" AFTER UPDATE ON orders FROM 
> > > > customer NOT DEF ...
> > > > 
> > > > I kept the "Triggers" section unchanged, showing only user triggers.  I
> > > > also updated the code to handle 8.3+ servers.
> > > > 
> > > > Patch attached.
> > > 
> > > Patch applied.
> > 
> > Thanks. It'd have been nice tho, to mention Fabrízio in the commit
> > message as the patch's author.
> 
> Uh, I was thinking of that, but I basically rewrote the patch from
> scratch and changed its visible behavior, so I was worried about perhaps
> blaming him if it introduced a bug.  I should have said "original patch
> by ...", but because so much of it was new, I didn't bother.

I just seems nicer to relatively new contributors to mention their names
when they try to contribute.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Mon, Feb 24, 2014 at 07:09:29PM +0100, Andres Freund wrote:
> On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
> > On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
> > > As others, I am concerned about people being confused when funny-looking
> > > trigger names suddenly appearing when you disable all table triggers.
> > > 
> > > What I ended up doing is to create a user and internal section when
> > > displaying disabled triggers:
> > > 
> > >   Disabled user triggers:
> > >   check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
> > > trigf()
> > >   Disabled internal triggers:
> > >   "RI_ConstraintTrigger_c_16409" AFTER INSERT ON orders FROM customer 
> > > NOT DEF ...
> > >   "RI_ConstraintTrigger_c_16410" AFTER UPDATE ON orders FROM customer 
> > > NOT DEF ...
> > >   
> > > I kept the "Triggers" section unchanged, showing only user triggers.  I
> > > also updated the code to handle 8.3+ servers.
> > > 
> > > Patch attached.
> > 
> > Patch applied.
> 
> Thanks. It'd have been nice tho, to mention Fabrízio in the commit
> message as the patch's author.

Uh, I was thinking of that, but I basically rewrote the patch from
scratch and changed its visible behavior, so I was worried about perhaps
blaming him if it introduced a bug.  I should have said "original patch
by ...", but because so much of it was new, I didn't bother.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Andres Freund
On 2014-02-24 12:45:12 -0500, Bruce Momjian wrote:
> On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
> > As others, I am concerned about people being confused when funny-looking
> > trigger names suddenly appearing when you disable all table triggers.
> > 
> > What I ended up doing is to create a user and internal section when
> > displaying disabled triggers:
> > 
> > Disabled user triggers:
> > check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
> > trigf()
> > Disabled internal triggers:
> > "RI_ConstraintTrigger_c_16409" AFTER INSERT ON orders FROM customer 
> > NOT DEF ...
> > "RI_ConstraintTrigger_c_16410" AFTER UPDATE ON orders FROM customer 
> > NOT DEF ...
> > 
> > I kept the "Triggers" section unchanged, showing only user triggers.  I
> > also updated the code to handle 8.3+ servers.
> > 
> > Patch attached.
> 
> Patch applied.

Thanks. It'd have been nice tho, to mention Fabrízio in the commit
message as the patch's author.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Auto-tuning work_mem and maintenance_work_mem

2014-02-24 Thread Bruce Momjian
On Mon, Feb 17, 2014 at 11:14:33AM -0500, Bruce Momjian wrote:
> On Sun, Feb 16, 2014 at 09:26:47PM -0500, Robert Haas wrote:
> > > So, would anyone like me to create patches for any of these items before
> > > we hit 9.4 beta?  We have added autovacuum_work_mem, and increasing
> > > work_mem and maintenance_work_mem by 4x is a simple operation.  Not sure
> > > about the others.  Or do we just keep this all for 9.5?
> > 
> > I don't think anyone objected to increasing the defaults for work_mem
> > and maintenance_work_mem by 4x, and a number of people were in favor,
> > so I think we should go ahead and do that.  If you'd like to do the
> > honors, by all means!
> 
> OK, patch attached.

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Pavel Stehule
2014-02-24 16:09 GMT+01:00 Andres Freund :

> On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
> >354246.00 93.0% s_lock
> > /usr/lib/postgresql/9.2/bin/postgres
> > 10503.00  2.8% LWLockRelease
> >  /usr/lib/postgresql/9.2/bin/postgres
> >  8802.00  2.3% LWLockAcquire
> >  /usr/lib/postgresql/9.2/bin/postgres
> >   828.00  0.2% _raw_spin_lock
> > [kernel.kallsyms]
> >   559.00  0.1% _raw_spin_lock_irqsave
> > [kernel.kallsyms]
> >   340.00  0.1% switch_mm
> >  [kernel.kallsyms]
> >   305.00  0.1% poll_schedule_timeout
> >  [kernel.kallsyms]
> >   274.00  0.1% native_write_msr_safe
> >  [kernel.kallsyms]
> >   257.00  0.1% _raw_spin_lock_irq
> > [kernel.kallsyms]
> >   238.00  0.1% apic_timer_interrupt
> > [kernel.kallsyms]
> >   236.00  0.1% __schedule
> > [kernel.kallsyms]
> >   213.00  0.1% HeapTupleSatisfiesMVCC
> >
> > With systemtap I got list of spin locks
> >
> > light weight locks
> > lockname   mode  countavg (time)
> > DynamicLocks  Exclusive   2804   1025
> > DynamicLocks Shared106130
> >ProcArrayLock  Exclusive 63 963551
> >ProcArrayLock Shared 50   4160
> > LockMgrLocks  Exclusive 18159
> >  IndividualLock   Exclusive  2  7
> >
> > There is relative few very long ProcArrayLocks lwlocks
>
> It's odd that there are so many exclusive acquisition
> ProcArrayLocks... A hierarchical profile would be interesting. I'd
> suggest compiling postgres with -fno-omit-frame-pointer and doing a
> profile with perf.
>
>
I had no experience with perf, so maybe it is not what you want

-  19.59%   postmaster  postgres
   - s_lock
  - 55.06% LWLockAcquire
 + 99.84% TransactionIdIsInProgress
  - 44.63% LWLockRelease
 + 99.91% TransactionIdIsInProgress
-  13.84%   postmaster  postgres
   - tas
  - 97.97% s_lock
 + 55.01% LWLockAcquire
 + 44.99% LWLockRelease
  - 1.10% LWLockAcquire
 + 99.89% TransactionIdIsInProgress
  - 0.93% LWLockRelease
 + 99.93% TransactionIdIsInProgress



-  19.59%   postmaster  postgres
   - s_lock
  - 55.06% LWLockAcquire
 - 99.84% TransactionIdIsInProgress
  HeapTupleSatisfiesMVCC
  heap_hot_search_buffer
  index_fetch_heap
  index_getnext
  get_actual_variable_range
  ineq_histogram_selectivity
  scalarineqsel
  mergejoinscansel
  cached_scansel
  initial_cost_mergejoin
  try_mergejoin_path
  sort_inner_and_outer
  add_paths_to_joinrel
  make_join_rel
  make_rels_by_clause_joins
  join_search_one_level
  standard_join_search
  make_rel_from_joinlist
  make_one_rel
  query_planner
  grouping_planner
  subquery_planner
  standard_planner
  planner
  pg_plan_query
  pg_plan_queries
  BuildCachedPlan
  GetCachedPlan
  exec_bind_message
  PostgresMain
  ExitPostmaster
  BackendStartup
  ServerLoop
  PostmasterMain
  startup_hacks

regards

Pavel


Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] old warning in docs

2014-02-24 Thread Bruce Momjian
On Sat, Feb 15, 2014 at 07:34:20PM -0500, Bruce Momjian wrote:
> On Sat, Feb 15, 2014 at 07:19:46PM -0500, Robert Haas wrote:
> > On Thu, Feb 13, 2014 at 10:55 AM, Bruce Momjian  wrote:
> > > I have created the attached patch which removes many of the pre-8.0
> > > references, and trims some of the 8.1-8.3 references.  There are
> > > probably some of these that should be kept, but it is easier to show you
> > > all the possibilities and we can trim down the removal list based on
> > > feedback.
> > 
> > The changes to lobj.sgml seem pointless.  I would also vote for
> > keeping xindex.sgml as-is.  The rest of the changes seem like
> > improvements.
> 
> OK, done.  Updated patch attached.

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-24 Thread Robert Haas
On Mon, Feb 24, 2014 at 9:48 AM, Andres Freund  wrote:
> On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
>> On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund  
>> wrote:
>
>> +   /*
>> +* XXX: It's impolite to ignore our argument and keep decoding until 
>> the
>> +* current position.
>> +*/
>>
>> Eh, what?
>
> So, the background here is that I was thinking of allowing to specify a
> limit for the number of returned rows. For the sql interface that sounds
> like a good idea. I am just not so sure anymore that allowing to specify
> a LSN as a limit is sufficient. Maybe simply allow to limit the number
> of changes and check everytime a transaction has been replayed?

The last idea there seems like pretty sound, but ...

> It's all trivial codewise, I am just wondering about the interface most
> users would want.

...I can't swear it meets this criterion.

>> +* We misuse the original meaning of SnapshotData's xip and
>> subxip fields
>> +* to make the more fitting for our needs.
>> [...]
>> +* XXX: Do we want extra fields instead of misusing existing
>> ones instead?
>>
>> If we're going to do this, then it surely needs to be documented in
>> snapshot.h.  On the second question, you're not the first hacker to
>> want to abuse the meanings of the existing fields; SnapshotDirty
>> already does it.  It's tempting to think we need a more principled
>> approach to this, like what we've done with Node i.e. typedef enum ...
>> SnapshotType; and then a separate struct definition for each kind, all
>> beginning with SnapshotType type.
>
> Hm, essentially that's what the ->satisfies pointer already is, right?

Sorta, yeah.  But with nodes, you can change the whole struct
definition for each type.

> There's already documentation of the extra fields in snapbuild, but I
> understand you'd rather have them moved?

Yeah, I think it needs to be documented where SnapshotData is defined.
 There may be reason to mention it again, or in more detail,
elsewhere.  But there should be some mention of it there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql should show disabled internal triggers

2014-02-24 Thread Bruce Momjian
On Wed, Feb 12, 2014 at 09:04:45PM -0500, Bruce Momjian wrote:
> As others, I am concerned about people being confused when funny-looking
> trigger names suddenly appearing when you disable all table triggers.
> 
> What I ended up doing is to create a user and internal section when
> displaying disabled triggers:
> 
>   Disabled user triggers:
>   check_update BEFORE UPDATE ON orders FOR EACH ROW EXECUTE PROCEDURE 
> trigf()
>   Disabled internal triggers:
>   "RI_ConstraintTrigger_c_16409" AFTER INSERT ON orders FROM customer 
> NOT DEF ...
>   "RI_ConstraintTrigger_c_16410" AFTER UPDATE ON orders FROM customer 
> NOT DEF ...
>   
> I kept the "Triggers" section unchanged, showing only user triggers.  I
> also updated the code to handle 8.3+ servers.
> 
> Patch attached.

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-24 Thread Robert Haas
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris  wrote:
> On 09/02/14 17:11, Jeremy Harris wrote:
>> On 06/02/14 18:21, Jeff Janes wrote:
>>>   Did you try sorting already-sorted, reverse
>>> sorted, or pipe-organ shaped data sets?  We will also need to test it on
>>> strings.  I usually use md5(random()::text) to generate strings for such
>>> purposes, at least for a first pass.
>>
>>
>> Attached is version 2 of the patch, which fixes the performance on
>> constant-input.
>
> Having beaten on this some more I'm prepared to abandon it.
>
> The wallclock time, for random input, drifts up at larger N
> (compared to the existing code) despite the number of comparisons
> being consistently less.
>
> Run under cachegrind, it takes about N/10 last-level cache misses,
> all for the new item being introduced to the heap.  The existing
> code takes none at all.

Can you explain this further?  This seems like an important clue that
could be useful when trying to optimize this code, but I'm a little
unclear which part of the operation has more cache misses with your
changes and why.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Josh Berkus
On 02/24/2014 07:08 AM, Merlin Moncure wrote:
> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure  wrote:
>> I still find the phrasing "as jsonb is more efficient for most
>> purposes" to be a bit off  Basically, the text json type is faster for
>> serialization/deserialization pattern (not just document preservation)
>> and jsonb is preferred when storing json and doing repeated
>> subdocument accesses.
> 
> Hm, I'm going to withdraw that.  I had done some testing of simple
> deserialization (cast to text and the like) and noted that jsonb was
> as much as 5x slower.  However, I just did some checking on
> json[b]_populate_recordset though and it's pretty much a wash.

Aside from that, I want our docs to make a strong endorsement of using
jsonb over json for most users.  jsonb will continue to be developed and
improved in the future; it is very unlikely that json will.  Maybe
that's what I should say rather than anything about efficiency.

In other words: having an ambiguous, complex evaluation of json vs.
jsonb does NOT benefit most users.  The result will be some users
choosing json and then pitching fit when they want jsonb in 9.5 and have
to rewrite all their tables.

Mind you, we'll need to fix the slow deserialization, though.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-security on updatable s.b. views

2014-02-24 Thread Dean Rasheed
On 13 February 2014 04:12, Craig Ringer  wrote:
> On 02/11/2014 08:19 PM, Yeb Havinga wrote:
>
>> I compared output of psql -ef of the minirim.sql script posted earlier
>> in http://www.postgresql.org/message-id/52f54927.1040...@gmail.com
>> between v4 and v7.
>>
>> Not everything is ok.
>
>> +psql:/home/m/minirim2.sql:409: ERROR:  attribute 6 has wrong type
>> +DETAIL:  Table has type tsrange, but query expects _confidentialitycode.
>
> This looks like an issue with attribute transformations made when
> applying security barrier quals.
>
>> +psql:/home/m/minirim2.sql:439: connection to server was lost
>
> That's dying with:
>
>> Program received signal SIGABRT, Aborted.
>> 0x003f02c359e9 in __GI_raise (sig=sig@entry=6) at 
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> 56return INLINE_SYSCALL (tgkill, 3, pid, selftid, sig);
>> (gdb) bt
>> #0  0x003f02c359e9 in __GI_raise (sig=sig@entry=6) at 
>> ../nptl/sysdeps/unix/sysv/linux/raise.c:56
>> #1  0x003f02c370f8 in __GI_abort () at abort.c:90
>> #2  0x00758d9d in ExceptionalCondition 
>> (conditionName=conditionName@entry=0x8b7bf0 "!(!bms_is_empty(upper_varnos))",
>> errorType=errorType@entry=0x78e89c "FailedAssertion", 
>> fileName=fileName@entry=0x8b78d7 "subselect.c", 
>> lineNumber=lineNumber@entry=1394) at assert.c:54
>> #3  0x0061de2b in convert_EXISTS_sublink_to_join (root=> out>, sublink=, under_not=under_not@entry=0 '\000', 
>> available_rels=0x117d038)
>> at subselect.c:1394
>> #4  0x0061efa9 in pull_up_sublinks_qual_recurse 
>> (root=root@entry=0x117d058, node=0x117a0f8, 
>> jtlink1=jtlink1@entry=0x7fff6d97f708,
>> available_rels1=available_rels1@entry=0x117d038, 
>> jtlink2=jtlink2@entry=0x0, available_rels2=available_rels2@entry=0x0) at 
>> prepjointree.c:391
>> #5  0x0061f339 in pull_up_sublinks_jointree_recurse 
>> (root=root@entry=0x117d058, jtnode=0x117a040, 
>> relids=relids@entry=0x7fff6d97f758) at prepjointree.c:208
>> #6  0x0062066f in pull_up_sublinks (root=root@entry=0x117d058) at 
>> prepjointree.c:147
>> #7  0x0061967e in subquery_planner (glob=0x10c3fb8, 
>> parse=parse@entry=0x1179af8, parent_root=parent_root@entry=0x1182fd0, 
>> hasRecursion=hasRecursion@entry=0 '\000',
> `
>> #8  0x005fc093 in set_subquery_pathlist (root=root@entry=0x1182fd0, 
>> rel=rel@entry=0x1179370, rti=rti@entry=4, rte=rte@entry=0x1183980) at 
>> allpaths.c:1209
>> #9  0x005fc54e in set_rel_size (root=root@entry=0x1182fd0, 
>> rel=0x1179370, rti=rti@entry=4, rte=0x1183980) at allpaths.c:265
>> #10 0x005fc62e in set_base_rel_sizes (root=root@entry=0x1182fd0) at 
>> allpaths.c:180
>> #11 0x005fd584 in make_one_rel (root=root@entry=0x1182fd0, 
>> joinlist=joinlist@entry=0x1179560) at allpaths.c:138
>> #12 0x0061617a in query_planner (root=root@entry=0x1182fd0, 
>> tlist=tlist@entry=0x11771c8, qp_callback=qp_callback@entry=0x616fd6 
>> ,
>> qp_extra=qp_extra@entry=0x7fff6d97fa00) at planmain.c:236
>> #13 0x006182f2 in grouping_planner (root=root@entry=0x1182fd0, 
>> tuple_fraction=tuple_fraction@entry=0) at planner.c:1280
>> #14 0x00619a11 in subquery_planner (glob=glob@entry=0x10c3fb8, 
>> parse=parse@entry=0x10c3d30, parent_root=parent_root@entry=0x0,
>> hasRecursion=hasRecursion@entry=0 '\000', tuple_fraction=0, 
>> subroot=subroot@entry=0x7fff6d97fb58) at planner.c:574
>> #15 0x00619c1f in standard_planner (parse=0x10c3d30, 
>> cursorOptions=0, boundParams=0x0) at planner.c:211
>> #16 0x00619e35 in planner (parse=parse@entry=0x10c3d30, 
>> cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
>> planner.c:139
>> #17 0x0068c64b in pg_plan_query (querytree=0x10c3d30, 
>> cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
>> postgres.c:759
>> #18 0x0068c6d3 in pg_plan_queries (querytrees=, 
>> cursorOptions=cursorOptions@entry=0, boundParams=boundParams@entry=0x0) at 
>> postgres.c:818
>> #19 0x0068c932 in exec_simple_query 
>> (query_string=query_string@entry=0x10c2d78 "SELECT * FROM hl7.test;") at 
>> postgres.c:983
>> #20 0x0068e46e in PostgresMain (argc=, 
>> argv=argv@entry=0x10cd1f0, dbname=0x10cd058 "regress", username=> out>) at postgres.c:4003
>> #21 0x006419a7 in BackendRun (port=port@entry=0x10c7e50) at 
>> postmaster.c:4080
>> #22 0x006432c2 in BackendStartup (port=port@entry=0x10c7e50) at 
>> postmaster.c:3769
>> #23 0x00643526 in ServerLoop () at postmaster.c:1580
>> #24 0x0064467c in PostmasterMain (argc=argc@entry=4, 
>> argv=argv@entry=0x10a8750) at postmaster.c:1235
>> #25 0x005d692c in main (argc=4, argv=0x10a8750) at main.c:205
>> (gdb)
>
>
> It's crashing while pulling up the query over "emp" (hl7.employee) and
> "part" (hl7.participation).
>
> Given the simplicity of what the row-security code its self is doing,
> I'm wondering if this is a case that isn'

Re: [HACKERS] [SQL] Comparison semantics of CHAR data type

2014-02-24 Thread Bruce Momjian
On Fri, Feb 14, 2014 at 05:02:26PM -0500, Bruce Momjian wrote:
> On Thu, Feb 13, 2014 at 09:47:01PM -0500, Bruce Momjian wrote:
> > On Wed, Oct 16, 2013 at 02:17:11PM -0400, Bruce Momjian wrote:
> > > > > You can see the UTF8 case is fine because \n is considered greater
> > > > > than space, but in the C locale, where \n is less than space, the
> > > > > false return value shows the problem with
> > > > > internal_bpchar_pattern_compare() trimming the string and first
> > > > > comparing on lengths.  This is exactly the problem you outline, where
> > > > > space trimming assumes everything is less than a space.
> > > > 
> > > > For collations other than C some of those issues that have to do with
> > > > string comparisons might simply be hidden, depending on how strcoll()
> > > > handles inputs off different lengths: If strcoll() applies implicit
> > > > space padding to the shorter value, there won't be any visible
> > > > difference in ordering between bpchar and varchar values.  If strcoll()
> > > > does not apply such space padding, the right-trimming of bpchar values
> > > > causes very similar issues even in a en_US collation.
> > 
> > I have added the attached C comment to explain the problem, and added a
> > TODO item to fix it if we ever break binary upgrading.
> > 
> > Does anyone think this warrants a doc mention?
> 
> I have done some more thinking on this and I found a way to document
> this, which reduces our need to actually fix it some day.  I am afraid
> the behavioral change needed to fix this might break so many
> applications that the fix will never be done, though I will keep the
> TODO item until I get more feedback on that.  Patch attached.

Patch applied.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-02-24 Thread Dean Rasheed
On 20 February 2014 01:48, Florian Pflug  wrote:
> On Jan29, 2014, at 13:45 , Florian Pflug  wrote:
>> In fact, I'm
>> currently leaning towards just forbidding non-strict forward transition
>> function with strict inverses, and adding non-NULL counters to the
>> aggregates that then require them. It's really only the SUM() aggregates
>> that are affected by this, I think.
>
> I finally got around to doing that, and the results aren't too bad. The
> attached patches required that the strictness settings of the forward and
> reverse transition functions agree, and employ exactly the same NULL-skipping
> logic we always had.
>
> The only aggregates seriously affected by that change were SUM(int2) and
> SUM(int4).
>
> The SUM, AVG and STDDEV aggregates which use NumericAggState where
> already mostly prepared for this - all they required were a few adjustments
> to correctly handle the last non-NULL, non-NaN input being removed, and a few
> additional PG_ARGISNULL calls for the inverse transition functions since 
> they're
> now non-strict. I've also modified them to unconditionally allocate the state
> at the first call, instead upon seeing the first non-NULL input, but that 
> isn't
> strictly required. But without that, the state can have three classes of 
> values -
> SQL-NULL, NULL pointer and valid pointer, and that's just confusing...
>
> SUM(int2) and SUM(int4) now simply use the same transition functions as
> AVG(int2) and AVG(int4), which use an int8 array to track the sum of the 
> inputs
> and the number of inputs, plus a new final function int2int4_sum(). 
> Previously,
> they used a single int8 as their state type.
>
> Since I was touching the code anyway, I removed some unnecessary inverse
> transition functions - namely, int8_avg_accum_inv and numeric_avg_accum_inv. 
> These
> are completely identical to their non-avg cousins - the only difference 
> between
> the corresponding forward transition functions is whether they request 
> computation
> of sumX2 (i.e. the sum of squares of the inputs) or not.
>
> I haven't yet updated the docs - it'll do that if and when there's consensus
> about whether this is the way to go or not.
>

I haven't looked at this in any detail yet, but that seems much neater
to me. It seems perfectly sensible that the forward and inverse
transition functions should have the same strictness settings, and
enforcing that keeps the logic simple, as well as hopefully making it
easier to document.

It's a shame that more transition functions cannot be made strict,
when they actually ignore null values, but I think trying to solve
that can be regarded as outside the scope of this patch.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication

2014-02-24 Thread Andres Freund
Hi,

On 2014-02-21 14:15:09 +0100, Christian Kruse wrote:
> +/* --
> + * pgstat_fetch_stat_local_beentry() -
> + *
> + *  Like pgstat_fetch_stat_beentry() but with local addtions (like xid and
> + *  xmin values of the backend)

s/local addtions/locally computed addititions/

> +/* --
> + * LocalPgBackendStatus
> + *
> + * When we build the backend status array, we use LocalPgBackendStatus to be
> + * able to add new values to the struct when needed without adding new fields
> + * to the shared memory. It contains the backend status as a first member.
> + * --
> + */
> +typedef struct LocalPgBackendStatus
> +{
> + /*
> +  * Local version of the backend status entry
> +  */
> + PgBackendStatus backendStatus;
> +
> + /*
> +  * The xid of the current transaction if available, InvalidTransactionId
> +  * if not
> +  */
> + TransactionId backend_xid;
> +
> + /*
> +  * The xmin of the current session if available, InvalidTransactionId
> +  * if not
> +  */
> + TransactionId backend_xmin;
> +} LocalPgBackendStatus;
> +

Those are sentences, so they should include a . at the end.

I think this is ready for committer.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure  wrote:
> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure  wrote:
>> I still find the phrasing "as jsonb is more efficient for most
>> purposes" to be a bit off  Basically, the text json type is faster for
>> serialization/deserialization pattern (not just document preservation)
>> and jsonb is preferred when storing json and doing repeated
>>subdocument accesses.
>
> Hm, I'm going to withdraw that.  I had done some testing of simple
> deserialization (cast to text and the like) and noted that jsonb was
> as much as 5x slower.  However, I just did some checking on
> json[b]_populate_recordset though and it's pretty much a wash.

[sorry for noise on this].

Here's the use case coverage as I see it today:

CASE:jsonjsonb hstore
Static document: yes  poor  poor
Precise document:yes  nono
Serialization:   yes  nono
Deserialization: poor***  yes*  no
Repeated Access: poor yes   yes
Manipulation:no   no**  yes
GIST/GIN searching:  no   no**  yes

notes:
* jsonb gets 'yes' for deserialization assuming andrew's 'two level'
deserialization fix goes in (otherwise 'poor').
** jsonb can't do this today, but presumably will be able to soon
*** 'poor' unless json type also gets the deserialization fix, then 'yes'.
 hstore can deserialize hstore format, but will rely on json/jsonb
for deserializing json

'Static document' represents edge cases where the json is opaque to
the database but performance -- for example large map polygons.
'Precise document' represents cases where whitespace or key order is important.

Peter asked upthread how to access the various features.  Well, today,
it basically means a bit of nimble casting to different structures
depending on which particular features are important to you, which
IMNSHO is not bad at all as long as we understand that most people who
rely on jsonb will also need hstore for its searching and operators.
Down the line when hstore and jsonb are more flushed out it's going to
come down to an API style choice.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2014-02-24 Thread Andres Freund
Hi,

I took a quick peek at this, and noticed the following things:
* I am pretty sure this patch doesn't compile anymore after the latest
  set of releases.
* This definitely should include isolationtester tests actually
  performing concurrent ALTER TABLEs. All that's currently there is
  tests that the locklevel isn't too high, but not that it actually works.
* So far no DDL uses ShareRowExclusiveLocks, why do so now? Not sure if
  there aren't relevant cases for with foreign key checks (which afair
  *do* acquire SRE locks).
* Why is AddConstraint "so complicated" when it's all used SRE locks?
* Are you sure AlterConstraint is generally safe without an AEL? It
  should be safe to change whether something is deferred, but not
  necessarily whether it's deferrable?
* Why does ChangeOwner need AEL?
* You changed several places to take out lower locks, which in itself is
  fine, but doesn't that lead to lock upgrade hazard if a later step
  acquires a stronger lock? Or do we take out a strong enough lock from
  the beginning.
* There's no explanation why the EOXact TupleDesc stuff is needed?

That's it for now,

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 1:39 PM, Marti Raudsepp  wrote:
> On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp  wrote:
>> With partial-sort-basic-1 and this fix on the same test suite, the
>> planner overhead is now a more manageable 0.5% to 1.3%; one test is
>> faster by 0.5%.
>
> Ping, Robert or anyone, does this overhead seem bearable or is that
> still too much?
>
> Do these numbers look conclusive enough or should I run more tests?

Tom should really be the one to comment on this, I think.  I read
through the patch quickly and it looks much less scary than the early
versions, but it's not obvious to me whether the remaining overhead is
enough to worry about.  I'd need to spend more time studying it to
form a really sound opinion on that topic, and unfortunately I don't
have that time right now.

I think it'd be interesting to try to determine specifically where
that overhead is coming from.  Pick the test case where it's the worst
(1.3%) and do a "perf" with and without the patch and look at the
difference in the call graph.  It's possible we could have changes on
that order of magnitude just from more or less fortuitous code layout
decisions as code shifts around, but it's also possible that there's a
real effect there we should think harder about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-24 Thread Bruce Momjian
On Wed, Feb 19, 2014 at 06:37:42PM -0500, Tom Lane wrote:
> Emre Hasegeli  writes:
> > [ cites bug #5705 ]
> 
> Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
> opclasses are.  There was discussion at the time of just ripping them
> out despite the compatibility hit.  We didn't do it, but if we had
> then life would be simpler now.
> 
> Perhaps it would be acceptable to drop the btree_gist implementation
> and teach pg_upgrade to refuse to upgrade if the old database contains
> any such indexes.  I'm not sure that solves the problem, though, because
> I think pg_upgrade will still fail if the opclass is in the old database
> even though unused (because you'll get the complaint about multiple
> default opclasses anyway).  The only simple way to avoid that is to not
> have btree_gist loaded at all in the old DB, and I doubt that's an
> acceptable upgrade restriction.  It would require dropping indexes of
> the other types supported by btree_gist, which would be a pretty hard
> sell since those aren't broken.
> 
> Really what we'd need here is for btree_gist to be upgraded to a version
> that doesn't define gist_inet_ops (or at least doesn't mark it as default)
> *before* pg_upgrading to a server version containing the proposed new
> implementation.  Not sure how workable that is.  Could we get away with
> having pg_upgrade unset the default flag in the old database?
> (Ick, but there are no very good alternatives here ...)

pg_upgrade has _never_ modified the old cluster, and I am hesitant to do
that now.  Can we make the changes in the new cluster, or in pg_dump
when in binary upgrade mode?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
2014-02-23 22:24 GMT+09:00 Kohei KaiGai :
> Folks,
>
> Let me remind the custom-scan patches; that is a basis feature of
> remote join of postgres_fdw, cache-only scan, (upcoming) GPU
> acceleration feature or various alternative ways to scan/join relations.
> Unfortunately, small amount of discussion we could have in this commit
> fest, even though Hanada-san volunteered to move the patches into
> "ready for committer" state at the CF-Nov.

I found some cosmetic flaw and .gitignore leak in the patches.  Please
see attached a patch for details.

-- 
Shigeru HANADA


custom_scan_cosmetic_fix.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-02-24 Thread Shigeru Hanada
Hi Kaigai-san,

Sorry to leave the thread for a while.

2014-02-23 22:24 GMT+09:00 Kohei KaiGai :
> (1) Interface to add alternative paths in addition to built-in join paths
>
> This patch adds "add_join_path_hook" on add_paths_to_joinrel to allow
> extensions to provide alternative scan path in addition to the built-in
> join paths. Custom-scan path being added is assumed to perform to scan
> on a (virtual) relation that is a result set of joining relations.
> My concern is its arguments to be pushed. This hook is declared as follows:
>
> /* Hook for plugins to add custom join path, in addition to default ones */
> typedef void (*add_join_path_hook_type)(PlannerInfo *root,
> RelOptInfo *joinrel,
> RelOptInfo *outerrel,
> RelOptInfo *innerrel,
> JoinType jointype,
> SpecialJoinInfo *sjinfo,
> List *restrictlist,
> List *mergeclause_list,
> SemiAntiJoinFactors *semifactors,
> Relids param_source_rels,
> Relids extra_lateral_rels);
> extern PGDLLIMPORT add_join_path_hook_type add_join_path_hook;
>
> Likely, its arguments upper than restrictlist should be informed to 
> extensions,
> because these are also arguments of add_paths_to_joinrel().
> However, I'm not 100% certain how about other arguments should be informed.
> Probably, it makes sense to inform param_source_rels and extra_lateral_rels
> to check whether the path is sensible for parameterized paths.
> On the other hand, I doubt whether mergeclause_list is usuful to deliver.
> (It may make sense if someone tries to implement their own merge-join
> implementation??)
>
> I'd like to seem idea to improve the current interface specification.

I've read the code path to add custom join again, and felt that
providing semifactors seems not necessary for the first cut, because
it is used in only initial_cost_nestloop (final_cost_nestloop receives
semifactors but it is not used there), and external module would not
become so smart before 9.5 development cycle.  It seems enough complex
to postpone determinig  whether it's essential for add_join_path_hook.
 Do you have any concrete use case for the parameter?

mergeclause_list and param_source_rels seem little easier to use, but
anyway it should be documented how to use those parameters.

IMHO, minimal interface seems better than fully-fledged but half-baked
one, especially in the first-cut.

-- 
Shigeru HANADA


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:27 PM, Michael Paquier
 wrote:
> On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund  wrote:
>> On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
>>> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>>>  wrote:
>>> >> Yes, that's a good precedent in multiple ways.
>>> > Here are updated patches to use pg_lsn instead of pglsn...
>>>
>>> OK, so I think this stuff is all committed now, with assorted changes.
>>>  Thanks for your work on this.
>>
>> cool, thanks you two.
>>
>> I wonder if pg_stat_replication shouldn't be updated to use it as well?
>> SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
>> that as names that are possible candidates for conversion.
> I was sure to have forgotten some views or functions in the previous
> patch... Please find attached a patch making pg_stat_replication use
> pg_lsn instead of the existing text fields.
> Regards,

Committed.  Do we want to do anything about pageinspect?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show relation and tuple infos of a lock to acquire

2014-02-24 Thread Christian Kruse
Hi,

attached you will find a new version of the patch, removing the ctid
text but leaving the ctid itself in the message.

On 23/02/14 11:14, Amit Kapila wrote:
> >> In general, why I am suggesting to restrict display of newly added
> >> context for the case it is added to ensure that it doesn't get started
> >> displaying in other cases where it might make sense or might not
> >> make sense.
> >
> > Anything failing while inside an XactLockTableWait() et al. will benefit
> > from the context. In which scenarios could that be unneccessary?
> 
> This path is quite deep, so I have not verified that whether
> this new context can make sense for all error cases.
> For example, in below path (error message), it doesn't seem to
> make any sense (I have tried to generate it through debugger,
> actual message will vary).
> 
> XactLockTableWait->SubTransGetParent->SimpleLruReadPage_ReadOnly->SimpleLruReadPage->SlruReportIOError
> 
> ERROR:  could not access status of transaction 927
> DETAIL:  Could not open file "pg_subtrans/": No error.
> CONTEXT:  relation "public"."t1"
> tuple ctid (0,2)

To be honest, I don't like the idea of setting up this error context
only for log_lock_wait messages. This sounds unnecessary complex to me
and I think that in the few cases where this message doesn't add a
value (and thus is useless) don't justify such complexity.

> I have not checked that, but the reason I mentioned about database name
> is due to display of database oid in similar message, please see the
> message below. I think in below context we get it from lock tag and
> I think for the patch case also, it might be better to display, but not
> a mandatory thing. Consider it as a suggestion which if you also feel
> good, then do it, else ignore it.
> 
> LOG:  process 4656 still waiting for ExclusiveLock on tuple (0,1) of relation 
> 57
> 513 of database 12045 after 1046.000 ms

After thinking a bit about this I added the database name to the
context message, see attached patch.

> >> > Currently I simply display the whole tuple with truncating long
> >> > fields. This is plain easy, no distinction necessary. I did some code
> >> > reading and it seems somewhat complex to get the PK columns and it
> >> > seems that we need another lock for this, too - maybe not the best
> >> > idea when we are already in locking trouble, do you disagree?
> >>
> >> I don't think you need to take another lock for this, it must already
> >> have appropriate lock by that time. There should not be any problem
> >> in calling relationHasPrimaryKey except that currently it is static, you
> >> need to expose it.
> >
> > Other locations that deal with similar things (notably
> > ExecBuildSlotValueDescription()) doesn't either. I don't think this
> > patch should introduce it then.
> 
> Displaying whole tuple doesn't seem to be the most right way
> to get debug information and especially in this case we are
> already displaying tuple offset(ctid) which is unique identity
> to identify a tuple. It seems to me that it is sufficient to display
> unique value of tuple if present.
> I understand that there is no clear issue here, so may be if others also
> share their opinion then it will be quite easy to take a call.

That would be nice. I didn't change it, yet.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index a771ccb..a04414e 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -107,6 +107,8 @@ static TransactionId MultiXactIdGetUpdateXid(TransactionId xmax,
 		uint16 t_infomask);
 static void MultiXactIdWait(MultiXactId multi, MultiXactStatus status,
 int *remaining, uint16 infomask);
+static void MultiXactIdWaitWithErrorContext(Relation rel, HeapTuple tup, MultiXactId multi,
+	MultiXactStatus status, int *remaining, uint16 infomask);
 static bool ConditionalMultiXactIdWait(MultiXactId multi,
 		   MultiXactStatus status, int *remaining,
 		   uint16 infomask);
@@ -2703,8 +2705,8 @@ l1:
 		if (infomask & HEAP_XMAX_IS_MULTI)
 		{
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, MultiXactStatusUpdate,
-			NULL, infomask);
+			MultiXactIdWaitWithErrorContext(relation, &tp, (MultiXactId) xwait,
+	  MultiXactStatusUpdate, NULL, infomask);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -2730,7 +2732,7 @@ l1:
 		else
 		{
 			/* wait for regular transaction to end */
-			XactLockTableWait(xwait);
+			XactLockTableWaitWithInfo(relation, &tp, xwait);
 			LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
 
 			/*
@@ -3255,8 +3257,8 @@ l2:
 			int			remain;
 
 			/* wait for multixact */
-			MultiXactIdWait((MultiXactId) xwait, mxact_status, &remain,
-			infomask);
+			MultiXactIdWaitWithErrorContext(relation, &oldtup,
+	   (MultiXactId) xwait, mxact_status, &remain, infomask);
 			

Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Andres Freund
On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
>354246.00 93.0% s_lock
> /usr/lib/postgresql/9.2/bin/postgres
> 10503.00  2.8% LWLockRelease
>  /usr/lib/postgresql/9.2/bin/postgres
>  8802.00  2.3% LWLockAcquire
>  /usr/lib/postgresql/9.2/bin/postgres
>   828.00  0.2% _raw_spin_lock
> [kernel.kallsyms]
>   559.00  0.1% _raw_spin_lock_irqsave
> [kernel.kallsyms]
>   340.00  0.1% switch_mm
>  [kernel.kallsyms]
>   305.00  0.1% poll_schedule_timeout
>  [kernel.kallsyms]
>   274.00  0.1% native_write_msr_safe
>  [kernel.kallsyms]
>   257.00  0.1% _raw_spin_lock_irq
> [kernel.kallsyms]
>   238.00  0.1% apic_timer_interrupt
> [kernel.kallsyms]
>   236.00  0.1% __schedule
> [kernel.kallsyms]
>   213.00  0.1% HeapTupleSatisfiesMVCC
> 
> With systemtap I got list of spin locks
> 
> light weight locks
> lockname   mode  countavg (time)
> DynamicLocks  Exclusive   2804   1025
> DynamicLocks Shared106130
>ProcArrayLock  Exclusive 63 963551
>ProcArrayLock Shared 50   4160
> LockMgrLocks  Exclusive 18159
>  IndividualLock   Exclusive  2  7
> 
> There is relative few very long ProcArrayLocks lwlocks

It's odd that there are so many exclusive acquisition
ProcArrayLocks... A hierarchical profile would be interesting. I'd
suggest compiling postgres with -fno-omit-frame-pointer and doing a
profile with perf.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure  wrote:
> I still find the phrasing "as jsonb is more efficient for most
> purposes" to be a bit off  Basically, the text json type is faster for
> serialization/deserialization pattern (not just document preservation)
> and jsonb is preferred when storing json and doing repeated
>subdocument accesses.

Hm, I'm going to withdraw that.  I had done some testing of simple
deserialization (cast to text and the like) and noted that jsonb was
as much as 5x slower.  However, I just did some checking on
json[b]_populate_recordset though and it's pretty much a wash.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] often PREPARE can generate high load (and sometimes minutes long unavailability)

2014-02-24 Thread Pavel Stehule
2014-02-23 21:32 GMT+01:00 Andres Freund :

> Hi,
>
> On 2014-02-23 20:04:39 +0100, Pavel Stehule wrote:
> > There is relative few very long ProcArrayLocks lwlocks
> >
> > This issue is very pathologic on fast computers with more than 8 CPU.
> This
> > issue was detected after migration from 8.4 to 9.2. (but tested with same
> > result on 9.0)  I see it on devel 9.4 today actualized.
> >
> > When I moved PREPARE from cycle, then described issues is gone. But when
> I
> > use a EXECUTE IMMEDIATELY, then the issue is back. So it looks it is
> > related to planner, ...
>
> In addition to the issue Jeff mentioned, I'd suggest trying the same
> workload with repeatable read. That can do *wonders* because of the
> reduced number of snapshots.
>
>
I tested it, and it doesn't help.

Is there some patch, that I can test related to this issue?


> Greetings,
>
> Andres Freund
>
> --
>  Andres Freund http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-24 Thread Andres Freund
On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
> On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund  wrote:

> +   /*
> +* XXX: It's impolite to ignore our argument and keep decoding until 
> the
> +* current position.
> +*/
> 
> Eh, what?

So, the background here is that I was thinking of allowing to specify a
limit for the number of returned rows. For the sql interface that sounds
like a good idea. I am just not so sure anymore that allowing to specify
a LSN as a limit is sufficient. Maybe simply allow to limit the number
of changes and check everytime a transaction has been replayed?

It's all trivial codewise, I am just wondering about the interface most
users would want.

> +* We misuse the original meaning of SnapshotData's xip and
> subxip fields
> +* to make the more fitting for our needs.
> [...]
> +* XXX: Do we want extra fields instead of misusing existing
> ones instead?
> 
> If we're going to do this, then it surely needs to be documented in
> snapshot.h.  On the second question, you're not the first hacker to
> want to abuse the meanings of the existing fields; SnapshotDirty
> already does it.  It's tempting to think we need a more principled
> approach to this, like what we've done with Node i.e. typedef enum ...
> SnapshotType; and then a separate struct definition for each kind, all
> beginning with SnapshotType type.

Hm, essentially that's what the ->satisfies pointer already is, right?

There's already documentation of the extra fields in snapbuild, but I
understand you'd rather have them moved?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-24 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:06 PM, Michael Paquier
 wrote:
> On Thu, Feb 20, 2014 at 9:43 AM, Michael Paquier
>  wrote:
>> On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas  wrote:
>>> On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
>>>  wrote:
 On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut  wrote:
> On 2/5/14, 1:31 PM, Robert Haas wrote:
>> On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut  wrote:
>>> Perhaps this type should be called pglsn, since it's an
>>> implementation-specific detail and not a universal concept like int,
>>> point, or uuid.
>>
>> If we're going to do that, I suggest pg_lsn rather than pglsn.  We
>> already have pg_node_tree, so using underscores for separation would
>> be more consistent.
>
> Yes, that's a good precedent in multiple ways.
 Here are updated patches to use pg_lsn instead of pglsn...
>>>
>>> OK, so I think this stuff is all committed now, with assorted changes.
>>>  Thanks for your work on this.
>> Thanks!
>> Oops, it looks like I am coming after the battle (time difference does
>> not help). I'll be more careful to test such patches on 32b platforms
>> as well in the future.
> After re-reading the code, I found two incorrect comments in the new
> regression tests. Patch fixing them is attached.

Thanks, committed.  But I left out the whitespace change you included.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Merlin Moncure
On Mon, Feb 24, 2014 at 12:20 AM, Josh Berkus  wrote:
> All,
>
> Here's a draft cleanup on the JSON section of the Datatype docs.  Since
> there's been a bunch of incremental patches on this, I just did a diff
> against HEAD.
>
> I looked over json-functions a bit, but am not clear on what needs to
> change there; the docs are pretty similar to other sections of
> Functions, and if they're complex it's because of the sheer number of
> JSON-related functions.
>
> Anyway, this version of datatypes introduces a comparison table, which I
> think should make things a bit clearer for users.

I still find the phrasing "as jsonb is more efficient for most
purposes" to be a bit off  Basically, the text json type is faster for
serialization/deserialization pattern (not just document preservation)
and jsonb is preferred when storing json and doing repeated
subdocument accesses.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Alvaro Herrera
Andres Freund escribió:
> On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
> > On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
> > wrote:
> > 
> > > Michael Paquier escribió:
> > > > Hi all,
> > > >
> > > > Please find attached a patch to dump pageinspect to 1.2. This
> > > > simply changes page_header to use the new internal datatype
> > > > pg_lsn instead of text.
> > >
> > > Uhm.  Does this crash if you pg_upgrade a server that has 1.1
> > > installed?
> > >
> > Oops yes you're right., it is obviously broken. Just re-thinking
> > about that, dumping this module just to change an argument type does
> > not seem to be worth the code complication. Thoughts?
> 
> It seem simple to support both, old and new type. page_header() (which
> IIRC is the only thing returning an LSN) likely uses
> get_call_result_type(). So you can just check what it's
> expecting. Either support both types or error out if it's the old
> extension version.

Yeah, erroring out seems good enough.  Particularly if you add a hint
saying "please upgrade the extension".

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Oleg Bartunov
Yes, the repository you mentioned is the last version of our
development. It contains various fixes of issues by Andres, but we are
waiting Andrew, who is working on jsonb stuff.

On Mon, Feb 24, 2014 at 5:34 PM, Tomas Vondra  wrote:
> On 7.2.2014 00:47, Andrew Dunstan wrote:
>>
>> On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>>>  Should I make new version of patch? Right now it's placed on github.
>>> May be Andrew wants to change something?
>>>
>>
>>
>> Attached are updated patches.
>>
>> Apart from the things Teodor has fixed, this includes
>>
>>  * switching to using text representation in jsonb send/recv
>>  * implementation of jsonb_array_elements_text that we need now we have
>>json_array_elements_text
>>  * some code fixes requested in code reviews, plus some other tidying
>>and refactoring.
>>
>> cheers
>
> Hi,
>
> I'm slightly uncertain if this is the current version of the patches, or
> whether I should look at
> https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
> slightly modified code.
>
> Anyway, the only thing I noticed in the v10 version so far is slight
> difference in naming - while we have json_to_hstore/hstore_to_json, we
> have jsonb2hstore/hstore2jsonb. I propose to change this to
> jsonb_to_hstore/hstore_to_jsonb.
>
> May not be needed if the implicit casts go through.
>
> regards
> Tomas
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-24 Thread Tomas Vondra
On 7.2.2014 00:47, Andrew Dunstan wrote:
> 
> On 02/05/2014 10:36 AM, Teodor Sigaev wrote:
>>  Should I make new version of patch? Right now it's placed on github.
>> May be Andrew wants to change something?
>>
> 
> 
> Attached are updated patches.
> 
> Apart from the things Teodor has fixed, this includes
> 
>  * switching to using text representation in jsonb send/recv
>  * implementation of jsonb_array_elements_text that we need now we have
>json_array_elements_text
>  * some code fixes requested in code reviews, plus some other tidying
>and refactoring.
> 
> cheers

Hi,

I'm slightly uncertain if this is the current version of the patches, or
whether I should look at
https://github.com/feodor/postgres/tree/jsonb_and_hstore which contains
slightly modified code.

Anyway, the only thing I noticed in the v10 version so far is slight
difference in naming - while we have json_to_hstore/hstore_to_json, we
have jsonb2hstore/hstore2jsonb. I propose to change this to
jsonb_to_hstore/hstore_to_jsonb.

May not be needed if the implicit casts go through.

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [review] PostgreSQL Service on Windows does not start if data directory given is relative path

2014-02-24 Thread MauMau

From: "Rajeev rastogi" 
Please find the attached modified patch.

Thanks, reviewed and made this patch ready for committer.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Andres Freund
On 2014-02-24 17:53:31 +0900, Michael Paquier wrote:
> On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera 
> wrote:
> 
> > Michael Paquier escribió:
> > > Hi all,
> > >
> > > Please find attached a patch to dump pageinspect to 1.2. This simply
> > > changes page_header to use the new internal datatype pg_lsn instead of
> > text.
> >
> > Uhm.  Does this crash if you pg_upgrade a server that has 1.1 installed?
> >
> Oops yes you're right., it is obviously broken. Just re-thinking about
> that, dumping this module just to change an argument type does not seem to
> be worth the code complication. Thoughts?

It seem simple to support both, old and new type. page_header() (which
IIRC is the only thing returning an LSN) likely uses
get_call_result_type(). So you can just check what it's
expecting. Either support both types or error out if it's the old
extension version.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dump pageinspect to 1.2: page_header with pg_lsn datatype

2014-02-24 Thread Michael Paquier
On Sun, Feb 23, 2014 at 6:09 AM, Alvaro Herrera wrote:

> Michael Paquier escribió:
> > Hi all,
> >
> > Please find attached a patch to dump pageinspect to 1.2. This simply
> > changes page_header to use the new internal datatype pg_lsn instead of
> text.
>
> Uhm.  Does this crash if you pg_upgrade a server that has 1.1 installed?
>
Oops yes you're right., it is obviously broken. Just re-thinking about
that, dumping this module just to change an argument type does not seem to
be worth the code complication. Thoughts?
Regards,
-- 
Michael