Re: [HACKERS] KNN searches support for SP-GiST [GSOC'14]

2014-08-20 Thread Heikki Linnakangas

On 08/20/2014 03:35 AM, Vladislav Sterzhanov wrote:

Hi there, pg-Hackers!
Here I go with the patch which brings up the possibility to perform
nearest-neighbour searches on SP-GiSTs (as of now includes implementation
for quad and kd trees). Pre-reviewed by my GSoC mentor Alexander Korotkov.
Sample benchmarking script included in the attachment (dumps the current
geonames archive and runs several searches on the (latitude, longitude)
points), which demonstrates the dramatic improvements against plain
searches and sorting.


Cool!

I think this needs a high-level description in access/spgist/README on 
how this works. You can probably copy-paste the similar description from 
gist's README, because it's the same design. If I understand correctly, 
the support functions return distances between the nodes and the query, 
and the SP-GiST code uses those distances to return the rows in order. 
An important detail there is that the distance returned for an inner 
node is a lower bound of the distance of any node in that branch.


A binary heap would be a better data structure than a red-black tree for 
the queue of nodes to visit/return. I know that GiST also uses a 
red-black for the same thing, but that's no reason not to do better 
here. There is a binary heap implementation in the backend too (see 
src/backend/lib/binaryheap.c), so it's not any more difficult to use.


Does the code handle ties correctly? The distance function is just an 
approximation, so it's possible that there are two items with same 
distance, but are not equal according to the ordering operator. As an 
extreme example, if you hack the distance function to always return 0.0, 
do you still get correct results (albeit slowly)?


The new suppLen field in spgConfigOut is not mentioned in the docs. Why 
not support pass-by-value supplementary values?


A couple of quick comments on the code:


@@ -137,8 +138,8 @@ spg_quad_picksplit(PG_FUNCTION_ARGS)
 {
spgPickSplitIn *in = (spgPickSplitIn *) PG_GETARG_POINTER(0);
spgPickSplitOut *out = (spgPickSplitOut *) PG_GETARG_POINTER(1);
-   int i;
-   Point  *centroid;
+   int i;
+   Point *centroid;

 #ifdef USE_MEDIAN
/* Use the median values of x and y as the centroid point */


Please remove all unnecessary whitespace changes like this.


@@ -213,14 +215,19 @@ spg_quad_inner_consistent(PG_FUNCTION_ARGS)
}

Assert(in->nNodes == 4);
+   
+   if (in->norderbys > 0)
+   {
+   out->distances = palloc(in->nNodes * sizeof (double *));
+   }


I think that should be "sizeof(double)".


+   *distances = (double *) malloc(norderbys * sizeof (double *));


No mallocs allowed in the backend, should be palloc.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] August commitfest

2014-08-20 Thread Heikki Linnakangas
Per the schedule agreed on in the Developer Meeting in Ottawa, the 
August commitfest began five days ago.


We don't seem to have a commitfest manager, and the commitfest has so 
far failed to manage itself. To get things going, I'm picking up the 
Commitfest Manager Mace I found from behind the dumpster now.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] August commitfest

2014-08-20 Thread David Rowley
On Wed, Aug 20, 2014 at 7:17 PM, Heikki Linnakangas  wrote:

> Per the schedule agreed on in the Developer Meeting in Ottawa, the August
> commitfest began five days ago.
>
> We don't seem to have a commitfest manager, and the commitfest has so far
> failed to manage itself. To get things going, I'm picking up the Commitfest
> Manager Mace I found from behind the dumpster now.
>
>
Thanks Heikki,
I was wondering if you knew what the plan is for the remaining ready for
committer patches from the June commitfest?

Regards

David Rowley


Re: [HACKERS] New Model For Role Attributes and Fine Grained Permssions

2014-08-20 Thread Heikki Linnakangas

On 08/19/2014 04:27 AM, Brightwell, Adam wrote:

Hi All,

This is a "proof-of-concept" patch for a new model around role attributes
and fine grained permissions meant to alleviate the current over dependence
on superuser.


Hmm. How does this get us any closer to fine-grained permissions? I 
guess we need this, so that you can grant/revoke the permissions, but I 
thought the hard part is defining what the fine-grained permissions are, 
in a way that you can't escalate yourself to full superuser through any 
of them.


The new syntax could be useful even if we never get around to do 
anything about current superuser checks, so I'm going to give this a 
quick review just on its own merits.


Please add documentation. That's certainly required before this can be 
committed, but it'll make reviewing the syntax much easier too.



This is not yet complete and only serves as a proof-of-concept at this
point, but I wanted to share it in the hopes of receiving comments,
suggestions and general feedback.

The general gist of this patch is as follows:

* New system catalog "pg_permission" that relates role id's to permissions.

* New syntax.
   - GRANT  TO 
   - REVOKE  FROM 
where,  is one of an enumerated value, such as "CREATE ROLE" or
"CREATE DATABASE".


The syntax for GRANT/REVOKE is quite complicated already. Do we want to 
overload it even more? Also keep in mind that the SQL standard specifies 
GRANT/REVOKE, so we have to be careful to not clash with the SQL 
standard syntax, or any conceivable future syntax the SQL committee 
might add to it (although we have plenty of PostgreSQL extensions in it 
already). For example, this is currently legal:


GRANT CREATE ALL ON TABLESPACE  TO 

Is that too close to the new syntax, to cause confusion? Does the new 
syntax work for all the more fine-grained permissions you have in mind? 
I'm particularly worried of conflicts with this existing syntax:


GRANT role_name [, ...] TO role_name [, ...] [ WITH ADMIN OPTION ]

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] August commitfest

2014-08-20 Thread Heikki Linnakangas

On 08/20/2014 10:42 AM, David Rowley wrote:

On Wed, Aug 20, 2014 at 7:17 PM, Heikki Linnakangas 
wrote:



Per the schedule agreed on in the Developer Meeting in Ottawa, the August
commitfest began five days ago.

We don't seem to have a commitfest manager, and the commitfest has so far
failed to manage itself. To get things going, I'm picking up the Commitfest
Manager Mace I found from behind the dumpster now.



Thanks Heikki,
I was wondering if you knew what the plan is for the remaining ready for
committer patches from the June commitfest?


There is no plan. I just moved them to the August commitfest, hopefully 
they will actually get committed or rejected by a committer.


I'm doing a quick pass through the list of patches now, and after that 
I'll start nagging people to get stuff reviewed/committed. (There are a 
bunch of patches that are in Waiting on author state, and no activity 
for months; I'm marking them directly as Returned with feedback.)


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()

2014-08-20 Thread Heikki Linnakangas

On 07/07/2014 11:46 AM, Abhijit Menon-Sen wrote:

At 2014-07-07 14:02:15 +0530, amit.khande...@enterprisedb.com wrote:


Other than some minor comments as mentioned below, I don't have any
more issues, it looks all good.


Thank you. (Updated patch attached.)


I don't think the new GetBufferWithoutRelcache function is in line with 
the existing ReadBuffer API. I think it would be better to add a new 
ReadBufferMode - RBM_CACHE_ONLY? - that only returns the buffer if it's 
already in cache, and InvalidBuffer otherwise.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] August commitfest

2014-08-20 Thread Heikki Linnakangas
I have cleaned up the Commitfest of patches that have been in Waiting 
for Author state for weeks or more. I believe the list is now an 
accurate representation of the actual state of the patches.


Patch authors
-

Please check if a patch of yours is in Waiting for Author state. It 
means that you need to amend the patch and post a new version. If you do 
nothing, the patch will be marked returned with feedback.


Also, please pick a patch or two, submitted by other people, and review 
them.


Reviewers
-

Please review patches. Every little helps.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Fix search_path default value separator.

2014-08-20 Thread Heikki Linnakangas

On 08/15/2014 04:58 PM, Bruce Momjian wrote:

On Fri, Aug 15, 2014 at 10:40:59PM +0900, Fujii Masao wrote:

Heh.  I'm not particularly averse to changing this, but I guess I
don't see any particular benefit of changing it either.  Either comma
or comma-space is a legal separator, so why worry about it?


This change might cause me to update the existing documents (which
I need to maintain in my company) including the output example of
default search_path. If the change is for the improvement, I'd be
happy to do that, but it seems not.

Also there might be some PostgreSQL extensions which their regression test
shows the default search_path. This patch would make their developers
spend the time to update the test. I'm sure that they are fine with that if
it's for an improvement. But not.


Well, rename GUC often too for clearity, so I don't see adjusting
white-space as something to avoid either.  It is always about short-term
adjustments vs. long-term clarity.


I think this is an improvement, although a really minor one. Although 
Robert & Fujii questioned if this is worth it, I didn't hear anyone 
objecting strongly, so committed.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.5

2014-08-20 Thread Heikki Linnakangas

Hi Andres,

Are you planning to continue working on this? Summarizing the discussion 
so far:


* Robert listed a bunch of little cleanup tasks 
(http://www.postgresql.org/message-id/ca+tgmozshvvqjakul6p3kdvzvpibtgkzoti3m+fvvjg5v+x...@mail.gmail.com). 
Amit posted yet more detailed commends.


* We talked about changing the file layout. I think everyone is happy 
with your proposal here: 
http://www.postgresql.org/message-id/20140702131729.gp21...@awork2.anarazel.de, 
with an overview description of what goes where 
(http://www.postgresql.org/message-id/CA+TgmoYuxfsm2dSy48=jgutrh1mozrvmjlhgqrfku7_wpv-...@mail.gmail.com)


* Talked about nested spinlocks. The consensus seems to be to (continue 
to) forbid nested spinlocks, but allow atomic operations while holding a 
spinlock. When atomics are emulated with spinlocks, it's OK to acquire 
the emulation spinlock while holding another spinlock.


* Talked about whether emulating atomics with spinlocks is a good idea. 
You posted performance results showing that at least with the patch to
use atomics to implement LWLocks, the emulation performs more or less 
the same as the current spinlock-based implementation. I believe 
everyone was more or less satisfied with that.



So ISTM we have consensus that the approach to spinlock emulation and 
nesting stuff is OK. To finish the patch, you'll just need to address 
the file layout and the laundry list of other little details that have 
been raised.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] implement subject alternative names support for SSL connections

2014-08-20 Thread Heikki Linnakangas

On 07/25/2014 07:10 PM, Alexey Klyukin wrote:

Greetings,

I'd like to propose a patch for checking subject alternative names entry in
the SSL certificate for DNS names during SSL authentication.


Thanks! I just ran into this missing feature last week, while working on 
my SSL test suite. So +1 for having the feature.


This patch needs to be rebased over current master branch, thanks to my 
refactoring that moved all OpenSSL-specific stuff to be-secure-openssl.c.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Compute attr_needed for child relations (was Re: [HACKERS] inherit support for foreign tables)

2014-08-20 Thread Etsuro Fujita

Hi Ashutish,

(2014/08/14 22:30), Ashutosh Bapat wrote:

On Thu, Aug 14, 2014 at 10:05 AM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>> wrote:

(2014/08/08 18:51), Etsuro Fujita wrote:
 >>> (2014/06/30 22:48), Tom Lane wrote:
  I wonder whether it isn't time to change that.  It was coded
like that
  originally only because calculating the values would've been a
waste of
  cycles at the time.  But this is at least the third place
where it'd be
  useful to have attr_needed for child rels.

 > I've revised the patch.

There was a problem with the previous patch, which will be described
below.  Attached is the updated version of the patch addressing that.



Here are some more comments


Thank you for the further review!


attr_needed also has the attributes used in the restriction clauses as
seen in distribute_qual_to_rels(), so, it looks unnecessary to call
pull_varattnos() on the clauses in baserestrictinfo in functions
check_selective_binary_conversion(), remove_unused_subquery_outputs(),
check_index_only().


IIUC, I think it's *necessary* to do that in those functions since the 
attributes used in the restriction clauses in baserestrictinfo are not 
added to attr_needed due the following code in distribute_qual_to_rels.


/*
 * If it's a join clause (either naturally, or because delayed by
 * outer-join rules), add vars used in the clause to targetlists of 
their

 * relations, so that they will be emitted by the plan nodes that scan
 * those relations (else they won't be available at the join node!).
 *
 * Note: if the clause gets absorbed into an EquivalenceClass then this
 * may be unnecessary, but for now we have to do it to cover the case
 * where the EC becomes ec_broken and we end up reinserting the 
original

 * clauses into the plan.
 */
if (bms_membership(relids) == BMS_MULTIPLE)
{
List   *vars = pull_var_clause(clause,
   PVC_RECURSE_AGGREGATES,
   PVC_INCLUDE_PLACEHOLDERS);

add_vars_to_targetlist(root, vars, relids, false);
list_free(vars);
}


Although in case of RTE_RELATION, the 0th entry in attr_needed
corresponds to FirstLowInvalidHeapAttributeNumber + 1, it's always safer
to use it is RelOptInfo::min_attr, in case get_relation_info() wants to
change assumption or somewhere down the line some other part of code
wants to change attr_needed information. It may be unlikely, that it
would change, but it will be better to stick to comments in RelOptInfo
  443 AttrNumber  min_attr;   /* smallest attrno of rel (often
<0) */
  444 AttrNumber  max_attr;   /* largest attrno of rel */
  445 Relids *attr_needed;/* array indexed [min_attr ..
max_attr] */


Good point!  Attached is the revised version of the patch.

Thanks,

Best regards,
Etsuro Fujita
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 799,806  check_selective_binary_conversion(RelOptInfo *baserel,
  	}
  
  	/* Collect all the attributes needed for joins or final output. */
! 	pull_varattnos((Node *) baserel->reltargetlist, baserel->relid,
!    &attrs_used);
  
  	/* Add all the attributes used by restriction clauses. */
  	foreach(lc, baserel->baserestrictinfo)
--- 799,810 
  	}
  
  	/* Collect all the attributes needed for joins or final output. */
! 	for (i = baserel->min_attr; i <= baserel->max_attr; i++)
! 	{
! 		if (!bms_is_empty(baserel->attr_needed[i - baserel->min_attr]))
! 			attrs_used = bms_add_member(attrs_used,
! 		i - FirstLowInvalidHeapAttributeNumber);
! 	}
  
  	/* Add all the attributes used by restriction clauses. */
  	foreach(lc, baserel->baserestrictinfo)
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
***
*** 577,588  set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
  		childrel->has_eclass_joins = rel->has_eclass_joins;
  
  		/*
! 		 * Note: we could compute appropriate attr_needed data for the child's
! 		 * variables, by transforming the parent's attr_needed through the
! 		 * translated_vars mapping.  However, currently there's no need
! 		 * because attr_needed is only examined for base relations not
! 		 * otherrels.  So we just leave the child's attr_needed empty.
  		 */
  
  		/*
  		 * Compute the child's size.
--- 577,585 
  		childrel->has_eclass_joins = rel->has_eclass_joins;
  
  		/*
! 		 * Compute the child's attr_needed.
  		 */
+ 		adjust_appendrel_attr_needed(rel, childrel, appinfo);
  
  		/*
  		 * Compute the child's size.
***
*** 2173,2178  remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel)
--- 2170,2176 
  {
  	Bitmapset  *attrs_used = NULL;
  	ListCell   *lc;
+ 	int			i;
  
  	/*
  	 * Do nothing if subquery has UNION/INTERSECT/EXCEPT: in principle we
***
**

Re: [HACKERS] better atomics - v0.5

2014-08-20 Thread Andres Freund
Hi,

On 2014-08-20 12:43:05 +0300, Heikki Linnakangas wrote:
> Are you planning to continue working on this?

Yes, I am! I've recently been on holiday and now I'm busy with catching
up with everything that has happened since. I hope to have a next
version ready early next week.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-08-20 Thread Etsuro Fujita

Hi Noah,

Thank you for the review!

(2014/07/02 11:23), Noah Misch wrote:

On Fri, Jun 20, 2014 at 05:04:06PM +0900, Kyotaro HORIGUCHI wrote:

Attached is the rebased patch of v11 up to the current master.


I've been studying this patch.

SELECT FOR UPDATE on the inheritance parent fails with a can't-happen error
condition, even when SELECT FOR UPDATE on the child foreign table alone would
have succeeded.


To fix this, I've modified the planner and executor so that the planner 
adds wholerow as well as ctid and tableoid as resjunk output columns to 
the plan for an inheritance tree that contains foreign tables, and that
while the executor uses the ctid and tableoid in the EPQ processing for 
child regular tables as before, the executor uses the wholerow and 
tableoid for child foreign tables.  Patch attached.  This is based on 
the patch [1].



The patch adds zero tests.  Add principal tests to postgres_fdw.sql.  Also,
create a child foreign table in foreign_data.sql; this will make dump/reload
tests of the regression database exercise an inheritance tree that includes a
foreign table.


Done.  I used your tests as reference.  Thanks!


The inheritance section of ddl.sgml should mention child foreign tables, at
least briefly.  Consider mentioning it in the partitioning section, too.


Done.


Your chosen ANALYZE behavior is fair, but the messaging from a database-wide
ANALYZE VERBOSE needs work:

INFO:  analyzing "test_foreign_inherit.parent"
INFO:  "parent": scanned 1 of 1 pages, containing 1 live rows and 0 dead rows; 
1 rows in sample, 1 estimated total rows
INFO:  analyzing "test_foreign_inherit.parent" inheritance tree
WARNING:  relcache reference leak: relation "child" not closed
WARNING:  relcache reference leak: relation "tchild" not closed
WARNING:  relcache reference leak: relation "parent" not closed

Please arrange to omit the 'analyzing "tablename" inheritance tree' message,
since this ANALYZE actually skipped that task.  (The warnings obviously need a
fix, too.)  I do find it awkward that adding a foreign table to an inheritance
tree will make autovacuum stop collecting statistics on that inheritance tree,
but I can't think of a better policy.


I think it would be better that this is handled in the same way as an 
inheritance tree that turns out to be a singe table that doesn't have 
any descendants in acquire_inherited_sample_rows().  That would still 
output the message as shown below, but I think that that would be more 
consistent with the existing code.  The patch works so.  (The warnings 
are also fixed.)


INFO:  analyzing "public.parent"
INFO:  "parent": scanned 0 of 0 pages, containing 0 live rows and 0 dead 
rows; 0 rows in sample, 0 estimated total rows

INFO:  analyzing "public.parent" inheritance tree
INFO:  analyzing "pg_catalog.pg_authid"
INFO:  "pg_authid": scanned 1 of 1 pages, containing 1 live rows and 0 
dead rows; 1 rows in sample, 1 estimated total rows



The rest of these review comments are strictly observations; they're not
requests for you to change the patch, but they may deserve more discussion.


I'd like to give my opinions on those things later on.

Sorry for the long delay.

[1] http://www.postgresql.org/message-id/53f4707c.8030...@lab.ntt.co.jp

Best regards,
Etsuro Fujita
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
***
*** 115,120  static void fileGetForeignRelSize(PlannerInfo *root,
--- 115,125 
  static void fileGetForeignPaths(PlannerInfo *root,
  	RelOptInfo *baserel,
  	Oid foreigntableid);
+ static ForeignPath *fileReparameterizeForeignPath(PlannerInfo *root,
+   RelOptInfo *baserel,
+   Oid foreigntableid,
+   ForeignPath *path,
+   Relids required_outer);
  static ForeignScan *fileGetForeignPlan(PlannerInfo *root,
     RelOptInfo *baserel,
     Oid foreigntableid,
***
*** 143,149  static bool check_selective_binary_conversion(RelOptInfo *baserel,
  static void estimate_size(PlannerInfo *root, RelOptInfo *baserel,
  			  FileFdwPlanState *fdw_private);
  static void estimate_costs(PlannerInfo *root, RelOptInfo *baserel,
! 			   FileFdwPlanState *fdw_private,
  			   Cost *startup_cost, Cost *total_cost);
  static int file_acquire_sample_rows(Relation onerel, int elevel,
  		 HeapTuple *rows, int targrows,
--- 148,154 
  static void estimate_size(PlannerInfo *root, RelOptInfo *baserel,
  			  FileFdwPlanState *fdw_private);
  static void estimate_costs(PlannerInfo *root, RelOptInfo *baserel,
! 			   FileFdwPlanState *fdw_private, List *join_conds,
  			   Cost *startup_cost, Cost *total_cost);
  static int file_acquire_sample_rows(Relation onerel, int elevel,
  		 HeapTuple *rows, int targrows,
***
*** 161,166  file_fdw_handler(PG_FUNCTION_ARGS)
--- 166,172 
  
  	fdwroutine->GetForeignRelSize = fileGetForeignRelSize;
  	fdwroutine->GetForeignPaths = fileGetForeignPaths;
+ 	fdwroutine->Reparam

Re: [HACKERS] August commitfest

2014-08-20 Thread Etsuro Fujita

Hi Heikki,

(2014/08/20 17:50), Heikki Linnakangas wrote:

I have cleaned up the Commitfest of patches that have been in Waiting
for Author state for weeks or more. I believe the list is now an
accurate representation of the actual state of the patches.


Thank you for the work!

I chaged the state of the following patch from "Returned with Feedback" 
to "Needs Review".


https://commitfest.postgresql.org/action/patch_view?id=1386


Patch authors
-



Also, please pick a patch or two, submitted by other people, and review
them.


Will do.

Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-08-20 Thread Jeevan Chalke
Hi,

I have reviewed this:

I have initialize cur_lineno to UINTMAX - 2. And then observed following
behaviour to check wrap-around.

postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[18446744073709551613]=# \set PROMPT2 '%/[%l]%R%# '
postgres[18446744073709551613]=# select
postgres[18446744073709551614]-# a
postgres[18446744073709551615]-# ,
postgres[0]-# b
postgres[1]-# from
postgres[2]-# dual;

It is wrapping to 0, where as line number always start with 1. Any issues?

I would like to ignore this as UINTMAX lines are too much for a input
buffer to hold. It is almost NIL chances to hit this.


However, I think you need to use UINT64_FORMAT while printing uint64
number. Currently it is %u which wrap-around at UINT_MAX.
See how pset.lineno is displayed.

Apart from this, I didn't see any issues in my testing.

Patch applies cleanly. make/make install/initdb/make check all are well.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-08-20 Thread Heikki Linnakangas

On 07/20/2014 07:17 PM, Tomas Vondra wrote:

On 19.7.2014 20:24, Tomas Vondra wrote:

On 13.7.2014 21:32, Tomas Vondra wrote:

The current patch only implemnents this for tuples in the main
hash table, not for skew buckets. I plan to do that, but it will
require separate chunks for each skew bucket (so we can remove it
without messing with all of them). The chunks for skew buckets
should be smaller (I'm thinking about ~4kB), because there'll be
more of them, but OTOH those buckets are for frequent values so the
chunks should not remain empty.


I've looked into extending the dense allocation to the skew buckets,
and I think we shouldn't do that. I got about 50% of the changes and
then just threw it out because it turned out quite pointless.

The amount of memory for skew buckets is limited to 2% of work mem,
so even with 100% overhead it'll use ~4% of work mem. So there's
pretty much nothing to gain here. So the additional complexity
introduced by the dense allocation seems pretty pointless.

I'm not entirely happy with the code allocating some memory densely
and some using traditional palloc, but it certainly seems cleaner
than the code I had.

So I think the patch is mostly OK as is.


Attached is v4 of the patch, mostly with minor improvements and cleanups
(removed MemoryContextStats, code related to skew buckets).


Can you remind me what kind of queries this is supposed to help with? 
Could you do some performance testing to demonstrate the effect? And 
also a worst-case scenario.


At a quick glance, I think you're missing a fairly obvious trick in 
ExecHashIncreaseNumBatches: if a chunk contains only a single tuple, you 
can avoid copying it to a new chunk and just link the old chunk to the 
new list instead. Not sure if ExecHashIncreaseNumBatches is called often 
enough for that to be noticeable, but at least it should help in extreme 
cases where you push around huge > 100MB tuples.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hokey wrong versions of libpq in apt.postgresql.org

2014-08-20 Thread Greg Stark
On Sat, Aug 9, 2014 at 2:23 AM, Tom Lane  wrote:
> If the Debian guidelines think that only SO major version need
> be considered, they're wrong, at least for the way we've been treating
> that.

The Debian approach is that you should have precisely one installed
copy of a library for each soname. I guess there's no particular
reason you can't have multiple versions in the repository (possiby
built by different source packages) but users will only be able to
install one of them. This follows from there being a single /usr/lib
and ldconfig picking a single version for each soname.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental backup: add backup profile to base backup

2014-08-20 Thread Heikki Linnakangas
I think this has had enough review for a WIP patch. I'm marking this as 
Returned with Feedback in the commitfest because:


* should use LSNs instead of a md5
* this doesn't do anything useful on its own, hence would need to see 
the whole solution before committing
* not clear how this would be extended if you wanted to do more 
fine-grained than file-level diffs.


But please feel free to continue discussing those items.

On 08/18/2014 03:04 AM, Marco Nenciarini wrote:

Hi Hackers,

This is the first piece of file level incremental backup support, as
described on wiki page https://wiki.postgresql.org/wiki/Incremental_backup

It is not yet complete, but I wish to share it on the list to receive
comments and suggestions.

The point of the patch is adding an option to pg_basebackup and
replication protocol BASE_BACKUP command to generate a backup_profile file.

When taking a full backup with pg_basebackup, the user can request
Postgres to generate a backup_profile file through the --profile option
(-B short option, which I've arbitrarily picked up because both -P and
-p was already taken)

At the moment the backup profile consists of a file with one line per
file detailing modification time, md5, size, tablespace and path
relative to tablespace root (PGDATA or the tablespace)

To calculate the md5 checksum I've used the md5 code present in pgcrypto
contrib as the code in src/include/libpq/md5.h is not suitable for large
files. Since a core feature cannot depend on a piece of contrib, I've
moved the files

contrib/pgcrypto/md5.c
contrib/pgcrypto/md5.h

to

src/backend/utils/hash/md5.c
src/include/utils/md5.h

changing the pgcrypto extension to use them.

There are still some TODOs:

* User documentation

* Remove the pg_basebackup code duplication I've introduced with the
ReceiveAndUnpackTarFileToDir function, which is almost the same of
ReceiveAndUnpackTarFile but does not expect to handle a tablespace. It
instead simply extract a tar stream in a destination directory. The
latter could probably be rewritten using the former as component, but it
needs some adjustment to the "progress reporting" part, which is not
present at the moment in ReceiveAndUnpackTarFileToDir.

* Add header section to backup_profile file which at the moment contains
only the body part. I'm thinking to change the original design and put
the whole backup label as header, which is IMHO clearer and well known.
I would use something like:

START WAL LOCATION: 0/E28 (file 0001000E)
CHECKPOINT LOCATION: 0/E60
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2014-08-14 18:54:01 CEST
LABEL: pg_basebackup base backup
END LABEL

I've attached the current patch based on master.

Any comment will be appreciated.

Regards,
Marco




--
- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
On Tue, Aug 19, 2014 at 10:42 PM, Alvaro Herrera
 wrote:
> Is there a way to create a link to a file which only exists as an open
> file descriptor?   If there was, you could create a temp file, open an
> fd, then delete the file.  That would remove the issue with files being
> leaked due to failures of various kinds.


Sort of. On recent Linuxen you can create a file with open(O_TMPFILE)
then use linkat(2) to create a link for it only once it's fully
written.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
c.f.:

   O_TMPFILE (since Linux 3.11)
  Create an unnamed temporary file.  The pathname argument
  specifies a directory; an unnamed inode will be created in
  that directory's filesystem.  Anything written to the
  resulting file will be lost when the last file descriptor is
  closed, unless the file is given a name.

  O_TMPFILE must be specified with one of O_RDWR or O_WRONLY
  and, optionally, O_EXCL.  If O_EXCL is not specified, then
  linkat(2) can be used to link the temporary file into the
  filesystem, making it permanent, using code like the
  following:

  char path[PATH_MAX];
  fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
  S_IRUSR | S_IWUSR);

  /* File I/O on 'fd'... */

  snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
  linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
  AT_SYMLINK_FOLLOW);

  In this case, the open() mode argument determines the file
  permission mode, as with O_CREAT.

  Specifying O_EXCL in conjunction with O_TMPFILE prevents a
  temporary file from being linked into the filesystem in the
  above manner.  (Note that the meaning of O_EXCL in this case
  is different from the meaning of O_EXCL otherwise.)

  There are two main use cases for O_TMPFILE:

  *  Improved tmpfile(3) functionality: race-free creation of
 temporary files that (1) are automatically deleted when
 closed; (2) can never be reached via any pathname; (3) are
 not subject to symlink attacks; and (4) do not require the
 caller to devise unique names.

  *  Creating a file that is initially invisible, which is then
 populated with data and adjusted to have appropriate
 filesystem attributes (chown(2), chmod(2), fsetxattr(2),
 etc.)  before being atomically linked into the filesystem
 in a fully formed state (using linkat(2) as described
 above).

  O_TMPFILE requires support by the underlying filesystem; only
  a subset of Linux filesystems provide that support.  In the
  initial implementation, support was provided in the ext2,
  ext3, ext4, UDF, Minix, and shmem filesystems.  XFS support
  was added in Linux 3.15.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-08-20 Thread Tomas Vondra
On 20 Srpen 2014, 14:05, Heikki Linnakangas wrote:
> On 07/20/2014 07:17 PM, Tomas Vondra wrote:
>> On 19.7.2014 20:24, Tomas Vondra wrote:
>>> On 13.7.2014 21:32, Tomas Vondra wrote:
 The current patch only implemnents this for tuples in the main
 hash table, not for skew buckets. I plan to do that, but it will
 require separate chunks for each skew bucket (so we can remove it
 without messing with all of them). The chunks for skew buckets
 should be smaller (I'm thinking about ~4kB), because there'll be
 more of them, but OTOH those buckets are for frequent values so the
 chunks should not remain empty.
>>>
>>> I've looked into extending the dense allocation to the skew buckets,
>>> and I think we shouldn't do that. I got about 50% of the changes and
>>> then just threw it out because it turned out quite pointless.
>>>
>>> The amount of memory for skew buckets is limited to 2% of work mem,
>>> so even with 100% overhead it'll use ~4% of work mem. So there's
>>> pretty much nothing to gain here. So the additional complexity
>>> introduced by the dense allocation seems pretty pointless.
>>>
>>> I'm not entirely happy with the code allocating some memory densely
>>> and some using traditional palloc, but it certainly seems cleaner
>>> than the code I had.
>>>
>>> So I think the patch is mostly OK as is.
>>
>> Attached is v4 of the patch, mostly with minor improvements and cleanups
>> (removed MemoryContextStats, code related to skew buckets).
>
> Can you remind me what kind of queries this is supposed to help with?
> Could you do some performance testing to demonstrate the effect? And
> also a worst-case scenario.

The dense allocation? That was not really directed at any specific
queries, it was about lowering hashjoin memory requirements in general.

First to make it more correct with respect to work_mem (hashjoin does not
account for the palloc overhead, so the actual memory consumption might be
much higher than work_mem).

Second to compensate for the memory for more buckets due to
NTUP_PER_BUCKET, which is tweaked in the 'hashjoin dynamic nbuckets'
patch.

There are some numbers / more detailed analysis in
http://www.postgresql.org/message-id/a17d6217fe0c9e459cb45cb764ad727c.squir...@sq.gransy.com

>
> At a quick glance, I think you're missing a fairly obvious trick in
> ExecHashIncreaseNumBatches: if a chunk contains only a single tuple, you
> can avoid copying it to a new chunk and just link the old chunk to the
> new list instead. Not sure if ExecHashIncreaseNumBatches is called often
> enough for that to be noticeable, but at least it should help in extreme
> cases where you push around huge > 100MB tuples.

Yeah, I thought about that too, but it seemed like a rare corner case. But
maybe you're right - it will also eliminate the 'fluctuation' (allocating
100MB chunk, which might get us over work_mem, ...). Not sure if this is
going to help, but it's easy enough to fix I guess.

The other optimization I'm thinking about is that when increasing number
of batches, the expectation is only about 1/2 the chunks will be
necessary. So instead of freeing the chunk, we might keep it and reuse it
later. That might lower the palloc overhead a bit (but it's already very
low, so I don't think that's measurable difference).

regards
Tomas





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Greg Stark wrote:

>   char path[PATH_MAX];
>   fd = open("/path/to/dir", O_TMPFILE | O_RDWR,
>   S_IRUSR | S_IWUSR);
> 
>   /* File I/O on 'fd'... */
> 
>   snprintf(path, PATH_MAX,  "/proc/self/fd/%d", fd);
>   linkat(AT_FDCWD, path, AT_FDCWD, "/path/for/file",
>   AT_SYMLINK_FOLLOW);

Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the
O_TMPFILE: you can have an open file descriptor to an "invisible" file
simply by creating a normal file and unlinking it.  I looked at linkat()
yesterday but the idea of using /proc/self didn't occur to me.  Nasty
trick :-(  It seems linkat() is quite a bit more portable than
O_TMPFILE, fortunately ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
> MauMau wrote:
> 
> > With that said, copying to a temporary file like .tmp and
> > renaming it to  sounds worthwhile even as a basic copy
> > utility.  I want to avoid copying to a temporary file with a fixed
> > name like _copy.tmp, because some advanced utility may want to run
> > multiple instances of pg_copy to copy several files into the same
> > directory simultaneously.  However, I'm afraid multiple .tmp
> > files might continue to occupy disk space after canceling copy or
> > power failure in some use cases, where the copy of the same file
> > won't be retried.  That's also the reason why I chose to not use a
> > temporary file like cp/copy.
> 
> Is there a way to create a link to a file which only exists as an open
> file descriptor?   If there was, you could create a temp file, open an
> fd, then delete the file.  That would remove the issue with files being
> leaked due to failures of various kinds.

Isn't this a solution looking for a problem? We're using tempfiles in
dozens of other places and I really don't see why this is the place to
stop doing so. Just copy to .tmp and move it into place. If things
crash during that, the command will be repeated shortly afterwards again
*anyway*. Let's not get into platform specific games here.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Andres Freund wrote:
> On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
> > MauMau wrote:
> > 
> > > With that said, copying to a temporary file like .tmp and
> > > renaming it to  sounds worthwhile even as a basic copy
> > > utility.  I want to avoid copying to a temporary file with a fixed
> > > name like _copy.tmp, because some advanced utility may want to run
> > > multiple instances of pg_copy to copy several files into the same
> > > directory simultaneously.  However, I'm afraid multiple .tmp
> > > files might continue to occupy disk space after canceling copy or
> > > power failure in some use cases, where the copy of the same file
> > > won't be retried.  That's also the reason why I chose to not use a
> > > temporary file like cp/copy.
> > 
> > Is there a way to create a link to a file which only exists as an open
> > file descriptor?   If there was, you could create a temp file, open an
> > fd, then delete the file.  That would remove the issue with files being
> > leaked due to failures of various kinds.
> 
> Isn't this a solution looking for a problem? We're using tempfiles in
> dozens of other places and I really don't see why this is the place to
> stop doing so. Just copy to .tmp and move it into place. If things
> crash during that, the command will be repeated shortly afterwards again
> *anyway*. Let's not get into platform specific games here.

The issue is what happens if there's a crash while the temp file is in
the middle of being filled.  I agree there are bigger problems to solve
in this area though.  Yes, I agree that a fixed name such as _copy.tmp
is a really bad choice for several reasons.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Greg Stark
On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera
 wrote:
> Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the
> O_TMPFILE: you can have an open file descriptor to an "invisible" file
> simply by creating a normal file and unlinking it.  I looked at linkat()
> yesterday but the idea of using /proc/self didn't occur to me.  Nasty
> trick :-(  It seems linkat() is quite a bit more portable than
> O_TMPFILE, fortunately ...

Supposedly linkat(2) on Linux refuses to create a link to a file that
was opened normally and had its last link removed with unlink(2) due
to concerns that this would create security holes.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
> Andres Freund wrote:
> > On 2014-08-19 17:42:06 -0400, Alvaro Herrera wrote:
> > > MauMau wrote:
> > > 
> > > > With that said, copying to a temporary file like .tmp and
> > > > renaming it to  sounds worthwhile even as a basic copy
> > > > utility.  I want to avoid copying to a temporary file with a fixed
> > > > name like _copy.tmp, because some advanced utility may want to run
> > > > multiple instances of pg_copy to copy several files into the same
> > > > directory simultaneously.  However, I'm afraid multiple .tmp
> > > > files might continue to occupy disk space after canceling copy or
> > > > power failure in some use cases, where the copy of the same file
> > > > won't be retried.  That's also the reason why I chose to not use a
> > > > temporary file like cp/copy.
> > > 
> > > Is there a way to create a link to a file which only exists as an open
> > > file descriptor?   If there was, you could create a temp file, open an
> > > fd, then delete the file.  That would remove the issue with files being
> > > leaked due to failures of various kinds.
> > 
> > Isn't this a solution looking for a problem? We're using tempfiles in
> > dozens of other places and I really don't see why this is the place to
> > stop doing so. Just copy to .tmp and move it into place. If things
> > crash during that, the command will be repeated shortly afterwards again
> > *anyway*. Let's not get into platform specific games here.
> 
> The issue is what happens if there's a crash while the temp file is in
> the middle of being filled.

The archive command will be be run again a couple seconds and remove the
half-filled temp file.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Alvaro Herrera
Greg Stark wrote:
> On Wed, Aug 20, 2014 at 2:27 PM, Alvaro Herrera
>  wrote:
> > Hmm, the real trick here is linkat(... "/proc/self/foobar"), not the
> > O_TMPFILE: you can have an open file descriptor to an "invisible" file
> > simply by creating a normal file and unlinking it.  I looked at linkat()
> > yesterday but the idea of using /proc/self didn't occur to me.  Nasty
> > trick :-(  It seems linkat() is quite a bit more portable than
> > O_TMPFILE, fortunately ...
> 
> Supposedly linkat(2) on Linux refuses to create a link to a file that
> was opened normally and had its last link removed with unlink(2) due
> to concerns that this would create security holes.

Sigh.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Tom Lane
Andres Freund  writes:
> On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
>> Andres Freund wrote:
>>> Isn't this a solution looking for a problem? We're using tempfiles in
>>> dozens of other places and I really don't see why this is the place to
>>> stop doing so. Just copy to .tmp and move it into place.

>> The issue is what happens if there's a crash while the temp file is in
>> the middle of being filled.

> The archive command will be be run again a couple seconds and remove the
> half-filled temp file.

Alternatively, you could use the process PID as part of the temp file
name; which is probably a good idea anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-08-20 09:50:56 -0400, Alvaro Herrera wrote:
> >> Andres Freund wrote:
> >>> Isn't this a solution looking for a problem? We're using tempfiles in
> >>> dozens of other places and I really don't see why this is the place to
> >>> stop doing so. Just copy to .tmp and move it into place.
> 
> >> The issue is what happens if there's a crash while the temp file is in
> >> the middle of being filled.
> 
> > The archive command will be be run again a couple seconds and remove the
> > half-filled temp file.
> 
> Alternatively, you could use the process PID as part of the temp file
> name; which is probably a good idea anyway.

I think that's actually worse, because nothing will clean up those
unless you explicitly scan for all .$pid files, and somehow
kill them.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup vs. Windows and tablespaces

2014-08-20 Thread Amit Kapila
On Wed, Aug 20, 2014 at 12:12 PM, Dilip kumar 
wrote:
>
> I have reviewed the patch and did not find any major comments.

Thanks for the review.

> There are some comments I would like to share with you
>
>
>
> 1.  Rebase the patch to current GIT head.

Done.

>
> 2.  +  * Construct symlink file
>
> +  */
>
> +  initStringInfo(&symlinkfbuf);
>
> I think declaration and initialization of symlinkfbuf string
can be moved under #ifdef WIN32 compile time macro,
>
> for other platform it’s simply allocated and freed which can be avoided.

Agreed, I have changed the patch as per your suggestion.

>
> 3.  +  /*
>
> +  * native windows utilites are not able
create symlinks while
>
> +  * extracting files from tar.
>
> +  */
>
>
>
> Rephrase the above sentence and fix spelling mistake
 (utilities are not able to create)

Done.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


extend_basebackup_to_include_symlink_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Tom Lane
Andres Freund  writes:
> On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
>> Alternatively, you could use the process PID as part of the temp file
>> name; which is probably a good idea anyway.

> I think that's actually worse, because nothing will clean up those
> unless you explicitly scan for all .$pid files, and somehow
> kill them.

True.  As long as the copy command is prepared to get rid of a
pre-existing target file, using a fixed .tmp extension should be fine.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-20 Thread Tom Lane
Josh Berkus  writes:
> On 08/15/2014 04:19 PM, Tom Lane wrote:
>> Personally I'd prefer to go to the all-lengths approach, but a large
>> part of that comes from a subjective assessment that the hybrid approach
>> is too messy.  Others might well disagree.

> ... So, that extraction test is about 1% *slower* than the basic Tom Lane
> lengths-only patch, and still 80% slower than original JSONB.  And it's
> the same size as the lengths-only version.

Since it's looking like this might be the direction we want to go, I took
the time to flesh out my proof-of-concept patch.  The attached version
takes care of cosmetic issues (like fixing the comments), and includes
code to avoid O(N^2) penalties in findJsonbValueFromContainer and
JsonbIteratorNext.  I'm not sure whether those changes will help
noticeably on Josh's test case; for me, they seemed worth making, but
they do not bring the code back to full speed parity with the all-offsets
version.  But as we've been discussing, it seems likely that those costs
would be swamped by compression and I/O considerations in most scenarios
with large documents; and of course for small documents it hardly matters.

Even if we don't go this way, there are parts of this patch that would
need to get committed.  I found for instance that convertJsonbArray and
convertJsonbObject have insufficient defenses against overflowing the
overall length field for the array or object.

For my own part, I'm satisfied with the patch as attached (modulo the
need to teach pg_upgrade about the incompatibility).  There remains the
question of whether to take this opportunity to add a version ID to the
binary format.  I'm not as excited about that idea as I originally was;
having now studied the code more carefully, I think that any expansion
would likely happen by adding more type codes and/or commandeering the
currently-unused high-order bit of JEntrys.  We don't need a version ID
in the header for that.  Moreover, if we did have such an ID, it would be
notationally painful to get it to most of the places that might need it.

regards, tom lane

diff --git a/src/backend/utils/adt/jsonb.c b/src/backend/utils/adt/jsonb.c
index 2fd87fc..456011a 100644
*** a/src/backend/utils/adt/jsonb.c
--- b/src/backend/utils/adt/jsonb.c
*** jsonb_from_cstring(char *json, int len)
*** 196,207 
  static size_t
  checkStringLen(size_t len)
  {
! 	if (len > JENTRY_POSMASK)
  		ereport(ERROR,
  (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
   errmsg("string too long to represent as jsonb string"),
   errdetail("Due to an implementation restriction, jsonb strings cannot exceed %d bytes.",
! 		   JENTRY_POSMASK)));
  
  	return len;
  }
--- 196,207 
  static size_t
  checkStringLen(size_t len)
  {
! 	if (len > JENTRY_LENMASK)
  		ereport(ERROR,
  (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
   errmsg("string too long to represent as jsonb string"),
   errdetail("Due to an implementation restriction, jsonb strings cannot exceed %d bytes.",
! 		   JENTRY_LENMASK)));
  
  	return len;
  }
diff --git a/src/backend/utils/adt/jsonb_util.c b/src/backend/utils/adt/jsonb_util.c
index 04f35bf..e47eaea 100644
*** a/src/backend/utils/adt/jsonb_util.c
--- b/src/backend/utils/adt/jsonb_util.c
***
*** 26,40 
   * in MaxAllocSize, and the number of elements (or pairs) must fit in the bits
   * reserved for that in the JsonbContainer.header field.
   *
!  * (the total size of an array's elements is also limited by JENTRY_POSMASK,
   * but we're not concerned about that here)
   */
  #define JSONB_MAX_ELEMS (Min(MaxAllocSize / sizeof(JsonbValue), JB_CMASK))
  #define JSONB_MAX_PAIRS (Min(MaxAllocSize / sizeof(JsonbPair), JB_CMASK))
  
! static void fillJsonbValue(JEntry *array, int index, char *base_addr,
  			   JsonbValue *result);
! static bool	equalsJsonbScalarValue(JsonbValue *a, JsonbValue *b);
  static int	compareJsonbScalarValue(JsonbValue *a, JsonbValue *b);
  static Jsonb *convertToJsonb(JsonbValue *val);
  static void convertJsonbValue(StringInfo buffer, JEntry *header, JsonbValue *val, int level);
--- 26,41 
   * in MaxAllocSize, and the number of elements (or pairs) must fit in the bits
   * reserved for that in the JsonbContainer.header field.
   *
!  * (the total size of an array's elements is also limited by JENTRY_LENMASK,
   * but we're not concerned about that here)
   */
  #define JSONB_MAX_ELEMS (Min(MaxAllocSize / sizeof(JsonbValue), JB_CMASK))
  #define JSONB_MAX_PAIRS (Min(MaxAllocSize / sizeof(JsonbPair), JB_CMASK))
  
! static void fillJsonbValue(JEntry *children, int index,
! 			   char *base_addr, uint32 offset,
  			   JsonbValue *result);
! static bool equalsJsonbScalarValue(JsonbValue *a, JsonbValue *b);
  static int	compareJsonbScalarValue(JsonbValue *a, JsonbValue *b);
  static Jsonb *convertToJsonb(JsonbValue *val);
  static void convertJsonbValue(StringInfo buffer, JEntry *header, JsonbValue *val, int level);
*

Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-08-20 Thread Thom Brown
On 17 August 2014 21:45, Fabrízio de Royes Mello 
wrote:

>
> On Mon, Jul 28, 2014 at 2:24 PM, Fabrízio de Royes Mello <
> fabriziome...@gmail.com> wrote:
> >
> >
> > On Mon, Jul 28, 2014 at 1:41 PM, Christoph Berg  wrote:
> > >
> > > Re: Fabrízio de Royes Mello 2014-07-28
> 
> > > > There are something that should I do on this patch yet?
> > >
> > > I haven't got around to have a look at the newest incarnation yet, but
> > > I plan to do that soonish. (Of course that shouldn't stop others from
> > > doing that as well if they wish.)
> > >
> >
> > Thanks!
> >
>
> Updated version.
>

Hi Fabrizio,

+  This form changes the table persistence type from unlogged to
permanent or
+  from unlogged to permanent (see ).

Shouldn't this read "unlogged to permanent or from permanent to unlogged"?


"ERROR:  table test is not permanent"

Perhaps this would be better as "table test is unlogged" as "permanent"
doesn't match the term used in the DDL syntax.


Gave the patch a quick test-drive on a replicated instance, and it appears
to operate as described.
-- 
Thom


[HACKERS] Patching for increasing the number of columns

2014-08-20 Thread Mayeul Kauffmann

Hello,

I am trying to patch the server source to increase the number of columns 
above 1600. I'm not planning to commit this but as suggested elsewhere 
[1], someone might suggest a configure option based on this.
I came up with a patch which seems to work (see below), but 3 of the 136 
tests fail.


I understand some will question db design, but, as written elsewhere, 
"What would be most helpful though is if the answer to this question 
stop being an attack on the business requirement analysis, database 
design skills, and/or sanity of the requester" [1]


I based my attempts on these discussions:
http://www.postgresql.org/message-id/200512221633.jbmgxwm13...@candle.pha.pa.us
http://www.postgresql.org/message-id/8914.1289620...@sss.pgh.pa.us
http://dba.stackexchange.com/questions/40137/in-postgresql-is-it-possible-to-change-the-maximum-number-of-columns-a-table-ca

I build this on Ubuntu 14.04, 64 bits. Bash session follows:

==
sudo apt-get install flex
sudo apt-get install bison build-essential
sudo apt-get install libreadline6-dev
sudo apt-get install zlib1g-dev
sudo apt-get install libossp-uuid-dev

version=3_64  # change this if you want to build several versions of 
postgres in parallel

# see also "MODIFY THIS TOO" below

echo "current version is"  $version
mkdir -p ~/bin/postgresql_9.3.4
cd ~/bin/postgresql_9.3.4
wget ftp://ftp.postgresql.org/pub/source/v9.3.4/postgresql-9.3.4.tar.bz2
mkdir -p ~/bin/postgresql_9.3.4/patched_$version
tar -xvf postgresql-9.3.*.tar.bz2 -C ~/bin/postgresql_9.3.4/patched_$version
cd patched_$version/postgresql-9.3.*

# use kate (KDE) or your preferred text editor:
kate src/include/access/htup_details.h

# See: 
http://dba.stackexchange.com/questions/40137/in-postgresql-is-it-possible-to-change-the-maximum-number-of-columns-a-table-ca

# Replace this:
#define MaxTupleAttributeNumber 1664/* 8 * 208 */
# by this: (the '#' sign  'define' should be included)
#define MaxTupleAttributeNumber 6656/* 32 * 208 */
# or this:
#define MaxTupleAttributeNumber 13312/* 64 * 208 */

# Replace this:
#define MaxHeapAttributeNumber1600/* 8 * 200 */
# by this: (the '#' sign before 'define' should be included)
#define MaxHeapAttributeNumber6400/* 32 * 200 */
# or this:
#define MaxHeapAttributeNumber12800/* 64 * 200 */


# See: 
http://www.postgresql.org/message-id/8914.1289620...@sss.pgh.pa.us 
suggests this:  uint16t_hoff;
# Replace this:  (in TWO PLACES)  (near lines 148 and lines 523. If you 
miss one, postgresql segfaults.)

  uint8 t_hoff; /* sizeof header incl. bitmap, padding */
# by this: (in TWO PLACES)
  uint32 t_hoff; /* sizeof header incl. bitmap, padding */
# or by this:(in TWO PLACES)
uint64 t_hoff; /* sizeof header incl. bitmap, padding */

# Save and close htup_details.h
# (TODO: write the above as a command-line patch)

./configure --with-blocksize=32 --prefix=/usr/local/pgsql_patched_$version

make
make check

# join ... FAILED
# select_views ... FAILED
# without_oid  ... FAILED
# 
#  3 of 136 tests failed. FIXME
# 
(not sure whether I can attach the log and diff of the test here).

I launched the server anyway and logged in with pgadmin3. I created a 
few tables with 2000 integer fields or so. Performed a few insert, 
select, update and join without any issue.
So at least basic join works. And in pgadmin3, the "has OIDs" porperties 
of tables I created is not checked.


Just to be sure, I performed again all the tests with 'make check' 
without any patch and without raising the blocksize  (configure option), 
and this time all the tests passed (NO failure).


Would anyone have some hint or advice?
Thank you!
Best regards,
Mayeul


[1] http://www.postgresql.org/message-id/8914.1289620...@sss.pgh.pa.us

PS: and since it's my first post here: thank you all so much for this 
wonderful DBMS :-)



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch status: delta relations in AFTER triggers

2014-08-20 Thread David Fetter
On Wed, Aug 20, 2014 at 07:29:39AM -0700, Kevin Grittner wrote:
> Heikki Linnakangas  wrote:
> 
> > What's the status of the "delta relations in AFTER triggers"
> > patch? I saw you had a lively discussion with Amit in the last
> > two weeks, but I couldn't tell if it's still "Needs review" or
> > should be "Waiting on Author" ?
> 
> The patch that David offered to use the tuplestores in C should 
> probably be updated to show both direct use based on what is in 
> TriggerData and SPI use.  I will leave that to David to submit;it 
> doesn't need to be in the same patch-set as the rest; it might, in 
> fact, be better to offer it separately once the API is solid (i.e., 
> after commit of the patch-sets I've just described.)

I'd be happy to.  Is there a repo I can refer to, or should I wait to
start until those patch sets are submitted to -hackers?

> Thanks for picking up the job of CF manager, BTW.  It's a dirty
> job, but somebody's got to do it.

Indeed, and thanks for wrangling this, Heikki.

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patching for increasing the number of columns

2014-08-20 Thread Tom Lane
Mayeul Kauffmann  writes:
> I am trying to patch the server source to increase the number of columns 
> above 1600. I'm not planning to commit this but as suggested elsewhere 
> [1], someone might suggest a configure option based on this.
> I came up with a patch which seems to work (see below), but 3 of the 136 
> tests fail.

You would have to show us the actual failure diffs to get much useful
comment, but in general increasing the size of tuple headers could
easily lead to changes in plan choices, which would affect output
row ordering (not to mention EXPLAIN results).  This is particularly
the case if you're testing on a 64-bit machine, since the maxalign'd
size of the header would go from 24 to 32 bytes ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GSoC2014] Patch ALTER TABLE ... SET LOGGED

2014-08-20 Thread Fabrízio de Royes Mello
On Wed, Aug 20, 2014 at 12:35 PM, Thom Brown  wrote:
>
> Hi Fabrizio,
>
> +  This form changes the table persistence type from unlogged to
permanent or
> +  from unlogged to permanent (see ).
>
> Shouldn't this read "unlogged to permanent or from permanent to unlogged"?
>

Fixed.


> "ERROR:  table test is not permanent"
>
> Perhaps this would be better as "table test is unlogged" as "permanent"
doesn't match the term used in the DDL syntax.
>

Fixed.


> Gave the patch a quick test-drive on a replicated instance, and it
appears to operate as described.
>

Thanks for your review.

Att,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/doc/src/sgml/ref/alter_table.sgml b/doc/src/sgml/ref/alter_table.sgml
index 69a1e14..397b4e6 100644
--- a/doc/src/sgml/ref/alter_table.sgml
+++ b/doc/src/sgml/ref/alter_table.sgml
@@ -59,16 +59,17 @@ ALTER TABLE [ IF EXISTS ] name
 ENABLE ALWAYS RULE rewrite_rule_name
 CLUSTER ON index_name
 SET WITHOUT CLUSTER
+SET {LOGGED | UNLOGGED}
 SET WITH OIDS
 SET WITHOUT OIDS
 SET ( storage_parameter = value [, ... ] )
+SET TABLESPACE new_tablespace
 RESET ( storage_parameter [, ... ] )
 INHERIT parent_table
 NO INHERIT parent_table
 OF type_name
 NOT OF
 OWNER TO new_owner
-SET TABLESPACE new_tablespace
 REPLICA IDENTITY {DEFAULT | USING INDEX index_name | FULL | NOTHING}
 
 and table_constraint_using_index is:
@@ -447,6 +448,20 @@ ALTER TABLE [ IF EXISTS ] name

 

+SET {LOGGED | UNLOGGED}
+
+ 
+  This form changes the table persistence type from unlogged to permanent or
+  from permanent to unlogged (see ).
+ 
+ 
+  Changing the table persistence type acquires an ACCESS EXCLUSIVE lock
+  and rewrites the table contents and associated indexes into new disk files.
+ 
+
+   
+
+   
 SET WITH OIDS
 
  
diff --git a/src/backend/commands/cluster.c b/src/backend/commands/cluster.c
index b1c411a..7f497c7 100644
--- a/src/backend/commands/cluster.c
+++ b/src/backend/commands/cluster.c
@@ -574,7 +574,8 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose)
 	heap_close(OldHeap, NoLock);
 
 	/* Create the transient table that will receive the re-ordered data */
-	OIDNewHeap = make_new_heap(tableOid, tableSpace, false,
+	OIDNewHeap = make_new_heap(tableOid, tableSpace,
+			   OldHeap->rd_rel->relpersistence,
 			   AccessExclusiveLock);
 
 	/* Copy the heap data into the new table in the desired order */
@@ -601,7 +602,7 @@ rebuild_relation(Relation OldHeap, Oid indexOid, bool verbose)
  * data, then call finish_heap_swap to complete the operation.
  */
 Oid
-make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
+make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, char relpersistence,
 			  LOCKMODE lockmode)
 {
 	TupleDesc	OldHeapDesc;
@@ -613,7 +614,6 @@ make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
 	Datum		reloptions;
 	bool		isNull;
 	Oid			namespaceid;
-	char		relpersistence;
 
 	OldHeap = heap_open(OIDOldHeap, lockmode);
 	OldHeapDesc = RelationGetDescr(OldHeap);
@@ -636,16 +636,10 @@ make_new_heap(Oid OIDOldHeap, Oid NewTableSpace, bool forcetemp,
 	if (isNull)
 		reloptions = (Datum) 0;
 
-	if (forcetemp)
-	{
+	if (relpersistence == RELPERSISTENCE_TEMP)
 		namespaceid = LookupCreationNamespace("pg_temp");
-		relpersistence = RELPERSISTENCE_TEMP;
-	}
 	else
-	{
 		namespaceid = RelationGetNamespace(OldHeap);
-		relpersistence = OldHeap->rd_rel->relpersistence;
-	}
 
 	/*
 	 * Create the new heap, using a temporary name in the same namespace as
@@ -1146,6 +1140,7 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
 	Oid			relfilenode1,
 relfilenode2;
 	Oid			swaptemp;
+	char		swaprelpersistence;
 	CatalogIndexState indstate;
 
 	/* We need writable copies of both pg_class tuples. */
@@ -1177,6 +1172,10 @@ swap_relation_files(Oid r1, Oid r2, bool target_is_pg_class,
 		relform1->reltablespace = relform2->reltablespace;
 		relform2->reltablespace = swaptemp;
 
+		swaprelpersistence = relform1->relpersistence;
+		relform1->relpersistence = relform2->relpersistence;
+		relform2->relpersistence = swaprelpersistence;
+
 		/* Also swap toast links, if we're swapping by links */
 		if (!swap_toast_by_content)
 		{
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 5130d51..a49e66f 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -147,6 +147,7 @@ ExecRefreshMatView(RefreshMatViewStmt *stmt, const char *queryString,
 	DestReceiver *dest;
 	bool		concurrent;
 	LOCKMODE	lockmode;
+	char		relpersistence;
 
 	/* Determine strength of lock needed. */
 	concurrent = stmt->concurrent;
@@ -233,9 +234,15 @@

Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread Robert Haas
On Mon, Aug 18, 2014 at 11:02 AM, Baker, Keith [OCDUS Non-J&J]
 wrote:
> My proof of concept code (steps a though e below) avoided any reading or 
> writing to the pipe (and associated handling of SIGPIPE), it just relied on 
> postmaster open of PIPE with ENXIO to indicate all is clear.

I'm not following.

> Robert, Assuming an algorithm choice is agreed upon in the near future, would 
> you be the logical choice to implement the change?
> I am happy to help, especially with any QNX-specific aspects, but don't want 
> to step on anyone's toes.

I'm unlikely to have time to work on this in the immediate future, but
I may be able to help review.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patching for increasing the number of columns

2014-08-20 Thread Mayeul Kauffmann

Tom wrote:
> You would have to show us the actual failure diffs to get much useful 
comment, but in general increasing the size of tuple headers could 
easily lead to

> changes in plan choices

Thank you Tom. So there is some hope! In effect the query plan is 
different for the join and the view tests. The result set is different 
only for the 'without_oid' test.
A side question: Are these tests comprehensive, or should I run other 
tests just to be sure? Hints on where to find those tests are welcome.

Thanks!
(diff below)
Mayeul
*** 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/expected/join.out 
2014-03-17 19:35:47.0 +
--- 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/results/join.out 
2014-08-20 15:40:56.248603754 +0100

***
*** 2791,2814 
join int4_tbl i1 on b.thousand = f1
right join int4_tbl i2 on i2.f1 = b.tenthous
order by 1;
!QUERY PLAN
! 
-

   Sort
 Sort Key: b.unique1
!->  Nested Loop Left Join
!  ->  Seq Scan on int4_tbl i2
   ->  Nested Loop Left Join
 Join Filter: (b.unique1 = 42)
 ->  Nested Loop
   ->  Nested Loop
 ->  Seq Scan on int4_tbl i1
!->  Index Scan using tenk1_thous_tenthous 
on tenk1 b
!  Index Cond: ((thousand = i1.f1) AND 
(i2.f1 = tenthous))

   ->  Index Scan using tenk1_unique1 on tenk1 a
 Index Cond: (unique1 = b.unique2)
 ->  Index Only Scan using tenk1_thous_tenthous on tenk1 c
   Index Cond: (thousand = a.thousand)
! (15 rows)

  select b.unique1 from
tenk1 a join tenk1 b on a.unique1 = b.unique2
--- 2791,2818 
join int4_tbl i1 on b.thousand = f1
right join int4_tbl i2 on i2.f1 = b.tenthous
order by 1;
!   QUERY PLAN
! 
---

   Sort
 Sort Key: b.unique1
!->  Hash Right Join
!  Hash Cond: (b.tenthous = i2.f1)
   ->  Nested Loop Left Join
 Join Filter: (b.unique1 = 42)
 ->  Nested Loop
   ->  Nested Loop
 ->  Seq Scan on int4_tbl i1
!->  Bitmap Heap Scan on tenk1 b
!  Recheck Cond: (thousand = i1.f1)
!  ->  Bitmap Index Scan on 
tenk1_thous_tenthous

!Index Cond: (thousand = i1.f1)
   ->  Index Scan using tenk1_unique1 on tenk1 a
 Index Cond: (unique1 = b.unique2)
 ->  Index Only Scan using tenk1_thous_tenthous on tenk1 c
   Index Cond: (thousand = a.thousand)
!  ->  Hash
!->  Seq Scan on int4_tbl i2
! (19 rows)

  select b.unique1 from
tenk1 a join tenk1 b on a.unique1 = b.unique2

==

*** 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/expected/select_views_1.out 
2014-03-17 19:35:47.0 +
--- 
~/bin/postgresql_9.3.4/patched_3_64/postgresql-9.3.4/src/test/regress/results/select_views.out 
2014-08-20 15:41:01.212603532 +0100

***
*** 1413,1423 
 WHERE f_leak(cnum) AND ymd >= '2011-10-01' AND ymd < '2011-11-01';
QUERY PLAN
--
!  Nested Loop
!Join Filter: (l.cid = r.cid)
 ->  Seq Scan on credit_usage r
   Filter: ((ymd >= '10-01-2011'::date) AND (ymd < 
'11-01-2011'::date))

!->  Materialize
   ->  Subquery Scan on l
 Filter: f_leak(l.cnum)
 ->  Hash Join
--- 1413,1423 
 WHERE f_leak(cnum) AND ymd >= '2011-10-01' AND ymd < '2011-11-01';
QUERY PLAN
--
!  Hash Join
!Hash Cond: (r.cid = l.cid)
 ->  Seq Scan on credit_usage r
   Filter: ((ymd >= '10-01-2011'::date) AND (ymd < 
'11-01-2011'::date))

!->  Hash
   ->  Subquery Scan on l
 Filter: f_leak(l.cnum)
 ->  Hash Join
***
*** 1446,1456 

   Subquery Scan on my_credit_card_usage_secure
 Filter: f_leak(my_credit_card_usage_secure.cnum)
!->  Nested Loop
!  Join Filter: (l.cid = r.cid)
   ->  Seq Scan on credit_usage r
 Filter: ((ymd >= '10-01-2011'::date) AND (ymd < 
'11-01-2011'::date))

!

Re: [HACKERS] Patch status: delta relations in AFTER triggers

2014-08-20 Thread Kevin Grittner
David Fetter  wrote:
> On Wed, Aug 20, 2014 at 07:29:39AM -0700, Kevin Grittner wrote:

>> The patch that David offered to use the tuplestores in C should
>> probably be updated to show both direct use based on what is in
>> TriggerData and SPI use.  I will leave that to David to submit;it
>> doesn't need to be in the same patch-set as the rest; it might, in
>> fact, be better to offer it separately once the API is solid (i.e.,
>> after commit of the patch-sets I've just described.)
>
> I'd be happy to.  Is there a repo I can refer to, or should I wait to
> start until those patch sets are submitted to -hackers?

It would be best to wait for the patches to be posted, or possibly
for them to be committed.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch status: delta relations in AFTER triggers

2014-08-20 Thread David Fetter
On Wed, Aug 20, 2014 at 09:59:11AM -0700, Kevin Grittner wrote:
> David Fetter  wrote:
> > On Wed, Aug 20, 2014 at 07:29:39AM -0700, Kevin Grittner wrote:
> 
> >> The patch that David offered to use the tuplestores in C should
> >> probably be updated to show both direct use based on what is in
> >> TriggerData and SPI use.  I will leave that to David to submit;it
> >> doesn't need to be in the same patch-set as the rest; it might,
> >> in fact, be better to offer it separately once the API is solid
> >> (i.e., after commit of the patch-sets I've just described.)
> >
> > I'd be happy to.  Is there a repo I can refer to, or should I wait
> > to start until those patch sets are submitted to -hackers?
> 
> It would be best to wait for the patches to be posted, or possibly
> for them to be committed.

Works for me. :)

Cheers,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patching for increasing the number of columns

2014-08-20 Thread Tom Lane
Mayeul Kauffmann  writes:
> Tom wrote:
>> You would have to show us the actual failure diffs to get much useful 
>> comment, but in general increasing the size of tuple headers could 
>> easily lead to changes in plan choices

> Thank you Tom. So there is some hope! In effect the query plan is 
> different for the join and the view tests. The result set is different 
> only for the 'without_oid' test.

Hm.  I think the without_oid test is not showing that anything is broken;
what it's testing is whether a table with oids is physically bigger (more
pages) than one without oids but the same data.  It's not implausible
that your change results in the same number of tuples fitting onto a page
in both cases.  It'd be worth doing the math to make sure that makes
sense.  Not sure if there's an easy way to change the table schema so that
you get different physical sizes in both cases.

The other tests aren't showing any functional issue either AFAICS.
The change away from a nestloop plan in join.out is a bit annoying,
because IIRC that test is specifically intended to check nestloop
parameter management; but that just means the test is brittle.

> A side question: Are these tests comprehensive, or should I run other 
> tests just to be sure? Hints on where to find those tests are welcome.

No, they're not comprehensive, and no, we don't have more :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] After switching primary server while using replication slot.

2014-08-20 Thread Robert Haas
On Tue, Aug 19, 2014 at 6:25 AM, Fujii Masao  wrote:
> On Mon, Aug 18, 2014 at 11:16 PM, Sawada Masahiko  
> wrote:
>> Hi all,
>> After switching primary serer while using repliaction slot, the
>> standby server will not able to connect new primary server.
>> Imagine this situation, if primary server has two ASYNC standby
>> servers, also use each replication slots.
>> And the one standby(A) apply WAL without problems. But another one
>> standby(B) has stopped after connected to primary server.
>> (or sending WAL is too delayed)
>>
>> In this situation, the standby(B) has not received WAL segment file
>> while stopping itself.
>> And the primary server can not remove WAL segments which has not been
>> received to all standby.
>> Therefore the primary server have to keep the WAL segment file which
>> has not been received to all standby.
>> But standby(A) can do checkpoint itself, and then it's possible to
>> recycle WAL segments.
>> The number of WAL segment of each server are different.
>> ( The number of WAL files of standby(A) having smaller than primary server.)
>> After the primary server is crashed, the standby(A) promote to primary,
>> we can try to connect standby(B) to standby(A) as new standby server.
>> But it will be failed because the standby(A) server might not have WAL
>> segment files that standby(B) required.
>
> This sounds valid concern.
>
>> To resolve this situation, I think that we should make master server
>> to notify about removal of WAL segment to all standby servers.
>> And the standby servers recycle WAL segments files base on that information.
>>
>> Thought?
>
> How does the server recycle WAL files after it's promoted from the
> standby to master?
> It does that as it likes? If yes, your approach would not be enough.
>
> The approach prevents unexpected removal of WAL files while the standby
> is running. But after the standby is promoted to master, it might recycle
> needed WAL files immediately. So another standby may still fail to retrieve
> the required WAL file after the promotion.
>
> ISTM that, in order to address this, we might need to log all the replication
> slot activities and replicate them to the standby. I'm not sure if this
> breaks the design of replication slot at all, though.

Yuck.

I believe that the reason why replication slots are not currently
replicated is because we had the idea that the standby could have
slots that don't exist on the master, for cascading replication.  I'm
not sure that works yet, but I think Andres definitely had it in mind
in the original design.

It seems to me that if every machine needs to keep not only the WAL it
requires for itself, but also the WAL that any of other machine in the
replication hierarchy might need, that's pretty much sucks.  Suppose
you have a master with 10 standbys, and each standby has 10 cascaded
standbys.  If one of those standbys goes down, do we really want all
100 other machines to keep copies of all the WAL?  That seems rather
unfortunate, since it's likely that only a few of those many standbys
are machines to which we would consider failing over.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] failures on barnacle (CLOBBER_CACHE_RECURSIVELY) because of memory leaks

2014-08-20 Thread Tomas Vondra
Hi,

On 13.8.2014 19:17, Tomas Vondra wrote:
> On 13.8.2014 17:52, Tom Lane wrote:
>
>> * I'm a bit dubious about testing -DRANDOMIZE_ALLOCATED_MEMORY in the
>> same build as -DCLOBBER_CACHE_RECURSIVELY, because each of these is
>> darned expensive and it's not clear you'd learn anything by running
>> them both together.  I think you might be better advised to run two
>> separate buildfarm critters with those two options, and thereby perhaps
>> get turnaround in something less than 80 days.
> 
> OK, I removed this for barnacle/addax/mite, let's see what's the impact.
> 
> FWIW We have three other animals running with CLOBBER_CACHE_ALWAYS +
> RANDOMIZE_ALLOCATED_MEMORY, and it takes ~20h per branch. But maybe the
> price when combined with CLOBBER_CACHE_RECURSIVELY is much higher.
> 
>> * It'd likely be a good idea to take out the TestUpgrade and TestDecoding
>> modules from the config too.  Otherwise, we won't be seeing barnacle's
>> next report before 2015, judging from the runtime of the check step
>> compared to some of the other slow buildfarm machines.  (I wonder whether
>> there's an easy way to skip the installcheck step, as that's going to
>> require a much longer run than it can possibly be worth too.)
> 
> OK, I did this too.
> 
> I stopped the already running test on addax and started the test on
> barnacle again. Let's see in a few days/weeks/months what is the result.

It seems to be running much faster (probably after removing the
randomization), and apparently it passed the create_view tests without
crashing, but then crashed at 'constraints' (again, because of OOM).

  PortalMemory: 8192 total in 1 blocks; 7880 free (0 chunks); 312 used
PortalHeapMemory: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
  ExecutorState: 769654952 total in 103 blocks; 114984 free (296
chunks); 769539968 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used

I suppose we don't expect 760MB ExecutorState here. Also, there's ~60MB
MessageContext.

It's still running, so I'm attaching the relevant part of log (again,
with MemoryContextStats for backends with VSS >= 512MB .

FWIW, it's running against a844c29966d7c0cd6a457e9324f175349bb12df0.

regards
Tomas
[53f203cf.585e:35] LOG:  statement: INSERT INTO CHECK_TBL VALUES (1);
[53f203cf.5862:12] LOG:  statement: CREATE VIEW ro_view8 AS SELECT a, b FROM base_tbl ORDER BY a OFFSET 1;
[53f203cf.5862:13] LOG:  statement: CREATE VIEW ro_view9 AS SELECT a, b FROM base_tbl ORDER BY a LIMIT 1;
[53f203cf.5862:14] LOG:  statement: CREATE VIEW ro_view10 AS SELECT 1 AS a;
[53f203cf.5862:15] LOG:  statement: CREATE VIEW ro_view11 AS SELECT b1.a, b2.b FROM base_tbl b1, base_tbl b2;
[53f203cf.5862:16] LOG:  statement: CREATE VIEW ro_view12 AS SELECT * FROM generate_series(1, 10) AS g(a);
[53f203cf.5862:17] LOG:  statement: CREATE VIEW ro_view13 AS SELECT a, b FROM (SELECT * FROM base_tbl) AS t;
[53f203cf.5862:18] LOG:  statement: CREATE VIEW rw_view14 AS SELECT ctid, a, b FROM base_tbl;
TopMemoryContext: 61792 total in 9 blocks; 4264 free (12 chunks); 57528 used
  CFuncHash: 8192 total in 1 blocks; 1640 free (0 chunks); 6552 used
  TopTransactionContext: 8192 total in 1 blocks; 6208 free (23 chunks); 1984 used
SPI Exec: 50323456 total in 15 blocks; 2568088 free (79 chunks); 47755368 used
  CachedPlanSource: 3072 total in 2 blocks; 672 free (0 chunks); 2400 used
SPI Proc: 8192 total in 1 blocks; 8088 free (0 chunks); 104 used
  MessageContext: 4194304 total in 10 blocks; 1120840 free (130 chunks); 3073464 used
  Operator class cache: 8192 total in 1 blocks; 1640 free (0 chunks); 6552 used
  smgr relation table: 24576 total in 2 blocks; 13872 free (3 chunks); 10704 used
  TransactionAbortContext: 32768 total in 1 blocks; 32736 free (0 chunks); 32 used
  Portal hash: 8192 total in 1 blocks; 1640 free (0 chunks); 6552 used
  PortalMemory: 8192 total in 1 blocks; 7880 free (0 chunks); 312 used
PortalHeapMemory: 1024 total in 1 blocks; 840 free (0 chunks); 184 used
  ExecutorState: 243261440 total in 38 blocks; 3436120 free (250 chunks); 239825320 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
ExprContext: 0 total in 0 blocks; 0 free (0 chunks); 0 used
  Relcache by OID: 8192 total in 1 blocks; 592 free (0 chunks); 7600 used
  CacheMemoryContext: 516096 total in 6 blocks; 260744 free (49 chunks); 255352 used
fkeys2p_i: 1024 total in 1 blocks; 144 free (0 chunks); 880 used
fkeys2_i: 1024 total in 1 blocks; 8 free (0 chunks); 1016 used
EventTriggerCache: 8192 total in 1 blocks; 8160 free (114 chunks); 32 used
  Event Trigger Cache: 8192 total in 1 blocks; 3712 free (0 chunks); 4480 used
pg_auth_members_member_role_index: 1024 total in 1 blocks; 8 free (0 chunks); 1016 used
pg_authid_oid_index: 1024 total in 1 blocks; 88 free (0 chunks); 936 used
pg_authid_rolname_index: 1024 total in 1 blocks; 144 free (0 chunks); 880 used
pg_database_oid_index: 1024 total in 1 blocks; 88

Re: [HACKERS] 9.5: Memory-bounded HashAgg

2014-08-20 Thread Robert Haas
On Sun, Aug 17, 2014 at 1:17 PM, Tomas Vondra  wrote:
> Being able to batch inner and outer relations in a matching way is
> certainly one of the reasons why hashjoin uses that particular scheme.
> There are other reasons, though - for example being able to answer 'Does
> this group belong to this batch?' quickly, and automatically update
> number of batches.
>
> I'm not saying the lookup is extremely costly, but I'd be very surprised
> if it was as cheap as modulo on a 32-bit integer. Not saying it's the
> dominant cost here, but memory bandwidth is quickly becoming one of the
> main bottlenecks.

Well, I think you're certainly right that a hash table lookup is more
expensive than modulo on a 32-bit integer; so much is obvious.  But if
the load factor is not too large, I think that it's not a *lot* more
expensive, so it could be worth it if it gives us other advantages.
As I see it, the advantage of Jeff's approach is that it doesn't
really matter whether our estimates are accurate or not.  We don't
have to decide at the beginning how many batches to do, and then
possibly end up using too much or too little memory per batch if we're
wrong; we can let the amount of memory actually used during execution
determine the number of batches.  That seems good.  Of course, a hash
join can increase the number of batches on the fly, but only by
doubling it, so you might go from 4 batches to 8 when 5 would really
have been enough.  And a hash join also can't *reduce* the number of
batches on the fly, which might matter a lot.  Getting the number of
batches right avoids I/O, which is a lot more expensive than CPU.

>> But the situation here isn't comparable, because there's only one
>> input stream.  I'm pretty sure we'll want to keep track of which
>> transition states we've spilled due to lack of memory as opposed to
>> those which were never present in the table at all, so that we can
>> segregate the unprocessed tuples that pertain to spilled transition
>> states from the ones that pertain to a group we haven't begun yet.
>
> Why would that be necessary or useful? I don't see a reason for tracking
> that / segregating the tuples.

Suppose there are going to be three groups: A, B, C.  Each is an
array_agg(), and they're big, so only of them will fit in work_mem at
a time.  However, we don't know that at the beginning, either because
we don't write the code to try or because we do write that code but
our cardinality estimates are way off; instead, we're under the
impression that all four will fit in work_mem.  So we start reading
tuples.  We see values for A and B, but we don't see any values for C
because those all occur later in the input.  Eventually, we run short
of memory and cut off creation of new groups.  Any tuples for C are
now going to get written to a tape from which we'll later reread them.
After a while, even that proves insufficient and we spill the
transition state for B to disk.  Any further tuples that show up for C
will need to be written to tape as well.  We continue processing and
finish group A.

Now it's time to do batch #2.  Presumably, we begin by reloading the
serialized transition state for group B.  To finish group B, we must
look at all the tuples that might possibly fall in that group.  If all
of the remaining tuples are on a single tape, we'll have to read all
the tuples in group B *and* all the tuples in group C; we'll
presumably rewrite the tuples that are not part of this batch onto a
new tape, which we'll then process in batch #3.  But if we took
advantage of the first pass through the input to put the tuples for
group B on one tape and the tuples for group C on another tape, we can
be much more efficient - just read the remaining tuples for group B,
not mixed with anything else, and then read a separate tape for group
C.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-08-20 Thread Robert Haas
On Tue, Aug 19, 2014 at 7:08 AM, Amit Kapila  wrote:
> On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas  wrote:
>>
>> On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila 
>> wrote:
>> > 1.
>> > +Number of parallel connections to perform the operation. This
>> > option will enable the vacuum
>> > +operation to run on parallel connections, at a time one table
>> > will
>> > be operated on one connection.
>> >
>> > a. How about describing w.r.t asynchronous connections
>> > instead of parallel connections?
>>
>> I don't think "asynchronous" is a good choice of word.
>
> Agreed.
>
>>Maybe "simultaneous"?
>
> Not sure. How about *concurrent* or *multiple*?

multiple isn't right, but we could say concurrent.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread
Robert and Tom,

Sorry for any confusion, I will try to clarify.

Here is progression of events as I recall them:
- My Initial QNX 6.5 port proposal lacked a robust replacement for the existing 
System V shared memory locking mechanism, a show stopper.
- Robert proposed a nice set of possible alternatives for locking (to enable an 
all POSIX shared memory solution for future platforms).
- Tom and Robert seemed to agree that a combination of file-based locking plus 
pipe-based locking should be a sufficiently robust on platforms without Sys V 
shared memory (e.g., QNX).
- I coded a proof-of-concept patch (fcntl + PIPE) which appeared to work on QNX 
(steps a through e).
- Robert countered with an 11 step algorithm (all in the postmaster)
- Tom suggested elimination of steps 5,6,7,8, and 11 (and swapping order 9 and 
10)

I was just taking a step back to ask what gaps existed in the proof-of-concept 
patch (steps a through e).
Is there a scenario it fails to cover, prompting the seemingly more complex 11 
step algorithm (which added writing data to the pipe and handling of SIGPIPE)?

I am willing to attempt coding of the set of changes for a QNX port (option for 
new locking and all POSIX shared memory, plus a few minor QNX-specific tweaks), 
provided you and Tom are satisfied that the show stoppers have been 
sufficiently addressed.

Please let me know if more discussion is required, or if it would be reasonable 
for me (or someone else of your choosing) to work on the coding effort (perhaps 
targeted for 9.5?)
If on the other hand it has been decided that a QNX port is not in the cards, I 
would like to know (I hope that is not the case given the progress made, but no 
point in wasting anyone's time).

Thanks again for your time, effort, patience, and coaching.

Keith Baker


> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Wednesday, August 20, 2014 12:26 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Tom Lane; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> On Mon, Aug 18, 2014 at 11:02 AM, Baker, Keith [OCDUS Non-J&J]
>  wrote:
> > My proof of concept code (steps a though e below) avoided any reading or
> writing to the pipe (and associated handling of SIGPIPE), it just relied on
> postmaster open of PIPE with ENXIO to indicate all is clear.
> 
> I'm not following.
> 
> > Robert, Assuming an algorithm choice is agreed upon in the near future,
> would you be the logical choice to implement the change?
> > I am happy to help, especially with any QNX-specific aspects, but don't
> want to step on anyone's toes.
> 
> I'm unlikely to have time to work on this in the immediate future, but I may
> be able to help review.
> 
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL
> Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-20 Thread Josh Berkus
On 08/20/2014 08:29 AM, Tom Lane wrote:
> Since it's looking like this might be the direction we want to go, I took
> the time to flesh out my proof-of-concept patch.  The attached version
> takes care of cosmetic issues (like fixing the comments), and includes
> code to avoid O(N^2) penalties in findJsonbValueFromContainer and
> JsonbIteratorNext

OK, will test.

This means we need a beta3, no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-20 Thread Tom Lane
Josh Berkus  writes:
> This means we need a beta3, no?

If we change the on-disk format, I'd say so.  So we don't want to wait
around too long before deciding.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patching for increasing the number of columns

2014-08-20 Thread Mayeul Kauffmann


On 20/08/14 18:17, Tom Lane wrote:
Hm. I think the without_oid test is not showing that anything is broken; 


The other tests aren't showing any functional issue either AFAICS. 

Thanks a lot Tom! That's very helpful.
I have written more details and some basic SQL tests in the wiki of the 
application (LimeSurvey) which requires this:


http://manual.limesurvey.org/Instructions_for_increasing_the_maximum_number_of_columns_in_PostgreSQL_on_Linux

I will give update here or on that wiki (where most relevant) should I 
find issues while testing.


Cheers,

mayeulk



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Verbose output of pg_dump not show schema name

2014-08-20 Thread Fabrízio de Royes Mello
On Wed, Aug 20, 2014 at 2:43 AM, Michael Paquier 
wrote:
>
> I had a look at this patch, and here are a couple of comments:
> 1) Depending on how ArchiveEntry is called to register an object to
> dump, namespace may be NULL, but it is not the case
> namespace->dobj.name, so you could get the namespace name at the top
> of the function that have their verbose output improved with something
> like that:
> const char *namespace = tbinfo->dobj.namespace ?
>tbinfo->dobj.namespace->dobj.name : NULL;
> And then simplify the message output as follows:
> if (namespace)
>write_msg("blah \"%s\".\"%s\" blah", namespace, classname);
> else
>write_msg("blah \"%s\" blah", classname);
> You can as well safely remove the checks on namespace->dobj.name.

Ok

> 2) I don't think that this is correct:
> -   ahlog(AH, 1, "processing data
> for table \"%s\"\n",
> - te->tag);
> +   ahlog(AH, 1, "processing data
> for table \"%s\".\"%s\"\n",
> + AH->currSchema,
te->tag);
> There are some code paths where AH->currSchema is set to NULL, and I
> think that you should use te->namespace instead.

Yes, you are correct!


> 3) Changing only this message is not enough. The following verbose
> messages need to be changed too for consistency:
> - pg_dump: creating $tag $object
> - pg_dump: setting owner and privileges for [blah]
>
> I have been pondering as well about doing similar modifications to the
> error message paths, but it did not seem worth it as this patch is
> aimed only for the verbose output. Btw, I have basically fixed those
> issues while doing the review, and finished with the attached patch.
> Fabrizio, is this new version fine for you?
>

Is fine to me.

I just change "if (tbinfo->dobj.namespace != NULL)" to "if
(tbinfo->dobj.namespace)".

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index 3aebac8..07cc10e 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -546,8 +546,13 @@ RestoreArchive(Archive *AHX)
 		/* Both schema and data objects might now have ownership/ACLs */
 		if ((te->reqs & (REQ_SCHEMA | REQ_DATA)) != 0)
 		{
-			ahlog(AH, 1, "setting owner and privileges for %s %s\n",
-  te->desc, te->tag);
+			/* Show namespace if available */
+			if (te->namespace)
+ahlog(AH, 1, "setting owner and privileges for %s \"%s\".\"%s\"\n",
+	  te->desc, te->namespace, te->tag);
+			else
+ahlog(AH, 1, "setting owner and privileges for %s \"%s\"\n",
+	  te->desc, te->tag);
 			_printTocEntry(AH, te, ropt, false, true);
 		}
 	}
@@ -621,7 +626,13 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
 
 	if ((reqs & REQ_SCHEMA) != 0)		/* We want the schema */
 	{
-		ahlog(AH, 1, "creating %s %s\n", te->desc, te->tag);
+		/* Show namespace if available */
+		if (te->namespace)
+			ahlog(AH, 1, "creating %s \"%s\".\"%s\"\n",
+  te->desc, te->namespace, te->tag);
+		else
+			ahlog(AH, 1, "creating %s \"%s\"\n", te->desc, te->tag);
+
 
 		_printTocEntry(AH, te, ropt, false, false);
 		defnDumped = true;
@@ -713,8 +724,13 @@ restore_toc_entry(ArchiveHandle *AH, TocEntry *te,
 	_becomeOwner(AH, te);
 	_selectOutputSchema(AH, te->namespace);
 
-	ahlog(AH, 1, "processing data for table \"%s\"\n",
-		  te->tag);
+	/* Show namespace if available */
+	if (te->namespace)
+		ahlog(AH, 1, "processing data for table \"%s\".\"%s\"\n",
+			  te->namespace, te->tag);
+	else
+		ahlog(AH, 1, "processing data for table \"%s\"\n",
+			  te->tag);
 
 	/*
 	 * In parallel restore, if we created the table earlier in
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 5c0f95f..ea32b42 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -1383,6 +1383,8 @@ dumpTableData_copy(Archive *fout, void *dcontext)
 {
 	TableDataInfo *tdinfo = (TableDataInfo *) dcontext;
 	TableInfo  *tbinfo = tdinfo->tdtable;
+	const char *namespace = tbinfo->dobj.namespace ?
+		tbinfo->dobj.namespace->dobj.name : NULL;
 	const char *classname = tbinfo->dobj.name;
 	const bool	hasoids = tbinfo->hasoids;
 	const bool	oids = tdinfo->oids;
@@ -1400,7 +1402,16 @@ dumpTableData_copy(Archive *fout, void *dcontext)
 	const char *column_list;
 
 	if (g_verbose)
-		write_msg(NULL, "dumping contents of table %s\n", classname);
+	{
+		/* Print namespace information if available */
+		if (namespace)
+			write_msg(NULL, "dumping contents of table \"%s\".\"%s\"\n",
+	  namespace,
+	  classname);
+		else
+			write_msg

[HACKERS] documentation update for doc/src/sgml/func.sgml

2014-08-20 Thread Andreas 'ads' Scherbaum


Hi,

attached is a small patch which updates doc/src/sgml/func.sgml. The 
change explains that functions like round() and others might behave 
different depending on your operating system (because of rint(3)) and 
that this is according to an IEEE standard. It also points out that #.5 
is not always rounded up, as expected from a mathematical point of view.



Regards,

--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 13c71af..da30991 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -924,6 +924,25 @@
 

 
+   
+For functions like round(), log() and sqrt() which run against
+either fixed-precision (NUMERIC) or floating-point numbers (e.g. REAL),
+note that the results of these operations will differ according
+to the input type due to rounding. This is most observable with
+round(), which can end up rounding down as well as up for
+any #.5 value. We use the
+http://en.wikipedia.org/w/index.php?title=IEEE_floating_point&oldid=622007055#Rounding_rules";>IEEE's rules
+for rounding floating-point numbers which can be machine-specific.
+The bitwise operators work only on integral data types, whereas
+the others are available for all numeric data types.
+  
+
+  
+The bitwise operators are also available for the bit string types
+bit and bit varying, as
+shown in .
+   
+
   
  shows functions for
 generating random numbers.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread Alvaro Herrera
Baker, Keith [OCDUS Non-J&J] wrote:

> Please let me know if more discussion is required, or if it would be
> reasonable for me (or someone else of your choosing) to work on the
> coding effort (perhaps targeted for 9.5?)
> If on the other hand it has been decided that a QNX port is not in the
> cards, I would like to know (I hope that is not the case given the
> progress made, but no point in wasting anyone's time).

As I recall, other than the postmaster startup interlock, the other
major missing item you mentioned is SA_RESTART.  That could well turn
out to be a showstopper, so I suggest you study that in more depth.

Are there other major items missing?  Did you have to use
configure --disable-spinlocks for instance?

What's your compiler, and what are the underlying hardware platforms you
want to support?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-20 Thread Josh Berkus
On 08/20/2014 08:29 AM, Tom Lane wrote:
> Josh Berkus  writes:
>> On 08/15/2014 04:19 PM, Tom Lane wrote:
>>> Personally I'd prefer to go to the all-lengths approach, but a large
>>> part of that comes from a subjective assessment that the hybrid approach
>>> is too messy.  Others might well disagree.
> 
>> ... So, that extraction test is about 1% *slower* than the basic Tom Lane
>> lengths-only patch, and still 80% slower than original JSONB.  And it's
>> the same size as the lengths-only version.
> 
> Since it's looking like this might be the direction we want to go, I took
> the time to flesh out my proof-of-concept patch.  The attached version
> takes care of cosmetic issues (like fixing the comments), and includes
> code to avoid O(N^2) penalties in findJsonbValueFromContainer and
> JsonbIteratorNext.  I'm not sure whether those changes will help
> noticeably on Josh's test case; for me, they seemed worth making, but
> they do not bring the code back to full speed parity with the all-offsets
> version.  But as we've been discussing, it seems likely that those costs
> would be swamped by compression and I/O considerations in most scenarios
> with large documents; and of course for small documents it hardly matters.

Table sizes and extraction times are unchanged from the prior patch
based on my workload.

We should be comparing all-lengths vs length-and-offset maybe using
another workload as well ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread
Alvaro,

Thanks for your interest and questions.
At this point I have created a proof-of-concept QNX 6.5 port which appears to 
work on the surface (passes regression tests), but needs to be deemed 
"production-quality".

To work around lack of SA_RESTART, I added QNX-specific retry macros to port.h
With these macros in place "make check" runs cleanly (fails in many place 
without them).

+#if defined(__QNX__)
+/* QNX does not support sigaction SA_RESTART. We must retry interrupted 
calls (EINTR) */
+/* Helper macros, used to build our retry macros */
+#define PG_RETRY_EINTR3(exp,val,type) ({ type _tmp_rc; do _tmp_rc = (exp); 
while (_tmp_rc == (val) && errno == EINTR); _tmp_rc; })
+#define PG_RETRY_EINTR(exp) PG_RETRY_EINTR3(exp,-1L,long int)
+#define PG_RETRY_EINTR_FILE(exp) PG_RETRY_EINTR3(exp,NULL,FILE *)
+/* override calls known to return EINTR when interrupted */
+#define close(a) PG_RETRY_EINTR(close(a))
+#define fclose(a) PG_RETRY_EINTR(fclose(a))
+#define fdopen(a,b) PG_RETRY_EINTR_FILE(fdopen(a,b))
+#define fopen(a,b) PG_RETRY_EINTR_FILE(fopen(a,b))
+#define freopen(a,b,c) PG_RETRY_EINTR_FILE(freopen(a,b,c))
+#define fseek(a,b,c) PG_RETRY_EINTR(fseek(a,b,c))
+#define fseeko(a,b,c) PG_RETRY_EINTR(fseeko(a,b,c))
+#define ftruncate(a,b) PG_RETRY_EINTR(ftruncate(a,b))
+#define lseek(a,b,c) PG_RETRY_EINTR(lseek(a,b,c))
+#define open(a,b,...) ({ int _tmp_rc; do _tmp_rc = 
open(a,b,##__VA_ARGS__); while (_tmp_rc == (-1) && errno == EINTR); _tmp_rc; })
+#define shm_open(a,b,c) PG_RETRY_EINTR(shm_open(a,b,c))
+#define stat(a,b) PG_RETRY_EINTR(stat(a,b))
+#define unlink(a) PG_RETRY_EINTR(unlink(a))
... (Macros for read and write are similar but slightly longer, so I omit 
them here)...
+#endif /* __QNX__ */

Here is what I used for configure, I am open to suggestions:
./configure --without-readline --disable-thread-safety

I am targeting QNX 6.5 on x86, using gcc 4.4.2.

Also, I have an issue to work out for locale support, but expect I can solve 
that.

Keith Baker

> -Original Message-
> From: Alvaro Herrera [mailto:alvhe...@2ndquadrant.com]
> Sent: Wednesday, August 20, 2014 4:16 PM
> To: Baker, Keith [OCDUS Non-J&J]
> Cc: Robert Haas; Tom Lane; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL
> 
> Baker, Keith [OCDUS Non-J&J] wrote:
> 
> > Please let me know if more discussion is required, or if it would be
> > reasonable for me (or someone else of your choosing) to work on the
> > coding effort (perhaps targeted for 9.5?) If on the other hand it has
> > been decided that a QNX port is not in the cards, I would like to know
> > (I hope that is not the case given the progress made, but no point in
> > wasting anyone's time).
> 
> As I recall, other than the postmaster startup interlock, the other major
> missing item you mentioned is SA_RESTART.  That could well turn out to be a
> showstopper, so I suggest you study that in more depth.
> 
> Are there other major items missing?  Did you have to use configure --
> disable-spinlocks for instance?
> 
> What's your compiler, and what are the underlying hardware platforms you
> want to support?
> 
> --
> Álvaro Herrerahttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-08-20 Thread Arthur Silva
What data are you using right now Josh?

There's the github archive http://www.githubarchive.org/
Here's some sample data https://gist.github.com/igrigorik/2017462




--
Arthur Silva



On Wed, Aug 20, 2014 at 6:09 PM, Josh Berkus  wrote:

> On 08/20/2014 08:29 AM, Tom Lane wrote:
> > Josh Berkus  writes:
> >> On 08/15/2014 04:19 PM, Tom Lane wrote:
> >>> Personally I'd prefer to go to the all-lengths approach, but a large
> >>> part of that comes from a subjective assessment that the hybrid
> approach
> >>> is too messy.  Others might well disagree.
> >
> >> ... So, that extraction test is about 1% *slower* than the basic Tom
> Lane
> >> lengths-only patch, and still 80% slower than original JSONB.  And it's
> >> the same size as the lengths-only version.
> >
> > Since it's looking like this might be the direction we want to go, I took
> > the time to flesh out my proof-of-concept patch.  The attached version
> > takes care of cosmetic issues (like fixing the comments), and includes
> > code to avoid O(N^2) penalties in findJsonbValueFromContainer and
> > JsonbIteratorNext.  I'm not sure whether those changes will help
> > noticeably on Josh's test case; for me, they seemed worth making, but
> > they do not bring the code back to full speed parity with the all-offsets
> > version.  But as we've been discussing, it seems likely that those costs
> > would be swamped by compression and I/O considerations in most scenarios
> > with large documents; and of course for small documents it hardly
> matters.
>
> Table sizes and extraction times are unchanged from the prior patch
> based on my workload.
>
> We should be comparing all-lengths vs length-and-offset maybe using
> another workload as well ...
>
> --
> Josh Berkus
> PostgreSQL Experts Inc.
> http://pgexperts.com
>


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Bruce Momjian
On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
> >> Alternatively, you could use the process PID as part of the temp file
> >> name; which is probably a good idea anyway.
> 
> > I think that's actually worse, because nothing will clean up those
> > unless you explicitly scan for all .$pid files, and somehow
> > kill them.
> 
> True.  As long as the copy command is prepared to get rid of a
> pre-existing target file, using a fixed .tmp extension should be fine.

Well, then we are back to this comment by MauMau:

> With that said, copying to a temporary file like .tmp and
> renaming it to  sounds worthwhile even as a basic copy utility.
> I want to avoid copying to a temporary file with a fixed name like
> _copy.tmp, because some advanced utility may want to run multiple
> instances of pg_copy to copy several files into the same directory
> simultaneously.  However, I'm afraid multiple .tmp files might
> continue to occupy disk space after canceling copy or power failure in
> some use cases, where the copy of the same file won't be retried.
> That's also the reason why I chose to not use a temporary file like
> cp/copy.

Do we want cases where the same directory is used multiple pg_copy
processes?  I can't imagine how that setup would make sense.

I am thinking pg_copy should emit a warning message when it removes an
old temp file.  This might alert people that something odd is happening
if they see the message often.

The pid-extension idea would work as pg_copy can test all pid extension
files to see if the pid is still active.  However, that assumes that the
pid is running on the local machine and not on another machines that has
NFS-mounted this directory, so maybe this is a bad idea, but again, we
are back to the idea that only one process should be writing into this
directory.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-08-20 Thread Andres Freund
On 2014-08-20 18:58:05 -0400, Bruce Momjian wrote:
> On Wed, Aug 20, 2014 at 10:36:40AM -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On 2014-08-20 10:19:33 -0400, Tom Lane wrote:
> > >> Alternatively, you could use the process PID as part of the temp file
> > >> name; which is probably a good idea anyway.
> > 
> > > I think that's actually worse, because nothing will clean up those
> > > unless you explicitly scan for all .$pid files, and somehow
> > > kill them.
> > 
> > True.  As long as the copy command is prepared to get rid of a
> > pre-existing target file, using a fixed .tmp extension should be fine.
> 
> Well, then we are back to this comment by MauMau:

> > With that said, copying to a temporary file like .tmp and
> > renaming it to  sounds worthwhile even as a basic copy utility.
> > I want to avoid copying to a temporary file with a fixed name like
> > _copy.tmp, because some advanced utility may want to run multiple
> > instances of pg_copy to copy several files into the same directory
> > simultaneously.  However, I'm afraid multiple .tmp files might
> > continue to occupy disk space after canceling copy or power failure in
> > some use cases, where the copy of the same file won't be retried.
> > That's also the reason why I chose to not use a temporary file like
> > cp/copy.
> 
> Do we want cases where the same directory is used multiple pg_copy
> processes?  I can't imagine how that setup would make sense.

I don't think anybody is proposing the _copy.tmp proposal. We've just
argued about the risk of .tmp. And I argued - and others seem to
agree - the space usage problem isn't really relevant because archive
commands and such are rerun after failure and can then clean up the temp
file again.

> I am thinking pg_copy should emit a warning message when it removes an
> old temp file.  This might alert people that something odd is happening
> if they see the message often.

Don't really see a point in this. If the archive command or such failed,
that will already have been logged. I'd expect this to be implemented by
passing O_CREAT | O_TRUNC to open(), nothing else.

> The pid-extension idea would work as pg_copy can test all pid extension
> files to see if the pid is still active.  However, that assumes that the
> pid is running on the local machine and not on another machines that has
> NFS-mounted this directory, so maybe this is a bad idea, but again, we
> are back to the idea that only one process should be writing into this
> directory.

I don't actually think we should assume that. There very well could be
one process running an archive command, using differently prefixed file
names or such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-08-20 Thread Alvaro Herrera
Alvaro Herrera wrote:

> So here's v16, rebased on top of 9bac66020.  As far as I am concerned,
> this is the last version before I start renaming everything to BRIN and
> then commit.

FWIW in case you or others have interest, here's the diff between your
patch and v16.  Also, for illustrative purposes, the diff between
versions yours and mine of the code that got moved to mmpageops.c
because it's difficult to see it from the partial patch.  (There's
nothing to do with that partial diff other than read it directly.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/contrib/pageinspect/mmfuncs.c
--- b/contrib/pageinspect/mmfuncs.c
***
*** 29,35 
  PG_FUNCTION_INFO_V1(minmax_page_type);
  PG_FUNCTION_INFO_V1(minmax_page_items);
  PG_FUNCTION_INFO_V1(minmax_metapage_info);
- PG_FUNCTION_INFO_V1(minmax_revmap_array_data);
  PG_FUNCTION_INFO_V1(minmax_revmap_data);
  
  typedef struct mm_column_state
--- 29,34 
***
*** 388,394  minmax_revmap_data(PG_FUNCTION_ARGS)
  	values[0] = Int64GetDatum((uint64) 0);
  
  	/* Extract (possibly empty) list of TIDs in this page. */
! 	for (i = 0; i < REGULAR_REVMAP_PAGE_MAXITEMS; i++)
  	{
  		ItemPointer	tid;
  
--- 387,393 
  	values[0] = Int64GetDatum((uint64) 0);
  
  	/* Extract (possibly empty) list of TIDs in this page. */
! 	for (i = 0; i < REVMAP_PAGE_MAXITEMS; i++)
  	{
  		ItemPointer	tid;
  
*** a/src/backend/access/minmax/Makefile
--- b/src/backend/access/minmax/Makefile
***
*** 12,17  subdir = src/backend/access/minmax
  top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS = minmax.o mmrevmap.o mmtuple.o mmxlog.o mmsortable.o
  
  include $(top_srcdir)/src/backend/common.mk
--- 12,17 
  top_builddir = ../../../..
  include $(top_builddir)/src/Makefile.global
  
! OBJS = minmax.o mmpageops.o mmrevmap.o mmtuple.o mmxlog.o mmsortable.o
  
  include $(top_srcdir)/src/backend/common.mk
*** a/src/backend/access/minmax/minmax.c
--- b/src/backend/access/minmax/minmax.c
***
*** 15,45 
   */
  #include "postgres.h"
  
- #include "access/htup_details.h"
  #include "access/minmax.h"
  #include "access/minmax_internal.h"
  #include "access/minmax_page.h"
! #include "access/minmax_revmap.h"
! #include "access/minmax_tuple.h"
  #include "access/minmax_xlog.h"
  #include "access/reloptions.h"
  #include "access/relscan.h"
- #include "access/xlogutils.h"
  #include "catalog/index.h"
- #include "catalog/pg_operator.h"
- #include "commands/vacuum.h"
  #include "miscadmin.h"
  #include "pgstat.h"
  #include "storage/bufmgr.h"
  #include "storage/freespace.h"
- #include "storage/indexfsm.h"
- #include "storage/lmgr.h"
- #include "storage/smgr.h"
- #include "utils/datum.h"
- #include "utils/lsyscache.h"
- #include "utils/memutils.h"
  #include "utils/rel.h"
- #include "utils/syscache.h"
  
  
  /*
--- 15,33 
   */
  #include "postgres.h"
  
  #include "access/minmax.h"
  #include "access/minmax_internal.h"
  #include "access/minmax_page.h"
! #include "access/minmax_pageops.h"
  #include "access/minmax_xlog.h"
  #include "access/reloptions.h"
  #include "access/relscan.h"
  #include "catalog/index.h"
  #include "miscadmin.h"
  #include "pgstat.h"
  #include "storage/bufmgr.h"
  #include "storage/freespace.h"
  #include "utils/rel.h"
  
  
  /*
***
*** 75,93  static MMBuildState *initialize_mm_buildstate(Relation idxRel,
  static bool terminate_mm_buildstate(MMBuildState *state);
  static void summarize_range(MMBuildState *mmstate, Relation heapRel,
  BlockNumber heapBlk);
- static bool mm_doupdate(Relation idxrel, BlockNumber pagesPerRange,
- 			mmRevmapAccess *rmAccess, BlockNumber heapBlk,
- 			Buffer oldbuf, OffsetNumber oldoff,
- 			const MMTuple *origtup, Size origsz,
- 			const MMTuple *newtup, Size newsz,
- 			bool samepage, bool *extended);
- static void mm_doinsert(Relation idxrel, BlockNumber pagesPerRange,
- 			mmRevmapAccess *rmAccess, Buffer *buffer, BlockNumber heapblkno,
- 			MMTuple *tup, Size itemsz, bool *extended);
- static Buffer mm_getinsertbuffer(Relation irel, Buffer oldbuf, Size itemsz,
-    bool *extended);
  static void form_and_insert_tuple(MMBuildState *mmstate);
- static Size mm_page_get_freespace(Page page);
  
  
  /*
--- 63,69 
***
*** 123,128  mminsert(PG_FUNCTION_ARGS)
--- 99,105 
  	rmAccess = mmRevmapAccessInit(idxRel, &pagesPerRange);
  
  restart:
+ 	CHECK_FOR_INTERRUPTS();
  	heapBlk = ItemPointerGetBlockNumber(heaptid);
  	/* normalize the block number to be the first block in the range */
  	heapBlk = (heapBlk / pagesPerRange) * pagesPerRange;
***
*** 155,161  restart:
  
  		addValue = index_getprocinfo(idxRel, keyno + 1,
  	 MINMAX_PROCNUM_ADDVALUE);
- 
  		result = FunctionCall5Coll(addValue,
     idxRel->rd_indcollation[keyno],
     PointerGetDatum(mmdesc),
--- 132,137

Re: [HACKERS] Verbose output of pg_dump not show schema name

2014-08-20 Thread Michael Paquier
On Thu, Aug 21, 2014 at 5:11 AM, Fabrízio de Royes Mello
 wrote:
> I just change "if (tbinfo->dobj.namespace != NULL)" to "if
> (tbinfo->dobj.namespace)".
Fine for me. I am marking this patch as ready for committer.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental backup: add backup profile to base backup

2014-08-20 Thread Bruce Momjian
On Mon, Aug 18, 2014 at 04:05:07PM +0300, Heikki Linnakangas wrote:
> But more to the point, I thought the consensus was to use the
> highest LSN of all the blocks in the file, no? That's essentially
> free to calculate (if you have to read all the data anyway), and
> isn't vulnerable to collisions.

The highest-LSN approach allows you to read only the tail part of each
8k block.  Assuming 512-byte storage sector sizes, you only have to read
1/8 of the file.

Now, the problem is that you lose kernel prefetch, but maybe
posix_fadvise() would fix that problem.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread Andres Freund
Hi,

On 2014-08-20 21:21:41 +, Baker, Keith [OCDUS Non-J&J] wrote:
> To work around lack of SA_RESTART, I added QNX-specific retry macros to port.h
> With these macros in place "make check" runs cleanly (fails in many place 
> without them).
> 
> +#if defined(__QNX__)
> +/* QNX does not support sigaction SA_RESTART. We must retry interrupted 
> calls (EINTR) */
> +/* Helper macros, used to build our retry macros */
> +#define PG_RETRY_EINTR3(exp,val,type) ({ type _tmp_rc; do _tmp_rc = 
> (exp); while (_tmp_rc == (val) && errno == EINTR); _tmp_rc; })
> +#define PG_RETRY_EINTR(exp) PG_RETRY_EINTR3(exp,-1L,long int)
> +#define PG_RETRY_EINTR_FILE(exp) PG_RETRY_EINTR3(exp,NULL,FILE *)
> +/* override calls known to return EINTR when interrupted */
> +#define close(a) PG_RETRY_EINTR(close(a))
> +#define fclose(a) PG_RETRY_EINTR(fclose(a))
> +#define fdopen(a,b) PG_RETRY_EINTR_FILE(fdopen(a,b))
> +#define fopen(a,b) PG_RETRY_EINTR_FILE(fopen(a,b))
> +#define freopen(a,b,c) PG_RETRY_EINTR_FILE(freopen(a,b,c))
> +#define fseek(a,b,c) PG_RETRY_EINTR(fseek(a,b,c))
> +#define fseeko(a,b,c) PG_RETRY_EINTR(fseeko(a,b,c))
> +#define ftruncate(a,b) PG_RETRY_EINTR(ftruncate(a,b))
> +#define lseek(a,b,c) PG_RETRY_EINTR(lseek(a,b,c))
> +#define open(a,b,...) ({ int _tmp_rc; do _tmp_rc = 
> open(a,b,##__VA_ARGS__); while (_tmp_rc == (-1) && errno == EINTR); _tmp_rc; 
> })
> +#define shm_open(a,b,c) PG_RETRY_EINTR(shm_open(a,b,c))
> +#define stat(a,b) PG_RETRY_EINTR(stat(a,b))
> +#define unlink(a) PG_RETRY_EINTR(unlink(a))
> ... (Macros for read and write are similar but slightly longer, so I omit 
> them here)...
> +#endif   /* __QNX__ */

I think this is a horrible way to go and unlikely to succeed. You're
surely going to miss calls and it's going to need to be maintained
continuously. We'll miss adding things which will then only break under
load. Which most poeple won't be able to generate under qnx.

The only reasonably way to fake kernel SA_RESTART support is doing so is
in $platform's libc. In the syscall wrapper.

> Here is what I used for configure, I am open to suggestions:
> ./configure --without-readline --disable-thread-safety

Why is the --disable-thread-safety needed?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Incremental backup: add backup profile to base backup

2014-08-20 Thread Claudio Freire
On Wed, Aug 20, 2014 at 8:24 PM, Bruce Momjian  wrote:
> On Mon, Aug 18, 2014 at 04:05:07PM +0300, Heikki Linnakangas wrote:
>> But more to the point, I thought the consensus was to use the
>> highest LSN of all the blocks in the file, no? That's essentially
>> free to calculate (if you have to read all the data anyway), and
>> isn't vulnerable to collisions.
>
> The highest-LSN approach allows you to read only the tail part of each
> 8k block.  Assuming 512-byte storage sector sizes, you only have to read
> 1/8 of the file.
>
> Now, the problem is that you lose kernel prefetch, but maybe
> posix_fadvise() would fix that problem.

Sequential read of 512-byte blocks or 8k blocks takes the same amount
of time in rotating media (if they're scheduled right). Maybe not in
SSD media.

Not only, the kernel will read in 4k blocks, instead of 8k (at least in linux).

So, the benefit is dubious.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal to add a QNX 6.5 port to PostgreSQL

2014-08-20 Thread Andres Freund
On 2014-07-25 18:29:53 -0400, Tom Lane wrote:
> > * QNX lacks sigaction SA_RESTART: I modified "src/include/port.h" 
> > to define macros to retry system calls upon EINTR (open,read,write,...) 
> > when compiled on QNX
> 
> That's pretty scary too.  For one thing, such macros would affect every
> call site whether it's running with SA_RESTART or not.  Do you really
> need it?  It looks to me like we just turn off HAVE_POSIX_SIGNALS if
> you don't have SA_RESTART.  Maybe that code has bit-rotted by now, but
> it did work at one time.

I have pretty much no trust that we're maintaining
!HAVE_POSIX_SIGNAL. And none that we have that capability of doing so. I
seriously doubt there's any !HAVE_POSIX_SIGNAL animals and
873ab97219caabeb2f7b390268a4fe01e2b7518c makes it pretty darn unlikely
that we have much chance of finding such mistakes during development.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Verbose output of pg_dump not show schema name

2014-08-20 Thread Fabrízio de Royes Mello
On Wed, Aug 20, 2014 at 8:24 PM, Michael Paquier 
wrote:
>
> On Thu, Aug 21, 2014 at 5:11 AM, Fabrízio de Royes Mello
>  wrote:
> > I just change "if (tbinfo->dobj.namespace != NULL)" to "if
> > (tbinfo->dobj.namespace)".
> Fine for me. I am marking this patch as ready for committer.
>

Thanks!

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] PQgetssl() and alternative SSL implementations

2014-08-20 Thread Bruce Momjian
On Tue, Aug 19, 2014 at 03:26:56PM -0400, Stephen Frost wrote:
> I think there's been some improvement since I last had to go through the
> pain of setting this all up, and some of it is undoubtably OpenSSL's
> fault, but there's definitely quite a bit more we could be doing to make
> SSL support easier.  I'm hopeful that I'll be able to spend more time on
> this in the future but it's not a priority currently.

I know I updated the docs on this in 2013:

commit fa4add50c4ea97c48881fa8cb3863df80141643c
Author: Bruce Momjian 
Date:   Fri Dec 6 09:42:08 2013 -0500

docs: clarify SSL certificate authority chain docs

Previously, the requirements of how intermediate certificates were
handled and their chain to root certificates was unclear.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQgetssl() and alternative SSL implementations

2014-08-20 Thread Bruce Momjian
On Tue, Aug 19, 2014 at 03:47:17PM -0400, Stephen Frost wrote:
> * Robert Haas (robertmh...@gmail.com) wrote:
> > BTW, if we're beating on libpq, I wonder if we shouldn't consider
> > bumping the soversion at some point.  I mean, I know that we
> > technically don't need to do that if we're only *adding* functions and
> > not changing any of the existing stuff in backward-incompatible ways,
> > but we might *want* to make some backward-incompatible changes at some
> > point, and I think there's a decent argument that any patch in this
> > are is already doing that at least to PQgetSSL().  Maybe this would be
> > a good time to think if there's anything else we want to do that
> > would, either by itself or in combination, justify a bump.
> 
> I'm not a big fan of doing it for this specific item, though it's
> technically an API breakage (which means we should actually have
> libpq2-dev packages, make everything that build-deps on libpq-dev
> update to build-dep on libpq2-dev, have libpq6, etc..).  If there are
> other backwards-incompatible things we wish to do, then I agree that
> it'd be good to do them all at the same time (perhaps in conjunction
> with 10.0...).  This is the part where I wish we had been keeping an
> updated list of things we want to change (like on the wiki..).

We have, called "Wire Protocol Changes":

https://wiki.postgresql.org/wiki/Todo

Unfortunately, the subsection link doesn't work on Firefox.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-08-20 Thread Sawada Masahiko
On Wed, Aug 20, 2014 at 9:00 PM, Jeevan Chalke
 wrote:
> Hi,
>
> I have reviewed this:
>
> I have initialize cur_lineno to UINTMAX - 2. And then observed following
> behaviour to check wrap-around.
>
> postgres=# \set PROMPT1 '%/[%l]%R%# '
> postgres[18446744073709551613]=# \set PROMPT2 '%/[%l]%R%# '
> postgres[18446744073709551613]=# select
> postgres[18446744073709551614]-# a
> postgres[18446744073709551615]-# ,
> postgres[0]-# b
> postgres[1]-# from
> postgres[2]-# dual;
>
> It is wrapping to 0, where as line number always start with 1. Any issues?
>
> I would like to ignore this as UINTMAX lines are too much for a input
> buffer to hold. It is almost NIL chances to hit this.
>
>
> However, I think you need to use UINT64_FORMAT while printing uint64
> number. Currently it is %u which wrap-around at UINT_MAX.
> See how pset.lineno is displayed.
>
> Apart from this, I didn't see any issues in my testing.
>
> Patch applies cleanly. make/make install/initdb/make check all are well.
>

Thank you for reviewing the patch!
Attached patch is latest version patch.
I modified the output format of cur_lineno.

Regards,

---
Sawada Masahiko


psql-line-number_v7.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-08-20 Thread Amit Kapila
On Thu, Aug 21, 2014 at 12:04 AM, Robert Haas  wrote:
>
> On Tue, Aug 19, 2014 at 7:08 AM, Amit Kapila 
wrote:
> > On Fri, Aug 15, 2014 at 12:55 AM, Robert Haas 
wrote:
> >>
> >> On Mon, Aug 11, 2014 at 12:59 AM, Amit Kapila 
> >> wrote:
> >> >
> >> > a. How about describing w.r.t asynchronous connections
> >> > instead of parallel connections?
> >>
> >> I don't think "asynchronous" is a good choice of word.
> >
> > Agreed.
> >
> >>Maybe "simultaneous"?
> >
> > Not sure. How about *concurrent* or *multiple*?
>
> multiple isn't right, but we could say concurrent.

I also find concurrent more appropriate.
Dilip, could you please change it to concurrent in doc updates,
variables, functions unless you see any objection for the same.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: Compute attr_needed for child relations (was Re: [HACKERS] inherit support for foreign tables)

2014-08-20 Thread Ashutosh Bapat
On Wed, Aug 20, 2014 at 3:25 PM, Etsuro Fujita 
wrote:

> Hi Ashutish,
>
>
> (2014/08/14 22:30), Ashutosh Bapat wrote:
>
>> On Thu, Aug 14, 2014 at 10:05 AM, Etsuro Fujita
>> mailto:fujita.ets...@lab.ntt.co.jp>> wrote:
>>
>> (2014/08/08 18:51), Etsuro Fujita wrote:
>>  >>> (2014/06/30 22:48), Tom Lane wrote:
>>   I wonder whether it isn't time to change that.  It was coded
>> like that
>>   originally only because calculating the values would've been a
>> waste of
>>   cycles at the time.  But this is at least the third place
>> where it'd be
>>   useful to have attr_needed for child rels.
>>
>>  > I've revised the patch.
>>
>> There was a problem with the previous patch, which will be described
>> below.  Attached is the updated version of the patch addressing that.
>>
>
>  Here are some more comments
>>
>
> Thank you for the further review!
>
>
>  attr_needed also has the attributes used in the restriction clauses as
>> seen in distribute_qual_to_rels(), so, it looks unnecessary to call
>> pull_varattnos() on the clauses in baserestrictinfo in functions
>> check_selective_binary_conversion(), remove_unused_subquery_outputs(),
>> check_index_only().
>>
>
> IIUC, I think it's *necessary* to do that in those functions since the
> attributes used in the restriction clauses in baserestrictinfo are not
> added to attr_needed due the following code in distribute_qual_to_rels.
>
>
That's right. Thanks for pointing that out.


> /*
>  * If it's a join clause (either naturally, or because delayed by
>  * outer-join rules), add vars used in the clause to targetlists of
> their
>  * relations, so that they will be emitted by the plan nodes that scan
>  * those relations (else they won't be available at the join node!).
>  *
>  * Note: if the clause gets absorbed into an EquivalenceClass then this
>  * may be unnecessary, but for now we have to do it to cover the case
>  * where the EC becomes ec_broken and we end up reinserting the
> original
>  * clauses into the plan.
>  */
> if (bms_membership(relids) == BMS_MULTIPLE)
> {
> List   *vars = pull_var_clause(clause,
>PVC_RECURSE_AGGREGATES,
>PVC_INCLUDE_PLACEHOLDERS);
>
> add_vars_to_targetlist(root, vars, relids, false);
> list_free(vars);
>
> }
>
>  Although in case of RTE_RELATION, the 0th entry in attr_needed
>> corresponds to FirstLowInvalidHeapAttributeNumber + 1, it's always safer
>> to use it is RelOptInfo::min_attr, in case get_relation_info() wants to
>> change assumption or somewhere down the line some other part of code
>> wants to change attr_needed information. It may be unlikely, that it
>> would change, but it will be better to stick to comments in RelOptInfo
>>   443 AttrNumber  min_attr;   /* smallest attrno of rel (often
>> <0) */
>>   444 AttrNumber  max_attr;   /* largest attrno of rel */
>>   445 Relids *attr_needed;/* array indexed [min_attr ..
>> max_attr] */
>>
>
> Good point!  Attached is the revised version of the patch.
>
>
If the patch is not in the commit-fest, can you please add it there? From
my side, the review is done, it should be marked "ready for committer",
unless somebody else wants to review.


>
> Thanks,
>
> Best regards,
> Etsuro Fujita
>



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Support for N synchronous standby servers

2014-08-20 Thread Michael Paquier
On Fri, Aug 15, 2014 at 9:28 PM, Fujii Masao  wrote:
> You added check_synchronous_standby_num() as the GUC check function for
> synchronous_standby_num, and checked that there. But that seems to be
wrong.
> You can easily see the following error messages even if
synchronous_standby_num
> is smaller than max_wal_senders. The point is that synchronous_standby_num
> should be located before max_wal_senders in postgresql.conf.
>
> LOG:  invalid value for parameter "synchronous_standby_num": 0
> DETAIL:  synchronous_standby_num cannot be higher than max_wal_senders.
I am not sure what I can do here, so I am removing this check in the code,
and simply add a note in the docs that a value of _num higher than
max_wal_senders does not have much meaning.

> I still think that it's strange that replication can be async even when
> s_s_num is larger than zero. That is, I think that the transaction must
> wait for s_s_num sync standbys whether s_s_names is empty or not.
> OTOH, if s_s_num is zero, replication must be async whether s_s_names
> is empty or not. At least for me, it's intuitive to use s_s_num primarily
> to control the sync mode. Of course, other hackers may have different
> thoughts, so we need to keep our ear open for them.
Sure, the compromise looks to be what you propose, and I am fine with that.

> In the above design, one problem is that the number of parameters
> that those who want to set up only one sync replication need to change is
> incremented by one. That is, they need to change s_s_num additionally.
> If we are really concerned about this, we can treat a value of -1 in
> s_s_num as the special value, which allows us to control sync replication
> only by s_s_names as we do now. That is, if s_s_names is empty,
> replication would be async. Otherwise, only one standby with
> high-priority in s_s_names becomes sync one. Probably the default of
> s_s_num should be -1. Thought?

Taking into account those comments, attached is a patch doing the following
things depending on the values of _num and _names:
- If _num = -1 and _names is empty, all the standbys are considered as
async (same behavior as 9.1~, and default).
- If _num = -1 and _names has at least one item, wait for one standby, even
if it is not connected at the time of commit. If one node is found as sync,
other standbys listed in _names with higher priority than the sync one are
in potential state (same as existing behavior).
- If _num = 0, all the standbys are async, whatever the values in _names.
Priority is enforced to 0 for all the standbys. SyncStandbysDefined is set
to false in this case.
- If _num > 0, must wait for _num standbys whatever the values in _names
The default value of _num is -1. Documentation has been updated in
consequence.

> The source code comments at the top of syncrep.c need to be udpated.
> It's worth checking whether there are other comments to be updated.
Done. I have updated some comments in other places than the header.
Regards,
-- 
Michael
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 2586,2597  include_dir 'conf.d'
  Specifies a comma-separated list of standby names that can support
  synchronous replication, as described in
  .
! At any one time there will be at most one active synchronous standby;
! transactions waiting for commit will be allowed to proceed after
! this standby server confirms receipt of their data.
! The synchronous standby will be the first standby named in this list
! that is both currently connected and streaming data in real-time
! (as shown by a state of streaming in the
  
  pg_stat_replication view).
  Other standby servers appearing later in this list represent potential
--- 2586,2598 
  Specifies a comma-separated list of standby names that can support
  synchronous replication, as described in
  .
! At any one time there will be at a number of active synchronous standbys
! defined by , transactions
! waiting for commit will be allowed to proceed after those standby
! servers confirm receipt of their data. The synchronous standbys will be
! the first entries named in this list that are both currently connected
! and streaming data in real-time (as shown by a state of
! streaming in the
  
  pg_stat_replication view).
  Other standby servers appearing later in this list represent potential
***
*** 2627,2632  include_dir 'conf.d'
--- 2628,2685 

   
  
+  
+   synchronous_standby_num (integer)
+   
+synchronous_standby_num configuration parameter
+   
+   
+   
+
+ Specifies the number of standbys that support
+ synchronous replication.
+
+
+ Default value is -1. In this case, if
+  is empty all the
+ standby node

Re: [HACKERS] pg_receivexlog --status-interval add fsync feedback

2014-08-20 Thread furuyao
> When replication slot is not specified in pg_receivexlog, the flush
> location in the feedback message always indicates invalid. So there seems
> to be no need to send the feedback as soon as fsync is issued, in that
> case.
> How should this option work when replication slot is not specified?

Thanks for the review!

The present is not checking the existence of specification of -S. 

The use case when replication slot is not specified.

Because it does fsync, it isn't an original intention.
remote_write is set in synchronous_commit.

To call attention to the user, append following documents.
"If you want to report the flush position to the server, should use -S option."

Regards,

--
Furuya Osamu


pg_receivexlog-fsync-feedback-v4.patch
Description: pg_receivexlog-fsync-feedback-v4.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Implementing parallel sequence doubts

2014-08-20 Thread Haribabu Kommi
Hi Hackers,

Implementation of "Parallel Sequence Scan"

Approach:

1."Parallel Sequence Scan" can achieved by using the background
workers doing the job of actual sequence scan including the
qualification check also.

2. Planner generates the parallel scan plan by checking the possible
criteria of the table to do a parallel scan and generates the tasks
(range of blocks).

3. In the executor Init phase, Try to copy the necessary data required
by the workers and start the workers.

4. In the executor run phase, just get the tuples which are sent by
the workers and process them further in the plan node execution.

some of the problems i am thinking:

1. Data structures that are required to be copied from backend to
worker are currentTransactionState, Snapshot, GUC, ComboCID, Estate
and etc.

I see some problems in copying "Estate" data structure into the shared
memory because it contains so many pointers. There is a need of some
infrastructure to copy these data structures into the shared memory.

If the backend takes care of locking the relation and there are no sub
plans involved in the qual and targetlist, is the estate is really
required in the worker to achieve the parallel scan?

Is it possible to Initialize the qual, targetlist and projinfo with
the local "Estate" structure created in the worker with the help of
"plan" structure received from the backend?

Or

In case if "estate" is required in the worker for the processing, is
there any better way possible to share backend data structures with
the workers?

Any suggestions?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP Patch for GROUPING SETS phase 1

2014-08-20 Thread Heikki Linnakangas

On 08/13/2014 09:43 PM, Atri Sharma wrote:

Sorry, forgot to attach the patch for fixing cube in contrib, which breaks
since we now reserve "cube" keyword. Please find attached the same.


Ugh, that will make everyone using the cube extension unhappy. After 
this patch, they will have to quote contrib's cube type and functions 
every time.


I think we should bite the bullet and rename the extension, and its 
"cube" type and functions. For an application, having to suddenly quote 
it has the same effect as renaming it; you'll have to find all the 
callers and change them. And in the long-run, it's clearly better to 
have an unambiguous name.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Parallel Sequence Scan doubts

2014-08-20 Thread Haribabu Kommi
Corrected subject.

Hi Hackers,

Implementation of "Parallel Sequence Scan"

Approach:

1."Parallel Sequence Scan" can achieved by using the background
workers doing the job of actual sequence scan including the
qualification check also.

2. Planner generates the parallel scan plan by checking the possible
criteria of the table to do a parallel scan and generates the tasks
(range of blocks).

3. In the executor Init phase, Try to copy the necessary data required
by the workers and start the workers.

4. In the executor run phase, just get the tuples which are sent by
the workers and process them further in the plan node execution.

some of the problems i am thinking:

1. Data structures that are required to be copied from backend to
worker are currentTransactionState, Snapshot, GUC, ComboCID, Estate
and etc.

I see some problems in copying "Estate" data structure into the shared
memory because it contains so many pointers. There is a need of some
infrastructure to copy these data structures into the shared memory.

If the backend takes care of locking the relation and there are no sub
plans involved in the qual and targetlist, is the estate is really
required in the worker to achieve the parallel scan?

Is it possible to Initialize the qual, targetlist and projinfo with
the local "Estate" structure created in the worker with the help of
"plan" structure received from the backend?

Or

In case if "estate" is required in the worker for the processing, is
there any better way possible to share backend data structures with
the workers?

Any suggestions?

Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-08-20 Thread Dilip kumar
On 21 August 2014 08:31, Amit Kapila Wrote,

> >>
> > >Not sure. How about *concurrent* or *multiple*?
> >
> >multiple isn't right, but we could say concurrent.
>I also find concurrent more appropriate.
>Dilip, could you please change it to concurrent in doc updates,
>variables, functions unless you see any objection for the same.

Ok, I will take care along with other comments fix..

Regards,
Dilip Kumar