Re: [HACKERS] better atomics - v0.5

2014-07-11 Thread Amit Kapila
On Sat, Jul 12, 2014 at 1:26 AM, Martijn van Oosterhout 
wrote:
> On Thu, Jul 10, 2014 at 08:46:55AM +0530, Amit Kapila wrote:
> > As per my understanding of the general theory around barriers,
> > read and write are defined to avoid reordering due to compiler and
> > full memory barriers are defined to avoid reordering due to processors.
> > There are some processors that support instructions for memory
> > barriers with acquire, release and fence semantics.
>
> Not exactly, both compilers and processors can rearrange loads and
> stores.  Because the architecture most developers work on (x86) has
> such a strong memory model (it's goes to lot of effort to hide
> reordering) people are under the impression that it's only the compiler
> you need to worry about, but that's not true for any other
> architechture.

I am not saying that we need only barriers to avoid reordering due to
compiler, rather my point was that we need to avoid reordering due to
both compiler and processor, however ways to achieve it require different
API's

> Memory barriers/fences are on the whole discouraged if you can use
> acquire/release semantics because the latter are faster and much easier
> to optimise for.

Do all processors support instructions for memory barriers with acquire,
release semantics?

> I strongly recommend the "Atomic Weapons" talk on the C11 memory model
> to help understand how they work. As a bonus it includes correct
> implementations for the major architectures.

Yes that will be a good if we can can implement as per C11 memory model and
thats what I am referring while reviewing this patch, however even if that
is not
possible or difficult to achieve in all cases, it is worth to have a
discussion for
memory model used in current API's implemented in patch.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] things I learned from working on memory allocation

2014-07-11 Thread Amit Kapila
On Fri, Jul 11, 2014 at 11:15 PM, Robert Haas  wrote:
> On Thu, Jul 10, 2014 at 1:05 AM, Amit Kapila 
wrote:
> > If there is an noticeable impact, then do you think having separate
> > file/infrastructure for parallel sort can help, basically non-parallel
> > and parallel sort will have some common and some specific parts.
> > The above layer will call the parallel/non-parallel functions based on
> > operation need.
>
> The tuplesort.c already handles so many cases already that adding yet
> another axis of variability is intimidating.  But it may work out that
> there's no better option.

For using new allocator, one idea is that it works seemlesly with current
memory allocation routines (palloc, pfree, repalloc, ..), another could be
that have separate routines (shm_palloc, shm_pfree, ..) for allocation
from new allocator.  I think having separate routines means all the call
sites for allocation/deallocation needs to be aware of operation type
(parallel or non-parallel), however I think this can avoid the overhead
for non-parallel cases.

I think it is bit difficult to say which approach ("with allocator" or
"directly use dsm") will turn out to be better as both have its pros and
cons as listed by you, however I feel that if we "directly using dsm", we
might need to change more core logic than "with allocator".

I believe we will face the same question again if somebody wants to
parallelize the join, so one parameter to decide could be based on
which approach will lead to less change in core logic/design.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Crash on backend exit w/ OpenLDAP [2.4.24, 2.4.31]

2014-07-11 Thread Noah Misch
On Thu, Jun 19, 2014 at 01:01:34PM -0400, Tom Lane wrote:
> Noah Misch  writes:
> > On Thu, Jun 12, 2014 at 05:02:19PM -0400, Noah Misch wrote:
> >> You can cause the at-exit crash by building PostgreSQL against OpenLDAP
> >> 2.4.31, connecting with LDAP authentication, and issuing "LOAD 'dblink'".
> 
> >> 4. Detect older OpenLDAP versions at runtime, just before we would 
> >> otherwise
> >> initialize OpenLDAP, and raise an error.  Possibly make the same check at
> >> compile time, for packager convenience.
> 
> > Having pondered this some more, I lean toward the following conservative 
> > fix.
> > Add to all supported branches a test case that triggers the crash and a
> > configure-time warning if the OpenLDAP version falls in the vulnerable 
> > range.
> > So long as those who build from source monitor either "configure" output or
> > test suite failures, they'll have the opportunity to head off the problem.
> 
> +1 for a configure warning, but I share your concern that it's likely to
> go unnoticed (sometimes I wish autoconf were not so chatty...).

> I'm not sure about the practicality of adding a test case --- how will we
> test that if no LDAP server is at hand?

Merely attempting an LDAP connection (to a closed port, for example)
initializes the library far enough to trigger the problem.  Here's a patch
implementing the warning and test case.  The test case is based on the one I
posted upthread, modified to work with installcheck, work with non-LDAP
builds, and close a race condition.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/configure b/configure
index 481096c..e2850e4 100755
--- a/configure
+++ b/configure
@@ -9464,6 +9464,17 @@ fi
 
 fi
 
+# PGAC_LDAP_SAFE
+# --
+# PostgreSQL sometimes loads libldap_r and plain libldap into the same
+# process.  Check for OpenLDAP versions known not to tolerate doing so; assume
+# non-OpenLDAP implementations are safe.  The dblink test suite exercises the
+# hazardous interaction directly.
+
+
+
+
+
 if test "$with_ldap" = yes ; then
   if test "$PORTNAME" != "win32"; then
  for ac_header in ldap.h
@@ -9480,6 +9491,47 @@ fi
 
 done
 
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for compatible LDAP 
implementation" >&5
+$as_echo_n "checking for compatible LDAP implementation... " >&6; }
+if ${pgac_cv_ldap_safe+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
+/* end confdefs.h.  */
+#include 
+#if !defined(LDAP_VENDOR_VERSION) || \
+ (defined(LDAP_API_FEATURE_X_OPENLDAP) && \
+  LDAP_VENDOR_VERSION >= 20424 && LDAP_VENDOR_VERSION <= 20431)
+choke me
+#endif
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile "$LINENO"; then :
+  pgac_cv_ldap_safe=yes
+else
+  pgac_cv_ldap_safe=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_ldap_safe" >&5
+$as_echo "$pgac_cv_ldap_safe" >&6; }
+
+if test "$pgac_cv_ldap_safe" != yes; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: WARNING:
+*** With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, each backend
+*** process that loads libpq (via WAL receiver, dblink, or postgres_fdw) and
+*** also uses LDAP will crash on exit." >&5
+$as_echo "$as_me: WARNING:
+*** With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, each backend
+*** process that loads libpq (via WAL receiver, dblink, or postgres_fdw) and
+*** also uses LDAP will crash on exit." >&2;}
+fi
   else
  for ac_header in winldap.h
 do :
diff --git a/configure.in b/configure.in
index c938a5d..9f324f0 100644
--- a/configure.in
+++ b/configure.in
@@ -1096,10 +1096,39 @@ if test "$with_libxslt" = yes ; then
   AC_CHECK_HEADER(libxslt/xslt.h, [], [AC_MSG_ERROR([header file 
 is required for XSLT support])])
 fi
 
+# PGAC_LDAP_SAFE
+# --
+# PostgreSQL sometimes loads libldap_r and plain libldap into the same
+# process.  Check for OpenLDAP versions known not to tolerate doing so; assume
+# non-OpenLDAP implementations are safe.  The dblink test suite exercises the
+# hazardous interaction directly.
+
+AC_DEFUN([PGAC_LDAP_SAFE],
+[AC_CACHE_CHECK([for compatible LDAP implementation], [pgac_cv_ldap_safe],
+[AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
+[#include 
+#if !defined(LDAP_VENDOR_VERSION) || \
+ (defined(LDAP_API_FEATURE_X_OPENLDAP) && \
+  LDAP_VENDOR_VERSION >= 20424 && LDAP_VENDOR_VERSION <= 20431)
+choke me
+#endif], [])],
+[pgac_cv_ldap_safe=yes],
+[pgac_cv_ldap_safe=no])])
+
+if test "$pgac_cv_ldap_safe" != yes; then
+  AC_MSG_WARN([
+*** With OpenLDAP versions 2.4.24 through 2.4.31, inclusive, each backend
+*** process that loads libpq (via WAL receiver, dblink, or postgres_fdw) and
+*** also uses LDAP will crash on exit.])
+fi])
+
+
+
 if test "$with_ldap" = yes ; then
   if test "$PORTNAME" != "win32"; then
  AC_CHECK_HEADERS(ldap.h, [],
   [AC_MSG_ERROR([header file  i

Re: [HACKERS] Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING

2014-07-11 Thread Jeff Davis
On Fri, 2014-07-11 at 11:51 -0400, Tom Lane wrote:
> Jeff Davis  writes:
> > Attached is a small patch to $SUBJECT.
> > In master, only single-byte characters are allowed as an escape. Of
> > course, with the patch it must still be a single character, but it may
> > be multi-byte.
> 
> I'm concerned about the performance cost of this patch.  Have you done
> any measurements about what kind of overhead you are putting on the
> inner loop of similar_escape?

I didn't consider this very performance critical, because this is
looping through the pattern, which I wouldn't expect to be a long
string. On my machine using en_US.UTF-8, the difference is imperceptible
for a SIMILAR TO ... ESCAPE query.

I was able to see about a 2% increase in runtime when using the
similar_escape function directly. I made a 10M tuple table and did:

explain analyze
  select 
similar_escape('','#')
 from t;

which was the worst reasonable case I could think of. (It appears that
selecting from a table is faster than from generate_series. I'm curious
what you use when testing the performance of an individual function at
the SQL level.)

> At the very least, please don't call GetDatabaseEncoding() over again
> every single time through the inner loop.  More generally, why do we
> need to use pg_encoding_verifymb?  The input data is presumably validly
> encoded.  ISTM the significantly cheaper pg_mblen() would be more
> appropriate.

Thank you. Using the non-verifying variants reduces the penalty in the
above test to 1%.

If needed, we could optimize further through code specialization, as
like_escape() does. Though I think like_escape() is specialized just
because MatchText() is specialized; and MatchText is specialized because
it operates on the actual data, not just the pattern. So I don't see a
reason to specialize similar_escape().

Regards,
Jeff Davis

*** a/src/backend/utils/adt/regexp.c
--- b/src/backend/utils/adt/regexp.c
***
*** 688,698  similar_escape(PG_FUNCTION_ARGS)
  		elen = VARSIZE_ANY_EXHDR(esc_text);
  		if (elen == 0)
  			e = NULL;			/* no escape character */
! 		else if (elen != 1)
! 			ereport(ERROR,
! 	(errcode(ERRCODE_INVALID_ESCAPE_SEQUENCE),
! 	 errmsg("invalid escape string"),
!   errhint("Escape string must be empty or one character.")));
  	}
  
  	/*--
--- 688,703 
  		elen = VARSIZE_ANY_EXHDR(esc_text);
  		if (elen == 0)
  			e = NULL;			/* no escape character */
! 		else
! 		{
! 			int escape_mblen = pg_mbstrlen_with_len(e, elen);
! 
! 			if (escape_mblen > 1)
! ereport(ERROR,
! 		(errcode(ERRCODE_INVALID_ESCAPE_SEQUENCE),
! 		 errmsg("invalid escape string"),
! 		 errhint("Escape string must be empty or one character.")));
! 		}
  	}
  
  	/*--
***
*** 722,781  similar_escape(PG_FUNCTION_ARGS)
  
  	while (plen > 0)
  	{
! 		char		pchar = *p;
  
! 		if (afterescape)
  		{
! 			if (pchar == '"' && !incharclass)	/* for SUBSTRING patterns */
! *r++ = ((nquotes++ % 2) == 0) ? '(' : ')';
! 			else
  			{
  *r++ = '\\';
  *r++ = pchar;
  			}
! 			afterescape = false;
! 		}
! 		else if (e && pchar == *e)
! 		{
! 			/* SQL99 escape character; do not send to output */
! 			afterescape = true;
  		}
! 		else if (incharclass)
  		{
! 			if (pchar == '\\')
  *r++ = '\\';
! 			*r++ = pchar;
! 			if (pchar == ']')
! incharclass = false;
! 		}
! 		else if (pchar == '[')
! 		{
! 			*r++ = pchar;
! 			incharclass = true;
! 		}
! 		else if (pchar == '%')
! 		{
! 			*r++ = '.';
! 			*r++ = '*';
! 		}
! 		else if (pchar == '_')
! 			*r++ = '.';
! 		else if (pchar == '(')
! 		{
! 			/* convert to non-capturing parenthesis */
! 			*r++ = '(';
! 			*r++ = '?';
! 			*r++ = ':';
! 		}
! 		else if (pchar == '\\' || pchar == '.' ||
!  pchar == '^' || pchar == '$')
! 		{
! 			*r++ = '\\';
! 			*r++ = pchar;
  		}
- 		else
- 			*r++ = pchar;
- 		p++, plen--;
  	}
  
  	*r++ = ')';
--- 727,813 
  
  	while (plen > 0)
  	{
! 		char	pchar = *p;
! 		int		mblen = pg_mblen(p);
  
! 		if (mblen == 1)
  		{
! 			if (afterescape)
! 			{
! if (pchar == '"' && !incharclass)	/* for SUBSTRING patterns */
! 	*r++ = ((nquotes++ % 2) == 0) ? '(' : ')';
! else
! {
! 	*r++ = '\\';
! 	*r++ = pchar;
! }
! afterescape = false;
! 			}
! 			else if (e && pchar == *e)
! 			{
! /* SQL99 escape character; do not send to output */
! afterescape = true;
! 			}
! 			else if (incharclass)
! 			{
! if (pchar == '\\')
! 	*r++ = '\\';
! *r++ = pchar;
! if (pchar == ']')
! 	incharclass = false;
! 			}
! 			else if (pchar == '[')
! 			{
! *r++ = pchar;
! incharclass = true;
! 			}
! 			else if (pchar == '%')
! 			{
! *r++ = '.';
! *r++ = '*';
! 			}
! 			else if (pchar == '_')
! *r++ = '.';
! 			else if (pchar == '(')
! 			{
! /* convert to non-capturing parenthesis */
! *r++ = '(';
! *r++ 

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-07-11 Thread Robert Haas
On Thu, Jul 10, 2014 at 2:52 AM, Tomonari Katsumata
 wrote:
> Several couple weeks ago, I posted a mail to pgsql-doc.
> http://www.postgresql.org/message-id/53992ff8.2060...@po.ntts.co.jp
>
> In this thread, I concluded that it's better to
> round up the value which is less than its unit.
> Current behavior (rounding down) has undesirable setting risk,
> because some GUCs have special meaning for 0.
>
> And then I made a patch for this.
> Please check the attached patch.

Thanks for the patch.  Please add it here:

https://commitfest.postgresql.org/action/commitfest_view/open

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] better atomics - v0.5

2014-07-11 Thread Martijn van Oosterhout
On Thu, Jul 10, 2014 at 08:46:55AM +0530, Amit Kapila wrote:
> As per my understanding of the general theory around barriers,
> read and write are defined to avoid reordering due to compiler and
> full memory barriers are defined to avoid reordering due to processors.
> There are some processors that support instructions for memory
> barriers with acquire, release and fence semantics.

Not exactly, both compilers and processors can rearrange loads and
stores.  Because the architecture most developers work on (x86) has
such a strong memory model (it's goes to lot of effort to hide
reordering) people are under the impression that it's only the compiler
you need to worry about, but that's not true for any other
architechture.

Memory barriers/fences are on the whole discouraged if you can use
acquire/release semantics because the latter are faster and much easier
to optimise for.

I strongly recommend the "Atomic Weapons" talk on the C11 memory model
to help understand how they work. As a bonus it includes correct
implementations for the major architectures.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> He who writes carelessly confesses thereby at the very outset that he does
> not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


[HACKERS] Over-optimization in ExecEvalWholeRowVar

2014-07-11 Thread Tom Lane
This example in the regression database is a simplified version of a bug
I was shown off-list:

regression=# select (
select q from
( select 1,2,3 where f1>0
  union all
  select 4,5,6.0 where f1<=0
) q
)
from int4_tbl;
ERROR:  record type has not been registered

The reason for the problem is that ExecEvalWholeRowVar is designed
to bless the Var's record type (via assign_record_type_typmod) just
once per query.  That works fine, as long as the source tuple slot
is the exact same TupleTableSlot throughout the query.  But in the
above example, we have an Append plan node for the UNION ALL, which
will pass back its children's result slots directly --- so at the
level where "q" is being evaluated, we see first the first leaf SELECT's
output slot (which gets blessed) and then the second leaf SELECT's
output slot (which doesn't).  That leads to generating a composite
Datum with typmod -1 ... oops.

I can reproduce this bug in all active branches.  Probably the reason
it hasn't been identified long since is that in most scenarios the
planner won't use a whole-row Var at all in cases like this, but will
prefer to build RowExprs.  (In particular, the inconsistent data types
of the last sub-select output columns are necessary to trigger the bug
in this example.)

The easy fix is to move the blessing code from ExecEvalWholeRowVar
into ExecEvalWholeRowFast.  (ExecEvalWholeRowSlow doesn't need it
since it doesn't deal with RECORD cases.)  That'll add a usually-useless
comparison or two per row cycle, which is unlikely to be measurable.

A bigger-picture problem is that we have other once-per-query tests in
ExecEvalWholeRowVar, and ExecEvalScalarVar too, whose validity should be
questioned given this bug.  I think they are all right, though, because
those tests are concerned with validating the source rowtype, which should
not change across Append children.  The bug here is because we're assuming
not just that the input rowtype is abstractly the same across all rows,
but that the exact same TupleTableSlot is used to return every source row.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Claudio Freire
On Fri, Jul 11, 2014 at 3:47 PM, Greg Stark  wrote:
> On Fri, Jul 11, 2014 at 6:00 PM, Claudio Freire  
> wrote:
>> Marking as read-only is ok, or emitting a NOTICE so that if anyone
>> changes those parameters that change the shape of the index, they know
>> it needs a rebuild would be OK too. Both mechanisms work for me.
>
> We don't actually have any of these mechanisms. They wouldn't be bad
> things to have but I don't think we should gate adding new types of
> indexes on adding them. In particular, the index could just hard code
> a value for these parameters and having them be parameterized is
> clearly better even if that doesn't produce all the warnings or
> rebuild things automatically or whatever.


No, I agree, it's just a nice to have.

But at least the docs should mention it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Greg Stark
On Fri, Jul 11, 2014 at 6:00 PM, Claudio Freire  wrote:
> Marking as read-only is ok, or emitting a NOTICE so that if anyone
> changes those parameters that change the shape of the index, they know
> it needs a rebuild would be OK too. Both mechanisms work for me.

We don't actually have any of these mechanisms. They wouldn't be bad
things to have but I don't think we should gate adding new types of
indexes on adding them. In particular, the index could just hard code
a value for these parameters and having them be parameterized is
clearly better even if that doesn't produce all the warnings or
rebuild things automatically or whatever.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-11 Thread Stephen Frost
Robert,

On Friday, July 11, 2014, Robert Haas  wrote:

> On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost  > wrote:
> > My feeling at the moment is that having them be per-table makes sense and
> > we'd still have flexibility to change later if we had some compelling
> reason
> > to do so.
>
> I don't think you can really change it later.  If policies are
> per-table, then you could have a policy p1 on table t1 and also on
> table t2; if they become global objects, then you can't have p1 in two
> places.  I hope I'm not beating a dead horse here, but changing syntax
> after it's been released is very, very hard.


Fair enough. My thinking was we'd come up with a way to map them (eg:
table_policy), but I do agree that changing it later would really suck and
having them be per-table makes a lot of sense.


> But that's not an argument against doing it this way; I think
> per-table policies are probably simpler and better here.  It means,
> for example, that policies need not have their own permissions and
> ownership structure - they're part of the table, just like a
> constraint, trigger, or rule, and the table owner's permissions
> control.  I like that, and I think our users will, too.


Agreed and I believe this is more-or-less what I had proposed up-thread
(not at a computer at the moment). I hope to have a chance to review and
update the design and flush out the catalog definition this weekend.

Thanks!

Stephen


Re: [HACKERS] things I learned from working on memory allocation

2014-07-11 Thread Robert Haas
On Thu, Jul 10, 2014 at 1:05 AM, Amit Kapila  wrote:
> On Tue, Jul 8, 2014 at 1:27 AM, Robert Haas  wrote:
>> 6. In general, I'm worried that it's going to be hard to keep the
>> overhead of parallel sort from leaking into the non-parallel case.
>> With the no-allocator approach, every place that uses
>> GetMemoryChunkSpace() or repalloc() or pfree() will have to handle the
>> DSM and non-DSM cases differently, which isn't great for either
>> performance or maintainability.  And even with an allocator, the
>> SortTuple array will need to use relative pointers in a DSM; that
>> might burden the non-DSM case.
>
> I think you are right in saying that there can be a performance impact
> for non-parallel case due to pfree() (or other similar calls) and/or due
> to usage of relative pointers, but can it have measurable impact during
> actual sort operation?

Benchmarking seems to indicate that it indeed can.

> If there is an noticeable impact, then do you think having separate
> file/infrastructure for parallel sort can help, basically non-parallel
> and parallel sort will have some common and some specific parts.
> The above layer will call the parallel/non-parallel functions based on
> operation need.

The tuplesort.c already handles so many cases already that adding yet
another axis of variability is intimidating.  But it may work out that
there's no better option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-11 Thread Robert Haas
On Fri, Jul 11, 2014 at 4:55 AM, Stephen Frost  wrote:
> On Thursday, July 10, 2014, Robert Haas  wrote:
>> On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost  wrote:
>> > Yes, this would be possible (and is nearly identical to the original
>> > patch, except that this includes per-role considerations), however, my
>> > thinking is that it'd be simpler to work with policy names rather than
>> > sets of quals, to use when mapping to roles, and they would potentially
>> > be useful later for other things (eg: for setting up which policies
>> > should be applied when, or which should be OR' or AND"d with other
>> > policies, or having groups of policies, etc).
>>
>> Hmm.  I guess that's reasonable.  Should the policy be a per-table
>> object (like rules, constraints, etc.) instead of a global object?
>>
>> You could do:
>>
>> ALTER TABLE table_name ADD POLICY policy_name (quals);
>> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
>> ALTER TABLE table_name DROP POLICY policy_name;
>
> Right, I was thinking they would be per table as they would specifically
> provide a name for a set of quals, and quals are naturally table-specific. I
> don't see a need to have them be global- that had been brought up before
> with the notion of applications picking their policy, but we could also add
> that later through another term (eg: contexts) which would then map to
> policies or similar. We could even extend policies to be global by mapping
> existing per-table ones to be global if we really needed to...
>
> My feeling at the moment is that having them be per-table makes sense and
> we'd still have flexibility to change later if we had some compelling reason
> to do so.

I don't think you can really change it later.  If policies are
per-table, then you could have a policy p1 on table t1 and also on
table t2; if they become global objects, then you can't have p1 in two
places.  I hope I'm not beating a dead horse here, but changing syntax
after it's been released is very, very hard.

But that's not an argument against doing it this way; I think
per-table policies are probably simpler and better here.  It means,
for example, that policies need not have their own permissions and
ownership structure - they're part of the table, just like a
constraint, trigger, or rule, and the table owner's permissions
control.  I like that, and I think our users will, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting Windows SChannel as OpenSSL replacement

2014-07-11 Thread Alvaro Herrera
Heikki Linnakangas wrote:

> I did again the refactoring you did back in 2006, patch attached.
> One thing I did differently: I moved the raw, non-encrypted,
> read/write functions to separate functions: pqsecure_raw_read and
> pqsecure_raw_write. Those functions encapsulate the SIGPIPE
> handling. The OpenSSL code implements a custom BIO, which calls to
> pqsecure_raw_read/write to do the low-level I/O.  Similarly in the
> server-side, there are be_tls_raw_read and pg_tls_raw_write
> functions, which do the
> prepare_for_client_read()/client_read_ended() dance, so that the SSL
> implementation doesn't need to know about that.

I'm skimming over this patch (0001).  There are some issues:

* You duplicated the long comment under the IDENTIFICATION tag that was
  in be-secure.c; it's now both in that file and also in
  be-secure-openssl.c.  I think it should be removed from be-secure.c.
  Also, the hardcoded DH params are duplicated in be-secure.c, but they
  belong in -openssl.c only now.

* There is some mixup regarding USE_SSL and USE_OPENSSL in be-secure.c.
  I think anything that's OpenSSL-specific should be in
  be-secure-openssl.c only; any new SSL implementation will need to
  implement all those functions.  For instance, be_tls_init().
  I imagine that if we select any SSL implementation, USE_SSL would get
  defined, and each SSL implementation would additionally define its own
  symbol.  Unless the idea is to get rid of USE_OPENSSL completely, and
  use only the Makefile bit to decide which implementation to use?  If
  so, then USE_OPENSSL as a preprocessor symbol is useless ...

* ssl_renegotiation_limit is also duplicated.  But removing this one is
  probably not going to be as easy as deleting a line from be-secure.c,
  because guc.c depends on that one.  I think that variable should be
  defined in be-secure.c (i.e. delete it from -openssl) and make sure
  that all SSL implementations enforce it on their own somehow.

The DISABLE_SIGPIPE thingy looks wrong in pqsecure_write.  I think it
should be like this:

ssize_t
pqsecure_write(PGconn *conn, const void *ptr, size_t len)
{
ssize_t n;

#ifdef USE_SSL
if (conn->ssl_in_use)
{
DECLARE_SIGPIPE_INFO(spinfo);

DISABLE_SIGPIPE(conn, spinfo, return -1);

n = pgtls_write(conn, ptr, len);

RESTORE_SIGPIPE(spinfo);
}
else 
#endif   /* USE_OPENSSL */
{
n = pqsecure_raw_write(conn, ptr, len);
}

return n;
}

You are missing the restore call, and I moved the declaration inside the
ssl_in_use block since otherwise it's not useful.  The other path is
fine since pqsecure_raw_write disables and restores the flag by itself.
Also, you're missing DECLARE/DISABLE/RESTORE in the ssl_in_use block in
pqsecure_read.  (The original code does not have that code in the
non-SSL path.  I assume, without checking, that that's okay.)  I also
assume without checking that all SSL implementations would be fine with
this SIGPIPE handling.

Another thing that seems wrong is the REMEMBER_EPIPE stuff.  The
fe-secure-openssl.c code should be setting the flag, but AFAICS only the
non-SSL code is doing it.


Thanks for working on this -- I'm sure many distributors will be happy
to be able to enable other, less license-broken TLS implementations.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-11 Thread Tomas Vondra
On 10.7.2014 21:33, Tomas Vondra wrote:
> On 9.7.2014 16:07, Robert Haas wrote:
>> On Tue, Jul 8, 2014 at 5:16 PM, Tomas Vondra  wrote:
>>> Thinking about this a bit more, do we really need to build the hash
>>> table on the first pass? Why not to do this:
>>>
>>> (1) batching
>>> - read the tuples, stuff them into a simple list
>>> - don't build the hash table yet
>>>
>>> (2) building the hash table
>>> - we have all the tuples in a simple list, batching is done
>>> - we know exact row count, can size the table properly
>>> - build the table
>>
>> We could do this, and in fact we could save quite a bit of memory if
>> we allocated say 1MB chunks and packed the tuples in tightly instead
>> of palloc-ing each one separately.  But I worry that rescanning the
>> data to build the hash table would slow things down too much.
> 
> I did a quick test of how much memory we could save by this. The 
> attached patch densely packs the tuples into 32kB chunks (1MB seems
> way too much because of small work_mem values, but I guess this might
> be tuned based on number of tuples / work_mem size ...).

Turns out getting this working properly will quite complicated. The
patch was only a quick attempt to see how much overhead there is, and
didn't solve one important details - batching.

The problem is that when increasing the number of batches, we need to
get rid of the tuples from one of them. Which means calling pfree() on
the tuples written to a temporary file, and that's not possible with the
dense allocation.


1) copy into new chunks (dead end)
--

The first idea I had was to just "copy" everything into new chunks and
then throw away the old ones, but this way we might end up using 150% of
work_mem on average (when the two batches are about 1/2 the data each),
and in an extreme case up to 200% of work_mem (all tuples having the
same key and thus falling into the same batch). So I don't think that's
really a good approach.


2) walking through the tuples sequentially
--

The other option is not to walk the tuples through buckets, but by
walking throught the chunks - we know the tuples are stored as
HashJoinTuple/MinimalTuple, so it shouldn't be that difficult. And by
walking the tuples in the order they're stored, we may just rearrange
the tuples in already allocated memory. And maybe it could be even
faster than the current code, because of the sequential access to the
memory (as opposed to the random access through buckets) and CPU caches.
So I like this approach - it's simple, clean and shouldn't add any
overhead (memory or CPU).


3) batch-aware chunks
-

I also think a "batch-aware" chunks might work. Say we're starting with
nbatch=N. Instead of allocating everything in a single chunk, we'll
allocate the tuples from the chunks according to a "hypothetical batch
number" - what batch would the tuple belong to if we had (nbatch=N*2).
So we'd have two chunks (or sequences of chunks), and we'd allocate the
tuples into them.

Then if we actually need to increase the number of batches, we know we
can just free one of the chunks, because all of the tuples are from the
'wrong' batche, and redistribute the remaining tuples into the new
chunks (according to the new hypothetical batch numbers).

I'm not sure how much overhead this would be, though. I think it can be
done quite efficiently, though, and it shouldn't have any impact at all,
if we don't do any additional batching (i.e. if the initial estimates
are ok).

Any other ideas how to tackle this?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Alvaro Herrera
Sawada Masahiko wrote:

> As you said, if line number reached UINT_MAX then I think that this
> case is too strange.
> I think INT_MAX is enough for line number.

My point is not whether 2 billion is a better number than 4 billion as a
maximum value.  My point is that wraparound of signed int is, I think,
not even defined in C, whereas wraparound of unsigned int is well
defined.  cur_line should be declared as unsigned int.  I don't trust
that INT_MAX+2 arithmetic.

Please don't use cur_line as a name for a global variable.  Something
like PSQLLineNumber seems more appropriate if it's going to be exposed
through prompt.h.  However, note that MainLoop() keeps state in local
variables and notes that it is reentrant; what happens to your cur_line
when a file is read by \i and similar?  I wonder if it should be part of
PsqlScanStateData instead ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Sawada Masahiko
On Fri, Jul 11, 2014 at 11:01 PM, Alvaro Herrera
 wrote:
> Jeevan Chalke wrote:
>
>> On Fri, Jul 11, 2014 at 3:13 PM, Sawada Masahiko 
>> wrote:
>
>> > And the line number should be switched to 1 when line number has
>> > reached to INT_MAX?
>>
>> Yes, when it goes beyond INT_MAX, wrap around to 1.
>>
>> BTW, I wonder, can't we simply use unsigned int instead?
>
> That was my thought also: let the variable be unsigned, and have it wrap
> around normally.  So once you reach UINT_MAX, the next line number is
> zero (instead of getting stuck at UINT_MAX, which would be rather
> strange).  Anyway I don't think anyone is going to reach the UINT_MAX
> limit ... I mean that would be one hell of a query, wouldn't it.  If
> your query is upwards of a million lines, surely you are in deep trouble
> already.
>
> Does your text editor handle files longer than 4 billion lines?
>

As you said, if line number reached UINT_MAX then I think that this
case is too strange.
I think INT_MAX is enough for line number.

The v5 patch which Jeevan is created seems to good.
But one point, I got hunk when patch is applied to HEAD. (doc file)
So I have revised it and attached.

Regards,

---
Sawada Masahiko


psql-line-number_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Claudio Freire
On Thu, Jul 10, 2014 at 4:20 PM, Alvaro Herrera
 wrote:
> Claudio Freire wrote:
>> On Wed, Jul 9, 2014 at 6:16 PM, Alvaro Herrera  
>> wrote:
>> > Another thing I noticed is that version 8 of the patch blindly believed
>> > the "pages_per_range" declared in catalogs.  This meant that if somebody
>> > did "alter index foo set pages_per_range=123" the index would
>> > immediately break (i.e. return corrupted results when queried).  I have
>> > fixed this by storing the pages_per_range value used to construct the
>> > index in the metapage.  Now if you do the ALTER INDEX thing, the new
>> > value is only used when the index is recreated by REINDEX.
>>
>> This seems a lot like parameterizing.
>
> I don't understand what that means -- care to elaborate?

We've been talking about bloom filters, and how their shape differs
according to the parameters of the bloom filter (number of hashes,
hash type, etc).

But after seeing this case of pages_per_range, I noticed it's an
effective-enough mechanism. Like:

CREATE INDEX ix_blah ON some_table USING bloom (somecol)
  WITH (BLOOM_HASHES=15, BLOOM_BUCKETS=1024, PAGES_PER_RANGE=64);

Marking as read-only is ok, or emitting a NOTICE so that if anyone
changes those parameters that change the shape of the index, they know
it needs a rebuild would be OK too. Both mechanisms work for me.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread David Johnston
On Fri, Jul 11, 2014 at 12:12 PM, Michael Banck  wrote:

> On Fri, Jul 11, 2014 at 11:07:21AM -0400, Tom Lane wrote:
> > David G Johnston  writes:
> > > Benedikt Grundmann wrote
> > >> That is it possible to tell the planner that index is off limits
> > >> i.e.
> > >> don't ever generate a plan using it?
> >
> > > Catalog hacking could work but not recommended (nor do I know the
> > > proper
> > > commands and limitations).  Do you need the database/table to accept
> > > writes
> > > during the testing period?
> >
> > Hacking pg_index.indisvalid could work, given a reasonably recent PG.
> > I would not try it in production until I'd tested it ;-)
>
> I wonder whether this should be exposed at the SQL level?  Hacking
> pg_index is left to superusers, but the creator of an index (or the
> owner of the schema) might want to experiment with disabling indices
> while debugging query plans as well.
>
> Turns out this is already in the TODO, Steve Singer has requested this
> (in particular, "ALTER TABLE ...  ENABLE|DISABLE INDEX ...") in
>
> http://www.postgresql.org/message-id/87hbegz5ir@cbbrowne.afilias-int.info
> (as linked to from the TODO wiki page), but the neighboring discussion
> was mostly about FK constraints.
>
> Thoughts?
>
>
> Michael
>

Apparently work is ongoing on to allow EXPLAIN to calculate the impact a
particular index has on table writes.  What is needed is a mechanism to
temporarily facilitate the remove impact of specific indexes on reads
without ​having to disable the index for writing.  Ideally on a per-query
basis so altering the catalog doesn't make sense.  I know we do not want
traditional planner hints but in the spirit of the existing
enable_indexscan GUC there should be a "
disable_readofindex='table1.index1,table1.index2,table2.index1' " GUC
capability that would allow for session, user, or system-level control of
which indexes are to be used during table reads.

David J.


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Michael Banck
On Fri, Jul 11, 2014 at 11:07:21AM -0400, Tom Lane wrote:
> David G Johnston  writes:
> > Benedikt Grundmann wrote
> >> That is it possible to tell the planner that index is off limits
> >> i.e.
> >> don't ever generate a plan using it?
> 
> > Catalog hacking could work but not recommended (nor do I know the
> > proper
> > commands and limitations).  Do you need the database/table to accept
> > writes
> > during the testing period?
> 
> Hacking pg_index.indisvalid could work, given a reasonably recent PG.
> I would not try it in production until I'd tested it ;-)

I wonder whether this should be exposed at the SQL level?  Hacking
pg_index is left to superusers, but the creator of an index (or the
owner of the schema) might want to experiment with disabling indices
while debugging query plans as well.

Turns out this is already in the TODO, Steve Singer has requested this
(in particular, "ALTER TABLE ...  ENABLE|DISABLE INDEX ...") in
http://www.postgresql.org/message-id/87hbegz5ir@cbbrowne.afilias-int.info
(as linked to from the TODO wiki page), but the neighboring discussion
was mostly about FK constraints.

Thoughts?


Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING

2014-07-11 Thread Tom Lane
Jeff Davis  writes:
> Attached is a small patch to $SUBJECT.
> In master, only single-byte characters are allowed as an escape. Of
> course, with the patch it must still be a single character, but it may
> be multi-byte.

I'm concerned about the performance cost of this patch.  Have you done
any measurements about what kind of overhead you are putting on the
inner loop of similar_escape?

At the very least, please don't call GetDatabaseEncoding() over again
every single time through the inner loop.  More generally, why do we
need to use pg_encoding_verifymb?  The input data is presumably validly
encoded.  ISTM the significantly cheaper pg_mblen() would be more
appropriate.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allow multi-byte characters as escape in SIMILAR TO and SUBSTRING

2014-07-11 Thread Jeff Davis
On Fri, 2014-07-11 at 14:41 +0900, Fujii Masao wrote:
> Could you add the patch into next CF?

Sure. The patch is so small I was thinking about committing it in a few
days (assuming no complaints), but I'm in no hurry.

> The patch doesn't contain the change of the document. But I think that
> it's better to document what character is allowed as escape in LIKE,
> SIMILAR TO and SUBSTRING.

It should be assumed that multi-byte characters are allowed nearly
everywhere, and we should document the places where only single-byte
characters are allowed.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Tom Lane
Andres Freund  writes:
> On 2014-07-11 11:20:08 -0400, Tom Lane wrote:
>> If you're talking about SnapshotNow hazards, I think the risk would be
>> minimal, and probably no worse than cases that the system will cause
>> by itself.

> Yes, SnapshotNow. I could reproduce it causing 'spurious' HOT updates
> and missing index inserts a while back. And I don't think it's
> comparable with normal modifications. Those either have a modification
> blocking lock or use heap_inplace...

I still think the risk is minimal, but if the OP was worried about this
he could take out an AccessExclusive lock on the parent table for long
enough to commit the pg_index change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Andres Freund
On 2014-07-11 11:20:08 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2014-07-11 11:07:21 -0400, Tom Lane wrote:
> >> Hacking pg_index.indisvalid could work, given a reasonably recent PG.
> >> I would not try it in production until I'd tested it ;-)
> 
> > Works, but IIRC can cause problems at least < 9.4 because concurrent
> > cache builds might miss the pg_index row...
> 
> If you're talking about SnapshotNow hazards, I think the risk would be
> minimal, and probably no worse than cases that the system will cause
> by itself.

Yes, SnapshotNow. I could reproduce it causing 'spurious' HOT updates
and missing index inserts a while back. And I don't think it's
comparable with normal modifications. Those either have a modification
blocking lock or use heap_inplace...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Tom Lane
Andres Freund  writes:
> On 2014-07-11 11:07:21 -0400, Tom Lane wrote:
>> Hacking pg_index.indisvalid could work, given a reasonably recent PG.
>> I would not try it in production until I'd tested it ;-)

> Works, but IIRC can cause problems at least < 9.4 because concurrent
> cache builds might miss the pg_index row...

If you're talking about SnapshotNow hazards, I think the risk would be
minimal, and probably no worse than cases that the system will cause
by itself.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Andres Freund
On 2014-07-11 11:07:21 -0400, Tom Lane wrote:
> David G Johnston  writes:
> > Benedikt Grundmann wrote
> >> That is it possible to tell the planner that index is off limits i.e.
> >> don't ever generate a plan using it?
> 
> > Catalog hacking could work but not recommended (nor do I know the proper
> > commands and limitations).  Do you need the database/table to accept writes
> > during the testing period?
> 
> Hacking pg_index.indisvalid could work, given a reasonably recent PG.
> I would not try it in production until I'd tested it ;-)

Works, but IIRC can cause problems at least < 9.4 because concurrent
cache builds might miss the pg_index row...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Tom Lane
David G Johnston  writes:
> Benedikt Grundmann wrote
>> That is it possible to tell the planner that index is off limits i.e.
>> don't ever generate a plan using it?

> Catalog hacking could work but not recommended (nor do I know the proper
> commands and limitations).  Do you need the database/table to accept writes
> during the testing period?

Hacking pg_index.indisvalid could work, given a reasonably recent PG.
I would not try it in production until I'd tested it ;-)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-11 Thread Tom Lane
David Rowley  writes:
> On Fri, Jul 11, 2014 at 1:11 PM, Tom Lane  wrote:
>> Hm ... actually, there might be a better answer: what about transforming
>> WHERE (x,y) NOT IN (SELECT provably-not-null-values FROM ...)
>> to
>> WHERE  AND x IS NOT NULL AND y IS NOT NULL

> I think this is the way to go.

> I'll try and get some time soon to look into adding the IS NOT NULL quals,
> unless you were thinking of looking again?

Go for it, I've got a lot of other stuff on my plate.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread David G Johnston
Benedikt Grundmann wrote
> That is it possible to tell the planner that index is off limits i.e.
> don't
> ever generate a plan using it?
> 
> Rationale:  Schema changes on big tables.  I might have convinced myself /
> strong beliefs that for all queries that I need to be fast the planner
> does
> not need to use a given index (e.g. other possible plans are fast enough).
> However if I just drop the index and it turns out I'm wrong I might be in
> a
> world of pain because it might just take way to long to recreate the
> index.
> 
> I know that I can use pg_stat* to figure out if an index is used at all.
> But in the presense of multiple indices and complex queries the planner
> might prefer the index-to-be-dropped but the difference to the
> alternatives
> available is immaterial.
> 
> The current best alternative we have is to test such changes on a testing
> database that gets regularly restored from production.  However at least
> in
> our case we simply don't know all possible queries (and logging all of
> them
> is not an option).
> 
> Cheers,
> 
> Bene

Worth double-checking in test but...

BEGIN;
DROP INDEX ...;
EXPLAIN ANALYZE SELECT ...
ROLLBACK;

Index dropping is transactional so your temporary action lasts until you
abort said transaction.

Though given your knowledge limitations this really isn't an improvement...

Catalog hacking could work but not recommended (nor do I know the proper
commands and limitations).  Do you need the database/table to accept writes
during the testing period?

You can avoid all indexes, but not a named subset, using a configuration
parameter.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Is-there-a-way-to-temporarily-disable-a-index-tp5811249p5811290.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Alvaro Herrera
Jeevan Chalke wrote:

> On Fri, Jul 11, 2014 at 3:13 PM, Sawada Masahiko 
> wrote:

> > And the line number should be switched to 1 when line number has
> > reached to INT_MAX?
> 
> Yes, when it goes beyond INT_MAX, wrap around to 1.
> 
> BTW, I wonder, can't we simply use unsigned int instead?

That was my thought also: let the variable be unsigned, and have it wrap
around normally.  So once you reach UINT_MAX, the next line number is
zero (instead of getting stuck at UINT_MAX, which would be rather
strange).  Anyway I don't think anyone is going to reach the UINT_MAX
limit ... I mean that would be one hell of a query, wouldn't it.  If
your query is upwards of a million lines, surely you are in deep trouble
already.

Does your text editor handle files longer than 4 billion lines?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-11 Thread Bruce Momjian
On Fri, Jul 11, 2014 at 09:48:06AM -0400, Bruce Momjian wrote:
> > Uh, why does this need to be in ALTER TABLE?  Can't this be part of
> > table creation done by pg_dump?
> 
> Uh, I think you need to read the thread.  We have to delay the toast
> creation part so we don't use an oid that will later be required by
> another table from the old cluster.  This has to be done after all
> tables have been created.
> 
> We could have pg_dump spit out those ALTER lines at the end of the dump,
> but it seems simpler to do it in pg_upgrade.
> 
> Even if we have pg_dump create all the tables that require pre-assigned
> TOAST oids first, then the other tables that _might_ need a TOAST table,
> those later tables might create a toast oid that matches a later
> non-TOAST-requiring table, so I don't think that fixes the problem.

What would be nice is if I could mark just the tables that will need
toast tables created in that later phase (those tables that didn't have
a toast table in the old cluster, but need one in the new cluster). 
However, I can't see where to store that or how to pass that back into
pg_upgrade.  I don't see a logical place in pg_class to put it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pg_upgrade and toast tables bug discovered

2014-07-11 Thread Bruce Momjian
On Fri, Jul 11, 2014 at 12:18:40AM -0400, Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > On Thu, Jul 10, 2014 at 06:38:26PM -0400, Bruce Momjian wrote:
> 
> > > I have thought some more on this.  I thought I would need to open
> > > pg_class in C and do complex backend stuff, but I now realize I can do
> > > it from libpq, and just call ALTER TABLE and I think that always
> > > auto-checks if a TOAST table is needed.  All I have to do is query
> > > pg_class from libpq, then construct ALTER TABLE commands for each item,
> > > and it will optionally create the TOAST table if needed.  I just have to
> > > use a no-op ALTER TABLE command, like SET STATISTICS.
> > 
> > Attached is the backend part of the patch.  I will work on the
> > pg_upgrade/libpq/ALTER TABLE part later.
> 
> Uh, why does this need to be in ALTER TABLE?  Can't this be part of
> table creation done by pg_dump?

Uh, I think you need to read the thread.  We have to delay the toast
creation part so we don't use an oid that will later be required by
another table from the old cluster.  This has to be done after all
tables have been created.

We could have pg_dump spit out those ALTER lines at the end of the dump,
but it seems simpler to do it in pg_upgrade.

Even if we have pg_dump create all the tables that require pre-assigned
TOAST oids first, then the other tables that _might_ need a TOAST table,
those later tables might create a toast oid that matches a later
non-TOAST-requiring table, so I don't think that fixes the problem.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Alvaro Herrera
Fujii Masao wrote:
> On Thu, Jul 10, 2014 at 6:16 AM, Alvaro Herrera
>  wrote:

> > Here's a new version of this patch, which is more generic the original
> > versions, and similar to what you describe.
> 
> I've not read the discussion so far at all, but I found the problem
> when I played with this patch. Sorry if this has already been discussed.
> 
> =# create table test as select num from generate_series(1,10) num;
> SELECT 10
> =# create index testidx on test using minmax (num);
> CREATE INDEX
> =# alter table test alter column num type text;
> ERROR:  could not determine which collation to use for string comparison
> HINT:  Use the COLLATE clause to set the collation explicitly.

Ah, yes, I need to pass down collation OIDs to comparison functions.
That's marked as XXX in various places in the code.  Sorry I forgot to
mention that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-07-11 Thread Rahila Syed
Thank you for review.

>So, you're compressing backup blocks one by one. I wonder if that's the
>right idea and if we shouldn't instead compress all of them in one run to
>increase the compression ratio.
The idea behind compressing blocks one by one was to keep the code as much
similar to the original as possible.
For instance the easiest change I could think of is , if we compress all
backup blocks of a WAL record together the below format of WAL record might
change

Fixed-size header (XLogRecord struct)
 rmgr-specific data
 BkpBlock
 backup block data
 BkpBlock
 backup block data


to

 Fixed-size header (XLogRecord struct)
 rmgr-specific data
 BkpBlock
 BkpBlock
 backup blocks data
...

But at the same time, it can be worth giving a try to see if there is
significant improvement in compression .

>So why aren't we compressing the hole here instead of compressing the
>parts that the current logic deems to be filled with important information?
Entire full page image in the WAL record is compressed. The unimportant
part of the full page image  which is hole is not WAL logged in original
code. This patch compresses entire full page image inclusive of hole. This
can be optimized by omitting hole in the compressed FPW(incase hole is
filled with non-zeros) like the original uncompressed FPW . But this can
lead to change in BkpBlock structure.

>This should be named 'compress_full_page_writes' or so, even if a
>temporary guc. There's the 'full_page_writes' guc and I see little
>reaason to deviate from its name.

Yes. This will be renamed to full_page_compression according to suggestions
earlier in the discussion.


Thank you,

Rahila Syed


On Fri, Jul 11, 2014 at 12:00 PM, Andres Freund 
wrote:

> On 2014-07-04 19:27:10 +0530, Rahila Syed wrote:
> > + /* Allocates memory for compressed backup blocks according to the
> compression
> > + * algorithm used.Once per session at the time of insertion of
> first XLOG
> > + * record.
> > + * This memory stays till the end of session. OOM is handled by
> making the
> > + * code proceed without FPW compression*/
> > + static char *compressed_pages[XLR_MAX_BKP_BLOCKS];
> > + static bool compressed_pages_allocated = false;
> > + if (compress_backup_block != BACKUP_BLOCK_COMPRESSION_OFF &&
> > + compressed_pages_allocated!= true)
> > + {
> > + size_t buffer_size = VARHDRSZ;
> > + int j;
> > + if (compress_backup_block ==
> BACKUP_BLOCK_COMPRESSION_SNAPPY)
> > + buffer_size +=
> snappy_max_compressed_length(BLCKSZ);
> > + else if (compress_backup_block ==
> BACKUP_BLOCK_COMPRESSION_LZ4)
> > + buffer_size += LZ4_compressBound(BLCKSZ);
> > + else if (compress_backup_block ==
> BACKUP_BLOCK_COMPRESSION_PGLZ)
> > + buffer_size += PGLZ_MAX_OUTPUT(BLCKSZ);
> > + for (j = 0; j < XLR_MAX_BKP_BLOCKS; j++)
> > + {   compressed_pages[j] = (char *) malloc(buffer_size);
> > + if(compressed_pages[j] == NULL)
> > + {
> > +
> compress_backup_block=BACKUP_BLOCK_COMPRESSION_OFF;
> > + break;
> > + }
> > + }
> > + compressed_pages_allocated = true;
> > + }
>
> Why not do this in InitXLOGAccess() or similar?
>
> >   /*
> >* Make additional rdata chain entries for the backup blocks, so
> that we
> >* don't need to special-case them in the write loop.  This
> modifies the
> > @@ -1015,11 +1048,32 @@ begin:;
> >   rdt->next = &(dtbuf_rdt2[i]);
> >   rdt = rdt->next;
> >
> > + if (compress_backup_block != BACKUP_BLOCK_COMPRESSION_OFF)
> > + {
> > + /* Compress the backup block before including it in rdata
> chain */
> > + rdt->data = CompressBackupBlock(page, BLCKSZ -
> bkpb->hole_length,
> > +
>   compressed_pages[i], &(rdt->len));
> > + if (rdt->data != NULL)
> > + {
> > + /*
> > +  * write_len is the length of compressed
> block and its varlena
> > +  * header
> > +  */
> > + write_len += rdt->len;
> > + bkpb->hole_length = BLCKSZ - rdt->len;
> > + /*Adding information about compression in
> the backup block header*/
> > +
> bkpb->block_compression=compress_backup_block;
> > + rdt->next = NULL;
> > + continue;
> > + }
> > + }
> > +
>
> So, you're compressing backup blocks one by one. I wonder if that's the
> right idea and if we shouldn't instead compress all of them in one run to
> increase the compression ratio.
>
>
> > +/*
> >   * Get a pointer to the

Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-11 Thread David Rowley
On Fri, Jul 11, 2014 at 1:11 PM, Tom Lane  wrote:

> I wrote:
> > We could no doubt fix this by also insisting that the left-side vars
> > be provably not null, but that's going to make the patch even slower
> > and even less often applicable.  I'm feeling discouraged about whether
> > this is worth doing in this form.
>
>
:-( seems I didn't do my analysis very well on that one.


>  Hm ... actually, there might be a better answer: what about transforming
>
>WHERE (x,y) NOT IN (SELECT provably-not-null-values FROM ...)
>
> to
>
>WHERE  AND x IS NOT NULL AND y IS NOT NULL
>
> ?
>
>
I think this is the way to go.
It's basically what I had to do with the WIP patch I have here for SEMI
JOIN removal, where when a IN() or EXISTS type join could be removed due to
the existence of a foreign key, the NULL values still need to be filtered
out.

Perhaps it would be possible for a future patch to check get_attnotnull and
remove these again in eval_const_expressions, if the column can't be null.

Thanks for taking the time to fix up the weirdness with the NATURAL joins
and also making use of the join condition to prove not null-ability.

I'll try and get some time soon to look into adding the IS NOT NULL quals,
unless you were thinking of looking again?

Regards

David Rowley


> Of course this would require x/y not being volatile, but if they are,
> we're not going to get far with optimizing the query anyhow.
>
> regards, tom lane
>


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Jeevan Chalke
Hi,


On Fri, Jul 11, 2014 at 3:13 PM, Sawada Masahiko 
wrote:

> >
> >
>
> To my understating cleanly, you means that line number is not changed
> when newline has reached to INT_MAX, is incorrect?
>

As per my thinking yes.


> And the line number should be switched to 1 when line number has
> reached to INT_MAX?
>

Yes, when it goes beyond INT_MAX, wrap around to 1.

BTW, I wonder, can't we simply use unsigned int instead?

Also, what is the behaviour on LINE n, in error message in case of such
wrap-around?


>
> >
> > Or much better, simply get rid of newline, and use cur_line itself, will
> > this work well for you?
>
> this is better.  I will change code to this.
> Thanks.
> I will fix it.
>


Meanwhile I have tried this. Attached patch to have your suggestion on
that.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 255e8ca..030f4d0 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3298,6 +3298,11 @@ testdb=> INSERT INTO my_table VALUES (:'content');
   
 
   
+%l
+The current line number
+  
+
+  
 %digits
 
 
diff --git a/src/bin/psql/mainloop.c b/src/bin/psql/mainloop.c
index c3aff20..675b550 100644
--- a/src/bin/psql/mainloop.c
+++ b/src/bin/psql/mainloop.c
@@ -8,6 +8,7 @@
 #include "postgres_fe.h"
 #include "mainloop.h"
 
+#include 
 
 #include "command.h"
 #include "common.h"
@@ -58,6 +59,7 @@ MainLoop(FILE *source)
 	pset.cur_cmd_source = source;
 	pset.cur_cmd_interactive = ((source == stdin) && !pset.notty);
 	pset.lineno = 0;
+	cur_line = 1;
 
 	/* Create working state */
 	scan_state = psql_scan_create();
@@ -225,6 +227,7 @@ MainLoop(FILE *source)
 		{
 			PsqlScanResult scan_result;
 			promptStatus_t prompt_tmp = prompt_status;
+			char *tmp = line;
 
 			scan_result = psql_scan(scan_state, query_buf, &prompt_tmp);
 			prompt_status = prompt_tmp;
@@ -235,6 +238,30 @@ MainLoop(FILE *source)
 exit(EXIT_FAILURE);
 			}
 
+			/* 
+			 * Increase current line number counter with the new lines present
+			 * in the line buffer
+			 */
+			while (*tmp != '\0' && scan_result != PSCAN_INCOMPLETE)
+			{
+if (*(tmp++) == '\n')
+	cur_line++;
+			}
+
+			/* The one new line is always added to tail of query_buf */
+			if (scan_result != PSCAN_INCOMPLETE)
+cur_line++;
+
+			/*
+			 * If we overflow, then we start at INT_MIN and move towards 0.  So
+			 * to get +ve wrap-around line number we have to add INT_MAX + 2 to
+			 * this number.  We add 2 due to the fact that we have difference
+			 * of 1 in absolute value of INT_MIN and INT_MAX and another 1 as
+			 * line number starts at one and not at zero.
+			 */
+			if (cur_line < 0)
+cur_line += INT_MAX + 2;
+
 			/*
 			 * Send command if semicolon found, or if end of line and we're in
 			 * single-line mode.
@@ -256,6 +283,7 @@ MainLoop(FILE *source)
 /* execute query */
 success = SendQuery(query_buf->data);
 slashCmdStatus = success ? PSQL_CMD_SEND : PSQL_CMD_ERROR;
+cur_line = 1;
 
 /* transfer query to previous_buf by pointer-swapping */
 {
@@ -303,6 +331,7 @@ MainLoop(FILE *source)
  query_buf : previous_buf);
 
 success = slashCmdStatus != PSQL_CMD_ERROR;
+cur_line = 1;
 
 if ((slashCmdStatus == PSQL_CMD_SEND || slashCmdStatus == PSQL_CMD_NEWEDIT) &&
 	query_buf->len == 0)
diff --git a/src/bin/psql/prompt.c b/src/bin/psql/prompt.c
index 26fca04..6a62e5f 100644
--- a/src/bin/psql/prompt.c
+++ b/src/bin/psql/prompt.c
@@ -44,6 +44,7 @@
  *		in prompt2 -, *, ', or ";
  *		in prompt3 nothing
  * %x - transaction status: empty, *, !, ? (unknown or no connection)
+ * %l - the line number
  * %? - the error code of the last query (not yet implemented)
  * %% - a percent sign
  *
@@ -229,6 +230,9 @@ get_prompt(promptStatus_t status)
 		}
 	break;
 
+case 'l':
+	sprintf(buf, "%d", cur_line);
+	break;
 case '?':
 	/* not here yet */
 	break;
diff --git a/src/bin/psql/prompt.h b/src/bin/psql/prompt.h
index 4d2f7e3..f1f80d2 100644
--- a/src/bin/psql/prompt.h
+++ b/src/bin/psql/prompt.h
@@ -22,4 +22,7 @@ typedef enum _promptStatus
 
 char	   *get_prompt(promptStatus_t status);
 
+/* Current line number */
+intcur_line;
+
 #endif   /* PROMPT_H */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing autocomplete for CREATE DATABASE

2014-07-11 Thread Magnus Hagander
On Fri, Jul 11, 2014 at 12:01 AM, Vik Fearing  wrote:
> On 07/10/2014 09:32 PM, Magnus Hagander wrote:
>> It seems psql is missing autocomplete entries for LC_COLLATE and
>> LC_CTYPE for the CREATE DATABASE command. Attached patch adds that.
>>
>> I doubt this is important enough to backpatch - thoughts?
>
> It won't apply to current head, but otherwise I see no problem with it.

Meh, thanks for pointing that out. I git-fetch:ed from the wrong repository :)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-11 Thread Simon Riggs
On 11 July 2014 10:23, Tomas Vondra  wrote:
> On 11 Červenec 2014, 9:27, Simon Riggs wrote:
>> On 9 July 2014 18:54, Tomas Vondra  wrote:
>>
>>> (1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
>>> of batches this requires)
>>
>> If we start off by assuming NTUP_PER_BUCKET = 1, how much memory does
>> it save to recalculate the hash bucket at 10 instead?
>> Resizing sounds like it will only be useful of we only just overflow our
>> limit.
>>
>> If we release next version with this as a hardcoded change, my
>> understanding is that memory usage for hash joins will leap upwards,
>> even if the run time of queries reduces. It sounds like we need some
>> kind of parameter to control this. "We made it faster" might not be
>> true if we run this on servers that are already experiencing high
>> memory pressure.
>
> Sure. We certainly don't want to make things worse for environments with
> memory pressure.
>
> The current implementation has two issues regarding memory:
>
> (1) It does not include buckets into used memory, i.e. it's not included
> into work_mem (so we may overflow work_mem). I plan to fix this, to make
> work_mem a bit more correct, as it's important for cases with
> NTUP_PER_BUCKET=1.
>
> (2) There's a significant palloc overhead, because of allocating each
> tuple separately - see my message from yesterday, where I observed the
> batch memory context to get 1.4GB memory for 700MB of tuple data. By
> densely packing the tuples, I got down to ~700MB (i.e. pretty much no
> overhead).
>
> The palloc overhead seems to be 20B (on x86_64) per tuple, and eliminating
> this it more than compensates for ~8B per tuple, required for
> NTUP_PER_BUCKET=1. And fixing (1) makes it more correct / predictable.
>
> It also improves the issue that palloc overhead is not counted into
> work_mem at all (that's why I got ~1.4GB batch context with work_mem=1GB).
>
> So in the end this should give us much lower memory usage for hash joins,
> even if we switch to NTUP_PER_BUCKET=1 (although that's pretty much
> independent change). Does that seem reasonable?

Yes, that alleviates my concern. Thanks.

> Regarding the tunable to control this - I certainly don't want another GUC
> no one really knows how to set right. And I think it's unnecessary thanks
> to the palloc overhead / work_mem accounting fix, described above.

Agreed, nor does anyone.

> The one thing I'm not sure about is what to do in case of reaching the
> work_mem limit (which should only happen with underestimated row count /
> row width) - how to decide whether to shrink the hash table or increment
> the number of batches. But this is not exclusive to NTUP_PER_BUCKET=1, it
> may happen with whatever NTUP_PER_BUCKET value you choose.
>
> The current code does not support resizing at all, so it always increments
> the number of batches, but I feel an "interleaved" approach might be more
> appropriate (nbuckets/2, nbatches*2, nbuckets/2, nbatches*2, ...). It'd be
> nice to have some cost estimates ('how expensive is a rescan' vs. 'how
> expensive is a resize'), but I'm not sure how to get that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] No exact/lossy block information in EXPLAIN ANALYZE for a bitmap heap scan

2014-07-11 Thread Fujii Masao
On Fri, Jul 11, 2014 at 5:45 PM, Etsuro Fujita
 wrote:
> I've noticed that EXPLAIN ANALYZE shows no information on exact/lossy
> blocks for a bitmap heap scan when both the numbers of exact/lossy pages
> retrieved by the node are zero.  Such an example is shown below.  I
> think it would be better to suppress the 'Heap Blocks' line in that
> case, based on the same idea of the 'Buffers' line.  Patch attached.

The patch looks good to me. Barring any objection, I will commit this both
in HEAD and 9.4.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Fix search_path default value separator.

2014-07-11 Thread Christoph Martin
Hi

I noticed a minor inconsistency with the search_path separator used in the
default configuration.

The schemas of any search_path set using `SET search_path TO...` are
separated by ", " (comma, space), while the default value is only separated
by "," (comma).

The attached patch against master changes the separator of the default
value to be consistent with the usual comma-space separators, and updates
the documentation of `SHOW search_path;` accordingly.

This massive three-byte change passes all 144 tests of make check.

Regards,

Christoph
From 0f52f107af59f560212bff2bda74e643d63687f0 Mon Sep 17 00:00:00 2001
From: Christoph Martin 
Date: Fri, 11 Jul 2014 08:52:39 +0200
Subject: [PATCH] Fix search_path default value separator.

When setting the search_path with `SET search_path TO...`, the schemas
of the resulting search_path as reported by `SHOW search_path` are
separated by a comma followed by a single space. This patch applies the
same format to the default search_path, and updates the documentation
accordingly.
---
 doc/src/sgml/ddl.sgml | 2 +-
 src/backend/utils/misc/guc.c  | 2 +-
 src/backend/utils/misc/postgresql.conf.sample | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3b7fff4..2273616 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -1746,7 +1746,7 @@ SHOW search_path;
 
  search_path
 --
- "$user",public
+ "$user", public
 
 The first element specifies that a schema with the same name as
 the current user is to be searched.  If no such schema exists,
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3a31a75..a3f1051 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2915,7 +2915,7 @@ static struct config_string ConfigureNamesString[] =
 			GUC_LIST_INPUT | GUC_LIST_QUOTE
 		},
 		&namespace_search_path,
-		"\"$user\",public",
+		"\"$user\", public",
 		check_search_path, assign_search_path, NULL
 	},
 
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 3f3e706..df98b02 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -499,7 +499,7 @@
 
 # - Statement Behavior -
 
-#search_path = '"$user",public'		# schema names
+#search_path = '"$user", public'	# schema names
 #default_tablespace = ''		# a tablespace name, '' uses the default
 #temp_tablespaces = ''			# a list of tablespace names, '' uses
 	# only default tablespace
-- 
2.0.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-07-11 Thread David Rowley
On Wed, Jul 9, 2014 at 12:59 PM, Tom Lane  wrote:

> David Rowley  writes:
> > On 9 July 2014 09:27, Tom Lane  wrote:
> >> On review it looks like analyzejoins.c would possibly benefit from an
> >> earlier fast-path check as well.
>
> > Do you mean for non-subqueries? There already is a check to see if the
> > relation has no indexes.
>
> Oh, sorry, that was a typo: I meant to write pathnode.c.  Specifically,
> we could skip the translate_sub_tlist step.  Admittedly that's not
> hugely expensive, but as long as we have the infrastructure for a quick
> check it might be worth doing.
>
>
> >> TBH I find the checks for FOR UPDATE and volatile functions to be
> >> questionable as well.
>
> > Well, that's a real tough one for me as I only added that based on what
> you
> > told me here:
> >> I doubt you should drop a subquery containing FOR UPDATE, either.
> >> That's a side effect, just as much as a volatile function would be.
>
> Hah ;-).  But the analogy to qual pushdown hadn't occurred to me at the
> time.
>
>
Ok, I've removed the check for volatile functions and FOR UPDATE.


> > As far as I know the FOR UPDATE check is pretty much void as of now
> anyway,
> > since the current state of query_is_distinct_for() demands that there's
> > either a DISTINCT, GROUP BY or just a plain old aggregate without any
> > grouping, which will just return a single row, neither of these will
> allow
> > FOR UPDATE anyway.
>
> True.
>
> > So the effort here should be probably be more focused on if we should
> allow
> > the join removal when the subquery contains volatile functions. We should
> > probably think fairly hard on this now as I'm still planning on working
> on
> > INNER JOIN removals at some point and I'm thinking we should likely be
> > consistent between the 2 types of join for when it comes to FOR UPDATE
> and
> > volatile functions, and I'm thinking right now that for INNER JOINs that,
> > since they're INNER that we could remove either side of the join. In that
> > case maybe it would be harder for the user to understand why their
> volatile
> > function didn't get executed.
>
> Color me dubious.  In exactly what circumstances would it be valid to
> suppress an inner join involving a sub-select?
>
>
hmm, probably I didn't think this through enough before commenting as I
don't actually have any plans for subselects with INNER JOINs. Though
saying that I guess there are cases that can be removed... Anything that
queries a single table that has a foreign key matching the join condition,
where the subquery does not filter or group the results. Obviously
something about the query would have to exist that caused it not to be
flattened, perhaps some windowing functions...


I've attached an updated patch which puts in some fast path code for
subquery type joins. I'm really not too sure on a good name for this
function. I've ended up with query_supports_distinctness() which I'm not
that keen on, but I didn't manage to come up with anything better.

Regards

David Rowley


subquery_leftjoin_removal_v1.6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Sawada Masahiko
On Fri, Jul 11, 2014 at 4:23 PM, Jeevan Chalke
 wrote:
> Hi,
>
> A.
>
> However, this introduced new bug. As I told, when editor number of lines
> reaches INT_MAX it starts giving negative number. You tried overcoming this
> issue by adding "< 0" check. But I guess you again fumbled in setting that
> correctly. You are setting it to INT_MAX - 1. This enforces each new line to
> show line number as INT_MAX - 1 which is incorrect.
>
> postgres=# \set PROMPT1 '%/[%l]%R%# '
> postgres[1]=# \set PROMPT2 '%/[%l]%R%# '
> postgres[1]=# \e
> postgres[2147483646]-# limit
> postgres[2147483646]-# 1;
>
>relname
> --
>  pg_statistic
> (1 row)
>
> postgres[1]=# \e
> postgres[2147483646]-# =
> postgres[2147483646]-# '
> postgres[2147483646]'# abc
> postgres[2147483646]'# '
> postgres[2147483646]-# ;
>  relname
> -
> (0 rows)
>
>
> postgres[1]=# select
> relname
> from
> pg_class
> where
> relname
> =
> '
> abc
> '
> ;
>
>
> Again to mimic that, I have simply intialized newline to INT_MAX - 2.
> Please don't take me wrong, but it seems that your unit testing is not
> enough. Above issue I discovered by doing exactly same steps I did in
> reviewing previous patch. If you had tested your new patch with those steps
> I guess you have caught that yourself.
>

To my understating cleanly, you means that line number is not changed
when newline has reached to INT_MAX, is incorrect?
And the line number should be switched to 1 when line number has
reached to INT_MAX?

>
> B.
>
> + /* Calculate the line number */
> + if (scan_result != PSCAN_INCOMPLETE)
> + {
> + /* The one new line is always added to tail of query_buf
> */
> + newline = (newline != 0) ? (newline + 1) : 1;
> + cur_line += newline;
> + }
>
> Here in above code changes, in any case you are adding 1 to newline. i.e.
> when it is 0 you are setting it 1 (+1) and when > 0 you are setting nl + 1
> (again +1).
> So why can't you simply use"
> if (scan_result != PSCAN_INCOMPLETE)
> cur_line += (newline + 1);
>
> Or better, why can't you initialize newline with 1 itself and then directly
> assign cur_line with newline. That will eliminate above entire code block,
> isn't it?
> Or much better, simply get rid of newline, and use cur_line itself, will
> this work well for you?

this is better.  I will change code to this.

>
> C. Typos:
> 1.
> /* Count the number of new line for calculate ofline number */
>
> Missing space between 'of' and 'line'.
> However better improve that to something like (just suggestion):
> "Count the number of new lines to correctly adjust current line number"
>
> 2.
> /* Avoid cur_line and newline exceeds the INT_MAX */
>
> You are saying avoid cur_line AND newline, but there is no adjustment for
> newline in the code following the comment.
>
> Thanks
> --
> Jeevan B Chalke
> Principal Software Engineer, Product Development
> EnterpriseDB Corporation
> The Enterprise PostgreSQL Company
>

Thanks.
I will fix it.

-- 
Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Securing "make check" (CVE-2014-0067)

2014-07-11 Thread Christoph Berg
Re: To Bruce Momjian 2014-07-11 <20140711093923.ga3...@msg.df7cb.de>
> Re: Bruce Momjian 2014-07-08 <20140708202114.gd9...@momjian.us>
> > > > > I believe pg_upgrade itself still needs a fix. While it's not a
> > > > > security problem to put the socket in $CWD while upgrading (it is
> > > > > using -c unix_socket_permissions=0700), this behavior is pretty
> > > > > unexpected, and does fail if your $CWD is > 107 bytes.
> > > > > 
> > > > > In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
> > > > > perl tests to avoid that problem, so imho it would make even more
> > > > > sense to fix pg_upgrade which could also fail in production.
> > > > 
> > > > +1.  Does writing that patch interest you?
> > > 
> > > I'll give it a try once I've finished this CF review.
> > 
> > OK.  Let me know if you need help.
> 
> Here's the patch. Proposed commit message:
> 
> Create pg_upgrade sockets in temp directories
> 
> pg_upgrade used to use the current directory for UNIX sockets to
> access the old/new cluster.  This fails when the current path is
> > 107 bytes.  Fix by reusing the tempdir code from pg_regress
> introduced in be76a6d39e2832d4b88c0e1cc381aa44a7f86881.  For cleanup,
> we need to remember up to two directories.

Uh... now really.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/
diff --git a/contrib/pg_upgrade/option.c b/contrib/pg_upgrade/option.c
index b81010a..121a3d4 100644
--- a/contrib/pg_upgrade/option.c
+++ b/contrib/pg_upgrade/option.c
@@ -14,6 +14,7 @@
 
 #include "pg_upgrade.h"
 
+#include 
 #include 
 #include 
 #ifdef WIN32
@@ -26,6 +27,13 @@ static void check_required_directory(char **dirpath, char **configpath,
    char *envVarName, char *cmdLineOption, char *description);
 #define FIX_DEFAULT_READ_ONLY "-c default_transaction_read_only=false"
 
+#ifdef HAVE_UNIX_SOCKETS
+/* make_temp_sockdir() is invoked at most twice from pg_upgrade.c via get_sock_dir() */
+#define MAX_TEMPDIRS 2
+static int n_tempdirs = 0;	/* actual number of directories created */
+static const char *temp_sockdir[MAX_TEMPDIRS];
+#endif
+
 
 UserOpts	user_opts;
 
@@ -396,6 +404,86 @@ adjust_data_dir(ClusterInfo *cluster)
 	check_ok();
 }
 
+#ifdef HAVE_UNIX_SOCKETS
+/*
+ * Remove the socket temporary directories.  pg_ctl waits for postmaster
+ * shutdown, so we expect the directory to be empty, unless we are interrupted
+ * by a signal, in which case the postmaster will clean up the sockets, but
+ * there's a race condition with us removing the directory.  Ignore errors;
+ * leaking a (usually empty) temporary directory is unimportant.  This can run
+ * from a signal handler.  The code is not acceptable in a Windows signal
+ * handler (see initdb.c:trapsig()), but Windows is not a HAVE_UNIX_SOCKETS
+ * platform.
+ */
+static void remove_temp(void)
+{
+	int			i;
+
+	for (i = 0; i < n_tempdirs; i++)
+	{
+		Assert(temp_sockdir[i]);
+		rmdir(temp_sockdir[i]);
+	}
+}
+
+/*
+ * Signal handler that calls remove_temp() and reraises the signal.
+ */
+static void
+signal_remove_temp(int signum)
+{
+	remove_temp();
+
+	pqsignal(signum, SIG_DFL);
+	raise(signum);
+}
+
+/*
+ * Create a temporary directory suitable for the server's Unix-domain socket.
+ * The directory will have mode 0700 or stricter, so no other OS user can open
+ * our socket to exploit it independently from the auth method used.  Most
+ * systems constrain the length of socket paths well below _POSIX_PATH_MAX, so
+ * we place the directory under /tmp rather than relative to the possibly-deep
+ * current working directory.
+ *
+ * Using a temporary directory so no connections arrive other than what
+ * pg_upgrade initiate itself.  Compared to using the compiled-in
+ * DEFAULT_PGSOCKET_DIR, this also permits pg_upgrade to work in builds that
+ * relocate it to a directory not writable to the cluster owner.
+ */
+static const char *
+make_temp_sockdir(void)
+{
+	char	   *template = strdup("/tmp/pg_upgrade-XX");
+
+	Assert(n_tempdirs < MAX_TEMPDIRS);
+	temp_sockdir[n_tempdirs] = mkdtemp(template);
+	if (temp_sockdir[n_tempdirs] == NULL)
+	{
+		fprintf(stderr, _("%s: could not create directory \"%s\": %s\n"),
+os_info.progname, template, strerror(errno));
+		exit(2);
+	}
+
+	/*
+	 * Remove the directories during clean exit.  Register the handler only
+	 * once, though.
+	 */
+	if (n_tempdirs == 0)
+		atexit(remove_temp);
+
+	/*
+	 * Remove the directory before dying to the usual signals.  Omit SIGQUIT,
+	 * preserving it as a quick, untidy exit.
+	 */
+	pqsignal(SIGHUP, signal_remove_temp);
+	pqsignal(SIGINT, signal_remove_temp);
+	pqsignal(SIGPIPE, signal_remove_temp);
+	pqsignal(SIGTERM, signal_remove_temp);
+
+	return temp_sockdir[n_tempdirs++];
+}
+#endif   /* HAVE_UNIX_SOCKETS */
 
 /*
  * get_sock_dir
@@ -418,10 +506,8 @@ get_sock_dir(ClusterInfo *cluster, bool live_check)
 	{
 		if (!live_check)
 		{
-			/* Use the current directory for the socket */
-			cluster->sockdir = pg_malloc(MAXPGPATH);
-			if (!getcwd(cluster->sockdir, MAXPGP

Re: [HACKERS] Securing "make check" (CVE-2014-0067)

2014-07-11 Thread Christoph Berg
Re: Bruce Momjian 2014-07-08 <20140708202114.gd9...@momjian.us>
> > > > I believe pg_upgrade itself still needs a fix. While it's not a
> > > > security problem to put the socket in $CWD while upgrading (it is
> > > > using -c unix_socket_permissions=0700), this behavior is pretty
> > > > unexpected, and does fail if your $CWD is > 107 bytes.
> > > > 
> > > > In f545d233ebce6971b6f9847680e48b679e707d22 Peter fixed the pg_ctl
> > > > perl tests to avoid that problem, so imho it would make even more
> > > > sense to fix pg_upgrade which could also fail in production.
> > > 
> > > +1.  Does writing that patch interest you?
> > 
> > I'll give it a try once I've finished this CF review.
> 
> OK.  Let me know if you need help.

Here's the patch. Proposed commit message:

Create pg_upgrade sockets in temp directories

pg_upgrade used to use the current directory for UNIX sockets to
access the old/new cluster.  This fails when the current path is
> 107 bytes.  Fix by reusing the tempdir code from pg_regress
introduced in be76a6d39e2832d4b88c0e1cc381aa44a7f86881.  For cleanup,
we need to remember up to two directories.

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and replication slots

2014-07-11 Thread Andres Freund
On 2014-07-11 11:18:58 +0200, Magnus Hagander wrote:
> On Fri, Jul 11, 2014 at 11:14 AM, Andres Freund  
> wrote:
> > On 2014-07-11 11:08:48 +0200, Magnus Hagander wrote:
> >> Is there a particular reason why pg_receivexlog only supports using
> >> manually created slots but pg_recvlogical supports creating and
> >> dropping them? Wouldn't it be good for consistency there?
> >
> > I've added it to recvlogical because logical decoding isn't usable
> > without slots. I'm not sure what you'd want to create/drop a slot from
> > receivexlog for, but I'm not adverse to adding the capability.
> 
> I was mostly thinking for consistency, really. Using slots with
> pg_receivexlog makes quite a bit of sense, even though it's useful
> without it. But having the ability to create and drop (with compatible
> commandline arguments of course) them directly when you set it up
> would certainly be more convenient.

Ok. Do you plan to take care of it? If, I'd be fine with backpatching
it. I'm not likely to get to it right now :(

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-11 Thread Tomas Vondra
On 11 Červenec 2014, 9:27, Simon Riggs wrote:
> On 9 July 2014 18:54, Tomas Vondra  wrote:
>
>> (1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
>> of batches this requires)
>
> If we start off by assuming NTUP_PER_BUCKET = 1, how much memory does
> it save to recalculate the hash bucket at 10 instead?
> Resizing sounds like it will only be useful of we only just overflow our
> limit.
>
> If we release next version with this as a hardcoded change, my
> understanding is that memory usage for hash joins will leap upwards,
> even if the run time of queries reduces. It sounds like we need some
> kind of parameter to control this. "We made it faster" might not be
> true if we run this on servers that are already experiencing high
> memory pressure.

Sure. We certainly don't want to make things worse for environments with
memory pressure.

The current implementation has two issues regarding memory:

(1) It does not include buckets into used memory, i.e. it's not included
into work_mem (so we may overflow work_mem). I plan to fix this, to make
work_mem a bit more correct, as it's important for cases with
NTUP_PER_BUCKET=1.

(2) There's a significant palloc overhead, because of allocating each
tuple separately - see my message from yesterday, where I observed the
batch memory context to get 1.4GB memory for 700MB of tuple data. By
densely packing the tuples, I got down to ~700MB (i.e. pretty much no
overhead).

The palloc overhead seems to be 20B (on x86_64) per tuple, and eliminating
this it more than compensates for ~8B per tuple, required for
NTUP_PER_BUCKET=1. And fixing (1) makes it more correct / predictable.

It also improves the issue that palloc overhead is not counted into
work_mem at all (that's why I got ~1.4GB batch context with work_mem=1GB).

So in the end this should give us much lower memory usage for hash joins,
even if we switch to NTUP_PER_BUCKET=1 (although that's pretty much
independent change). Does that seem reasonable?

Regarding the tunable to control this - I certainly don't want another GUC
no one really knows how to set right. And I think it's unnecessary thanks
to the palloc overhead / work_mem accounting fix, described above.

The one thing I'm not sure about is what to do in case of reaching the
work_mem limit (which should only happen with underestimated row count /
row width) - how to decide whether to shrink the hash table or increment
the number of batches. But this is not exclusive to NTUP_PER_BUCKET=1, it
may happen with whatever NTUP_PER_BUCKET value you choose.

The current code does not support resizing at all, so it always increments
the number of batches, but I feel an "interleaved" approach might be more
appropriate (nbuckets/2, nbatches*2, nbuckets/2, nbatches*2, ...). It'd be
nice to have some cost estimates ('how expensive is a rescan' vs. 'how
expensive is a resize'), but I'm not sure how to get that.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Incorrect comment in postgres_fdw.c

2014-07-11 Thread Etsuro Fujita
I think the following comment for store_returning_result() in
postgres_fdw.c is not right.

/* PGresult must be released before leaving this function. */

I think PGresult should not be released before leaving this function *on
success* in that function.

(I guess the comment has been copied and pasted from that for
get_remote_estimate().)

Thanks,

Best regards,
Etsuro Fujita
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 7dd43a9..f328833 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2257,7 +2257,6 @@ static void
 store_returning_result(PgFdwModifyState *fmstate,
 	   TupleTableSlot *slot, PGresult *res)
 {
-	/* PGresult must be released before leaving this function. */
 	PG_TRY();
 	{
 		HeapTuple	newtup;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and replication slots

2014-07-11 Thread Magnus Hagander
On Fri, Jul 11, 2014 at 11:14 AM, Andres Freund  wrote:
> On 2014-07-11 11:08:48 +0200, Magnus Hagander wrote:
>> Is there a particular reason why pg_receivexlog only supports using
>> manually created slots but pg_recvlogical supports creating and
>> dropping them? Wouldn't it be good for consistency there?
>
> I've added it to recvlogical because logical decoding isn't usable
> without slots. I'm not sure what you'd want to create/drop a slot from
> receivexlog for, but I'm not adverse to adding the capability.

I was mostly thinking for consistency, really. Using slots with
pg_receivexlog makes quite a bit of sense, even though it's useful
without it. But having the ability to create and drop (with compatible
commandline arguments of course) them directly when you set it up
would certainly be more convenient.


>> I'm guessing it's related to not being exposed over the replication
>> protocol?
>
> It's exposed:
> create_replication_slot:
> /* CREATE_REPLICATION_SLOT slot PHYSICAL */
> K_CREATE_REPLICATION_SLOT IDENT K_PHYSICAL

Hmm. You know, it would help if I actually looked at a 9.4 version of
the file when I check for functions of this kind :) Thanks!

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_receivexlog and replication slots

2014-07-11 Thread Andres Freund
On 2014-07-11 11:08:48 +0200, Magnus Hagander wrote:
> Is there a particular reason why pg_receivexlog only supports using
> manually created slots but pg_recvlogical supports creating and
> dropping them? Wouldn't it be good for consistency there?

I've added it to recvlogical because logical decoding isn't usable
without slots. I'm not sure what you'd want to create/drop a slot from
receivexlog for, but I'm not adverse to adding the capability.

> I'm guessing it's related to not being exposed over the replication
> protocol?

It's exposed:
create_replication_slot:
/* CREATE_REPLICATION_SLOT slot PHYSICAL */
K_CREATE_REPLICATION_SLOT IDENT K_PHYSICAL

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_receivexlog and replication slots

2014-07-11 Thread Magnus Hagander
Is there a particular reason why pg_receivexlog only supports using
manually created slots but pg_recvlogical supports creating and
dropping them? Wouldn't it be good for consistency there?

I'm guessing it's related to not being exposed over the replication
protocol? We had a discussion earlier that I remember about being able
to use an "auto-drop" slot in pg_basebackup, but this would be
different - this would be about creating and dropping a regular
physical replication slot...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS Design

2014-07-11 Thread Stephen Frost
On Thursday, July 10, 2014, Robert Haas  wrote:

> On Wed, Jul 9, 2014 at 2:13 AM, Stephen Frost  > wrote:
> > Yes, this would be possible (and is nearly identical to the original
> > patch, except that this includes per-role considerations), however, my
> > thinking is that it'd be simpler to work with policy names rather than
> > sets of quals, to use when mapping to roles, and they would potentially
> > be useful later for other things (eg: for setting up which policies
> > should be applied when, or which should be OR' or AND"d with other
> > policies, or having groups of policies, etc).
>
> Hmm.  I guess that's reasonable.  Should the policy be a per-table
> object (like rules, constraints, etc.) instead of a global object?
>
> You could do:
>
> ALTER TABLE table_name ADD POLICY policy_name (quals);
> ALTER TABLE table_name POLICY FOR role_name IS policy_name;
> ALTER TABLE table_name DROP POLICY policy_name;
>

Right, I was thinking they would be per table as they would specifically
provide a name for a set of quals, and quals are naturally table-specific.
I don't see a need to have them be global- that had been brought up before
with the notion of applications picking their policy, but we could also add
that later through another term (eg: contexts) which would then map to
policies or similar. We could even extend policies to be global by mapping
existing per-table ones to be global if we really needed to...

My feeling at the moment is that having them be per-table makes sense and
we'd still have flexibility to change later if we had some compelling
reason to do so.

Thanks!

Stephen


[HACKERS] No exact/lossy block information in EXPLAIN ANALYZE for a bitmap heap scan

2014-07-11 Thread Etsuro Fujita
I've noticed that EXPLAIN ANALYZE shows no information on exact/lossy
blocks for a bitmap heap scan when both the numbers of exact/lossy pages
retrieved by the node are zero.  Such an example is shown below.  I
think it would be better to suppress the 'Heap Blocks' line in that
case, based on the same idea of the 'Buffers' line.  Patch attached.

postgres=# explain (analyze, verbose, buffers) select * from test where
id < 10;
QUERY PLAN
--
 Bitmap Heap Scan on public.test  (cost=4.29..8.31 rows=1 width=29)
(actual time=0.006..0.006 rows=0 loops=1)
   Output: id, inserted, data
   Recheck Cond: (test.id < 10)
   Heap Blocks:
   Buffers: shared hit=2
   ->  Bitmap Index Scan on test_pkey  (cost=0.00..4.29 rows=1 width=0)
(actual time=0.003..0.003 rows=0 loops=1)
 Index Cond: (test.id < 10)
 Buffers: shared hit=2
 Planning time: 0.118 ms
 Execution time: 0.027 ms
(10 rows)

Thanks,

Best regards,
Etsuro Fujita
*** a/src/backend/commands/explain.c
--- b/src/backend/commands/explain.c
***
*** 1937,1949  show_tidbitmap_info(BitmapHeapScanState *planstate, ExplainState *es)
  	}
  	else
  	{
! 		appendStringInfoSpaces(es->str, es->indent * 2);
! 		appendStringInfoString(es->str, "Heap Blocks:");
! 		if (planstate->exact_pages > 0)
! 			appendStringInfo(es->str, " exact=%ld", planstate->exact_pages);
! 		if (planstate->lossy_pages > 0)
! 			appendStringInfo(es->str, " lossy=%ld", planstate->lossy_pages);
! 		appendStringInfoChar(es->str, '\n');
  	}
  }
  
--- 1937,1952 
  	}
  	else
  	{
! 		if (planstate->exact_pages > 0 || planstate->lossy_pages > 0)
! 		{
! 			appendStringInfoSpaces(es->str, es->indent * 2);
! 			appendStringInfoString(es->str, "Heap Blocks:");
! 			if (planstate->exact_pages > 0)
! appendStringInfo(es->str, " exact=%ld", planstate->exact_pages);
! 			if (planstate->lossy_pages > 0)
! appendStringInfo(es->str, " lossy=%ld", planstate->lossy_pages);
! 			appendStringInfoChar(es->str, '\n');
! 		}
  	}
  }
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Fix xpath() to return namespace definitions

2014-07-11 Thread Ali Akbar
Greetings,

Attached modified patch to handle xmlCopyNode returning NULL. The patch is
larger because xmlerrcxt must be passed to xml_xmlnodetoxmltype (xmlerrcxt
is needed for calling xml_ereport).

>From libxml2 source, it turns out that the other cases that xmlCopyNode
will return NULL will not happen. So in this patch i assume that the only
case is memory exhaustion.

But i found some bug in libxml2's code, because we call xmlCopyNode with 1
as the second parameter, internally xmlCopyNode will call xmlStaticCopyNode
recursively via xmlStaticCopyNodeList. And xmlStaticCopyNodeList doesn't
check the return of xmlStaticCopyNode whether it's NULL. So if someday the
recursion returns NULL (maybe because of memory exhaustion), it will
SEGFAULT.

Knowing this but in libxml2 code, is this patch is still acceptable?

Thanks,
Ali Akbar
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
***
*** 141,149  static bool print_xml_decl(StringInfo buf, const xmlChar *version,
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
--- 141,151 
  			   pg_enc encoding, int standalone);
  static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg,
  		  bool preserve_whitespace, int encoding);
! static text *xml_xmlnodetoxmltype(xmlNodePtr cur,
! 	   PgXmlErrorContext *xmlerrcxt);
  static int xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate,
! 	   PgXmlErrorContext *xmlerrcxt);
  #endif   /* USE_LIBXML */
  
  static StringInfo query_to_xml_internal(const char *query, char *tablename,
***
*** 3595,3620  SPI_sql_row_to_xmlelement(int rownum, StringInfo result, char *tablename,
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur)
  {
  	xmltype*result;
  
  	if (cur->type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
  
  		buf = xmlBufferCreate();
  		PG_TRY();
  		{
! 			xmlNodeDump(buf, NULL, cur, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
  		}
  		PG_CATCH();
  		{
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
  		PG_END_TRY();
  		xmlBufferFree(buf);
  	}
  	else
--- 3597,3636 
   * return value otherwise)
   */
  static text *
! xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
  {
  	xmltype*result;
  
  	if (cur->type == XML_ELEMENT_NODE)
  	{
  		xmlBufferPtr buf;
+ 		xmlNodePtr cur_copy;
  
  		buf = xmlBufferCreate();
+ 
+ 		/* the result of xmlNodeDump won't contain namespace definitions
+ 		 * from parent nodes, but xmlCopyNode duplicates a node along
+ 		 * with its required namespace definitions.
+ 		 */
+ 		cur_copy = xmlCopyNode(cur, 1);
+ 
+ 		if (cur_copy == NULL)
+ 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
+ 		"could not serialize xml");
+ 
  		PG_TRY();
  		{
! 			xmlNodeDump(buf, NULL, cur_copy, 0, 1);
  			result = xmlBuffer_to_xmltype(buf);
  		}
  		PG_CATCH();
  		{
+ 			xmlFreeNode(cur_copy);
  			xmlBufferFree(buf);
  			PG_RE_THROW();
  		}
  		PG_END_TRY();
+ 		xmlFreeNode(cur_copy);
  		xmlBufferFree(buf);
  	}
  	else
***
*** 3656,3662  xml_xmlnodetoxmltype(xmlNodePtr cur)
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate)
  {
  	int			result = 0;
  	Datum		datum;
--- 3672,3679 
   */
  static int
  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
! 	   ArrayBuildState **astate,
! 	   PgXmlErrorContext *xmlerrcxt)
  {
  	int			result = 0;
  	Datum		datum;
***
*** 3678,3684  xml_xpathobjtoxmlarray(xmlXPathObjectPtr xpathobj,
  
  	for (i = 0; i < result; i++)
  	{
! 		datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i]));
  		*astate = accumArrayResult(*astate, datum,
     false, XMLOID,
     CurrentMemoryContext);
--- 3695,3702 
  
  	for (i = 0; i < result; i++)
  	{
! 		datum = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i],
! 	 xmlerrcxt));
  		*astate = accumArrayResult(*astate, datum,
     false, XMLOID,
     CurrentMemoryContext);
***
*** 3882,3890  xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
  		 * Extract the results as requested.
  		 */
  		if (res_nitems != NULL)
! 			*res_nitems = xml_xpathobjtoxmlarray(xpathobj, astate);
  		else
! 			(void) xml_xpathobjtoxmlarray(xpathobj, astate);
  	}
  	PG_CATCH();
  	{
--- 3900,3908 
  		 * Extract the results as requested.
  		 */
  		if (res_nitems != NULL)
! 			*res_nitems = xml_xpa

[HACKERS] Is there a way to temporarily disable a index

2014-07-11 Thread Benedikt Grundmann
That is it possible to tell the planner that index is off limits i.e. don't
ever generate a plan using it?

Rationale:  Schema changes on big tables.  I might have convinced myself /
strong beliefs that for all queries that I need to be fast the planner does
not need to use a given index (e.g. other possible plans are fast enough).
However if I just drop the index and it turns out I'm wrong I might be in a
world of pain because it might just take way to long to recreate the index.

I know that I can use pg_stat* to figure out if an index is used at all.
But in the presense of multiple indices and complex queries the planner
might prefer the index-to-be-dropped but the difference to the alternatives
available is immaterial.

The current best alternative we have is to test such changes on a testing
database that gets regularly restored from production.  However at least in
our case we simply don't know all possible queries (and logging all of them
is not an option).

Cheers,

Bene


Re: [HACKERS] tweaking NTUP_PER_BUCKET

2014-07-11 Thread Simon Riggs
On 9 July 2014 18:54, Tomas Vondra  wrote:

> (1) size the buckets for NTUP_PER_BUCKET=1 (and use whatever number
> of batches this requires)

If we start off by assuming NTUP_PER_BUCKET = 1, how much memory does
it save to recalculate the hash bucket at 10 instead?
Resizing sounds like it will only be useful of we only just overflow our limit.

If we release next version with this as a hardcoded change, my
understanding is that memory usage for hash joins will leap upwards,
even if the run time of queries reduces. It sounds like we need some
kind of parameter to control this. "We made it faster" might not be
true if we run this on servers that are already experiencing high
memory pressure.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add line number as prompt option to psql

2014-07-11 Thread Jeevan Chalke
Hi,

Found new issues with latest patch:


> Thank you for reviewing the patch with variable cases.
> I have revised the patch, and attached latest patch.
>
> > A:
> > Will you please explain the idea behind these changes ?
> I thought wrong about adding new to tail of query_buf.
> The latest patch does not change related to them.
>
> Thanks.


> > B:
> I added the condition of cur_line < 0.
>
>
A.

However, this introduced new bug. As I told, when editor number of lines
reaches INT_MAX it starts giving negative number. You tried overcoming this
issue by adding "< 0" check. But I guess you again fumbled in setting that
correctly. You are setting it to INT_MAX - 1. This enforces each new line
to show line number as INT_MAX - 1 which is incorrect.

postgres=# \set PROMPT1 '%/[%l]%R%# '
postgres[1]=# \set PROMPT2 '%/[%l]%R%# '
postgres[1]=# \e
postgres[2147483646]-# limit
postgres[2147483646]-# 1;
   relname
--
 pg_statistic
(1 row)

postgres[1]=# \e
postgres[2147483646]-# =
postgres[2147483646]-# '
postgres[2147483646]'# abc
postgres[2147483646]'# '
postgres[2147483646]-# ;
 relname
-
(0 rows)

postgres[1]=# select
relname
from
pg_class
where
relname
=
'
abc
'
;


Again to mimic that, I have simply intialized newline to INT_MAX - 2.
Please don't take me wrong, but it seems that your unit testing is not
enough. Above issue I discovered by doing exactly same steps I did in
reviewing previous patch. If you had tested your new patch with those steps
I guess you have caught that yourself.


B.

+ /* Calculate the line number */
+ if (scan_result != PSCAN_INCOMPLETE)
+ {
+ /* The one new line is always added to tail of query_buf
*/
+ newline = (newline != 0) ? (newline + 1) : 1;
+ cur_line += newline;
+ }

Here in above code changes, in any case you are adding 1 to newline. i.e.
when it is 0 you are setting it 1 (+1) and when > 0 you are setting nl + 1
(again +1).
So why can't you simply use"
if (scan_result != PSCAN_INCOMPLETE)
cur_line += (newline + 1);

Or better, why can't you initialize newline with 1 itself and then directly
assign cur_line with newline. That will eliminate above entire code block,
isn't it?
Or much better, simply get rid of newline, and use cur_line itself, will
this work well for you?


C. Typos:
1.
/* Count the number of new line for calculate ofline number */

Missing space between 'of' and 'line'.
However better improve that to something like (just suggestion):
"Count the number of new lines to correctly adjust current line number"

2.
/* Avoid cur_line and newline exceeds the INT_MAX */

You are saying avoid cur_line AND newline, but there is no adjustment for
newline in the code following the comment.

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] inherit support for foreign tables

2014-07-11 Thread Etsuro Fujita

(2014/07/11 15:50), Etsuro Fujita wrote:

(2014/07/10 18:12), Shigeru Hanada wrote:



IIUC, you mean that tableoid can't be retrieved when a foreign table
is accessed via parent table,


No.  What I want to say is that tableoid *can* be retrieved when a
foreign table is accessed via the parent table.


In fact, you can do that with v13 [1], but I plan to change the way of 
fixing (see [2]).


Thanks,

[1] http://www.postgresql.org/message-id/53b10914.2010...@lab.ntt.co.jp
[2] http://www.postgresql.org/message-id/53b2188b.4090...@lab.ntt.co.jp

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Simon Riggs
On 10 July 2014 00:13, Alvaro Herrera  wrote:
> Josh Berkus wrote:
>> On 07/09/2014 02:16 PM, Alvaro Herrera wrote:
>> > The way it works now, each opclass needs to have three support
>> > procedures; I've called them getOpers, maybeUpdateValues, and compare.
>> > (I realize these names are pretty bad, and will be changing them.)
>>
>> I kind of like "maybeUpdateValues".  Very ... NoSQL-ish.  "Maybe update
>> the values, maybe not."  ;-)
>
> :-)  Well, that's exactly what happens.  If we insert a new tuple into
> the table, and the existing summarizing tuple (to use Peter's term)
> already covers it, then we don't need to update the index tuple at all.
> What this name doesn't say is what values are to be maybe-updated.

There are lots of functions that maybe-do-things, that's just modular
programming. Not sure we need to prefix things with maybe to explain
that, otherwise we'd have maybeXXX everywhere.

More descriptive name would be MaintainIndexBounds() or similar.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-07-11 Thread Simon Riggs
On 9 July 2014 23:54, Peter Geoghegan  wrote:
> On Wed, Jul 9, 2014 at 2:16 PM, Alvaro Herrera  
> wrote:
>> All this being said, I'm sticking to the name "Minmax indexes".  There
>> was a poll in pgsql-advocacy
>> http://www.postgresql.org/message-id/53a0b4f8.8080...@agliodbs.com
>> about a new name, but there were no suggestions supported by more than
>> one person.  If a brilliant new name comes up, I'm open to changing it.
>
> How about "summarizing indexes"? That seems reasonably descriptive.

-1 for another name change. That boat sailed some months back.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers