Re: [HACKERS] issue log message to suggest VACUUM FULL if a table is nearly empty

2014-03-26 Thread Robert Haas
On Sun, Mar 9, 2014 at 5:28 PM, Wang, Jing ji...@fast.au.fujitsu.com wrote:
 Enclosed is the patch to implement the requirement that issue log message to
 suggest VACUUM FULL if a table is nearly empty.

 The requirement comes from the Postgresql TODO list.

 If the relpage of the table  RELPAGES_VALUES_THRESHOLD(default 1000) then
 the table is considered to be large enough.

 If the free_space/total_space  FREESPACE_PERCENTAGE_THRESHOLD(default 0.5)
 then the table is considered to have large numbers of unused rows.

I'm not sure that we want people to automatically VF a table just
because it's 2x bloated.  Doesn't it depend on the table size?  And in
sort of a funny way, too, like, if the tables is small, 2x bloat is
not wasting much disk space, but getting rid of it is probably easy,
so maybe you should - but if the table is a terabyte, even 50% bloat
might be pretty intolerable, but whether it makes sense to try to get
rid of it depends on your access pattern.  I'm not really too sure
whether it makes sense to try to make an automated recommendation
here, or maybe only in egregious cases.

 The free_space is calculated by reading the details from the FSM pages. This
 may increase the IO, but expecting very less FSM pages thus it shouldn't
 cause

The free space map can show more or less than the real amount of free
space, can't it?  I worry about making a recommendation that might
turn out to be wildly inaccurate...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New parameter RollbackError to control rollback behavior on error

2014-03-26 Thread Michael Paquier
Hi all,

The behavior of rollback when an error occurs on an handle is
controlled by extending Protocol with $PROTNUM-[0|1|2] where:
- 0 = let the application handle rollbacks
- 1 = rollback whole transaction when an error occurs
- 2 = rollback only statement that failed
Using such an extension is somewhat awkward as a single string is used
for two settings... The proposed attached patch adds a new parameter
called RollbackError that allows to control the behavior of rollback
on error with a different parameter.

For backward-compatibility purposes, this patch does not break the old
grammar of Protocol: it just gives the priority to RollbackError if
both Protocol and RollbackError are set for a connection. Regression
tests to test RollbackError and combinations of RollbackError/Protocol
are added in the patch in the existing test error-rollback (which has
needed some refactoring, older tests are not impacted). Docs are
included as well.

I thought first about including that in my cleanup work for 9.4, but
as this actually does not break the driver it may be worth adding it
directly to master, explaining the patch attached here. Comments
welcome. Note that if there are objections I do not mind adding that
for the work that would be merged later to 9.4 builds.

Regards,
-- 
Michael
diff --git a/connection.c b/connection.c
index 8601cb7..9407619 100644
--- a/connection.c
+++ b/connection.c
@@ -288,7 +288,8 @@ CC_conninfo_init(ConnInfo *conninfo, UInt4 option)
conninfo-bytea_as_longvarbinary = -1;
conninfo-use_server_side_prepare = -1;
conninfo-lower_case_identifier = -1;
-   conninfo-rollback_on_error = -1;
+   conninfo-rollback_error_value = -1;
+   conninfo-is_rollback_error = -1;
conninfo-force_abbrev_connstr = -1;
conninfo-bde_environment = -1;
conninfo-fake_mss = -1;
@@ -341,7 +342,8 @@ CC_copy_conninfo(ConnInfo *ci, const ConnInfo *sci)
CORR_VALCPY(bytea_as_longvarbinary);
CORR_VALCPY(use_server_side_prepare);
CORR_VALCPY(lower_case_identifier);
-   CORR_VALCPY(rollback_on_error);
+   CORR_VALCPY(rollback_error_value);
+   CORR_VALCPY(is_rollback_error);
CORR_VALCPY(force_abbrev_connstr);
CORR_VALCPY(bde_environment);
CORR_VALCPY(fake_mss);
@@ -2733,7 +2735,7 @@ CC_send_query_append(ConnectionClass *self, const char 
*query, QueryInfo *qi, UD
BOOLignore_abort_on_conn = ((flag  IGNORE_ABORT_ON_CONN) != 0),
create_keyset = ((flag  CREATE_KEYSET) != 0),
issue_begin = ((flag  GO_INTO_TRANSACTION) != 0  
!CC_is_in_trans(self)),
-   rollback_on_error, query_rollback, end_with_commit;
+   rollback_error_value, query_rollback, end_with_commit;
 
const char  *wq;
charswallow, *ptr;
@@ -2844,13 +2846,13 @@ CC_send_query_append(ConnectionClass *self, const char 
*query, QueryInfo *qi, UD
return res;
}
 
-   rollback_on_error = (flag  ROLLBACK_ON_ERROR) != 0;
+   rollback_error_value = (flag  ROLLBACK_ON_ERROR) != 0;
end_with_commit = (flag  END_WITH_COMMIT) != 0;
 #definereturn DONT_CALL_RETURN_FROM_HERE???
consider_rollback = (issue_begin || (CC_is_in_trans(self)  
!CC_is_in_error_trans(self)) || strnicmp(query, begin, 5) == 0);
-   if (rollback_on_error)
-   rollback_on_error = consider_rollback;
-   query_rollback = (rollback_on_error  !end_with_commit  
PG_VERSION_GE(self, 8.0));
+   if (rollback_error_value)
+   rollback_error_value = consider_rollback;
+   query_rollback = (rollback_error_value  !end_with_commit  
PG_VERSION_GE(self, 8.0));
if (!query_rollback  consider_rollback  !end_with_commit)
{
if (stmt)
@@ -3341,7 +3343,7 @@ cleanup:
CC_on_abort(self, CONN_DEAD);
retres = NULL;
}
-   if (rollback_on_error  CC_is_in_trans(self)  
!discard_next_savepoint)
+   if (rollback_error_value  CC_is_in_trans(self)  
!discard_next_savepoint)
{
charcmd[64];
 
diff --git a/connection.h b/connection.h
index e92ff61..453ed5b 100644
--- a/connection.h
+++ b/connection.h
@@ -287,7 +287,16 @@ typedef struct
chardatabase[MEDIUM_REGISTRY_LEN];
charusername[MEDIUM_REGISTRY_LEN];
pgNAME  password;
+
+   /*
+* Protocol number, this can be used as well to define the behavior
+* of rollback in case of error.
+*/
charprotocol[SMALL_REGISTRY_LEN];
+
+   /* Value of RollbackError */
+   charrollback_error[SMALL_REGISTRY_LEN];
+
charport[SMALL_REGISTRY_LEN];
charsslmode[16];
charonlyread[SMALL_REGISTRY_LEN];
@@ -308,7 +317,19 @@ typedef struct
signed char bytea_as_longvarbinary;
signed char 

[HACKERS] Conditional jump or move depends on uninitialised value(s) within tsginidx.c

2014-03-26 Thread Peter Geoghegan
I see this when I run the regression tests with valgrind (memcheck) on
master's tip:

2014-03-25 22:38:40.850 PDT 31570 LOG:  statement: SET enable_seqscan=OFF;
2014-03-25 22:38:40.859 PDT 31570 LOG:  statement: SELECT count(*)
FROM test_tsvector WHERE a @@ 'wr|qh';
==31570== Conditional jump or move depends on uninitialised value(s)
==31570==at 0x8A58F4: gin_tsquery_triconsistent (tsginidx.c:329)
==31570==by 0x8F8659: FunctionCall7Coll (fmgr.c:1471)
==31570==by 0x488F20: directTriConsistentFn (ginlogic.c:97)
==31570==by 0x480026: keyGetItem (ginget.c:1094)
==31570==by 0x480191: scanGetItem (ginget.c:1182)
==31570==by 0x481A11: gingetbitmap (ginget.c:1791)
==31570==by 0x8F7F7E: FunctionCall2Coll (fmgr.c:1324)
==31570==by 0x4C8406: index_getbitmap (indexam.c:651)
==31570==by 0x663A46: MultiExecBitmapIndexScan (nodeBitmapIndexscan.c:89)
==31570==by 0x64C516: MultiExecProcNode (execProcnode.c:550)
==31570==by 0x662944: BitmapHeapNext (nodeBitmapHeapscan.c:104)
==31570==by 0x657739: ExecScanFetch (execScan.c:82)
==31570==  Uninitialised value was created by a stack allocation
==31570==at 0x8A585B: gin_tsquery_triconsistent (tsginidx.c:299)

It looks like a recheck stack variable isn't every being set within
TS_execute_ternary() (which has a pointer to that variable on the
stack) - ultimately, the checkcondition_gin() callback will set the
flag, but only to 'true' (iff that's appropriate). When that doesn't
happen, it just contains a garbage uninitialized value, and yet
evidently control flow depends on that value.

I propose that we initialize the variable to false, since there
appears to be a tacit assumption that that is already the case, as
with the plain consistent GIN support function in the same file.
Attached patch does just that.

-- 
Peter Geoghegan
*** a/src/backend/utils/adt/tsginidx.c
--- b/src/backend/utils/adt/tsginidx.c
*** gin_tsquery_triconsistent(PG_FUNCTION_AR
*** 305,311 
  	/* int32	nkeys = PG_GETARG_INT32(3); */
  	Pointer*extra_data = (Pointer *) PG_GETARG_POINTER(4);
  	GinLogicValue res = GIN_FALSE;
! 	bool		recheck;
  
  	/* The query requires recheck only if it involves weights */
  	if (query-size  0)
--- 305,311 
  	/* int32	nkeys = PG_GETARG_INT32(3); */
  	Pointer*extra_data = (Pointer *) PG_GETARG_POINTER(4);
  	GinLogicValue res = GIN_FALSE;
! 	bool		recheck = false;
  
  	/* The query requires recheck only if it involves weights */
  	if (query-size  0)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Still something fishy in the fastpath lock stuff

2014-03-26 Thread Heikki Linnakangas

On 03/26/2014 06:59 AM, Robert Haas wrote:

On Tue, Mar 25, 2014 at 11:58 AM, Tom Lane t...@sss.pgh.pa.us wrote:

Robert Haas robertmh...@gmail.com writes:

I really think we
should change that rule; it leads to ugly code, and bugs.


No doubt, but are we prepared to believe that we have working compiler
barriers other than volatile?  (Which, I will remind you, is in the C
standard.  Unlike compiler barriers.)


I'm prepared to believe that it would take some time to be certain
that we've got that right on all compilers we support.  The most
common ones are easily dealt with, I think, but it might be harder to
verify the behavior on more obscure compilers.  But I'd rather deal
with bullet-proofing SpinLockAcquire() and SpinLockRelease() *in
general* than deal with bullet-proofing every call site individually,
which is what we have to do right now.


+1. To support porting to new compilers, we can fall back to e.g calling 
a dummy function through a function pointer, if we don't have a proper 
implementation.


Aside from the correctness issue: A while ago, while working on the 
xloginsert slots stuff, I looked at the assembler that gcc generated for 
ReserverXLogInsertLocation(). That function is basically:


SpinLockAcquire()
modify and read a few variables in shared memory
SpinLockRelease()
fairly expensive calculations based on values read

Gcc decided to move the release of the lock so that it became:

SpinLockAcquire()
modify and read a few variables in shared memory
fairly expensive calculations based on values read
SpinLockRelease()

I went through some effort while writing the patch to make sure the 
spinlock is held for as short duration as possible. But gcc undid that 
to some extent :-(.


I tried to put a compiler barrier there and run some short performance 
tests, but it didn't make any noticeable difference. So it doesn't seem 
matter in practice - maybe the CPU reorders things anyway, or the 
calculations are not expensive enough to matter after all.


I just looked at the assembler generated for LWLockAcquire, and a 
similar thing is happening there. The code is:


...
641 /* We are done updating shared state of the lock itself. */
642 SpinLockRelease(lock-mutex);
643
644 TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(l), T_ID(l), mode);
645
646 /* Add lock to list of locks held by this backend */
647 held_lwlocks[num_held_lwlocks++] = l;
...

gcc reordered part of the held_lwlocks assignment after releasing the 
spinlock:


.L21:
.loc 1 647 0
movslq  num_held_lwlocks(%rip), %rax
addq$16, %r12
.LVL23:
.loc 1 652 0
testl   %ebx, %ebx
.loc 1 642 0
movb$0, (%r14)  --- SpinLockRelease
.loc 1 652 0
leal-1(%rbx), %r13d
.loc 1 647 0
leal1(%rax), %edx
movq%r14, held_lwlocks(,%rax,8)
.LVL24:
movl%edx, num_held_lwlocks(%rip)

The held_lwlocks assignment was deliberately put outside the 
spinlock-protected section, to minimize the time the spinlock is held, 
but the compiler moved some of it back in: the basic block .L21.


A compiler barrier would avoid that kind of reordering. Not using 
volatile would also allow the compiler to reorder and optimize the 
instructions *within* the spinlock-protected block, which might shave a 
few instructions while a spinlock is held. Granted, spinlock-protected 
blocks are small so there isn't much room for optimization, but still.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-03-26 Thread Kyotaro HORIGUCHI
Hello, I could see reparameterize for foreign path to work
effectively thanks to your advice. The key point was setting
use_remote_estimate to false and existence of another child to
get parameterized path in prior stages.

The overall patch was applied on HEAD and compiled cleanly except
for a warning.

 analyze.c: In function ‘acquire_inherited_sample_rows’:
 analyze.c:1461: warning: unused variable ‘saved_rel’


As for postgres-fdw, the point I felt uneasy in previous patch
was fixed already:) - which was coping with omission of
ReparameterizeForeignPath.

And for file-fdw, you made a change to re-create foreignscan node
instead of the previous copy-and-modify. Is the reason you did it
that you considered the cost of 're-checking whether to
selectively perform binary conversion' is low enough? Or other
reasons?

Finally, although I insist the necessity of the warning for child
foreign tables on alter table, if you belive it to be put off,
I'll compromise by putting a note to CF-app that last judgement
is left to committer.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Conditional jump or move depends on uninitialised value(s) within tsginidx.c

2014-03-26 Thread Heikki Linnakangas

On 03/26/2014 09:21 AM, Peter Geoghegan wrote:

It looks like a recheck stack variable isn't every being set within
TS_execute_ternary() (which has a pointer to that variable on the
stack) - ultimately, the checkcondition_gin() callback will set the
flag, but only to 'true' (iff that's appropriate). When that doesn't
happen, it just contains a garbage uninitialized value, and yet
evidently control flow depends on that value.

I propose that we initialize the variable to false, since there
appears to be a tacit assumption that that is already the case, as
with the plain consistent GIN support function in the same file.
Attached patch does just that.


Yep, fixed. Thanks!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New parameter RollbackError to control rollback behavior on error

2014-03-26 Thread Heikki Linnakangas

On 03/26/2014 08:39 AM, Michael Paquier wrote:

Hi all,

The behavior of rollback when an error occurs on an handle is
controlled by extending Protocol with $PROTNUM-[0|1|2] where:
- 0 = let the application handle rollbacks
- 1 = rollback whole transaction when an error occurs
- 2 = rollback only statement that failed
Using such an extension is somewhat awkward as a single string is used
for two settings... The proposed attached patch adds a new parameter
called RollbackError that allows to control the behavior of rollback
on error with a different parameter.


Great!

Since we're designing a new user interface for this, let's try to make 
it as user-friendly as possible:


* I'm not too fond of the RollbackError name. It sounds like an error 
while rolling back. I googled around and found out that DataDirect's 
proprietary driver has the same option, and they call it 
TransactionErrorBehavior. See 
http://blogs.datadirect.com/2013/07/solution-unexpected-postgres-current-transaction-aborted-error.html.


* Instead of using 0-2 as the values, let's give them descriptive names. 
Something like none, RollbackTransaction, RollbackStatement. 
(actually, we'll probably want to also allow the integers, to keep the 
connection string short, as there is a size limit on that)



I thought first about including that in my cleanup work for 9.4, but
as this actually does not break the driver it may be worth adding it
directly to master, explaining the patch attached here. Comments
welcome. Note that if there are objections I do not mind adding that
for the work that would be merged later to 9.4 builds.


Yeah, let's get this into the master branch before your big 9.4 cleanup 
work.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New parameter RollbackError to control rollback behavior on error

2014-03-26 Thread Michael Paquier
On Wed, Mar 26, 2014 at 3:39 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 Hi all,

 The behavior of rollback when an error occurs on an handle is
 controlled by extending Protocol with $PROTNUM-[0|1|2]...
My apologies. This message was sent to the wrong mailing list and was
dedicated to odbc.
Once again sorry for that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New parameter RollbackError to control rollback behavior on error

2014-03-26 Thread Hiroshi Inoue

Hi Michael,

Isn't it an ODBC issue?

regards,
Hiroshi Inoue

(2014/03/26 15:39), Michael Paquier wrote:

Hi all,

The behavior of rollback when an error occurs on an handle is
controlled by extending Protocol with $PROTNUM-[0|1|2] where:
- 0 = let the application handle rollbacks
- 1 = rollback whole transaction when an error occurs
- 2 = rollback only statement that failed
Using such an extension is somewhat awkward as a single string is used
for two settings... The proposed attached patch adds a new parameter
called RollbackError that allows to control the behavior of rollback
on error with a different parameter.

For backward-compatibility purposes, this patch does not break the old
grammar of Protocol: it just gives the priority to RollbackError if
both Protocol and RollbackError are set for a connection. Regression
tests to test RollbackError and combinations of RollbackError/Protocol
are added in the patch in the existing test error-rollback (which has
needed some refactoring, older tests are not impacted). Docs are
included as well.

I thought first about including that in my cleanup work for 9.4, but
as this actually does not break the driver it may be worth adding it
directly to master, explaining the patch attached here. Comments
welcome. Note that if there are objections I do not mind adding that
for the work that would be merged later to 9.4 builds.

Regards,



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Only first XLogRecData is visible to rm_desc with WAL_DEBUG

2014-03-26 Thread Heikki Linnakangas

On 03/25/2014 08:05 PM, Andres Freund wrote:

On 2014-03-25 10:49:54 -0700, Robert Haas wrote:

On Tue, Mar 25, 2014 at 12:30 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

I've found WAL_DEBUG quite useful in the past, when working on
scalability, and have indeed wished for it to be
compiled-in-by-default.

I don't know whether I'm the only one, though.


You are not.  I would rather have it fixed than removed, if possible.  I
don't really care too much about getting a performance hit to palloc the
records, really; being able to actually read what's happening is much
more useful.


I find it useful too, but I think pg_xlogdump can serve the same purpose.

One thing missing from pg_xlogdump is the capability to keep tracking the
inserted WAL, instead of dumping to the end of WAL and stopping there. If we
add an option to pg_xlogdump to poll the WAL instead of bailing out at an
error, I think it's a good replacement.


Well, one nice thing about wal_debug is that the WAL is at that point
still associated with the session that generated it.  But I grant
that's not a huge issue.  How much work are we talking about to fix
this, though?


It's not entirely trivial, we'd essentially need to copy the loop in
CopyXLogRecordToWAL(). And do so while still holding the
WALInsertLock().


Oh, no, there's no need to do it while holding WALInsertLock. It's quite 
trivial, actually, see attached. So it's not difficult to fix this if we 
want to.


I just committed a patch to add a -f/--follow flag to pg_xlogdump. That 
seems very handy, even if we decide to fix the wal_debug code. It 
doesn't require compiling with wal_debug, and pg_xlogdump allows 
filtering by rmgr id etc.


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index b573185..3b28905 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -1262,8 +1262,20 @@ begin:;
 		xlog_outrec(buf, rechdr);
 		if (rdata-data != NULL)
 		{
+			StringInfoData recordbuf;
+
+			/*
+			 * We have to piece together the WAL record data from the
+			 * XLogRecData entries, so that we can pass it to the rm_desc
+			 * function as one contiguous chunk.
+			 */
+			initStringInfo(recordbuf);
+			for (;rdata != NULL; rdata = rdata-next)
+appendBinaryStringInfo(recordbuf, rdata-data, rdata-len);
+
 			appendStringInfoString(buf,  - );
-			RmgrTable[rechdr-xl_rmid].rm_desc(buf, rechdr-xl_info, rdata-data);
+			RmgrTable[rechdr-xl_rmid].rm_desc(buf, rechdr-xl_info, recordbuf.data);
+			pfree(recordbuf.data);
 		}
 		elog(LOG, %s, buf.data);
 		pfree(buf.data);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Store Extension Options

2014-03-26 Thread Fabrízio de Royes Mello
On Thu, Mar 13, 2014 at 5:35 PM, Robert Haas robertmh...@gmail.com wrote:

 Well, I'm not sure that's really any big deal, but I'm not wedded to
 the label idea.  My principal concern is: I'm opposed to allowing
 unvalidated options into the database.  I think it should be a
 requirement that if the validator can't be found and called, then the
 reloption is no good and you just reject it.  So, if we go with the
 reloptions route, I'd want to see pg_register_option_namespace go away
 in favor of some solution that preserves that property.  One thing I
 kind of like about the LABEL approach is that it applies to virtually
 every object type, meaning that we might not have to repeat this
 discussion when (as seems inevitable) people want to be able to do
 things to schemas or tablespaces or roles.  But I don't hold that
 position so strongly as to be unwilling to entertain any other
 options.


During the last days I thought about this discussion and to use SECLABELs
sounds weird to me. Here in Brazil we call this kind of thing 'gambiarra'.
Because we'll try to use something that born with a very well defined
purpose to another purpose. Personally I don't like that.

If we think more about SECLABELs, in a more abstract way, it is just a
'property' about database objects. And the same is COMMENTs. Both SECLABEL
and COMMENT provide a way to store something about objects.

Maybe we can think about a new object on top of COMMENT and SECLABELs. An
object called 'PROPERTY' or 'OPTION'. And COMMENTs and SECLABELs can be
just a kind of this new object, and we must introduce a new kind callled
'CUSTOM'.

I thought about this because representation (syntax) of SECLABELs and
COMMENTs are similar. They describe something about objects, they have
local and shared catalog.

This is just something that occurred to me. Maybe I'm completely wrong. Or
not!

Comments?

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Only first XLogRecData is visible to rm_desc with WAL_DEBUG

2014-03-26 Thread Robert Haas
On Wed, Mar 26, 2014 at 5:08 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 Oh, no, there's no need to do it while holding WALInsertLock. It's quite
 trivial, actually, see attached. So it's not difficult to fix this if we
 want to.

Well is there any reason not to just commit that patch?  I mean, it
seems odd to rip this out if the fix is straightforward and already
written.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Duplicated row after promote in synchronous streaming replication

2014-03-26 Thread Dang Minh Huong
Hi all,

I'm using PostgreSQL 9.1.10 for my HA project and have found this problem.

I did (multiple times) the following sequence in my primary/standby synchronous 
replication environment,

1. Update rows in a table (which have primary key constraint column) in active 
DB

2. Stop active DB

3. Promote standby DB

4. Confirm the updated table in promoted standby (new primary) and found that, 
there's a duplicate updated row (number of row was increased).

I think it is a replication bug but wonder if it was fixed yet. 
Can somebody help me?

I'm not yet confirm PostgreSQL source, but here is my investigation result.

Updated table before promoted were HOT update (index file was not changed).

After promote i continue update that duplicated row (it returned two row 
updated), and confirm with pg_filedump, i found the duplicated row and only one 
is related to primary key index constraint.

Compare with old active DB, i saw that after promote line pointer of updated 
row (duplicated row) is broken into two line pointer, the new one is related to 
primary index constraint and the other is not related to. Some thing like 
below, 

Old active DB:
ctid(0,3)-ctid(0,6)-ctid(0,7)

New active DB (after promote and update):
ctid(0,3)-ctid(0,9)
ctid(0,7)-ctid(0,10)

ctid(0,10) is not related to primary key index constraint.

Is something was wrong in redo log in standby DB? Or line pointer in HOT update 
feature?

Thanks and best regards,
Huong,

Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-03-26 Thread Bruce Momjian
On Mon, Mar 24, 2014 at 09:07:07PM -0400, Bruce Momjian wrote:
  In the INDEX case, should the output mention specifically which index
  is being considered?
 
 Ah, good idea.  Updated patch attached.  The output is now:
 
   test= \d+ test
Table public.test
Column |  Type   | Modifiers | Storage | Stats target | Description
   +-+---+-+--+-
x  | integer | not null  | plain   |  |
   Indexes:
   test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY
   i_test2 btree (x)
 --   Replica Identity: USING INDEX test_pkey
   Has OIDs: no
 
 However, now that I look at it, it seems redundant as REPLICA IDENTITY
 is already marked on the actual index.  Ideas?

Hearing nothing, I have gone back to the previous patch that just marks
replica identity as USING INDEX;  applied patch attached.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
new file mode 100644
index a194ce7..21bbdf8
*** a/src/bin/psql/describe.c
--- b/src/bin/psql/describe.c
*** describeOneTableDetails(const char *sche
*** 2345,2358 
  			printTableAddFooter(cont, buf.data);
  		}
  
! 		if ((tableinfo.relkind == 'r' || tableinfo.relkind == 'm') 
! 			tableinfo.relreplident != 'd'  tableinfo.relreplident != 'i')
  		{
  			const char *s = _(Replica Identity);
  
  			printfPQExpBuffer(buf, %s: %s,
  			  s,
! 			  tableinfo.relreplident == 'n' ? NOTHING : FULL);
  			printTableAddFooter(cont, buf.data);
  		}
  
--- 2345,2363 
  			printTableAddFooter(cont, buf.data);
  		}
  
! 		if (verbose  (tableinfo.relkind == 'r' || tableinfo.relkind == 'm') 
! 			strcmp(schemaname, pg_catalog) != 0)
  		{
  			const char *s = _(Replica Identity);
  
  			printfPQExpBuffer(buf, %s: %s,
  			  s,
! 			  tableinfo.relreplident == 'd' ? DEFAULT :
! 			  tableinfo.relreplident == 'f' ? FULL :
! 			  tableinfo.relreplident == 'i' ? USING INDEX :
! 			  tableinfo.relreplident == 'n' ? NOTHING :
! 			  ???);
! 
  			printTableAddFooter(cont, buf.data);
  		}
  
diff --git a/src/test/regress/expected/create_table_like.out b/src/test/regress/expected/create_table_like.out
new file mode 100644
index 5f29b39..feb6c93
*** a/src/test/regress/expected/create_table_like.out
--- b/src/test/regress/expected/create_table_like.out
*** CREATE TABLE ctlt12_storage (LIKE ctlt1
*** 115,120 
--- 115,121 
   a  | text | not null  | main |  | 
   b  | text |   | extended |  | 
   c  | text |   | external |  | 
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  CREATE TABLE ctlt12_comments (LIKE ctlt1 INCLUDING COMMENTS, LIKE ctlt2 INCLUDING COMMENTS);
*** CREATE TABLE ctlt12_comments (LIKE ctlt1
*** 125,130 
--- 126,132 
   a  | text | not null  | extended |  | A
   b  | text |   | extended |  | B
   c  | text |   | extended |  | C
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  CREATE TABLE ctlt1_inh (LIKE ctlt1 INCLUDING CONSTRAINTS INCLUDING COMMENTS) INHERITS (ctlt1);
*** NOTICE:  merging constraint ctlt1_a_che
*** 140,145 
--- 142,148 
  Check constraints:
  ctlt1_a_check CHECK (length(a)  2)
  Inherits: ctlt1
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  SELECT description FROM pg_description, pg_constraint c WHERE classoid = 'pg_constraint'::regclass AND objoid = c.oid AND c.conrelid = 'ctlt1_inh'::regclass;
*** Check constraints:
*** 162,167 
--- 165,171 
  ctlt3_a_check CHECK (length(a)  5)
  Inherits: ctlt1,
ctlt3
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  CREATE TABLE ctlt13_like (LIKE ctlt3 INCLUDING CONSTRAINTS INCLUDING COMMENTS INCLUDING STORAGE) INHERITS (ctlt1);
*** Check constraints:
*** 177,182 
--- 181,187 
  ctlt1_a_check CHECK (length(a)  2)
  ctlt3_a_check CHECK (length(a)  5)
  Inherits: ctlt1
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  SELECT description FROM pg_description, pg_constraint c WHERE classoid = 'pg_constraint'::regclass AND objoid = c.oid AND c.conrelid = 'ctlt13_like'::regclass;
*** Indexes:
*** 198,203 
--- 203,209 
  ctlt_all_expr_idx btree ((a || b))
  Check constraints:
  ctlt1_a_check CHECK (length(a)  2)
+ Replica Identity: DEFAULT
  Has OIDs: no
  
  SELECT c.relname, objsubid, description FROM pg_description, pg_index i, pg_class c WHERE classoid = 'pg_class'::regclass AND objoid = i.indexrelid AND c.oid = i.indexrelid AND i.indrelid = 'ctlt_all'::regclass ORDER BY c.relname, objsubid;
diff --git a/src/test/regress/expected/inherit.out b/src/test/regress/expected/inherit.out
new 

Re: [HACKERS] Duplicated row after promote in synchronous streaming replication

2014-03-26 Thread Thom Brown
On 26 March 2014 15:08, Dang Minh Huong kakalo...@gmail.com wrote:
 Hi all,

 I'm using PostgreSQL 9.1.10 for my HA project and have found this problem.

 I did (multiple times) the following sequence in my primary/standby
 synchronous replication environment,

 1. Update rows in a table (which have primary key constraint column) in
 active DB

 2. Stop active DB

 3. Promote standby DB

 4. Confirm the updated table in promoted standby (new primary) and found
 that, there's a duplicate updated row (number of row was increased).

 I think it is a replication bug but wonder if it was fixed yet.
 Can somebody help me?

 I'm not yet confirm PostgreSQL source, but here is my investigation result.

 Updated table before promoted were HOT update (index file was not changed).

 After promote i continue update that duplicated row (it returned two row
 updated), and confirm with pg_filedump, i found the duplicated row and only
 one is related to primary key index constraint.

 Compare with old active DB, i saw that after promote line pointer of updated
 row (duplicated row) is broken into two line pointer, the new one is related
 to primary index constraint and the other is not related to. Some thing like
 below,

 Old active DB:
 ctid(0,3)-ctid(0,6)-ctid(0,7)

 New active DB (after promote and update):
 ctid(0,3)-ctid(0,9)
 ctid(0,7)-ctid(0,10)

 ctid(0,10) is not related to primary key index constraint.

 Is something was wrong in redo log in standby DB? Or line pointer in HOT
 update feature?

It sounds like you're hitting a bug that was introduced in that
exact minor version, and has since been fixed:

http://www.postgresql.org/docs/9.1/static/release-9-1-11.html

You should update to the latest minor version, then re-base your
standbys from the primary.

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-03-26 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Mon, Mar 24, 2014 at 09:07:07PM -0400, Bruce Momjian wrote:
   In the INDEX case, should the output mention specifically which index
   is being considered?
  
  Ah, good idea.  Updated patch attached.  The output is now:
  
  test= \d+ test
   Table public.test
   Column |  Type   | Modifiers | Storage | Stats target | Description
  +-+---+-+--+-
   x  | integer | not null  | plain   |  |
  Indexes:
  test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY
  i_test2 btree (x)
  -- Replica Identity: USING INDEX test_pkey
  Has OIDs: no
  
  However, now that I look at it, it seems redundant as REPLICA IDENTITY
  is already marked on the actual index.  Ideas?
 
 Hearing nothing, I have gone back to the previous patch that just marks
 replica identity as USING INDEX;  applied patch attached.

Not opposed to this, but it seems a bit strange; REPLICA IDENTITY is a
property of the table, not of any individual index.  I think we should
lose the token in the Indexes section.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-03-26 Thread Bruce Momjian
On Wed, Mar 26, 2014 at 12:20:07PM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
  On Mon, Mar 24, 2014 at 09:07:07PM -0400, Bruce Momjian wrote:
In the INDEX case, should the output mention specifically which index
is being considered?
   
   Ah, good idea.  Updated patch attached.  The output is now:
   
 test= \d+ test
  Table public.test
  Column |  Type   | Modifiers | Storage | Stats target | Description
 +-+---+-+--+-
  x  | integer | not null  | plain   |  |
 Indexes:
 test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY
 i_test2 btree (x)
   --   Replica Identity: USING INDEX test_pkey
 Has OIDs: no
   
   However, now that I look at it, it seems redundant as REPLICA IDENTITY
   is already marked on the actual index.  Ideas?
  
  Hearing nothing, I have gone back to the previous patch that just marks
  replica identity as USING INDEX;  applied patch attached.
 
 Not opposed to this, but it seems a bit strange; REPLICA IDENTITY is a
 property of the table, not of any individual index.  I think we should
 lose the token in the Indexes section.

That is an interesting idea.  It would mean that \d table would not show
anything about replica identity, because right now it does:

test= \d test
 Table public.test
 Column |  Type   | Modifiers
+-+---
 x  | integer | not null
Indexes:
test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY

That seems logical.  So under the new plan, \d would show:

test= \d test
 Table public.test
 Column |  Type   | Modifiers
+-+---
 x  | integer | not null
Indexes:
test_pkey PRIMARY KEY, btree (x)

and \d+ would show:

test= \d+ test
 Table public.test
 Column |  Type   | Modifiers | Storage | Stats target | Description
+-+---+-+--+-
 x  | integer | not null  | plain   |  |
Indexes:
test_pkey PRIMARY KEY, btree (x)
Replica Identity: USING INDEX test_pkey
Has OIDs: no

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small regression adjustment

2014-03-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 It occurred to me after having to change I think 9 files to clean up a 
 small mess in the jsonb regression tests the other day that we might 
 usefully expose the inputdir and outputdir to psql as variables when 
 running pg_regress. Then we might be able to do thing like this, quite 
 independent of location:

 \set datafile :inputdir/data/mystuff.data
 COPY mytable FROM :'datafile';

If we could get rid of the run-time-generated-test-file facility
altogether, I could get excited about this; but just getting rid of
the COPY special cases isn't enough for that.  Looking at
convert_sourcefiles_in, it seems like we'd also need solutions for
these dynamic substitutions:

replace_string(line, @testtablespace@, testtablespace);
replace_string(line, @libdir@, dlpath);
replace_string(line, @DLSUFFIX@, DLSUFFIX);

At least this one seems rather difficult to fix in this fashion:

output/create_function_1.source:83:ERROR:  could not find function 
nosuchsymbol in file @libdir@/regress@DLSUFFIX@

(I'm a bit inclined to think that we could dispense with @DLSUFFIX@
altogether; explicit use of the platform's library suffix has been
deprecated for at least a decade.  But the others are harder.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-03-26 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Wed, Mar 26, 2014 at 12:20:07PM -0300, Alvaro Herrera wrote:

  Not opposed to this, but it seems a bit strange; REPLICA IDENTITY is a
  property of the table, not of any individual index.  I think we should
  lose the token in the Indexes section.
 
 That is an interesting idea.  It would mean that \d table would not show
 anything about replica identity, because right now it does:
 
   test= \d test
Table public.test
Column |  Type   | Modifiers
   +-+---
x  | integer | not null
   Indexes:
   test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY
 
 That seems logical.

Hmm.  It seems to me that to make this more compact we could keep the
current token in the index line if it's INDEX, and not display the
Replica Identity: line at all; and if it's something other than index
and different from the default value, then print Replica Identity in
both \d and \d+.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Useless Replica Identity: NOTHING noise from psql \d

2014-03-26 Thread Bruce Momjian
On Wed, Mar 26, 2014 at 12:53:32PM -0300, Alvaro Herrera wrote:
 Bruce Momjian wrote:
  On Wed, Mar 26, 2014 at 12:20:07PM -0300, Alvaro Herrera wrote:
 
   Not opposed to this, but it seems a bit strange; REPLICA IDENTITY is a
   property of the table, not of any individual index.  I think we should
   lose the token in the Indexes section.
  
  That is an interesting idea.  It would mean that \d table would not show
  anything about replica identity, because right now it does:
  
  test= \d test
   Table public.test
   Column |  Type   | Modifiers
  +-+---
   x  | integer | not null
  Indexes:
  test_pkey PRIMARY KEY, btree (x) REPLICA IDENTITY
  
  That seems logical.
 
 Hmm.  It seems to me that to make this more compact we could keep the
 current token in the index line if it's INDEX, and not display the
 Replica Identity: line at all; and if it's something other than index
 and different from the default value, then print Replica Identity in
 both \d and \d+.

OK. Tom's original complaint was about showing the default state in \d:

http://www.postgresql.org/message-id/12303.1387038...@sss.pgh.pa.us

though that example was for an odd case where a system table didn't use
the default value.

The attached patch matches your suggestion.  It is basically back to
what the code originally had, except it skips system tables, and shows
??? for invalid values.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
new file mode 100644
index 21bbdf8..d1447fe
*** a/src/bin/psql/describe.c
--- b/src/bin/psql/describe.c
*** describeOneTableDetails(const char *sche
*** 2345,2360 
  			printTableAddFooter(cont, buf.data);
  		}
  
! 		if (verbose  (tableinfo.relkind == 'r' || tableinfo.relkind == 'm') 
  			strcmp(schemaname, pg_catalog) != 0)
  		{
  			const char *s = _(Replica Identity);
  
  			printfPQExpBuffer(buf, %s: %s,
  			  s,
- 			  tableinfo.relreplident == 'd' ? DEFAULT :
  			  tableinfo.relreplident == 'f' ? FULL :
- 			  tableinfo.relreplident == 'i' ? USING INDEX :
  			  tableinfo.relreplident == 'n' ? NOTHING :
  			  ???);
  
--- 2345,2363 
  			printTableAddFooter(cont, buf.data);
  		}
  
! 		if ((tableinfo.relkind == 'r' || tableinfo.relkind == 'm') 
! 			/*
! 			 * No need to display default values;  we already display a
! 			 * REPLICA IDENTITY marker on indexes.
! 			 */
! 			tableinfo.relreplident != 'd'  tableinfo.relreplident != 'i' 
  			strcmp(schemaname, pg_catalog) != 0)
  		{
  			const char *s = _(Replica Identity);
  
  			printfPQExpBuffer(buf, %s: %s,
  			  s,
  			  tableinfo.relreplident == 'f' ? FULL :
  			  tableinfo.relreplident == 'n' ? NOTHING :
  			  ???);
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Only first XLogRecData is visible to rm_desc with WAL_DEBUG

2014-03-26 Thread Heikki Linnakangas

On 03/26/2014 04:51 PM, Robert Haas wrote:

On Wed, Mar 26, 2014 at 5:08 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Oh, no, there's no need to do it while holding WALInsertLock. It's quite
trivial, actually, see attached. So it's not difficult to fix this if we
want to.


Well is there any reason not to just commit that patch?  I mean, it
seems odd to rip this out if the fix is straightforward and already
written.


I guess so. Committed..

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-03-26 Thread Ants Aasma
It seems to me that when flushing logical mappings to disk, each
mapping file leaks the buffer used to pass the mappings to XLogInsert.
Also, it seems consistent to allocate that buffer in the RewriteState
memory context. Patch attached.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de
diff --git a/src/backend/access/heap/rewriteheap.c b/src/backend/access/heap/rewriteheap.c
index 4cf07ea..ae439e8 100644
--- a/src/backend/access/heap/rewriteheap.c
+++ b/src/backend/access/heap/rewriteheap.c
@@ -897,7 +897,7 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
 
 		/* write all mappings consecutively */
 		len = src-num_mappings * sizeof(LogicalRewriteMappingData);
-		waldata = palloc(len);
+		waldata = MemoryContextAlloc(state-rs_cxt, len);
 		waldata_start = waldata;
 
 		/*
@@ -943,6 +943,7 @@ logical_heap_rewrite_flush_mappings(RewriteState state)
 		/* write xlog record */
 		XLogInsert(RM_HEAP2_ID, XLOG_HEAP2_REWRITE, rdata);
 
+		pfree(waldata);
 	}
 	Assert(state-rs_num_rewrite_mappings == 0);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small regression adjustment

2014-03-26 Thread Andrew Dunstan


On 03/26/2014 11:37 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

It occurred to me after having to change I think 9 files to clean up a
small mess in the jsonb regression tests the other day that we might
usefully expose the inputdir and outputdir to psql as variables when
running pg_regress. Then we might be able to do thing like this, quite
independent of location:
 \set datafile :inputdir/data/mystuff.data
 COPY mytable FROM :'datafile';

If we could get rid of the run-time-generated-test-file facility
altogether, I could get excited about this; but just getting rid of
the COPY special cases isn't enough for that.  Looking at
convert_sourcefiles_in, it seems like we'd also need solutions for
these dynamic substitutions:

 replace_string(line, @testtablespace@, testtablespace);
 replace_string(line, @libdir@, dlpath);
 replace_string(line, @DLSUFFIX@, DLSUFFIX);

At least this one seems rather difficult to fix in this fashion:

output/create_function_1.source:83:ERROR:  could not find function nosuchsymbol in file 
@libdir@/regress@DLSUFFIX@

(I'm a bit inclined to think that we could dispense with @DLSUFFIX@
altogether; explicit use of the platform's library suffix has been
deprecated for at least a decade.  But the others are harder.)




Well, maybe we should change dfmgr.c to stop outputting the DLSUFFIX in 
its error message.


I haven't tried with the other two. I will when I get a spare moment.

But even if we find it too troublesome to get rid if the substitution 
part altogether, I think minimizing the need for it would still be worth 
doing. It would help extension authors, for example, who are most likely 
to want to use it to load data files for testing their extensions.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-03-26 Thread Stephen Frost
* Ants Aasma (a...@cybertec.at) wrote:
 It seems to me that when flushing logical mappings to disk, each
 mapping file leaks the buffer used to pass the mappings to XLogInsert.
 Also, it seems consistent to allocate that buffer in the RewriteState
 memory context. Patch attached.

Hmm, yeah, it does look that way.  Why bother pfree'ing it here though
instead of letting it be cleaned up with state-rs_cxt in
end_heap_rewrite()?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] small regression adjustment

2014-03-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 03/26/2014 11:37 AM, Tom Lane wrote:
 At least this one seems rather difficult to fix in this fashion:
 
 output/create_function_1.source:83:ERROR:  could not find function 
 nosuchsymbol in file @libdir@/regress@DLSUFFIX@
 
 (I'm a bit inclined to think that we could dispense with @DLSUFFIX@
 altogether; explicit use of the platform's library suffix has been
 deprecated for at least a decade.  But the others are harder.)

 Well, maybe we should change dfmgr.c to stop outputting the DLSUFFIX in 
 its error message.

If it weren't in the input command, it wouldn't be in the message either,
I think.  It's the @libdir@ part of that that's problematic.

 But even if we find it too troublesome to get rid if the substitution 
 part altogether, I think minimizing the need for it would still be worth 
 doing. It would help extension authors, for example, who are most likely 
 to want to use it to load data files for testing their extensions.

Yeah, I suppose getting down to one file needing a replacement would still
be a significant improvement over the current situation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-03-26 Thread Andres Freund
On 2014-03-26 12:49:41 -0400, Stephen Frost wrote:
 * Ants Aasma (a...@cybertec.at) wrote:
  It seems to me that when flushing logical mappings to disk, each
  mapping file leaks the buffer used to pass the mappings to XLogInsert.
  Also, it seems consistent to allocate that buffer in the RewriteState
  memory context. Patch attached.

Good catch. There's actually no need for explicitly using the context,
we're in the appropriate one. The only other MemoryContextAlloc() caller
in there should be converted to a palloc as well.

 Hmm, yeah, it does look that way.  Why bother pfree'ing it here though
 instead of letting it be cleaned up with state-rs_cxt in
 end_heap_rewrite()?

For a somewhat large relation (say a pg_attribute in a db with lots of
tables), this can actually get to be a somewhat significant amount of
memory. It *will* currently already get cleaned up with the context, but
we can easily do better.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-03-26 Thread Stephen Frost
* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-03-26 12:49:41 -0400, Stephen Frost wrote:
  * Ants Aasma (a...@cybertec.at) wrote:
   It seems to me that when flushing logical mappings to disk, each
   mapping file leaks the buffer used to pass the mappings to XLogInsert.
   Also, it seems consistent to allocate that buffer in the RewriteState
   memory context. Patch attached.
 
 Good catch. There's actually no need for explicitly using the context,
 we're in the appropriate one. The only other MemoryContextAlloc() caller
 in there should be converted to a palloc as well.

Hrm..?  I don't think that's right when it's called from
end_heap_rewrite().  Perhaps we should be switching to state-rs_cxt
while in end_heap_rewrite() also though?

  Hmm, yeah, it does look that way.  Why bother pfree'ing it here though
  instead of letting it be cleaned up with state-rs_cxt in
  end_heap_rewrite()?
 
 For a somewhat large relation (say a pg_attribute in a db with lots of
 tables), this can actually get to be a somewhat significant amount of
 memory. It *will* currently already get cleaned up with the context, but
 we can easily do better.

Ok, so perhaps we should go ahead and pfree() this as we go.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Missing pfree in logical_heap_rewrite_flush_mappings()

2014-03-26 Thread Andres Freund
On 2014-03-26 13:41:27 -0400, Stephen Frost wrote:
 * Andres Freund (and...@2ndquadrant.com) wrote:
  On 2014-03-26 12:49:41 -0400, Stephen Frost wrote:
   * Ants Aasma (a...@cybertec.at) wrote:
It seems to me that when flushing logical mappings to disk, each
mapping file leaks the buffer used to pass the mappings to XLogInsert.
Also, it seems consistent to allocate that buffer in the RewriteState
memory context. Patch attached.
  
  Good catch. There's actually no need for explicitly using the context,
  we're in the appropriate one. The only other MemoryContextAlloc() caller
  in there should be converted to a palloc as well.
 
 Hrm..?  I don't think that's right when it's called from
 end_heap_rewrite().

You're right. Apprently I shouldn't look at patches while watching a
keynote ;)

 Perhaps we should be switching to state-rs_cxt
 while in end_heap_rewrite() also though?

I think just applying Aant's patch is fine.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-26 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 I've attached an updated invtrans_strictstrict_base patch which has the
 feature removed.

What is the state of play on this patch?  Is the latest version what's in
http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
plus this sub-patch?  Is everybody reasonably happy with it?  I don't
see it marked ready for committer in the CF app, but time is running
out.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-03-26 Thread David Rowley
On Thu, Mar 27, 2014 at 7:33 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  I've attached an updated invtrans_strictstrict_base patch which has the
  feature removed.

 What is the state of play on this patch?  Is the latest version what's in

 http://www.postgresql.org/message-id/64f96fd9-64d1-40b9-8861-e61820292...@phlo.org
 plus this sub-patch?  Is everybody reasonably happy with it?  I don't
 see it marked ready for committer in the CF app, but time is running
 out.


As far as I know the only concern left was around the extra stats in the
explain output, which I removed in the patch I attached in the previous
email.

The invtrans_strictstrict_base.patch in my previous email replaces the
invtrans_strictstrict_base_038070.patch in that Florian sent here
http://www.postgresql.org/message-id/64F96FD9-64D1-40B9-8861-E6182029220B@phlo.orgall
of the other patches are unchanged so it's save to use Florian's
latest
ones

Perhaps Dean can confirm that there's nothing else outstanding?



 regards, tom lane



Re: [HACKERS] Minimum supported version of Python?

2014-03-26 Thread Tom Lane
I wrote:
 Andres Freund and...@2ndquadrant.com writes:
 If there's a refcounting bug inside python somewhere (which is easy to
 trigger in python's C interface), it could be excerbated by that change,
 since it frees/compiles functions more frequently. But I'd very much
 like more evidence of this...

 I think it's not a refcount issue, or at least not solely that.  As best
 I can tell, there's a stack clobber involved, because gdb can't make sense
 of the stack after the exception hits.  I've been trying to localize it
 more closely, but it's slow going because Apple's copy of python doesn't
 include debug symbols.

Fortunately, Apple still has the source code for that package archived at
www.opensource.apple.com, so after a bit of hair-pulling I was able to
build a version with debug symbols.  And (may I have the envelope please)
you're right: it is a refcount issue.  Take a look at what
PLy_modify_tuple does with plntup, and note that the TD[\new\] is not a
dictionary error is intentionally triggered by the plpython_trigger test.
So we have a prematurely-freed dictionary item apparently available for
recycling even though it's still part of the calling function's parsetree.
It's still like that in HEAD, too.

Will fix it shortly.  I wonder though if there are any more thinkos like
this one :-(

BTW, isn't plstr totally dead code?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why MarkBufferDirtyHint doesn't increment shared_blks_dirtied

2014-03-26 Thread Robert Haas
On Mon, Mar 24, 2014 at 9:02 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 MarkBufferDirty() always increment BufferUsage counter
 (shared_blks_dirtied) for dirty blocks whenever it dirties any
 block, whereas same is not true for MarkBufferDirtyHint().
 Is there any particular reason for not incrementing
 shared_blks_dirtied in MarkBufferDirtyHint()?

Hmm, I think that's a bug, dating back to this commit:

commit 2254367435fcc4a31cc3b6d8324e33c5c30f265a
Author: Robert Haas rh...@postgresql.org
Date:   Wed Feb 22 20:33:05 2012 -0500

Make EXPLAIN (BUFFERS) track blocks dirtied, as well as those written.

Also expose the new counters through pg_stat_statements.

Patch by me.  Review by Fujii Masao and Greg Smith.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Conditional jump or move depends on uninitialised value(s) within jsonfuncs.c

2014-03-26 Thread Peter Geoghegan
I found another bug of this category using Valgrind (memcheck). When
the standard regression tests are run against master's git tip, I see
the following:

2014-03-26 12:58:21.246 PDT 28882 LOG:  statement: select * from
json_to_record('{a:1,b:foo,c:bar}',true)
as x(a int, b text, d text);
==28882== Conditional jump or move depends on uninitialised value(s)
==28882==at 0x837610: populate_record_worker (jsonfuncs.c:2240)
==28882==by 0x836D42: json_to_record (jsonfuncs.c:2032)
==28882==by 0x650096: ExecMakeTableFunctionResult (execQual.c:2164)
==28882==by 0x6713D9: FunctionNext (nodeFunctionscan.c:94)
==28882==by 0x657799: ExecScanFetch (execScan.c:82)
==28882==by 0x657808: ExecScan (execScan.c:132)
==28882==by 0x67172E: ExecFunctionScan (nodeFunctionscan.c:266)
==28882==by 0x64C314: ExecProcNode (execProcnode.c:426)
==28882==by 0x64A08D: ExecutePlan (execMain.c:1475)
==28882==by 0x6481FC: standard_ExecutorRun (execMain.c:308)
==28882==by 0x648073: ExecutorRun (execMain.c:256)
==28882==by 0x7B837B: PortalRunSelect (pquery.c:946)
==28882==  Uninitialised value was created by a heap allocation
==28882==at 0x91CE4B: MemoryContextAlloc (mcxt.c:585)
==28882==by 0x8371F0: populate_record_worker (jsonfuncs.c:2157)
==28882==by 0x836D42: json_to_record (jsonfuncs.c:2032)
==28882==by 0x650096: ExecMakeTableFunctionResult (execQual.c:2164)
==28882==by 0x6713D9: FunctionNext (nodeFunctionscan.c:94)
==28882==by 0x657799: ExecScanFetch (execScan.c:82)
==28882==by 0x657808: ExecScan (execScan.c:132)
==28882==by 0x67172E: ExecFunctionScan (nodeFunctionscan.c:266)
==28882==by 0x64C314: ExecProcNode (execProcnode.c:426)
==28882==by 0x64A08D: ExecutePlan (execMain.c:1475)
==28882==by 0x6481FC: standard_ExecutorRun (execMain.c:308)
==28882==by 0x648073: ExecutorRun (execMain.c:256)

(It's worth noting that I pass --track-origins=yes to Valgrind to get
extra information about where the uninitialized memory came from when
this happens).

It looks to me like there is an oversight within
populate_record_worker() that sees not all codepaths initialize
my_extra's ColumnIOData array. Attached patch has this happen as part
of fcinfo-flinfo-fn_extra initialization, which results in all 4
such instances of this warning not appearing in subsequent runs
(ncolumns is also assigned).

There are still some other warnings not related to json appearing in
this set of results. By volume, they relate primarily to SP-GiST
picksplits, and GinFormTuple(). There was some discussion of at least
the former issue before [1], but I guess no one has since picked it
up.

The other things I see include:

==10833== 7 errors in context 2 of 2:
==10833== Invalid read of size 8
==10833==at 0x4C2E478: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:882)
==10833==by 0x460496: heap_fill_tuple (heaptuple.c:248)
==10833==by 0x4617E4: heap_form_tuple (heaptuple.c:716)
==10833==by 0x6588B2: ExecCopySlotTuple (execTuples.c:573)
==10833==by 0x658BF9: ExecMaterializeSlot (execTuples.c:765)
==10833==by 0x66E91D: ExecInsert (nodeModifyTable.c:179)
==10833==by 0x66FCEF: ExecModifyTable (nodeModifyTable.c:1024)
==10833==by 0x64C242: ExecProcNode (execProcnode.c:377)
==10833==by 0x64A08D: ExecutePlan (execMain.c:1475)
==10833==by 0x6481FC: standard_ExecutorRun (execMain.c:308)
==10833==by 0x648073: ExecutorRun (execMain.c:256)
==10833==by 0x7B710C: ProcessQuery (pquery.c:185)
==10833==  Address 0x6ab9028 is 6,344 bytes inside a block of size 8,192 alloc'd
==10833==at 0x4C2C934: malloc (vg_replace_malloc.c:291)
==10833==by 0x91AA46: AllocSetAlloc (aset.c:853)
==10833==by 0x91D2A6: palloc (mcxt.c:657)
==10833==by 0x6841B4: initStringInfo (stringinfo.c:50)
==10833==by 0x7B5FD5: PostgresMain (postgres.c:3914)
==10833==by 0x738E4B: BackendRun (postmaster.c:4089)
==10833==by 0x73858B: BackendStartup (postmaster.c:3778)
==10833==by 0x734D06: ServerLoop (postmaster.c:1569)
==10833==by 0x734369: PostmasterMain (postmaster.c:1222)
==10833==by 0x696333: main (main.c:203)

and:

==10833== 1 errors in context 1 of 2:
==10833== Invalid read of size 8
==10833==at 0x4C2E39E: memcpy@@GLIBC_2.14 (mc_replace_strmem.c:882)
==10833==by 0x7FD1D7: datumCopy (datum.c:132)
==10833==by 0x7146EE: evaluate_expr (clauses.c:4583)
==10833==by 0x713AB3: evaluate_function (clauses.c:4122)
==10833==by 0x712F7D: simplify_function (clauses.c:3761)
==10833==by 0x710A29: eval_const_expressions_mutator (clauses.c:2455)
==10833==by 0x69B734: expression_tree_mutator (nodeFuncs.c:2508)
==10833==by 0x712999: eval_const_expressions_mutator (clauses.c:3414)
==10833==by 0x69B911: expression_tree_mutator (nodeFuncs.c:2557)
==10833==by 0x712999: eval_const_expressions_mutator (clauses.c:3414)
==10833==by 0x7103D7: eval_const_expressions (clauses.c:2297)
==10833==by 0x6F7EC7: 

Re: [HACKERS] Duplicated row after promote in synchronous streaming replication

2014-03-26 Thread Dang Minh Huong
2014/03/27 0:18、Thom Brown t...@linux.com のメッセージ:

 On 26 March 2014 15:08, Dang Minh Huong kakalo...@gmail.com wrote:
 Hi all,
 
 I'm using PostgreSQL 9.1.10 for my HA project and have found this problem.
 
 I did (multiple times) the following sequence in my primary/standby
 synchronous replication environment,
 
 1. Update rows in a table (which have primary key constraint column) in
 active DB
 
 2. Stop active DB
 
 3. Promote standby DB
 
 4. Confirm the updated table in promoted standby (new primary) and found
 that, there's a duplicate updated row (number of row was increased).
 
 I think it is a replication bug but wonder if it was fixed yet.
 Can somebody help me?
 
 I'm not yet confirm PostgreSQL source, but here is my investigation result.
 
 Updated table before promoted were HOT update (index file was not changed).
 
 After promote i continue update that duplicated row (it returned two row
 updated), and confirm with pg_filedump, i found the duplicated row and only
 one is related to primary key index constraint.
 
 Compare with old active DB, i saw that after promote line pointer of updated
 row (duplicated row) is broken into two line pointer, the new one is related
 to primary index constraint and the other is not related to. Some thing like
 below,
 
 Old active DB:
 ctid(0,3)-ctid(0,6)-ctid(0,7)
 
 New active DB (after promote and update):
 ctid(0,3)-ctid(0,9)
 ctid(0,7)-ctid(0,10)
 
 ctid(0,10) is not related to primary key index constraint.
 
 Is something was wrong in redo log in standby DB? Or line pointer in HOT
 update feature?
 
 It sounds like you're hitting a bug that was introduced in that
 exact minor version, and has since been fixed:
 
 http://www.postgresql.org/docs/9.1/static/release-9-1-11.html
 

Thanks for your prompt response. I will confirm and revision-up if it is needed.

 You should update to the latest minor version, then re-base your
 standbys from the primary.
 
 -- 
 Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Conditional jump or move depends on uninitialised value(s) within jsonfuncs.c

2014-03-26 Thread Andrew Dunstan


On 03/26/2014 05:01 PM, Peter Geoghegan wrote:

I found another bug of this category using Valgrind (memcheck). When
the standard regression tests are run against master's git tip, I see
the following:

2014-03-26 12:58:21.246 PDT 28882 LOG:  statement: select * from
json_to_record('{a:1,b:foo,c:bar}',true)
as x(a int, b text, d text);



Thanks. Your fix committed.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of IF NOT EXISTS to others CREATE statements

2014-03-26 Thread Fabrízio de Royes Mello
On Sun, Mar 2, 2014 at 1:04 AM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:

 On Sat, Mar 1, 2014 at 7:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 
  =?ISO-8859-1?Q?Fabr=EDzio_de_Royes_Mello?= fabriziome...@gmail.com
writes:
   On Sat, Mar 1, 2014 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   [ re schema upgrade scenarios ]
   Why wouldn't COR semantics answer that requirement just as well, if
not
   better?
 
   Just because it will replace the object content... and in some cases
this
   cannot happen because it will regress the schema to an old version.
 
  That argument seems awfully darn flimsy.

 Sorry, I know my use case is very specific...

 We don't have this feature is a strong argument just because we can
implement COR instead? Or maybe just we don't want to add more complexity
to source code?

 The complexity to source code added by this feature is minimal, but the
result is very useful, and can be used for many tools (i.e. rails
migrations, python alembic, doctrine, and others)


  In any case, given the existence of DO it's simple to code up
  create-if-not-exists behavior with a couple lines of plpgsql; that seems
  to me to be a sufficient answer for corner cases.  create-or-replace is
  not equivalently fakable if the system doesn't supply the functionality.
 

 You are completely right.

 But we already have DROP ... IF EXISTS, then I think if we would have
CREATE ... IF NOT EXISTS (the inverse behavior) will be very natural...
and I agree in implement CREATE OR REPLACE too.


Hi all,

Sorry to return with this thread, but I think we missed something during
the review.

In 17th August 2013 [1] I added more code to patch [2]:

- CREATE SEQUENCE [ IF NOT EXISTS ]
- CREATE DOMAIN [ IF NOT EXISTS ]
- CREATE EVENT TRIGGER [ IF NOT EXISTS ]
- CREATE ROLE [ IF NOT EXISTS ]

Seems that no one reviewed this part or was rejected with others?

Regards,

[1] https://commitfest.postgresql.org/action/patch_view?id=1133
[2] http://www.postgresql.org/message-id/520fe6d4.8050...@timbira.com.br

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] doPickSplit stack buffer overflow in XLogInsert?

2014-03-26 Thread Peter Geoghegan
On Wed, Nov 27, 2013 at 9:10 AM, Noah Misch n...@leadboat.com wrote:
 The threat is that rounding the read size up to the next MAXALIGN would cross
 into an unreadable memory page, resulting in a SIGSEGV.  Every palloc chunk
 has MAXALIGN'd size under the hood, so the excess read of toDelete cannot
 cause a SIGSEGV.  For a stack variable, it depends on the ABI.  I'm not aware
 of an ABI where the four bytes past the end of this stack variable could be
 unreadable, which is not to claim I'm well-read on the topic.  We should fix
 this in due course on code hygiene grounds, but I would not back-patch it.

Attached patch silences the Invalid read of size n complaints of
Valgrind. I agree with your general thoughts around backpatching. Note
that the patch addresses a distinct complaint from Kevin's, as
Valgrind doesn't take issue with the invalid reads past the end of
spgxlogPickSplit variables on the stack.

-- 
Peter Geoghegan
*** a/src/backend/access/spgist/spgdoinsert.c
--- b/src/backend/access/spgist/spgdoinsert.c
*** moveLeafs(Relation index, SpGistState *s
*** 412,419 
  
  	/* Locate the tuples to be moved, and count up the space needed */
  	i = PageGetMaxOffsetNumber(current-page);
! 	toDelete = (OffsetNumber *) palloc(sizeof(OffsetNumber) * i);
! 	toInsert = (OffsetNumber *) palloc(sizeof(OffsetNumber) * (i + 1));
  
  	size = newLeafTuple-size + sizeof(ItemIdData);
  
--- 412,419 
  
  	/* Locate the tuples to be moved, and count up the space needed */
  	i = PageGetMaxOffsetNumber(current-page);
! 	toDelete = (OffsetNumber *) palloc0(MAXALIGN(sizeof(OffsetNumber) * i));
! 	toInsert = (OffsetNumber *) palloc0(MAXALIGN(sizeof(OffsetNumber) * (i + 1)));
  
  	size = newLeafTuple-size + sizeof(ItemIdData);
  
*** doPickSplit(Relation index, SpGistState
*** 722,731 
  	n = max + 1;
  	in.datums = (Datum *) palloc(sizeof(Datum) * n);
  	heapPtrs = (ItemPointerData *) palloc(sizeof(ItemPointerData) * n);
! 	toDelete = (OffsetNumber *) palloc(sizeof(OffsetNumber) * n);
! 	toInsert = (OffsetNumber *) palloc(sizeof(OffsetNumber) * n);
  	newLeafs = (SpGistLeafTuple *) palloc(sizeof(SpGistLeafTuple) * n);
! 	leafPageSelect = (uint8 *) palloc(sizeof(uint8) * n);
  
  	xlrec.node = index-rd_node;
  	STORE_STATE(state, xlrec.stateSrc);
--- 722,731 
  	n = max + 1;
  	in.datums = (Datum *) palloc(sizeof(Datum) * n);
  	heapPtrs = (ItemPointerData *) palloc(sizeof(ItemPointerData) * n);
! 	toDelete = (OffsetNumber *) palloc0(MAXALIGN(sizeof(OffsetNumber) * n));
! 	toInsert = (OffsetNumber *) palloc0(MAXALIGN(sizeof(OffsetNumber) * n));
  	newLeafs = (SpGistLeafTuple *) palloc(sizeof(SpGistLeafTuple) * n);
! 	leafPageSelect = (uint8 *) palloc0(MAXALIGN(sizeof(uint8) * n));
  
  	xlrec.node = index-rd_node;
  	STORE_STATE(state, xlrec.stateSrc);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-03-26 Thread Etsuro Fujita
Hi Horiguchi-san,

(2014/03/26 17:14), Kyotaro HORIGUCHI wrote:
 The overall patch was applied on HEAD and compiled cleanly except
 for a warning.
 
 analyze.c: In function ‘acquire_inherited_sample_rows’:
 analyze.c:1461: warning: unused variable ‘saved_rel’

I've fixed this in the latest version (v8) of the patch.

 And for file-fdw, you made a change to re-create foreignscan node
 instead of the previous copy-and-modify. Is the reason you did it
 that you considered the cost of 're-checking whether to
 selectively perform binary conversion' is low enough? Or other
 reasons?

The reason is that we get the result of the recheck from
path-fdw_private.  Sorry, I'd forgotten it.  So, I modified the code to
simply call create_foreignscan_path().

 Finally, although I insist the necessity of the warning for child
 foreign tables on alter table, if you belive it to be put off,
 I'll compromise by putting a note to CF-app that last judgement
 is left to committer.

OK  So, if there are no objections of other, I'll mark this patch as
ready for committer and do that.

Thanks,

Best regards,
Etsuro Fujita


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] B-Tree support function number 3 (strxfrm() optimization)

2014-03-26 Thread Peter Geoghegan
I have thought a little further about my proposal to have inner B-Tree
pages store strxfrm() blobs [1]; my earlier conclusion was that this
could not be trusted when equality was indicated [2] due to the
Hungarian problem [3]. As I noted earlier, that actually doesn't
really matter, because we may have missed what's really so useful
about strxfrm() blobs: We *can* trust a non-zero result as a proxy for
what we would have gotten with a proper bttextcmp(), even if a
strcmp() only looks at the first byte of a blob for each of two text
strings being compared. Here is what the strxfrm() blob looks like for
some common English words when put through glibc's strxfrm() with the
en_US.UTF-8 collation (I wrote a wrapper function to do this and
output bytea a couple of years ago):

[local]/postgres=# select strxfrm_test('Yellow');
strxfrm_test

 \x241017171a220109090909090901100909090909
(1 row)

[local]/postgres=# select strxfrm_test('Red');
   strxfrm_test
--
 \x1d100f0109090901100909
(1 row)

[local]/postgres=# select strxfrm_test('Orange');
strxfrm_test

 \x1a1d0c1912100109090909090901100909090909
(1 row)

[local]/postgres=# select strxfrm_test('Green');
 strxfrm_test
--
 \x121d101019010909090909011009090909
(1 row)

[local]/postgres=# select strxfrm_test('White');
 strxfrm_test
--
 \x2213141f10010909090909011009090909
(1 row)

[local]/postgres=# select strxfrm_test('Black');
 strxfrm_test
--
 \x0d170c0e16010909090909011009090909
(1 row)

Obviously, all of these blobs, while much larger than the original
string still differ in their first byte. It's almost as if they're
intended to be truncated.

The API I envisage is a new support function 3 that operator class
authors may optionally provide. Within the support function a text
varlena argument is passed, or whatever the type that relates to the
opclass happens to be. It returns a Datum to the core system. That
datum is always what is actually passed to their sort support routine
(B-Tree support function number 2) iff a support function 3 is
provided (you must provide number 2 if you provide number 3, but not
vice-versa). In respect of support-function-3-providing opclasses, the
core system is entitled to take the sort support return value as a
proxy for what a proper support function 1 call would indicate iff the
sort support routine does not return 0 in respect of two
support-function-3 blobs (typically strxfrm() blobs truncated at 8
bytes for convenient storage as pass-by-value Datums). Otherwise, a
proper call to support function 1, with fully-formed text arguments is
required.

I see at least two compelling things we can do with these blobs:

1. Store them as pseudo-columns of inner B-Tree IndexTuples (not leaf
pages). Naturally, inner pages are very heterogeneous, so only having
8 bytes is very probably an excellent trade-off there. Typically 1-3%
of B-Tree pages are inner pages, so the bloat risk seems acceptable.

2. When building a SortTuple array within TupleSort, we can store far
more of these truncated blobs in memory than we can proper strings. if
SortTuple.datum1 (the first column to sort on among tuples being
sorted, which is currently stored in memory as an optimization) was
just an 8 byte truncated strxfrm() blob, and not a pointer to a text
string in memory, that would be pretty great for performance for
several reasons. So just as with B-Tree inner pages, for SortTuples
there can be a pseudo leading key - we need only compare additional
sort keys/heap_getattr() when we need a tie-breaker, when those 8
bytes aren't enough to reach a firm conclusion.

It doesn't just stop with strxfrm() blobs, though. Why couldn't we
create blobs that can be used as reliable proxies for numerics, that
are just integers? Sure, you need to reserve a bit to indicate an
inability to represent the original value, and possibly work out other
details like that, but the only negative thing you can say about that,
or indeed applying these techniques to any operator class is that it
might not help in certain worst cases (mostly contrived cases). Still,
the overhead of doing that bit of extra work is surely quite low
anyway - at worst, a extra few instructions wasted per comparison -
making these techniques likely quite profitable for the vast majority
of real-world applications.

Does anyone else want to pick this idea up and run with it? I don't
think I'll have time for it.

[1] 
http://www.postgresql.org/message-id/CAM3SWZTcXrdDZSpA11qZXiyo4_jtxwjaNdZpnY54yjzq7d64=a...@mail.gmail.com

[2] 
http://www.postgresql.org/message-id/cam3swzs7wewrbmrgci9_ycx49ug6ugqn2xngjg3zq5v8lbd...@mail.gmail.com

[3] 

Re: [HACKERS] Why MarkBufferDirtyHint doesn't increment shared_blks_dirtied

2014-03-26 Thread Amit Kapila
On Thu, Mar 27, 2014 at 1:39 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Mar 24, 2014 at 9:02 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 MarkBufferDirty() always increment BufferUsage counter
 (shared_blks_dirtied) for dirty blocks whenever it dirties any
 block, whereas same is not true for MarkBufferDirtyHint().
 Is there any particular reason for not incrementing
 shared_blks_dirtied in MarkBufferDirtyHint()?

 Hmm, I think that's a bug, dating back to this commit:

 commit 2254367435fcc4a31cc3b6d8324e33c5c30f265a

Right.
Do you think the fix attached in my previous mail is appropriate?
http://www.postgresql.org/message-id/CAA4eK1KQQSpNmfxg8Cg3-JZD23Q4Ee3iCsuLZGekH=dnagp...@mail.gmail.com


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minimum supported version of Python?

2014-03-26 Thread Peter Eisentraut
On Tue, 2014-03-18 at 18:37 -0400, Tom Lane wrote:
 After further experimentation, I've found that 2.3 does pass the
 regression
 tests if one installs the separately-available cdecimal module.  So my
 complaint reduces to the fact that our Requirements documentation
 doesn't mention the need for this.  Do you have an objection to adding
 something there?

You backpatched this change, but that can't be right, because the
feature that requires the cdecimal module was added in 9.4
(7919398bac8bacd75ec5d763ce8b15ffaaa3e071).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Few new warnings have been introduced in windows build (jsonb_util.c)

2014-03-26 Thread Amit Kapila
Few new warnings have been introduced in windows build.


1e:\workspace\postgresql\master\postgresql\src\backend\utils\adt\jsonb_util.c(802):
warning C4715: 'JsonbIteratorNext' : not all control paths return a
value
1e:\workspace\postgresql\master\postgresql\src\backend\utils\adt\jsonb_util.c(1042):
warning C4715: 'JsonbDeepContains' : not all control paths return a
value
1e:\workspace\postgresql\master\postgresql\src\backend\utils\adt\jsonb_util.c(1175):
warning C4715: 'compareJsonbScalarValue' : not all control paths
return a value

These are quite similar to what we have fixed some time back.
Attached patch fixes these warnings. I am not sure about
fix of warning in 'JsonbIteratorNext', as there is no invalid value
for JSONB processing.



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


silence_warning_jsonb-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers