Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Amit Kapila
On Tue, Feb 4, 2014 at 12:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Amit Kapila amit.kapil...@gmail.com writes:
 In the function where it is used, it seems to me that it is setting DateStyle
 as ISO if it is not ISO and function configure_remote_session() will set it
 to ISO initially. So basically even if the value of DateStyle is junk, it 
 will
 just do what we wanted i.e setting it to ISO. Does the failure is due to 
 reason
 that it is not expecting DateStyle to be ISO and due to junk value of this
 parameter it sets the same to ISO.

 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the variable.
 But it's hard to believe that this would be true of MainLWLockArray.

Thats true, but for me its failing as MainLWLockArray contains Junk value.
Now the point to look out is why its passing on some of the build farm m/c's.
I will study about this more, if you have anything specific in mind
then, do let me know.

Can I get the details of m/c env on which it is passing like which
windows version and msvc version?
It might give some clue.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 12:38, Fujii Masao wrote:
 ISTM that the phrase Request queue is not used much around the lock.
 Using the phrase wait queue or Simon's suggestion sound better to at least 
 me.
 Thought?

Sounds reasonable to me. Attached patch changes messages to the following:

Process holding the lock: A. Wait queue: B.
Processes holding the lock: A, B. Wait queue: C.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/doc/src/sgml/sources.sgml b/doc/src/sgml/sources.sgml
index 881b0c3..aa20807 100644
--- a/doc/src/sgml/sources.sgml
+++ b/doc/src/sgml/sources.sgml
@@ -251,6 +251,15 @@ ereport(ERROR,
/listitem
listitem
 para
+ functionerrdetail_log_plural(const char *fmt_singuar, const char
+ *fmt_plural, unsigned long n, ...)/function is like
+ functionerrdetail_log/, but with support for various plural forms of
+ the message.
+ For more information see xref linkend=nls-guidelines.
+/para
+   /listitem
+   listitem
+para
  functionerrhint(const char *msg, ...)/function supplies an optional
  quotehint/ message; this is to be used when offering suggestions
  about how to fix the problem, as opposed to factual details about
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index fb449a8..9620f6e 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -1209,13 +1209,23 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 		 */
 		if (log_lock_waits  deadlock_state != DS_NOT_YET_CHECKED)
 		{
-			StringInfoData buf;
+			StringInfoData buf,
+		lock_waiters_sbuf,
+		lock_holders_sbuf;
 			const char *modename;
 			long		secs;
 			int			usecs;
 			long		msecs;
+			SHM_QUEUE  *procLocks;
+			PROCLOCK   *proclock;
+			bool		first_holder = true,
+		first_waiter = true;
+			int			lockHoldersNum = 0;
 
 			initStringInfo(buf);
+			initStringInfo(lock_waiters_sbuf);
+			initStringInfo(lock_holders_sbuf);
+
 			DescribeLockTag(buf, locallock-tag.lock);
 			modename = GetLockmodeName(locallock-tag.lock.locktag_lockmethodid,
 	   lockmode);
@@ -1225,10 +1235,67 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
 			msecs = secs * 1000 + usecs / 1000;
 			usecs = usecs % 1000;
 
+			/*
+			 * we loop over the lock's procLocks to gather a list of all
+			 * holders and waiters. Thus we will be able to provide more
+			 * detailed information for lock debugging purposes.
+			 *
+			 * lock-procLocks contains all processes which hold or wait for
+			 * this lock.
+			 */
+
+			LWLockAcquire(partitionLock, LW_SHARED);
+
+			procLocks = (lock-procLocks);
+			proclock = (PROCLOCK *) SHMQueueNext(procLocks, procLocks,
+			   offsetof(PROCLOCK, lockLink));
+
+			while (proclock)
+			{
+/*
+ * we are a waiter if myProc-waitProcLock == proclock; we are
+ * a holder if it is NULL or something different
+ */
+if (proclock-tag.myProc-waitProcLock == proclock)
+{
+	if (first_waiter)
+	{
+		appendStringInfo(lock_waiters_sbuf, %d,
+		 proclock-tag.myProc-pid);
+		first_waiter = false;
+	}
+	else
+		appendStringInfo(lock_waiters_sbuf, , %d,
+		 proclock-tag.myProc-pid);
+}
+else
+{
+	if (first_holder)
+	{
+		appendStringInfo(lock_holders_sbuf, %d,
+		 proclock-tag.myProc-pid);
+		first_holder = false;
+	}
+	else
+		appendStringInfo(lock_holders_sbuf, , %d,
+		 proclock-tag.myProc-pid);
+
+	lockHoldersNum++;
+}
+
+proclock = (PROCLOCK *) SHMQueueNext(procLocks, proclock-lockLink,
+			   offsetof(PROCLOCK, lockLink));
+			}
+
+			LWLockRelease(partitionLock);
+
 			if (deadlock_state == DS_SOFT_DEADLOCK)
 ereport(LOG,
 		(errmsg(process %d avoided deadlock for %s on %s by rearranging queue order after %ld.%03d ms,
-			  MyProcPid, modename, buf.data, msecs, usecs)));
+MyProcPid, modename, buf.data, msecs, usecs),
+		 (errdetail_log_plural(Process holding the lock: %s. Wait queue: %s.,
+		   Processes holding the lock: %s. Wait queue: %s.,
+			   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data;
 			else if (deadlock_state == DS_HARD_DEADLOCK)
 			{
 /*
@@ -1240,13 +1307,19 @@ ProcSleep(LOCALLOCK *locallock, LockMethod lockMethodTable)
  */
 ereport(LOG,
 		(errmsg(process %d detected deadlock while waiting for %s on %s after %ld.%03d ms,
-			  MyProcPid, modename, buf.data, msecs, usecs)));
+MyProcPid, modename, buf.data, msecs, usecs),
+		 (errdetail_log_plural(Process holding the lock: %s. Wait queue: %s.,
+		   Processes holding the lock: %s. Wait queue: %s.,
+			   lockHoldersNum, lock_holders_sbuf.data, lock_waiters_sbuf.data;
 			}
 
 			if (myWaitStatus == STATUS_WAITING)
 ereport(LOG,
 		(errmsg(process %d still waiting for 

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-02-04 Thread Andres Freund
On 2014-02-03 17:51:20 -0800, Peter Geoghegan wrote:
 On Sun, Feb 2, 2014 at 6:00 AM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-01 19:47:29 -0800, Peter Geoghegan wrote:
  Here are the results of a benchmark on Nathan Boley's 64-core, 4
  socket server: 
  http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
 
  That's interesting. The maximum number of what you see here (~293125)
  is markedly lower than what I can get.
 
  ... poke around ...
 
  Hm, that's partially because you're using pgbench without -M prepared if
  I see that correctly. The bottleneck in that case is primarily memory
  allocation. But even after that I am getting higher
  numbers: ~342497.
 
  Trying to nail down the differnce it oddly seems to be your
  max_connections=80 vs my 100. The profile in both cases is markedly
  different, way much more spinlock contention with 80. All in
  Pin/UnpinBuffer().
 
 I updated this benchmark, with your BufferDescriptors alignment patch
 [1] applied on top of master (while still not using -M prepared in
 order to keep the numbers comparable). So once again, that's:
 
 http://postgres-benchmarks.s3-website-us-east-1.amazonaws.com/amd-4-socket-rwlocks/
 
 It made a bigger, fairly noticeable difference, but not so big a
 difference as you describe here. Are you sure that you saw this kind
 of difference with only 64 clients, as you mentioned elsewhere [1]
 (perhaps you fat-fingered [1] -- -cj is ambiguous)? Obviously
 max_connections is still 80 in the above. Should I have gone past 64
 clients to see the problem? The best numbers I see with the [1] patch
 applied on master is only ~327809 for -S 10 64 clients. Perhaps I've
 misunderstood.

That's likely -M prepared.  It was with -c 64 -j 64...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Show process IDs of processes holding a lock; show relation and tuple infos of a lock to acquire

2014-02-04 Thread Rajeev rastogi

On 4th February 2014, Christian kruse Wrote:
 On 04/02/14 12:38, Fujii Masao wrote:
  ISTM that the phrase Request queue is not used much around the lock.
  Using the phrase wait queue or Simon's suggestion sound better to
 at least me.
  Thought?
 
 Sounds reasonable to me. Attached patch changes messages to the
 following:
 
 Process holding the lock: A. Wait queue: B.
 Processes holding the lock: A, B. Wait queue: C.

This looks good to me also.

Thanks and Regards,
Kumar Rajeev Rastogi




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
Hi,

On 2014-02-04 10:23:14 +0900, Michael Paquier wrote:
 On Tue, Feb 4, 2014 at 10:10 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Michael Paquier michael.paqu...@gmail.com writes:
  Please find attached a patch implementing lsn as a datatype, based on
  the one Robert wrote a couple of years ago.
 
  Patch contains regression tests as well as a bit of documentation.
  Perhaps this is too late for 9.4, so if there are no objections I'll
  simply add this patch to the next commit fest in June for 9.5.
 
  I may have lost count, but aren't a bunch of the affected functions new
  in 9.4?  If so, there's a good argument to be made that we should get
  this in now, rather than waiting and having an API change for those
  functions in 9.5.

Yes, that sounds sensible.

 + /*--
 +  *  Relational operators for LSNs
 +  *-*/

Isn't it just operators? They aren't really relational...


 *** 302,307  extern struct varlena *pg_detoast_datum_packed(struct 
 varlena * datum);
 --- 303,309 
   #define PG_RETURN_CHAR(x)return CharGetDatum(x)
   #define PG_RETURN_BOOL(x)return BoolGetDatum(x)
   #define PG_RETURN_OID(x) return ObjectIdGetDatum(x)
 + #define PG_RETURN_LSN(x) return LogSeqNumGetDatum(x)
   #define PG_RETURN_POINTER(x) return PointerGetDatum(x)
   #define PG_RETURN_CSTRING(x) return CStringGetDatum(x)
   #define PG_RETURN_NAME(x)return NameGetDatum(x)
 *** a/src/include/postgres.h
 --- b/src/include/postgres.h
 ***
 *** 484,489  typedef Datum *DatumPtr;
 --- 484,503 
   #define ObjectIdGetDatum(X) ((Datum) SET_4_BYTES(X))
   
   /*
 +  * DatumGetLogSeqNum
 +  *  Returns log sequence number of a datum.
 +  */
 + 
 + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))

I am not a fan of LogSegNum. I think at this point fewer people
understand that than LSN. There's also no reason to invent a third term
for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.

 *** a/src/backend/replication/slotfuncs.c
 --- b/src/backend/replication/slotfuncs.c
 ***
 *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
   boolactive;
   Oid database;
   const char *slot_name;
 - 
 - charrestart_lsn_s[MAXFNAMELEN];
   int i;
   
   SpinLockAcquire(slot-mutex);
 --- 141,146 

Unrelated change.


Looks reasonable on a first look. Thanks!

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andres Freund
On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Amit Kapila amit.kapil...@gmail.com writes:
  In the function where it is used, it seems to me that it is setting 
  DateStyle
  as ISO if it is not ISO and function configure_remote_session() will set it
  to ISO initially. So basically even if the value of DateStyle is junk, it 
  will
  just do what we wanted i.e setting it to ISO. Does the failure is due to 
  reason
  that it is not expecting DateStyle to be ISO and due to junk value of this
  parameter it sets the same to ISO.
 
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the variable.
 But it's hard to believe that this would be true of MainLWLockArray.

There's not that much lwlock usage in contrib. It's just
pg_stat_statements and pg_buffercache. Neither has tests... So it very
well could be that breakage simply hasn't been observed.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Tue, Feb 4, 2014 at 6:15 PM, Andres Freund and...@2ndquadrant.com wrote:

 + /*--
 +  *  Relational operators for LSNs
 +  *-*/

 Isn't it just operators? They aren't really relational...
Operators for LSNs?

 *** 302,307  extern struct varlena *pg_detoast_datum_packed(struct 
 varlena * datum);
 --- 303,309 
   #define PG_RETURN_CHAR(x)return CharGetDatum(x)
   #define PG_RETURN_BOOL(x)return BoolGetDatum(x)
   #define PG_RETURN_OID(x) return ObjectIdGetDatum(x)
 + #define PG_RETURN_LSN(x) return LogSeqNumGetDatum(x)
   #define PG_RETURN_POINTER(x) return PointerGetDatum(x)
   #define PG_RETURN_CSTRING(x) return CStringGetDatum(x)
   #define PG_RETURN_NAME(x)return NameGetDatum(x)
 *** a/src/include/postgres.h
 --- b/src/include/postgres.h
 ***
 *** 484,489  typedef Datum *DatumPtr;
 --- 484,503 
   #define ObjectIdGetDatum(X) ((Datum) SET_4_BYTES(X))

   /*
 +  * DatumGetLogSeqNum
 +  *  Returns log sequence number of a datum.
 +  */
 +
 + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))

 I am not a fan of LogSegNum. I think at this point fewer people
 understand that than LSN. There's also no reason to invent a third term
 for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.
So let's go with DatumGetLSN and LSNGetDatum instead...

 *** a/src/backend/replication/slotfuncs.c
 --- b/src/backend/replication/slotfuncs.c
 ***
 *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
   boolactive;
   Oid database;
   const char *slot_name;
 -
 - charrestart_lsn_s[MAXFNAMELEN];
   int i;

   SpinLockAcquire(slot-mutex);
 --- 141,146 

 Unrelated change.
Funnily, the patch attached in my previous mail did not include all
the diffs, it is an error with filterdiff that I use to generate
context diff patches... My original branch includes the following
diffs as well in slotfuncs.c for the second patch:
diff --git a/src/backend/replication/slotfuncs.c
b/src/backend/replication/slotfuncs.c
index 98a860e..68ecdcd 100644
--- a/src/backend/replication/slotfuncs.c
+++ b/src/backend/replication/slotfuncs.c
@@ -141,8 +141,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
boolactive;
Oid database;
const char *slot_name;
-
-   charrestart_lsn_s[MAXFNAMELEN];
int i;

SpinLockAcquire(slot-mutex);
@@ -164,9 +162,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)

memset(nulls, 0, sizeof(nulls));

-   snprintf(restart_lsn_s, sizeof(restart_lsn_s), %X/%X,
-(uint32) (restart_lsn  32),
(uint32) restart_lsn);
-
i = 0;
values[i++] = CStringGetTextDatum(slot_name);
if (database == InvalidOid)
@@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
else
nulls[i++] = true;
if (restart_lsn != InvalidTransactionId)
-   values[i++] = CStringGetTextDatum(restart_lsn_s);
+   values[i++] = restart_lsn;
else
nulls[i++] = true;

Anything else?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
 On Tue, Feb 4, 2014 at 6:15 PM, Andres Freund and...@2ndquadrant.com wrote:
 
  + /*--
  +  *  Relational operators for LSNs
  +  *-*/
 
  Isn't it just operators? They aren't really relational...
 Operators for LSNs?

Fine with me.

  + #define DatumGetLogSeqNum(X) ((XLogRecPtr) GET_8_BYTES(X))
 
  I am not a fan of LogSegNum. I think at this point fewer people
  understand that than LSN. There's also no reason to invent a third term
  for LSNs. We'd have LSN, XLogRecPtr, and LogSeqNum.
 So let's go with DatumGetLSN and LSNGetDatum instead...

Sup.

  *** a/src/backend/replication/slotfuncs.c
  --- b/src/backend/replication/slotfuncs.c
  ***
  *** 141,148  pg_get_replication_slots(PG_FUNCTION_ARGS)
boolactive;
Oid database;
const char *slot_name;
  -
  - charrestart_lsn_s[MAXFNAMELEN];
int i;
 
SpinLockAcquire(slot-mutex);
  --- 141,146 
 
  Unrelated change.
 Funnily, the patch attached in my previous mail did not include all
 the diffs, it is an error with filterdiff that I use to generate
 context diff patches... My original branch includes the following

Ah, then it makes more sense.

 diffs as well in slotfuncs.c for the second patch:
 diff --git a/src/backend/replication/slotfuncs.c
 b/src/backend/replication/slotfuncs.c
 index 98a860e..68ecdcd 100644
 --- a/src/backend/replication/slotfuncs.c
 +++ b/src/backend/replication/slotfuncs.c
 @@ -141,8 +141,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 boolactive;
 Oid database;
 const char *slot_name;
 -
 -   charrestart_lsn_s[MAXFNAMELEN];
 int i;
 
 SpinLockAcquire(slot-mutex);
 @@ -164,9 +162,6 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 
 memset(nulls, 0, sizeof(nulls));
 
 -   snprintf(restart_lsn_s, sizeof(restart_lsn_s), %X/%X,
 -(uint32) (restart_lsn  32),
 (uint32) restart_lsn);
 -
 i = 0;
 values[i++] = CStringGetTextDatum(slot_name);
 if (database == InvalidOid)
 @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 else
 nulls[i++] = true;
 if (restart_lsn != InvalidTransactionId)
 -   values[i++] = CStringGetTextDatum(restart_lsn_s);
 +   values[i++] = restart_lsn;
 else
 nulls[i++] = true;

Isn't that missing a LSNGetDatum()? Also, isn't it lacking the
corresponding pg_proc change?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] PostgreSQL fails to start on Windows if it crashes after tablespace creation

2014-02-04 Thread MauMau
I'm sorry, I'm replying to an older mail, because I lost your latest mail by 
mistake.



Ah. Sorry, I missed that part. As NTFS junctions and symbolic links are
different (although they behave similarly), there seems only a minor
inconvenience related to misleading error message i.e.


You are right.  Fixed.

Regards
MauMau


remove_tblspc_symlink_v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-04 Thread MauMau

From: Craig Ringer cr...@2ndquadrant.com

I completely agree; just saying that any installer can set the key.

I'm convinced that setting this flag is appropriate, at least while Pg
relies on having the shared memory segment mapped in the same zone in
every executable. Just pointing out that there's a workaround in the
mean time.


Please don't mind, I didn't misunderstand your intent.  I think we should 
apply this in the next minor release to avoid unnecessary confusion -- more 
new users would use PostgreSQL on Windows 8/2012 and hit this problem.


I added this patch to the CommitFest.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-04 Thread Robert Haas
On Sun, Feb 2, 2014 at 10:15 PM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:
 Allowing ALTER COLUMN SET STORAGE on foreign tables would make sense if for
 example, SELECT * INTO local_table FROM foreign_table did create a new
 local table of columns having the storage types associated with those of a
 foreign table?

Seems like a pretty weak argument.  It's not that we can't find
strange corner cases where applying SET STORAGE to a foreign table
doesn't do something; it's that they *are* strange corner cases.  The
options as we normally don't understand them just aren't sensible in
this context, and a good deal of work has been put into an alternative
options framework, which is what authors of FDWs ought to be using.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe fails to start on Windows Server 2012 due to ASLR

2014-02-04 Thread Craig Ringer
On 02/04/2014 07:28 PM, MauMau wrote:

 
 Please don't mind, I didn't misunderstand your intent.  I think we
 should apply this in the next minor release to avoid unnecessary
 confusion -- more new users would use PostgreSQL on Windows 8/2012 and
 hit this problem.
 
 I added this patch to the CommitFest.

It's really a bugfix suitable for backpatching IMO.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Tue, Feb 4, 2014 at 7:22 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
 @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
 else
 nulls[i++] = true;
 if (restart_lsn != InvalidTransactionId)
 -   values[i++] = CStringGetTextDatum(restart_lsn_s);
 +   values[i++] = restart_lsn;
 else
 nulls[i++] = true;

 Isn't that missing a LSNGetDatum()?
Oops yes. Will fix.

 Also, isn't it lacking the corresponding pg_proc change?
restart_lsn is the 6th argument of pg_get_replication_slots, and the
list of arguments of this function is already changed like that in my
patch:
{25,25,26,16,28,25} = {25,25,26,16,28,3220}
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Andres Freund
On 2014-02-04 21:04:13 +0900, Michael Paquier wrote:
 On Tue, Feb 4, 2014 at 7:22 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-02-04 19:17:51 +0900, Michael Paquier wrote:
  @@ -180,7 +175,7 @@ pg_get_replication_slots(PG_FUNCTION_ARGS)
  else
  nulls[i++] = true;
  if (restart_lsn != InvalidTransactionId)
  -   values[i++] = CStringGetTextDatum(restart_lsn_s);
  +   values[i++] = restart_lsn;
  else
  nulls[i++] = true;
 
  Isn't that missing a LSNGetDatum()?
 Oops yes. Will fix.
 
  Also, isn't it lacking the corresponding pg_proc change?
 restart_lsn is the 6th argument of pg_get_replication_slots, and the
 list of arguments of this function is already changed like that in my
 patch:
 {25,25,26,16,28,25} = {25,25,26,16,28,3220}
 Regards,

Ok.

I think the patch should also adapt pageinspect...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Andres Freund
On 2014-02-04 00:38:19 +0100, Andres Freund wrote:
   A quick hack (attached) making BufferDescriptor 64byte aligned indeed
   restored performance across all max_connections settings. It's not
   surprising that a misaligned buffer descriptor causes problems -
   there'll be plenty of false sharing of the spinlocks otherwise. Curious
   that the the intel machine isn't hurt much by this.
 
  I think that is explained here:
 
  http://www.agner.org/optimize/blog/read.php?i=142v=t
 
  With Sandy Bridge, Misaligned memory operands [are] handled efficiently.
 
 No, I don't think so. Those improvements afair refer to unaligned
 accesses as in accessing a 4 byte variable at address % 4 != 0.

So, Christian did some benchmarking on the intel machine, and his
results were also lower than mine, and I've since confirmed that it's
also possible to reproduce the alignment problems on the intel machine.

Which imo means fixing this got more important...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Andres Freund
Hi,

I recently had the need to bury the used isolation level in the
connection string, but it turns out that doesn't work that well...

PGOPTIONS='-c default_transaction_isolation=serializable' \
psql ... -c SHOW default_transaction_isolation
works well enough, but
PGOPTIONS='-c default_transaction_isolation=repeatable read' \
psql ... -c SHOW default_transaction_isolation
doesn't, because of the whitespace. I couldn't come up with any adequate
quoting.

I'd like to propose adding aliases with dashes instead of spaces to the
isolation_level_options array? I'd even like to backport it, because it
makes benchmarking across versions unneccessarily hard.

Additionally we might want to think about a bit better quoting support
for such options?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Alvaro Herrera
Tom Lane wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
  Hello, I have often seen inquiries about an log message from
  PostgreSQL server.
  LOG:  could not create IPv6 socket: Address family not supported by 
  protocol
 
 That's merely a harmless log message.


 If we're concerned about users worrying about log messages from
 this, I'd rather see us downgrade those log messages to DEBUG level
 than risk breaking the code with behaviors that were proven to be
 a bad idea a decade ago.  But TBH I see no strong need to do anything
 here.

How about just adding a HINT?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 Hello, I have often seen inquiries about an log message from
 PostgreSQL server.
 LOG:  could not create IPv6 socket: Address family not supported by protocol

 That's merely a harmless log message.

 How about just adding a HINT?

Hmm ... maybe, but how would you phrase the hint exactly?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the variable.
 But it's hard to believe that this would be true of MainLWLockArray.

 There's not that much lwlock usage in contrib. It's just
 pg_stat_statements and pg_buffercache. Neither has tests... So it very
 well could be that breakage simply hasn't been observed.

Hm, you're right --- I'd have thought there were more of those.

Ugh.  This problem was bad enough when I thought that it would only lead
to link-time errors detectable in the buildfarm.  If it can lead to errors
only observable at runtime --- and maybe not obvious even then --- then
I think we *have to* do something about it.  By that I mean that we must
get rid of the need to manually plaster PGDLLIMPORT on global variables.

Anybody with a Windows build environment want to test the #define extern
trick?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 10:43 AM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-02-04 02:10:47 -0500, Tom Lane wrote:

Meh.  It might be that the DateStyle usage in postgres_fdw would
accidentally fail to malfunction if it saw a bogus value of the variable.
But it's hard to believe that this would be true of MainLWLockArray.

There's not that much lwlock usage in contrib. It's just
pg_stat_statements and pg_buffercache. Neither has tests... So it very
well could be that breakage simply hasn't been observed.

Hm, you're right --- I'd have thought there were more of those.

Ugh.  This problem was bad enough when I thought that it would only lead
to link-time errors detectable in the buildfarm.  If it can lead to errors
only observable at runtime --- and maybe not obvious even then --- then
I think we *have to* do something about it.  By that I mean that we must
get rid of the need to manually plaster PGDLLIMPORT on global variables.

Anybody with a Windows build environment want to test the #define extern
trick?






We have details on how to build with Mingw/Msys on Windows on an Amazon 
VM http://wiki.postgresql.org/wiki/Building_With_MinGW which is either 
free or very cheap. Do I need to give instructions on how to do this for 
MSVC builds too? It's really not terribly hard.



cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [doc patch] extra_float_digits and casting from real to numeric

2014-02-04 Thread Christoph Berg
Re: To Tom Lane 2014-01-08 20140108094017.ga20...@msgid.df7cb.de
 What about this patch to mention this gotcha more explicitely in the
 documentation?
 
 diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
 new file mode 100644
 index 0386330..968f4a7
 *** a/doc/src/sgml/datatype.sgml
 --- b/doc/src/sgml/datatype.sgml
 *** NUMERIC
 *** 689,694 
 --- 689,697 
 literal0/literal, the output is the same on every platform
 supported by PostgreSQL.  Increasing it will produce output that
 more accurately represents the stored value, but may be unportable.
 +   Casts to other numeric datatypes and the literalto_char/literal
 +   function are not affected by this setting, it affects only the text
 +   representation.
/para
   /note
   

Anyone for that patch?

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 PGOPTIONS='-c default_transaction_isolation=serializable' \
 psql ... -c SHOW default_transaction_isolation
 works well enough, but
 PGOPTIONS='-c default_transaction_isolation=repeatable read' \
 psql ... -c SHOW default_transaction_isolation
 doesn't, because of the whitespace. I couldn't come up with any adequate
 quoting.

 I'd like to propose adding aliases with dashes instead of spaces to the
 isolation_level_options array? I'd even like to backport it, because it
 makes benchmarking across versions unneccessarily hard.

-1.  This is not a general solution to the problem.  There are other
GUCs for which people might want spaces in the value.

 Additionally we might want to think about a bit better quoting support
 for such options?

Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
down for space-containing options.  Not sure what the most appropriate
quoting convention would be there, but I'm sure we can think of something.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andres Freund
On February 4, 2014 5:06:52 PM CET, Andrew Dunstan and...@dunslane.net wrote:

On 02/04/2014 10:43 AM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the
variable.
 But it's hard to believe that this would be true of
MainLWLockArray.
 There's not that much lwlock usage in contrib. It's just
 pg_stat_statements and pg_buffercache. Neither has tests... So it
very
 well could be that breakage simply hasn't been observed.
 Hm, you're right --- I'd have thought there were more of those.

 Ugh.  This problem was bad enough when I thought that it would only
lead
 to link-time errors detectable in the buildfarm.  If it can lead to
errors
 only observable at runtime --- and maybe not obvious even then ---
then
 I think we *have to* do something about it.  By that I mean that we
must
 get rid of the need to manually plaster PGDLLIMPORT on global
variables.

 Anybody with a Windows build environment want to test the #define
extern
 trick?

  



We have details on how to build with Mingw/Msys on Windows on an Amazon

VM http://wiki.postgresql.org/wiki/Building_With_MinGW which is
either 
free or very cheap. Do I need to give instructions on how to do this
for 
MSVC builds too? It's really not terribly hard.

Err. It might not be very hard but it certainly is time consuming. And that for 
people not caring about windows.

If there were usable, regularly refreshed, instances out there'd it'd be 
slightly less bad. But this still by far the most annoying and intrusive 
platform to care about.

Andres


-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Erik Rijkers
CentOS Release 6.5 (final)
AMD FX(tm)-8120 Eight-Core
2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 x86_64 
x86_64 GNU/Linux
memory: 8GB

I am testing nested hstore, on a server with both with these patches:
  jsonb-9.patch.gz
  nested-hstore-9.patch.gz

One of the first tries brings down the server (gobbles up too much memory, I 
think).

When I run:


select version();

drop   table if exists t_1000 ;
create table   t_1000 (
idserial
  , hshstore primary key
  , hs2   hstore
);
insert into t_1000 (hs,hs2)
  select
   ( '['   || i || ','  || i|| ']' ) ::hstore
,  ( '{'   || i || '=' || i|| '}' ) ::hstore
  from generate_series(1, 1000) as f(i)
;
-


I get:

$ time psql -af nestedhs.sql
\timing on
Timing is on.
select version();
 version
--
 PostgreSQL 9.4devel_nested_hstore_20140204_0814_00d4f2a on 
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.2, 64-bit
(1 row)

Time: 1.288 ms
drop   table if exists t_1000 ;
DROP TABLE
Time: 0.808 ms
create table   t_1000 (
idserial
  , hshstore primary key
  , hs2   hstore
);
CREATE TABLE
Time: 111.397 ms
insert into t_1000 (hs,hs2)
  select
  ( '['   || i || ','  || i|| ']' ) ::hstore
, ( '{'   || i || '=' || i|| '}' ) ::hstore
  from generate_series(1, 1000) as f(i)
;
[psql:nestedhs.sql:14: connection to server was lost

real5m4.780s
user0m0.005s
sys 0m0.009s

logging:

2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
terminated by signal 9: Killed
2014-02-04 10:34:25.854 CET 29133 DETAIL:  Failed process was running: insert 
into t_1000 (hs,hs2)
select
 ( '['   || i || ','  || i|| ']' ) ::hstore
  ,  ( '{'   || i || '=' || i|| '}' ) ::hstore
from generate_series(1, 1000) as f(i)
;
2014-02-04 10:34:25.884 CET 29133 LOG:  terminating any other active server 
processes
2014-02-04 10:34:28.541 CET 29133 LOG:  all server processes terminated; 
reinitializing
2014-02-04 10:34:30.002 CET 29534 LOG:  database system was interrupted; last 
known up at 2014-02-04 10:30:42 CET
2014-02-04 10:34:30.933 CET 29535 FATAL:  the database system is in recovery 
mode
2014-02-04 10:34:31.150 CET 29534 LOG:  database system was not properly shut 
down; automatic recovery in progress
2014-02-04 10:34:31.344 CET 29534 LOG:  redo starts at 1/B1CC92F8
2014-02-04 10:34:46.681 CET 29534 LOG:  unexpected pageaddr 1/86F4A000 in log 
segment 0001000100CC, offset
16031744
2014-02-04 10:34:46.681 CET 29534 LOG:  redo done at 1/CCF49F50
2014-02-04 10:34:52.039 CET 29133 LOG:  database system is ready to accept 
connections



(and btw, I end up with a large but unusable table:

testdb=# \dt+ t_1000
   List of relations
 Schema |Name| Type  |  Owner   |  Size  | Description
++---+--++-
 public | t_1000 | table | aardvark | 291 MB |
(1 row)

testdb=# select count(*) from t_1000;
 count
---
 0
(1 row)

)

Main .conf settings:

setting  |   current_setting
--+--
 autovacuum   | off
 port | 6541
 shared_buffers   | 512MB
 effective_cache_size | 2GB
 work_mem | 50MB
 maintenance_work_mem | 1GB
 checkpoint_segments  | 20
 server_version   | 9.4devel_nested_hstore_20140204_0814_00d4f2a
 pg_postmaster_start_time | 2014-02-04 10:12 (uptime: 0d 0h 33m 42s)
 data_checksums   | off

 feature_id |feature_name| is_supported | is_verified_by |  
comments
++--++-
 PKG100 | project name   | YES  | ej | nested_hstore
 PKG101 | patched| YES  | ej | YES
 PKG102 | patch file | YES  | ej |
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/jsonb-9.patch 
   +
||  ||
/home/aardvark/download/pgpatches/0094/nested_hstore/20140130/nested-hstore-9.patch
 PKG103 | build time | YES  | ej | 2014-02-04 
08:19:13.600371+01
 PKG104 | server_version | YES  | ej | 
9.4devel_nested_hstore_20140204_0814_00d4f2a
 PKG105 | server_version_num | YES  | ej | 90400
 PKG106 | port   | YES  | ej | 6541
 PKG110 | commit hash| YES  | ej 

Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Andres Freund
On 2014-02-04 11:36:22 -0500, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  PGOPTIONS='-c default_transaction_isolation=serializable' \
  psql ... -c SHOW default_transaction_isolation
  works well enough, but
  PGOPTIONS='-c default_transaction_isolation=repeatable read' \
  psql ... -c SHOW default_transaction_isolation
  doesn't, because of the whitespace. I couldn't come up with any adequate
  quoting.
 
  I'd like to propose adding aliases with dashes instead of spaces to the
  isolation_level_options array? I'd even like to backport it, because it
  makes benchmarking across versions unneccessarily hard.
 
 -1.  This is not a general solution to the problem.  There are other
 GUCs for which people might want spaces in the value.

Sure, I didn't say it was. But I don't see any oother values that are
likely being passed via PGOPTIONS that frequently contain spaces. Sure,
you can generate a search_path that does so, but that's just asking for
problems. Most other GUCs that can contain spaces are
PGC_SIGHUP/POSTMASTER.
And having to use quoting just makes it awkward to use from shell. Since
all the other option values try to take not to force using spaces, I see
little reason not to do so here as well.

  Additionally we might want to think about a bit better quoting support
  for such options?
 
 Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
 down for space-containing options.  Not sure what the most appropriate
 quoting convention would be there, but I'm sure we can think of something.

No argument against introducing it. What about simply allowing escaping
of the next character using \?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extension_control_path

2014-02-04 Thread David E. Wheeler
On Jan 30, 2014, at 10:06 AM, Sergey Muraviov sergey.k.murav...@gmail.com 
wrote:

 Now it looks fine for me.

Just as another data point, I recently submitted pgTAP to the Homebrew project 
This is the build-from-source system for OS X, used by a lot of web developers. 
In my build script, I originally had

   depends_on :postgresql

Which means, “require any version of PostgreSQL.” But then tests failed on OS X 
Server, which includes a system-distributed PostgreSQL. Homebrew installs 
everything in /usr/local, and not only does it disallow installing anything 
outside of that directory, it doesn’t have any permissions to do so. The 
install failed, of course, because extensions want to install in 
$PGROOT/share/extensions. For now, I had to change it to

   depends_on 'postgresql'

A subtle difference that means, “require the latest version of the 
Homebrew-built PostgreSQL in /usr/local.”

However, if extension_control_path was supported, I could change it back to 
requiring any Postgres and install pgTAP somewhere under /usr/local, as 
required for Homebrew. Then all the user would have to do to use it with their 
preferred Postgres would be to set extension_control_path.

In other words, I am strongly in favor of this patch, as it gives distribution 
systems a lot more flexibility (for better and for worse) in determining where 
extensions should be installed.

My $0.02.

Best,

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] specifying repeatable read in PGOPTIONS

2014-02-04 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-02-04 11:36:22 -0500, Tom Lane wrote:
 -1.  This is not a general solution to the problem.  There are other
 GUCs for which people might want spaces in the value.

 Sure, I didn't say it was. But I don't see any oother values that are
 likely being passed via PGOPTIONS that frequently contain spaces.

application_name --- weren't we just reading about people passing entire
command lines there?  (They must be using some other way of setting it
currently, but PGOPTIONS doesn't seem like an implausible source.)

 Yeah.  See pg_split_opts(), which explicitly acknowledges that it'll fall
 down for space-containing options.  Not sure what the most appropriate
 quoting convention would be there, but I'm sure we can think of something.

 No argument against introducing it. What about simply allowing escaping
 of the next character using \?

The same thought had occurred to me.  Since it'll typically already be
inside some levels of quoting, any quoted-string convention seems like
it'd be a pain to use.  But a straight backslash-escapes-the-next-char
thing wouldn't be too awful, I think.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 11:30 AM, Andres Freund wrote:





We have details on how to build with Mingw/Msys on Windows on an Amazon

VM http://wiki.postgresql.org/wiki/Building_With_MinGW which is
either
free or very cheap. Do I need to give instructions on how to do this
for
MSVC builds too? It's really not terribly hard.

Err. It might not be very hard but it certainly is time consuming. And that for 
people not caring about windows.

If there were usable, regularly refreshed, instances out there'd it'd be 
slightly less bad. But this still by far the most annoying and intrusive 
platform to care about.




If someone volunteered to pay for the storage, I'd be prepared to make 
some time to create an AMI to reduce the startup time dramatically. 
Basically it would be boot the AMI and start testing your patches. I'd 
even make it as friendly as possible for people who don't like to get 
too far from unix-ish environments.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 If someone volunteered to pay for the storage, I'd be prepared to make 
 some time to create an AMI to reduce the startup time dramatically. 
 Basically it would be boot the AMI and start testing your patches. I'd 
 even make it as friendly as possible for people who don't like to get 
 too far from unix-ish environments.

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Joshua D. Drake


On 02/04/2014 09:34 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

If someone volunteered to pay for the storage, I'd be prepared to make
some time to create an AMI to reduce the startup time dramatically.
Basically it would be boot the AMI and start testing your patches. I'd
even make it as friendly as possible for people who don't like to get
too far from unix-ish environments.


My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.


Although that is obviously your prerogative it is important to remember 
that Windows is easily the second most used version of PostgreSQL out 
there (behind Linux).


I know many people that run it on Windows, especially in the medical 
field. I also know many people that embed it (Apple not withstanding).


Yes it is an obnoxious platform but it is THE platform, no matter how 
much we like to tout our Linux(Hiker) creds.


Sincerely,

JD


--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Jeff Janes
On Tue, Feb 4, 2014 at 8:06 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 02/04/2014 10:43 AM, Tom Lane wrote:


 Ugh.  This problem was bad enough when I thought that it would only lead
 to link-time errors detectable in the buildfarm.  If it can lead to errors
 only observable at runtime --- and maybe not obvious even then --- then
 I think we *have to* do something about it.  By that I mean that we must
 get rid of the need to manually plaster PGDLLIMPORT on global variables.

 Anybody with a Windows build environment want to test the #define extern
 trick?






 We have details on how to build with Mingw/Msys on Windows on an Amazon VM
 http://wiki.postgresql.org/wiki/Building_With_MinGW which is either
 free or very cheap. Do I need to give instructions on how to do this for
 MSVC builds too? It's really not terribly hard.



If you gave step by step instructions like the ones for MinGW, I would at
least give it a try.  Last time a looked into it, I gave up after I
couldn't figure out which of the umpteen similarly-named products was the
one I needed to buy/download-for-free/register-and-download and then
install, and I tried a few of them at random without much success.

Cheers,

Jeff


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 Now there is approximately 1.4~5% CPU gain for
 hundred tiny fields, half nulled case

I don't want to advocate too strongly for this patch because, number
one, Amit is a colleague and more importantly, number two, I can't
claim to be an expert on compression.  But that having been said, I
think these numbers are starting to look awfully good.  The only
remaining regressions are in the cases where a large fraction of the
tuple turns over, and they're not that big even then.  The two *worst*
tests now seem to be hundred tiny fields, all changed and hundred
tiny fields, half changed.  For the all changed case, the median
unpatched time is 16.3172590732574 and the median patched time is
16.9294109344482, a 4% loss; for the half changed case, the median
unpatched time is 16.5795118808746 and the median patched time is
17.0454230308533, a 3% loss.  Both cases show minimal change in WAL
volume.

Meanwhile, in friendlier cases, like one short and one long field, no
change, we're seeing big improvements.  That particular case shows a
speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
and I think not unrepresentative of many real-world workloads.  Some
might well do better, having either more or longer unchanged fields.
Assuming that the logic isn't buggy, a point in need of further study,
I'm starting to feel like we want to have this.  And I might even be
tempted to remove the table-level off switch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 17:41, Erik Rijkers wrote:

 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
 terminated by signal 9: Killed

Did you check if this was the OOM killer? Should be logged in dmesg.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpfuIWxnh8kv.pgp
Description: PGP signature


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Erik Rijkers
On Tue, February 4, 2014 18:56, Christian Kruse wrote:
 Hi,

 On 04/02/14 17:41, Erik Rijkers wrote:

 2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
 terminated by signal 9: Killed

 Did you check if this was the OOM killer? Should be logged in dmesg.


I would be surprised if it wasn't. (no access to that machine at the moment)

How do we regard such crashes?  It seems to me this was rather eaasily 
'provoked' (for want of a better word).

I am inclined to blame the patch...









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Joshua D. Drake j...@commandprompt.com writes:
 On 02/04/2014 09:34 AM, Tom Lane wrote:
 My own opinion is that I've already wasted untold man-hours thanks to
 the random porting problems induced by Windows, a platform that I never
 have and never will care about personally.  I will *not* spend my own
 time doing tests that someone else could do.  If we can't get some
 effort contributed by someone who does use that platform, I'm personally
 prepared to declare the entire damn thing no longer supported.

 Although that is obviously your prerogative it is important to remember 
 that Windows is easily the second most used version of PostgreSQL out 
 there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Bruce Momjian
On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
 Meanwhile, in friendlier cases, like one short and one long field, no
 change, we're seeing big improvements.  That particular case shows a
 speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
 and I think not unrepresentative of many real-world workloads.  Some
 might well do better, having either more or longer unchanged fields.
 Assuming that the logic isn't buggy, a point in need of further study,
 I'm starting to feel like we want to have this.  And I might even be
 tempted to remove the table-level off switch.

Does this feature relate to compression of WAL page images at all?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Andres Freund
On 2014-02-04 14:09:57 -0500, Bruce Momjian wrote:
 On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
  Meanwhile, in friendlier cases, like one short and one long field, no
  change, we're seeing big improvements.  That particular case shows a
  speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
  and I think not unrepresentative of many real-world workloads.  Some
  might well do better, having either more or longer unchanged fields.
  Assuming that the logic isn't buggy, a point in need of further study,
  I'm starting to feel like we want to have this.  And I might even be
  tempted to remove the table-level off switch.
 
 Does this feature relate to compression of WAL page images at all?

No.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PQputCopyData dont signal error

2014-02-04 Thread Jeff Davis
On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
 On 14.04.2011 10:15, Pavel Stehule wrote:
  Hello
 
  I have a problem with PQputCopyData function. It doesn't signal some error.
 
  while ((row = mysql_fetch_row(res)) != NULL)
  {
  snprintf(buffer, sizeof(buffer), %s%s\n, row[0], 
  row[1]);
  copy_result = PQputCopyData(pconn, buffer, 
  strlen(buffer));
  printf(%s\n, PQerrorMessage(pconn));
  printf(%d\n, copy_result);
  if (copy_result != 1)
  {
  fprintf(stderr, Copy to target table failed: 
  %s,
  PQerrorMessage(pconn));
  EXIT;
  }
  }
 
  it returns 1 for broken values too :(
 
  Is necessary some special check?
 
 The way COPY works is that PQputCopyData just sends the data to the 
 server, and the server will buffer it in its internal buffer and 
 processes it when it feels like it. The PQputCopyData() calls don't even 
 need to match line boundaries.
 
 I think you'll need to send all the data and finish the COPY until you 
 get an error. If you have a lot of data to send, you might want to slice 
 it into multiple COPY statements of say 50MB each, so that you can catch 
 errors in between.

[ replying to old thread ]

According to the protocol docs[1]:

In the event of a backend-detected error during copy-in mode (including
receipt of a CopyFail message), the backend will issue an ErrorResponse
message. If the COPY command was issued via an extended-query message,
the backend will now discard frontend messages until a Sync message is
received, then it will issue ReadyForQuery and return to normal
processing. If the COPY command was issued in a simple Query message,
the rest of that message is discarded and ReadyForQuery is issued. In
either case, any subsequent CopyData, CopyDone, or CopyFail messages
issued by the frontend will simply be dropped.

If the remaining CopyData messages are dropped, I don't see why
PQputCopyData can't return some kind of error indicating that further
CopyData messages are useless so that it can stop sending them.

Asking the client to break the copy into multiple COPY commands is bad,
because then the client needs to figure out the line breaks, which is a
burden in many cases.

Certainly we don't want to *guarantee* that the backend will issue an
error at any particular point, because of the buffering on the server
side. But from a practical standpoint, the server will let the client
know fairly quickly and it will avoid a lot of client-side work and
network traffic.

Would a change to PQputCopyData be welcome?

Regards,
Jeff Davis

[1] http://www.postgresql.org/docs/9.3/static/protocol-flow.html





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Tom Lane
Erik Rijkers e...@xs4all.nl writes:
 On Tue, February 4, 2014 18:56, Christian Kruse wrote:
 Did you check if this was the OOM killer? Should be logged in dmesg.

 I would be surprised if it wasn't. (no access to that machine at the moment)

 How do we regard such crashes?  It seems to me this was rather eaasily 
 'provoked' (for want of a better word).

Well, it suggests that there may be a memory leak, which would be a bug
even though we'd assign it lower priority than a true crash.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Adrian Klaver

On 02/04/2014 10:53 AM, Tom Lane wrote:

Joshua D. Drake j...@commandprompt.com writes:

On 02/04/2014 09:34 AM, Tom Lane wrote:

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.



Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).


[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?


Well, from what I have seen of the Windows world there is a fairly sharp 
demarcation between users and developers. So use does not necessarily 
mean developer knowledge.





regards, tom lane





--
Adrian Klaver
adrian.kla...@gmail.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 01:53 PM, Tom Lane wrote:

Joshua D. Drake j...@commandprompt.com writes:

On 02/04/2014 09:34 AM, Tom Lane wrote:

My own opinion is that I've already wasted untold man-hours thanks to
the random porting problems induced by Windows, a platform that I never
have and never will care about personally.  I will *not* spend my own
time doing tests that someone else could do.  If we can't get some
effort contributed by someone who does use that platform, I'm personally
prepared to declare the entire damn thing no longer supported.

Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?




I suspect that the open hostility towards it doesn't help much. I do put 
in quite alot of effort supporting it one way and another, committing 
patches and running buildfarm members among other things. And we have 
several others who put in significant amounts of effort. And no, it's 
not my platform of choice either.


Anyway, I will explore my suggestion upthread of a pre-prepared windows 
development AMI, when I have time. Right now between work for actual 
money and things like jsonb I am slammed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] nested hstore - large insert crashes server

2014-02-04 Thread Stephen Frost
* Erik Rijkers (e...@xs4all.nl) wrote:
 On Tue, February 4, 2014 18:56, Christian Kruse wrote:
  On 04/02/14 17:41, Erik Rijkers wrote:
  2014-02-04 10:34:25.376 CET 29133 LOG:  server process (PID 29459) was 
  terminated by signal 9: Killed
 
  Did you check if this was the OOM killer? Should be logged in dmesg.
 
 I would be surprised if it wasn't. (no access to that machine at the moment)
 
 How do we regard such crashes?  It seems to me this was rather eaasily 
 'provoked' (for want of a better word).
 
 I am inclined to blame the patch...

It sounds like there is at least some investigation which should happen
here to see why we're using so much memory (well beyond work_mem and
even maint_work_mem it sounds like), but it would also be good to have
the machine reconfigured to not allow OOM killing.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] PQputCopyData dont signal error

2014-02-04 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 On Thu, 2011-04-14 at 10:50 +0300, Heikki Linnakangas wrote:
 I think you'll need to send all the data and finish the COPY until you 
 get an error. If you have a lot of data to send, you might want to slice 
 it into multiple COPY statements of say 50MB each, so that you can catch 
 errors in between.

 If the remaining CopyData messages are dropped, I don't see why
 PQputCopyData can't return some kind of error indicating that further
 CopyData messages are useless so that it can stop sending them.

An error from PQputCopyData indicates a connection-level problem,
typically; that is it means that it couldn't send data not that the server
had detected a problem with some earlier data.  I'd not be surprised
if applications would respond to such an error indication by just
pre-emptively dropping the connection.  (A quick trawl through our
own code shows a precedent in pg_restore, in fact.)  So the idea
of having PQputCopyData start failing as soon as an error has arrived
from the server doesn't sound that attractive.

What'd be safer is to provide a way of detecting whether an error
has arrived (without actually consuming it, of course) so that the
case could be handled by adding something like

if (PQerrorIsPending(conn))
break;

to the send-data loop.  This would allow the application code to know
what is happening.  It would also not impose the penalty of checking
for errors on apps that prefer to optimize for the no-error case.

A different approach would be for PQputCopyData to just internally
suppress sending of further CopyData messages, *without* returning
an error; this would be transparent and it would save the network
traffic at least.  However, my reading of the original example is
that the OP was at least as interested in suppressing the creation
of further data as in suppressing the sending, so this wouldn't
really solve his problem completely.  And it's also optimizing
for the error case not the no-error case, in a way that doesn't give
the app any say in the matter.  So I'm not for this particularly,
just mentioning it for completeness.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

I'm doing some benchmarks regarding this problem: one set with
baseline and one set with your patch. Machine was a 32 core machine (4
CPUs with 8 cores), 252 gib RAM. Both versions have the type align
patch applied. pgbench-tools config:

SCALES=100
SETCLIENTS=1 4 8 16 32 48 64 96 128
SETTIMES=2

I added -M prepared to the pgbench call in the benchwarmer script.

The read-only tests are finished, I come to similiar results as yours:

http://wwwtech.de/pg/benchmarks-lwlock-read-only/

I think the small differences are caused by the fact that I use TCP
connections and not Unix domain sockets.

The results are pretty impressive… I will post the read-write results
as soon as they are finished.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpKL7vRo_Vp0.pgp
Description: PGP signature


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 I'm doing some benchmarks regarding this problem: one set with
 baseline and one set with your patch. Machine was a 32 core machine (4
 CPUs with 8 cores), 252 gib RAM. Both versions have the type align
 patch applied.

It certainly seems as if the interesting cases are where clients  cores.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 I added -M prepared to the pgbench call in the benchwarmer script.

 The read-only tests are finished, I come to similiar results as yours:

 http://wwwtech.de/pg/benchmarks-lwlock-read-only/

Note that Christian ran this test with max_connections=201, presumably
to exercise the alignment problem.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On 2014-02-04 11:48:14 -0800, Peter Geoghegan wrote:
 On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  I added -M prepared to the pgbench call in the benchwarmer script.
 
  The read-only tests are finished, I come to similiar results as yours:
 
  http://wwwtech.de/pg/benchmarks-lwlock-read-only/
 
 Note that Christian ran this test with max_connections=201, presumably
 to exercise the alignment problem.

I think he has applied the patch to hack around the alignment issue I
pushed to git for both branches. It's not nice enough to be applied yet,
but it should fix the issue.
I think the 201 is just a remembrance of debugging the issue.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:50 AM, Andres Freund and...@2ndquadrant.com wrote:
 I think he has applied the patch to hack around the alignment issue I
 pushed to git for both branches. It's not nice enough to be applied yet,
 but it should fix the issue.
 I think the 201 is just a remembrance of debugging the issue.

I guess that given that *both* cases tested had the patch applied,
that makes sense. However, I would have liked to see a real master
baseline.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Joshua D. Drake


On 02/04/2014 11:17 AM, Andrew Dunstan wrote:


prepared to declare the entire damn thing no longer supported.

Although that is obviously your prerogative it is important to remember
that Windows is easily the second most used version of PostgreSQL out
there (behind Linux).

[ shrug... ]  If it's so widely used, why is it so hard to find somebody
who's willing to put in some development effort for it?




Because Windows developers get paid and have better things to do than 
interact with a bunch of Open Source people and their opinions. Let's be 
honest, all of us here are not the easiest to work with and we are here 
largely due to an ideology. Why would a Windows developer bother?


I know lots of Windows Developers and most of them don't follow anywhere 
near the ideology we do. I know very few (read 0) Windows developers 
that do it for fun (I am not saying they don't have fun. I am saying 
they don't do it for fun). They do it for their career.


Heck, look back through the years at our archives. Nobody really thought 
that we would somehow bring windows developers into the fold if we 
ported to that platform. It was a market share play and it worked. 
However, now we are paying the price for it.


To be perfectly frank, the first thing I do when a customer calls and 
says we are on Windows is tell them I will spend our relationship 
pushing them to move PostgreSQL to Linux. It works over time for the 
most part but not on all.


Sincerely,

Joshua D. Drake




--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 I'm doing some benchmarks regarding this problem: one set with
 baseline and one set with your patch. Machine was a 32 core machine (4
 CPUs with 8 cores), 252 gib RAM.

What CPU model? Can you post /proc/cpuinfo? The distinction between
logical and physical cores matters here.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On February 4, 2014 8:53:36 PM CET, Peter Geoghegan p...@heroku.com wrote:
On Tue, Feb 4, 2014 at 11:50 AM, Andres Freund and...@2ndquadrant.com
wrote:
 I think he has applied the patch to hack around the alignment issue I
 pushed to git for both branches. It's not nice enough to be applied
yet,
 but it should fix the issue.
 I think the 201 is just a remembrance of debugging the issue.

I guess that given that *both* cases tested had the patch applied,
that makes sense. However, I would have liked to see a real master
baseline.

Christian, could you rerun with master (the commit on which the branch is based 
on), the alignment patch, and then the lwlock patch? Best with max_connections 
200.
That's probably more important than the write tests as a first step..

Thanks,
Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Jeff Janes
On Tue, Feb 4, 2014 at 9:26 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 02/04/2014 11:30 AM, Andres Freund wrote:




 We have details on how to build with Mingw/Msys on Windows on an Amazon

 VM http://wiki.postgresql.org/wiki/Building_With_MinGW which is
 either
 free or very cheap. Do I need to give instructions on how to do this
 for
 MSVC builds too? It's really not terribly hard.

 Err. It might not be very hard but it certainly is time consuming. And
 that for people not caring about windows.

 If there were usable, regularly refreshed, instances out there'd it'd be
 slightly less bad. But this still by far the most annoying and intrusive
 platform to care about.



 If someone volunteered to pay for the storage, I'd be prepared to make
 some time to create an AMI to reduce the startup time dramatically.
 Basically it would be boot the AMI and start testing your patches. I'd
 even make it as friendly as possible for people who don't like to get too
 far from unix-ish environments.


Do you know about what it would cost?  Could official community funds be
used for it (it seems like something that is cheap, but which you wouldn't
want to be forgotten about some month.)

Having an AMI would help, but even with an AMI in place, MinGW is still
insanely slow.  Running make on already made PostgreSQL (so there was
nothing to actually do) takes 1.5 minutes.  And a make after a make clean
takes half an hour.  This is on an actual desktop, not an AWS micro
instance.  So doing a git bisect is just painful.  Is the MSVC build faster?

Cheers,

Jeff


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 12:02, Peter Geoghegan wrote:
 On Tue, Feb 4, 2014 at 11:39 AM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  I'm doing some benchmarks regarding this problem: one set with
  baseline and one set with your patch. Machine was a 32 core machine (4
  CPUs with 8 cores), 252 gib RAM.
 
 What CPU model? Can you post /proc/cpuinfo? The distinction between
 logical and physical cores matters here.

model name  : Intel(R) Xeon(R) CPU E5-4620 0 @ 2.20GHz

32 physical cores, 64 logical cores. /proc/cpuinfo is applied.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpZpWnKfQtb4.pgp
Description: PGP signature


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 12:34 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 If someone volunteered to pay for the storage, I'd be prepared to make
 some time to create an AMI to reduce the startup time dramatically.
 Basically it would be boot the AMI and start testing your patches. I'd
 even make it as friendly as possible for people who don't like to get
 too far from unix-ish environments.

 My own opinion is that I've already wasted untold man-hours thanks to
 the random porting problems induced by Windows, a platform that I never
 have and never will care about personally.  I will *not* spend my own
 time doing tests that someone else could do.  If we can't get some
 effort contributed by someone who does use that platform, I'm personally
 prepared to declare the entire damn thing no longer supported.

You know, I would really prefer to just stick a PGDLLIMPORT on this
place and any others that need it, and any others that come up, than
turn this into a political football.  Having to sprinkle PGDLLIMPORT
on the handful of variables that are accessed by contrib modules is,
indeed, annoying.  But it's not any more annoying than twelve other
things that I have to do as a committer, like remembering to bump
whichever of catversion, the xlog page magic, and the control data
version are relevant to a particular commit, knowing which particular
SGML files have to build standalone, and enforcing the project's code
formatting, comment, and documentation conventions on everyone who
submits a patch.  So acting as if this particular annoyance is somehow
unique or a good reason to desupport what is despite all of our
misgivings one of our most popular platforms doesn't impress me one
bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Peter Eisentraut
Perhaps this type should be called pglsn, since it's an
implementation-specific detail and not a universal concept like int,
point, or uuid.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Viability of text HISTORY/INSTALL/regression README files (was Re: [COMMITTERS] pgsql: Document a few more regression test hazards.)

2014-02-04 Thread Robert Haas
On Tue, Feb 4, 2014 at 1:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Noah Misch n...@leadboat.com writes:
 Robert Haas robertmh...@gmail.com writes:
 I wonder if these standalone things are really worthwhile.

 I wonder how difficult it would be to make sufficient link data available 
 when
 building the standalone files.  There would be no linking per se; we would
 just need the referent's text fragment emitted where the xref tag appears.

 IIRC, that's basically what the workaround is, except it's not very
 automated.  Even if it were automated, though, there's still a problem:
 such links aren't really *useful* in flat text format.  I think that
 forcing the author to actually think about what to put there in the
 flat text version is a good thing, if we're going to retain the flat
 text version at all.

Right.  I mean, a lot of the links say things like Section 26.2
which obviously makes no sense in a standalone text file.

I agree with your comments upthread: INSTALL *might* still be useful
to somebody, but I would be pretty surprised if anyone uses HISTORY or
regress_README for anything any more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Christian Kruse
Hi,

On 04/02/14 21:03, Andres Freund wrote:
 Christian, could you rerun with master (the commit on which the
 branch is based on), the alignment patch, and then the lwlock patch?
 Best with max_connections 200.  That's probably more important than
 the write tests as a first step..

Ok, benchmark for baseline+alignment patch is running. This will take
a couple of hours and since I have to get up at about 05:00 I won't be
able to post it before tomorrow.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpsN5kcUiOgQ.pgp
Description: PGP signature


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 03:08 PM, Jeff Janes wrote:



Do you know about what it would cost?  Could official community funds 
be used for it (it seems like something that is cheap, but which you 
wouldn't want to be forgotten about some month.)


Having an AMI would help, but even with an AMI in place, MinGW is 
still insanely slow.  Running make on already made PostgreSQL (so 
there was nothing to actually do) takes 1.5 minutes.  And a make after 
a make clean takes half an hour.  This is on an actual desktop, not 
an AWS micro instance.  So doing a git bisect is just painful.  Is the 
MSVC build faster?






Would have to check with the same build options (cassert and debug have 
major timing effects.) I agree it's not lightning fast like make -j 4 
on a decent linux box.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part2: fast scan

2014-02-04 Thread Alexander Korotkov
On Mon, Feb 3, 2014 at 6:31 PM, Alexander Korotkov aekorot...@gmail.comwrote:

 On Mon, Jan 27, 2014 at 7:30 PM, Alexander Korotkov 
 aekorot...@gmail.comwrote:

 On Mon, Jan 27, 2014 at 2:32 PM, Alexander Korotkov aekorot...@gmail.com
  wrote:

 On Sun, Jan 26, 2014 at 8:14 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:

 On 01/26/2014 08:24 AM, Tomas Vondra wrote:

 Hi!

 On 25.1.2014 22:21, Heikki Linnakangas wrote:

 Attached is a new version of the patch set, with those bugs fixed.


 I've done a bunch of tests with all the 4 patches applied, and it seems
 to work now. I've done tests with various conditions (AND/OR, number of
 words, number of conditions) and I so far I did not get any crashes,
 infinite loops or anything like that.

 I've also compared the results to 9.3 - by dumping the database and
 running the same set of queries on both machines, and indeed I got 100%
 match.

 I also did some performance tests, and that's when I started to worry.

 For example, I generated and ran 1000 queries that look like this:

SELECT id FROM messages
 WHERE body_tsvector @@ to_tsquery('english','(header  53  32 
 useful  dropped)')
 ORDER BY ts_rank(body_tsvector, to_tsquery('english','(header  53
 
 32  useful  dropped)')) DESC;

 i.e. in this case the query always was 5 words connected by AND. This
 query is a pretty common pattern for fulltext search - sort by a list
 of
 words and give me the best ranked results.

 On 9.3, the script was running for ~23 seconds, on patched HEAD it was
 ~40. It's perfectly reproducible, I've repeated the test several times
 with exactly the same results. The test is CPU bound, there's no I/O
 activity at all. I got the same results with more queries (~100k).

 Attached is a simple chart with x-axis used for durations measured on
 9.3.2, y-axis used for durations measured on patched HEAD. It's obvious
 a vast majority of queries is up to 2x slower - that's pretty obvious
 from the chart.

 Only about 50 queries are faster on HEAD, and 700 queries are more
 than
 50% slower on HEAD (i.e. if the query took 100ms on 9.3, it takes
 150ms
 on HEAD).

 Typically, the EXPLAIN ANALYZE looks something like this (on 9.3):

   http://explain.depesz.com/s/5tv

 and on HEAD (same query):

   http://explain.depesz.com/s/1lI

 Clearly the main difference is in the Bitmap Index Scan which takes
 60ms on 9.3 and 120ms on HEAD.

 On 9.3 the perf top looks like this:

  34.79%  postgres [.] gingetbitmap
  28.96%  postgres [.] ginCompareItemPointers
   9.36%  postgres [.] TS_execute
   5.36%  postgres [.] check_stack_depth
   3.57%  postgres [.] FunctionCall8Coll

 while on 9.4 it looks like this:

  28.20%  postgres [.] gingetbitmap
  21.17%  postgres [.] TS_execute
   8.08%  postgres [.] check_stack_depth
   7.11%  postgres [.] FunctionCall8Coll
   4.34%  postgres [.] shimTriConsistentFn

 Not sure how to interpret that, though. For example where did the
 ginCompareItemPointers go? I suspect it's thanks to inlining, and that
 it might be related to the performance decrease. Or maybe not.


 Yeah, inlining makes it disappear from the profile, and spreads that
 time to the functions calling it.

 The profile tells us that the consistent function is called a lot more
 than before. That is expected - with the fast scan feature, we're calling
 consistent not only for potential matches, but also to refute TIDs based on
 just a few entries matching. If that's effective, it allows us to skip many
 TIDs and avoid consistent calls, which compensates, but if it's not
 effective, it's just overhead.

 I would actually expect it to be fairly effective for that query, so
 that's a bit surprising. I added counters to see where the calls are coming
 from, and it seems that about 80% of the calls are actually coming from
 this little the feature I explained earlier:


  In addition to that, I'm using the ternary consistent function to check
 if minItem is a match, even if we haven't loaded all the entries yet.
 That's less important, but I think for something like rare1 | (rare2 
 frequent) it might be useful. It would allow us to skip fetching
 'frequent', when we already know that 'rare1' matches for the current
 item. I'm not sure if that's worth the cycles, but it seemed like an
 obvious thing to do, now that we have the ternary consistent function.


 So, that clearly isn't worth the cycles :-). At least not with an
 expensive consistent function; it might be worthwhile if we pre-build the
 truth-table, or cache the results of the consistent function.

 Attached is a quick patch to remove that, on top of all the other
 patches, if you want to test the effect.


 Every single change you did in fast scan seems to be reasonable, but
 testing shows that something went wrong. 

Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 12:30 PM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Ok, benchmark for baseline+alignment patch is running.

I see that you have enabled latency information. For this kind of
thing I prefer to hack pgbench-tools to not collect this (i.e. to not
pass the -l flag, Per-Transaction Logging). Just remove it and
pgbench-tools rolls with it. It may well be that the overhead added is
completely insignificant, but for something like this, where the
latency information is unlikely to add any value, I prefer to not take
the chance. This is a fairly minor point, however, especially since
these are only 60 second runs where you're unlikely to accumulate
enough transaction latency information to notice any effect.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Bruce Momjian
On Tue, Feb  4, 2014 at 08:11:18PM +0100, Andres Freund wrote:
 On 2014-02-04 14:09:57 -0500, Bruce Momjian wrote:
  On Tue, Feb  4, 2014 at 01:28:38PM -0500, Robert Haas wrote:
   Meanwhile, in friendlier cases, like one short and one long field, no
   change, we're seeing big improvements.  That particular case shows a
   speedup of 21% and a WAL reduction of 36%.  That's a pretty big deal,
   and I think not unrepresentative of many real-world workloads.  Some
   might well do better, having either more or longer unchanged fields.
   Assuming that the logic isn't buggy, a point in need of further study,
   I'm starting to feel like we want to have this.  And I might even be
   tempted to remove the table-level off switch.
  
  Does this feature relate to compression of WAL page images at all?
 
 No.

I guess it bothers me we are working on compressing row change sets
while the majority(?) of WAL is page images.  I know we had a page image
compression patch that got stalled.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 11:11 AM, Andres Freund and...@2ndquadrant.com wrote:
 Does this feature relate to compression of WAL page images at all?

 No.

So the obvious question is: where, if anywhere, do the two efforts
(this patch, and Fujii's patch) overlap? Does Fujii have any concerns
about this patch as it relates to his effort to compress FPIs?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Andres Freund
On February 4, 2014 10:50:10 PM CET, Peter Geoghegan p...@heroku.com wrote:
On Tue, Feb 4, 2014 at 11:11 AM, Andres Freund and...@2ndquadrant.com
wrote:
 Does this feature relate to compression of WAL page images at all?

 No.

So the obvious question is: where, if anywhere, do the two efforts
(this patch, and Fujii's patch) overlap? Does Fujii have any concerns
about this patch as it relates to his effort to compress FPIs?

I think there's zero overlap. They're completely complimentary features. It's 
not like normal WAL records have an irrelevant volume.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 1:58 PM, Andres Freund and...@2ndquadrant.com wrote:
 I think there's zero overlap. They're completely complimentary features. It's 
 not like normal WAL records have an irrelevant volume.

I'd have thought so too, but I would not like to assume. Like many
people commenting on this thread, I don't know very much about
compression.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Josh Berkus
On 02/03/2014 07:27 AM, Andres Freund wrote:
 On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
 I lost my stomach (or maybe it was the glass of red) somewhere in the
 middle, but I think this needs a lot of work. Especially the io code
 doesn't seem ready to me. I'd consider ripping out the send/recv code
 for 9.4, that seems the biggest can of worms. It will still be usable
 without.

 Not having type send/recv functions is somewhat dangerous; it can
 cause problems for libraries that run everything through the binary
 wire format.  I'd give jsonb a pass on that, being a new type, but
 would be concerned if hstore had that ability revoked.
 
 Yea, removing it for hstore would be a compat problem...
 
 offhand note: hstore_send seems pretty simply written and clean; it's
 a simple nonrecursive iterator...
 
 But a send function is pretty pointless without the corresponding recv
 function... And imo recv simply is to dangerous as it's currently
 written.
 I am not saying that it cannot be made work, just that it's still nearly
 as ugly as when I pointed out several of the dangers some weeks back.

Oleg, Teodor, any comments on the above?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] integrate pg_upgrade analyze_new_cluster.sh into vacuumdb

2014-02-04 Thread Jeff Janes
On Tue, Jan 21, 2014 at 9:06 AM, Oskari Saarenmaa o...@ohmu.fi wrote:

 09.01.2014 05:15, Peter Eisentraut kirjoitti:

  pg_upgrade creates a script analyze_new_cluster.{sh|bat} that runs
 vacuumdb --analyze-only in three stages with different statistics target
 settings to get a fresh cluster analyzed faster.  I think this behavior
 is also useful for clusters or databases freshly created by pg_restore
 or any other loading mechanism, so it's suboptimal to have this
 constrained to pg_upgrade.


 I think the three stage analyze is a wrong solution to the slow analyze
 problem.


It certainly is not the best possible solution.  But it might be the best
one that can be arrived at within a reasonable amount of time.

If we really want to micromanage the process we could put a lot of work
into it.  Take the case of a small table of about 300 pages.  We read the
table 3 times (separated by enough time that it is probably no longer
cached), first keeping one tuple per page, then 10, then 100.   Instead it
should probably just jump directly to sampling at statistics target of 100
and then forget those small tables, at least if all we are concerned about
is IO costs.  (Since the selected tuples are sorted once for each column,
there might be a CPU reason to take a small sample at the first pass, if we
are more concerned with CPU than IO.)

But I do wonder what experience people have with the 3 stage process, how
useful is it empirically?  If you can't open the database for general use
until the 3rd phase is done, then you would just jump to doing that stage,
rather than working through all 3 of them.  If you can open the database
and muddle through without statistics for a while, why not muddle through
for the little bit longer that it would take to collect the full set right
off the bat, rather than making intermediate passes?

So I agree that the current system in not optimal.  But this patch is just
moving existing behavior from a less general location to a more general
one, so I don't think it should be held hostage to improvements that could
theoretically be made but which no one has offered to do.  I wouldn't want
to put in a change that forces users to learn something new for 9.4 only to
have it completely redone in 9.5 and then make them learn that.  But the
documentation and training burden of this change seems small enough that I
wouldn't worry about that.  (On the other hand, the benefit of the change
also seems pretty small.)

In my experience most of the analyze time goes to reading random blocks
 from the disk but we usually use only a small portion of that data (1 row
 per block.)

 If we were able to better utilize the data we read we could get good
 statistics with a lot less IO than we currently need.  This was discussed
 in length at
 http://www.postgresql.org/message-id/CAM-w4HOjRbNPMW=
 shjhw_qfapcuu5ege1tmdr0zqu+kqx8q...@mail.gmail.com but it hasn't turned
 into patches so far.


I don't think it is an accident that it hasn't turned into patches.  You
can't change the laws of statistics just by wanting it badly enough.  Our
current sampling method is already insufficiently random.  We aren't going
to fix things by making it even less random.   But I guess that that is an
empirical question again, have most statistics problems been due to the
sample being insufficiently large, or insufficiently random?  (Or
insufficiently recent?)

There could be a different stage-by-stage approach where
default_statistics_target is fixed but in the first pass you just take the
first 30,000 rows in each table, and then in second pass you take a random
30,000 rows.  But the knobs to do that currently do not exist, and I doubt
they would be welcomed if their only use is to support pg_upgrade.  So that
idea is not a blocker to this patch, either.

Cheers,

Jeff


[HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Jeremy Harris

The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.

Both algorithms appear to be O(n) (contradicting Wikipedia's claim
that a siftup heapify is O(n log n)).

--
Cheers,
   Jeremy
diff --git a/src/backend/utils/sort/tuplesort.c 
b/src/backend/utils/sort/tuplesort.c
index 8b520c1..2ea7b2d 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -1211,7 +1211,8 @@ puttuple_common(Tuplesortstate *state, SortTuple *tuple)
(void) grow_memtuples(state);
Assert(state-memtupcount  state-memtupsize);
}
-   state-memtuples[state-memtupcount++] = *tuple;
+   state-memtuples[state-memtupcount] = *tuple;
+   state-memtuples[state-memtupcount++].tupindex = 0;
 
/*
 * Check if it's time to switch over to a bounded 
heapsort. We do
@@ -1884,23 +1885,47 @@ inittapes(Tuplesortstate *state)
state-tp_tapenum = (int *) palloc0(maxTapes * sizeof(int));
 
/*
-* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
is
-* marked as belonging to run number zero.
-*
-* NOTE: we pass false for checkIndex since there's no point in 
comparing
-* indexes in this step, even though we do intend the indexes to be part
-* of the sort key...
+* Convert the unsorted contents of memtuples[] into a heap. Each tuple 
was
+* marked as belonging to run number zero on input; we don't compare 
that.
 */
ntuples = state-memtupcount;
-   state-memtupcount = 0; /* make the heap empty */
-   for (j = 0; j  ntuples; j++)
{
-   /* Must copy source tuple to avoid possible overwrite */
-   SortTuple   stup = state-memtuples[j];
+   SortTuple * m = state-memtuples;
 
-   tuplesort_heap_insert(state, stup, 0, false);
+   for (j = (ntuples-2)/2; /* comb the array from the last heap 
parent */
+j = 0;/* to the start */
+j--)
+   {
+   int root = j;
+   int child, swap = 0;
+   SortTuple ptup = m[root];
+
+   while ((child = root*2 + 1) = ntuples-1)
+   {   /* root has at least one child. Check 
left-child */
+   if (COMPARETUP(state, ptup, m[child])  0)
+   {   
/* [root]  [lchild] */
+   if (  ++child = ntuples-1  
/* check right-child */
+   COMPARETUP(state, ptup, 
m[child])  0)
+   swap = child;
+   else
+   break;  
/* [root] smallest */
+   }
+   else
+   {
+   swap = child;
+   if (  ++child = ntuples-1  
/* check right-child */
+   COMPARETUP(state, m[swap], 
m[child])  0)
+   swap = child;
+   }
+   /* [swap] is smallest of three; move into the 
parent slot */
+   m[root] = m[swap];
+   root = swap;/* and repeat 
with the child subtree */
+   }
+   if (swap)
+   m[swap] = ptup;
+   /* This and all heap nodes after are now 
well-positioned */
+   }
}
-   Assert(state-memtupcount == ntuples);
 
state-currentRun = 0;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 You know, I would really prefer to just stick a PGDLLIMPORT on this
 place and any others that need it, and any others that come up, than
 turn this into a political football.  Having to sprinkle PGDLLIMPORT
 on the handful of variables that are accessed by contrib modules is,
 indeed, annoying.

I'm not actually trying to turn this into a political football.  What
I want is a solution that we can trust, ie, that will allow us to
ship Windows code that's not broken.  We have failed to do so for at
least the past year, and not even known it.

I had been okay with the manual PGDLLIMPORT-sprinkling approach
(not happy with it, of course, but prepared to tolerate it) as long
as I believed the buildfarm would reliably tell us of the need for
it.  That assumption has now been conclusively disproven, though.
The question therefore becomes, what are we going to do instead?
Keep on doing what we were doing doesn't strike me as an acceptable
answer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition

2014-02-04 Thread Andres Freund
On 2014-02-04 13:42:51 -0800, Peter Geoghegan wrote:
 On Tue, Feb 4, 2014 at 12:30 PM, Christian Kruse
 christ...@2ndquadrant.com wrote:
  Ok, benchmark for baseline+alignment patch is running.
 
 I see that you have enabled latency information. For this kind of
 thing I prefer to hack pgbench-tools to not collect this (i.e. to not
 pass the -l flag, Per-Transaction Logging). Just remove it and
 pgbench-tools rolls with it. It may well be that the overhead added is
 completely insignificant, but for something like this, where the
 latency information is unlikely to add any value, I prefer to not take
 the chance. This is a fairly minor point, however, especially since
 these are only 60 second runs where you're unlikely to accumulate
 enough transaction latency information to notice any effect.

Hm, I don't find that convincing. If you look at the results from the
last run the latency information is actually quite interesting.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 02/04/2014 03:08 PM, Jeff Janes wrote:
 Having an AMI would help, but even with an AMI in place, MinGW is 
 still insanely slow.  Running make on already made PostgreSQL (so 
 there was nothing to actually do) takes 1.5 minutes.  And a make after 
 a make clean takes half an hour.  This is on an actual desktop, not 
 an AWS micro instance.  So doing a git bisect is just painful.  Is the 
 MSVC build faster?

 Would have to check with the same build options (cassert and debug have 
 major timing effects.) I agree it's not lightning fast like make -j 4 
 on a decent linux box.

I wonder if ccache exists for Mingw.  That thing makes a huge difference
in the perceived build speed ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Peter Geoghegan
On Tue, Feb 4, 2014 at 4:21 AM, Andres Freund and...@2ndquadrant.com wrote:
 Which imo means fixing this got more important...

I strongly agree.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 05:47 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 02/04/2014 03:08 PM, Jeff Janes wrote:

Having an AMI would help, but even with an AMI in place, MinGW is
still insanely slow.  Running make on already made PostgreSQL (so
there was nothing to actually do) takes 1.5 minutes.  And a make after
a make clean takes half an hour.  This is on an actual desktop, not
an AWS micro instance.  So doing a git bisect is just painful.  Is the
MSVC build faster?

Would have to check with the same build options (cassert and debug have
major timing effects.) I agree it's not lightning fast like make -j 4
on a decent linux box.

I wonder if ccache exists for Mingw.  That thing makes a huge difference
in the perceived build speed ...





Indeed. But it's not really, AFAIK. Certainly it's not in the list of 
packages known to mingw-get on the machine i checked on (jacana).


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 5:26 AM, Peter Eisentraut pete...@gmx.net wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.
It makes sense. I'll update the patches according to that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] encouraging index-only scans

2014-02-04 Thread Jeff Janes
On Mon, Feb 3, 2014 at 8:55 AM, Robert Haas robertmh...@gmail.com wrote:


 I've also had some further thoughts about the right way to drive
  vacuum scheduling.  I think what we need to do is tightly couple the
 rate at which we're willing to do vacuuming to the rate at which we're
 incurring vacuum debt.  That is, if we're creating 100kB/s of pages
 needing vacuum, we vacuum at 2-3MB/s (with default settings).


If we can tolerate 2-3MB/s without adverse impact on other work, then we
can tolerate it.  Do we gain anything substantial by sand-bagging it?



 If
 we're creating 10MB/s of pages needing vacuum, we *still* vacuum at
 2-3MB/s.  Not shockingly, vacuum gets behind, the database bloats, and
 everything goes to heck.


(Your reference to bloat made be me think your comments here are about
vacuuming in general, not specific to IOS.  If that isn't the case, then
please ignore.)

If we can only vacuum at 2-3MB/s without adversely impacting other
activity, but we are creating 10MB/s of future vacuum need, then there are
basically two possibilities I can think of.  Either the 10MB/s represents a
spike, and vacuum should tolerate it and hope to catch up on the debt
later.  Or it represents a new permanent condition, in which case I bought
too few hard drives for the work load, and no scheduling decision that
autovacuum can make will save me from my folly. Perhaps there is some
middle ground between those possibilities, but I don't see room for much
middle ground.

I guess there might be entirely different possibilities not between those
two; for example, I don't realize I'm doing something that is generating
10MB/s of vacuum debt, and would like to have this thing I'm doing be
automatically throttled to the point it doesn't interfere with other
processes (either directly, or indirectly by bloat)



 The rate of vacuuming needs to be tied
 somehow to the rate at which we're creating stuff that needs to be
 vacuumed.  Right now we don't even have a way to measure that, let
 alone auto-regulate the aggressiveness of autovacuum on that basis.


There is the formula used to decide when a table gets vacuumed.  Isn't the
time delta in this formula a measure of how fast we are creating stuff that
needs to be vacuumed for bloat reasons?  Is your objection that it doesn't
include other reasons we might want to vacuum, or that it just doesn't work
very well, or that is not explicitly exposed?




 Similarly, for marking of pages as all-visible, we currently make the
 same decision whether the relation is getting index-scanned (in which
 case the failure to mark those pages all-visible may be suppressing
 the use of index scans or making them less effective) or whether it's
 not being accessed at all (in which case vacuuming it won't help
 anything, and might hurt by pushing other pages out of cache).


If it is not getting accessed at all because the database is not very
active right now, that would be the perfect time to vacuum it.  Between I
can accurately project current patterns of (in)activity into the future
and People don't build large tables just to ignore them forever, I think
the latter is more likely to be true.  If the system is busy but this
particular table is not, then that would be a better reason to
de-prioritise vacuuming that table.  But can this degree of reasoning
really be implemented in a practical way?  In core?


  Again,
 if we had better statistics, we could measure this - counting heap
 fetches for actual index-only scans plus heap fetches for index scans
 that might have been planned index-only scans but for the relation
 having too few all-visible pages doesn't sound like an impossible
 metric to gather.


My experience has been that if too few pages are all visible, it generally
switches to a seq scan, not an index scan of a different index.  But many
things that are semantically possible to be index-only-scans would never be
planned that way even if allvisible were 100%, so I think it would have to
do two planning passes, one with the real allvisible, and a hypothetical
one with allvisible set to 100%.  And then there is the possibility that,
while a high allvisible would be useful, the table is so active that no
amount of vacuuming could ever keep it high.

Cheers,

Jeff


Re: [HACKERS] Re: Misaligned BufferDescriptors causing major performance problems on AMD

2014-02-04 Thread Peter Geoghegan
On Mon, Feb 3, 2014 at 3:38 PM, Andres Freund and...@2ndquadrant.com wrote:
  A quick hack (attached) making BufferDescriptor 64byte aligned indeed
  restored performance across all max_connections settings. It's not
  surprising that a misaligned buffer descriptor causes problems -
  there'll be plenty of false sharing of the spinlocks otherwise. Curious
  that the the intel machine isn't hurt much by this.

 What fiddling are you thinking of?

 Basically always doing a TYPEALIGN(CACHELINE_SIZE, addr) before
 returning from ShmemAlloc() (and thereby ShmemInitStruct).

There is something you have not drawn explicit attention to that is
very interesting. If we take REL9_3_STABLE tip to be representative
(built with full -O2 optimization, no assertions just debugging
symbols), setting max_connections to 91 from 90 does not have the
effect of making the BufferDescriptors array aligned; it has the
effect of making it *misaligned*. You reported that 91 was much better
than 90. I think that the problem actually occurs when the array *is*
aligned!

I suspect that the scenario described in this article accounts for the
quite noticeable effect reported: http://danluu.com/3c-conflict

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minor performance improvement in transition to external sort

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 7:22 AM, Jeremy Harris j...@wizmail.org wrote:
 The attached patch replaces the existing siftup method for heapify with
 a siftdown method. Tested with random integers it does 18% fewer
 compares and takes 10% less time for the heapify, over the work_mem
 range 1024 to 1048576.

 Both algorithms appear to be O(n) (contradicting Wikipedia's claim
 that a siftup heapify is O(n log n)).
It looks interesting but it is too late to have that in 9.4... You
should definitely add this patch to the next commit fest so as it is
not lost in the void:
https://commitfest.postgresql.org/action/commitfest_view?id=22
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 9:38 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Feb 5, 2014 at 8:59 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  I'll update the patches according to that.
 Here are the updated patches with the following changes (according to
 previous comments):
 - Datatype is renamed to pglsn, documentation, file names, regressions
 and APIs are updated as well.
 - The DatumGet* and *GetDatum APIs are renamed with PGLSN (Should be
 PgLsn? But that's a detail)
 - pg_create_physical_replication_slot uses PGLSNGetDatum for its 6th argument
 For pageinspect, only page_header is impacted and I think that this
 should be a separated patch as it makes necessary to dump it to 1.2. I
 can write it later once the core parts are decided.
I just forgot to mention that the 2nd patch does not use context diffs
but git diffs because of filterdiff not able to catch all the new
content of slotfuncs.c.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

2014-02-04 Thread Jon Nelson
What - if anything - do I need to do to get this on the commitfest
list for the next commitfest?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-04 Thread Etsuro Fujita

(2014/02/04 20:56), Robert Haas wrote:

On Sun, Feb 2, 2014 at 10:15 PM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:

Allowing ALTER COLUMN SET STORAGE on foreign tables would make sense if for
example, SELECT * INTO local_table FROM foreign_table did create a new
local table of columns having the storage types associated with those of a
foreign table?


Seems like a pretty weak argument.  It's not that we can't find
strange corner cases where applying SET STORAGE to a foreign table
doesn't do something; it's that they *are* strange corner cases.  The
options as we normally don't understand them just aren't sensible in
this context, and a good deal of work has been put into an alternative
options framework, which is what authors of FDWs ought to be using.


I just wanted to discuss the possiblity of allowing SET STORAGE on a 
foreign table, but I've got the point.  I'll resume the patch review.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Kyotaro HORIGUCHI
Hello,

At Tue, 04 Feb 2014 02:07:08 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 
3176.1391497...@sss.pgh.pa.us
 One good reason not to trust this too much is that getaddrinfo() is
 fundamentally a userspace DNS access function, and as such it has
 no very good way to know if there's currently an IPv4 or IPv6
 interface configured on the local system.  At minimum there are
 obvious race conditions in that.

A case which would be more common is ::1 in /etc/hosts. I had
following error with this patch for such a case.

| LOG:  could not bind IPv4 socket: Address already in use
| HINT:  Is another postmaster already running on port 5432? If not, wait a few 
seconds and retry.

getaddrinfo returned two same entries having the same address
AF_INET 127.0.0.1:14357. One of them is for ::1 in
hosts. This is worse than current behavior X-(

At Tue, 04 Feb 2014 10:31:03 -0500, Tom Lane t...@sss.pgh.pa.us wrote in 
12552.1391527...@sss.pgh.pa.us
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Tom Lane wrote:
  Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
  LOG:  could not create IPv6 socket: Address family not supported by 
  protocol
 
  That's merely a harmless log message.
 
  How about just adding a HINT?
 
 Hmm ... maybe, but how would you phrase the hint exactly?

Putting the 'exactly' aside, is it something means 'You will get
this message when the feature to handle the address family is
disabled', only for EAFNOSUPPORT ?

 Though I don't know whether such a hint is helpful for those who
tend to mind that kind of message.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] could not create IPv6 socket (AI_ADDRCONFIG)

2014-02-04 Thread Tom Lane
Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 getaddrinfo returned two same entries having the same address
 AF_INET 127.0.0.1:14357. One of them is for ::1 in
 hosts. This is worse than current behavior X-(

Yeah, the fundamental issue is that getaddrinfo tends to return bogus
info.

 How about just adding a HINT?

 Hmm ... maybe, but how would you phrase the hint exactly?

 Putting the 'exactly' aside, is it something means 'You will get
 this message when the feature to handle the address family is
 disabled', only for EAFNOSUPPORT ?

  Though I don't know whether such a hint is helpful for those who
 tend to mind that kind of message.

I still think the best thing might be to reduce the individual messages
to DEBUG-something, and only produce a LOG entry if we are unable to
bind to *any* of the addresses returned by getaddrinfo.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 04:08 AM, Jeff Janes wrote:
  So doing a git bisect is just painful.  Is the MSVC
 build faster?

Yes, but not on EC2.

I've found Windows EC2 instances so impossibly slow I just gave up
working with it. It took 1.5 hours to do a build and regression check
with msvc on a Medium EC2 instance; the same build takes 10 mins on my
tiny Intel i3 based Windows test machine.

It's possible that some of the larger instance types may perform better
as they use different approaches to virtualization than simple Xen HVM.
I haven't tested, as the cost of those instances rapidly becomes
problematic.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 02:53 AM, Tom Lane wrote:
 Joshua D. Drake j...@commandprompt.com writes:
 On 02/04/2014 09:34 AM, Tom Lane wrote:
 My own opinion is that I've already wasted untold man-hours thanks to
 the random porting problems induced by Windows, a platform that I never
 have and never will care about personally.  I will *not* spend my own
 time doing tests that someone else could do.  If we can't get some
 effort contributed by someone who does use that platform, I'm personally
 prepared to declare the entire damn thing no longer supported.
 
 Although that is obviously your prerogative it is important to remember 
 that Windows is easily the second most used version of PostgreSQL out 
 there (behind Linux).
 
 [ shrug... ]  If it's so widely used, why is it so hard to find somebody
 who's willing to put in some development effort for it?

Sadly, I'm now convinced that Windows users are just much less likely to
contribute anything constructive to a project - code, documentation,
anything. It's a real gimme world, and has a really strong ethic that
the vendor does things with their software, you don't just go and get
involved.

That said, I think the fuss being made about the intrusiveness of
Windows support and its impact is overblown here. These are a few macros
that're noops on other platforms anyway, and some build code hidden away
in src/tools .

It's ugly. It's annoying. It's crap that users don't contribute back.
It's also just not that big a deal; there are many other things that are
similarly painful or more so.

Expecting folks to fire up an AMI and hand-control the build with a GUI
over a high latency connection is a waste of time better spent
elsewhere, though, and will result in everyone continuing to avoid any
sort of testing on Windows.

Personally what I think we need is a *public* Jenkins instance, or
similar, to which you can push a branch and have it automatically build
and make check on Windows. I've got that running for internal use, but
it's on a host I can't share access to (and an unreliable one, at that).

I'd be happy to share the setup for the Jenkins instance and the Windows
integration parts, along with the instructions I wrote on how to set up
the Windows build test node(s) and the tooling I'm using to automate the
Windows build.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Andrew Dunstan


On 02/04/2014 10:48 PM, Craig Ringer wrote:

On 02/05/2014 04:08 AM, Jeff Janes wrote:

  So doing a git bisect is just painful.  Is the MSVC
build faster?

Yes, but not on EC2.

I've found Windows EC2 instances so impossibly slow I just gave up
working with it. It took 1.5 hours to do a build and regression check
with msvc on a Medium EC2 instance; the same build takes 10 mins on my
tiny Intel i3 based Windows test machine.

It's possible that some of the larger instance types may perform better
as they use different approaches to virtualization than simple Xen HVM.
I haven't tested, as the cost of those instances rapidly becomes
problematic.




I typically use m1.medium. A spot instance for that is currently $0.033 
/ hour. When I was working on one such the other day it took nothing 
like 1.5 hours to build and test. I didn't time it so I can't tell you 
how long it took, but much less than that. Of course YMMV.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 06:29 AM, Tom Lane wrote:
 I had been okay with the manual PGDLLIMPORT-sprinkling approach
 (not happy with it, of course, but prepared to tolerate it) as long
 as I believed the buildfarm would reliably tell us of the need for
 it.  That assumption has now been conclusively disproven, though.
 The question therefore becomes, what are we going to do instead?
 Keep on doing what we were doing doesn't strike me as an acceptable
 answer.

I'm in complete agreement here. Silent failures we can't test for that
might sneak data corruption in are not cool.

I'll have a look into ways to making sure that globals with incorrect
linkage fail at runtime link time, as is the case for functions. I won't
be able to spend much time on it immediately; will take a quick look and
if I don't find anything, will follow up post-CF4.

I'm kind of horrified that the dynamic linker doesn't throw its toys
when it sees this.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Craig Ringer
On 02/05/2014 12:06 AM, Andrew Dunstan wrote:
 
 On 02/04/2014 10:43 AM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 On 2014-02-04 02:10:47 -0500, Tom Lane wrote:
 Meh.  It might be that the DateStyle usage in postgres_fdw would
 accidentally fail to malfunction if it saw a bogus value of the
 variable.
 But it's hard to believe that this would be true of MainLWLockArray.
 There's not that much lwlock usage in contrib. It's just
 pg_stat_statements and pg_buffercache. Neither has tests... So it very
 well could be that breakage simply hasn't been observed.
 Hm, you're right --- I'd have thought there were more of those.

 Ugh.  This problem was bad enough when I thought that it would only lead
 to link-time errors detectable in the buildfarm.  If it can lead to
 errors
 only observable at runtime --- and maybe not obvious even then --- then
 I think we *have to* do something about it.  By that I mean that we must
 get rid of the need to manually plaster PGDLLIMPORT on global variables.

 Anybody with a Windows build environment want to test the #define
 extern
 trick?


 
 
 
 We have details on how to build with Mingw/Msys on Windows on an Amazon
 VM http://wiki.postgresql.org/wiki/Building_With_MinGW which is either
 free or very cheap. Do I need to give instructions on how to do this for
 MSVC builds too? It's really not terribly hard.

I've got some guidance on that here:

https://github.com/2ndQuadrant/pg_build_win

from setting up a clean Windows instance for builds, on to the build
process.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Duplicate Tuple Elidation during External Sort for DISTINCT

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 10:33 AM, Jon Nelson jnelson+pg...@jamponi.net wrote:
 What - if anything - do I need to do to get this on the commitfest
 list for the next commitfest?
The list of instructions is here:
http://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission
Then the next commit fest (#1 for 9.5), will be in June and is here:
https://commitfest.postgresql.org/action/commitfest_view?id=22
Note that you will need an account on postgresql.org to register your patch.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Viability of text HISTORY/INSTALL/regression README files (was Re: [COMMITTERS] pgsql: Document a few more regression test hazards.)

2014-02-04 Thread Noah Misch
On Tue, Feb 04, 2014 at 03:28:45PM -0500, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 1:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  Noah Misch n...@leadboat.com writes:
  Robert Haas robertmh...@gmail.com writes:
  I wonder if these standalone things are really worthwhile.
 
  I wonder how difficult it would be to make sufficient link data available 
  when
  building the standalone files.  There would be no linking per se; we would
  just need the referent's text fragment emitted where the xref tag 
  appears.
 
  IIRC, that's basically what the workaround is, except it's not very
  automated.  Even if it were automated, though, there's still a problem:
  such links aren't really *useful* in flat text format.  I think that
  forcing the author to actually think about what to put there in the
  flat text version is a good thing, if we're going to retain the flat
  text version at all.
 
 Right.  I mean, a lot of the links say things like Section 26.2
 which obviously makes no sense in a standalone text file.

For xrefs normally displayed that way, text output could emit a URL, either
inline or in the form of a footnote.  For link targets (e.g. SQL commands)
having a friendly text fragment for xref sites, use the normal fragment.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread James Sewell
Hello All,

I have been reading through some of the recent discussions about failback
when in a streaming replication setup. I define failback as:


   1. Node A is master, Node B is slave
   2. Node A crashes || Node A is stopped || nothing happens
   3. Promote Node B to Master
   4. Attach Node A as slave

My understanding is currently to achieve step three you need to take a base
backup of Node B and deploy it to Node A before starting streaming
replication (or use rsync etc...).

This is very undesirable for many users, especially if they have a very
large database.

From the discussions I can see that the problem is to do with Node A
writing changes to disk that Node B are not streamed before Node A crashes.

Has there been any consensus on this issue? Are there any solutions which
might make it into 9.4 or 9.5? I've seen some proposals and a tool
(pg_rewind), but all seem to have draw backs.

I've been looking mainly at these threads:

http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=ggyu1kmt+s2xcq...@mail.gmail.com

http://www.postgresql.org/message-id/caf8q-gxg3pqtf71nvece-6ozraew5pwhk7yqtbjgwrfu513...@mail.gmail.com

http://www.postgresql.org/message-id/519df910.4020...@vmware.com

Cheers,

James Sewell,
PostgreSQL Team Lead / Solutions Architect
__


 Level 2, 50 Queen St, Melbourne VIC 3000

*P *(+61) 3 8370 8000  *W* www.lisasoft.com  *F *(+61) 3 8370 8099

-- 


--
The contents of this email are confidential and may be subject to legal or 
professional privilege and copyright. No representation is made that this 
email is free of viruses or other defects. If you have received this 
communication in error, you may not copy or distribute any part of it or 
otherwise disclose its contents to anyone. Please advise the sender of your 
incorrect receipt of this correspondence.


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-04 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 02/05/2014 06:29 AM, Tom Lane wrote:
 I had been okay with the manual PGDLLIMPORT-sprinkling approach
 (not happy with it, of course, but prepared to tolerate it) as long
 as I believed the buildfarm would reliably tell us of the need for
 it.  That assumption has now been conclusively disproven, though.

 I'm kind of horrified that the dynamic linker doesn't throw its toys
 when it sees this.

Indeed :-(.

The truly strange part of this is that it seems that the one Windows
buildfarm member that's telling the truth (or most nearly so, anyway)
is narwhal, which appears to have the oldest and cruftiest toolchain
of the lot.  I'd really like to come out the other end of this
investigation with a clear understanding of why the newer toolchains
are failing to report a link problem, and yet not building working
executables.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-02-04 Thread Amit Kapila
On Tue, Feb 4, 2014 at 11:58 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Feb 4, 2014 at 12:39 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 Now there is approximately 1.4~5% CPU gain for
 hundred tiny fields, half nulled case

 Assuming that the logic isn't buggy, a point in need of further study,
 I'm starting to feel like we want to have this.  And I might even be
 tempted to remove the table-level off switch.

I have tried to stress on worst case more, as you are thinking to
remove table-level switch and found that even if we increase the
data by approx. 8 times (ten long fields, all changed, each field contains
80 byte data), the CPU overhead is still  5% which clearly shows that
the overhead doesn't increase much even if the length of unmatched data
is increased by much larger factor.
So the data for worst case adds more weight to your statement
(remove table-level switch), however there is no harm in keeping
table-level option with default as 'true' and if some users are really sure
the updates in their system will have nothing in common, then they can
make this new option as 'false'.

Below is data for the new case  ten long fields, all changed added
in attached script file:

Unpatched
   testname   | wal_generated | duration
--+---+--
 ten long fields, all changed |3473999520 | 45.0375978946686
 ten long fields, all changed |3473999864 | 45.2536928653717
 ten long fields, all changed |3474006880 | 45.1887288093567


After pgrb_delta_encoding_v8.patch
--
  testname   | wal_generated | duration
--+---+--
 ten long fields, all changed |3474006456 | 47.5744359493256
 ten long fields, all changed |3474000136 | 47.3830440044403
 ten long fields, all changed |3474002688 | 46.9923310279846



With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


wal-update-testsuite.sh
Description: Bourne shell script

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread Amit Kapila
On Wed, Feb 5, 2014 at 10:30 AM, James Sewell james.sew...@lisasoft.com
wrote:

 Hello All,

 I have been reading through some of the recent discussions about failback
when in a streaming replication setup. I define failback as:

 Node A is master, Node B is slave
 Node A crashes || Node A is stopped || nothing happens
 Promote Node B to Master
 Attach Node A as slave

 My understanding is currently to achieve step three you need to take a
base backup of Node B and deploy it to Node A before starting streaming
replication (or use rsync etc...).

I think in above sentence you mean to say to achieve step *four* ..


 This is very undesirable for many users, especially if they have a very
large database.

 From the discussions I can see that the problem is to do with Node A
writing changes to disk that Node B are not streamed before Node A crashes.

Yes, this is right.


 Has there been any consensus on this issue? Are there any solutions which
might make it into 9.4 or 9.5?

As far as I know, there is still no solution provided in 9.4, can't say
anything
for 9.5 with any certainity. However in 9.4, there is a new parameter
wal_log_hints which can be useful to overcome drawback of pg_rewind.

 I've seen some proposals and a tool (pg_rewind), but all seem to have
draw backs.

As far as I remember, one of the main drawbacks for pg_rewind was related to
hint bits which can be avoided by wal_log_hints. pg_rewind is not part of
core
PostgreSQL code, however if you wish, you can try that tool to see if can it
solve your purpose.

Note - James, in previous reply, I missed to cc to hackers, so sending it
again.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Heikki Linnakangas

On 02/03/2014 05:22 PM, Merlin Moncure wrote:

I lost my stomach (or maybe it was the glass of red) somewhere in the
middle, but I think this needs a lot of work. Especially the io code
doesn't seem ready to me. I'd consider ripping out the send/recv code
for 9.4, that seems the biggest can of worms. It will still be usable
without.

Not having type send/recv functions is somewhat dangerous; it can
cause problems for libraries that run everything through the binary
wire format.  I'd give jsonb a pass on that, being a new type, but
would be concerned if hstore had that ability revoked.


send/recv functions are also needed for binary-format COPY. IMHO jsonb 
must have send/recv functions. All other built-in types have them, 
except for types like 'smgr', 'aclitem' and 'any*' that no-one should be 
using as column types.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL Failback without rebuild

2014-02-04 Thread Michael Paquier
On Wed, Feb 5, 2014 at 3:14 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Wed, Feb 5, 2014 at 10:30 AM, James Sewell james.sew...@lisasoft.com
 I've seen some proposals and a tool (pg_rewind), but all seem to have draw
 backs.

 As far as I remember, one of the main drawbacks for pg_rewind was related to
 hint bits which can be avoided by wal_log_hints. pg_rewind is not part of
 core
 PostgreSQL code, however if you wish, you can try that tool to see if can it
 solve your purpose.
For 9.3, pg_rewind is only safe with page checksums enabled. For 9.4,
yes wal_log_hints or checksums is mandatory. The code contains as well
some safety checks as well to ensure that a node not using those
parameters cannot be rewinded.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-02-04 Thread Oleg Bartunov
Andrew provided us more information and we'll work on recv. What
people think about testing this stuff ?  btw, we don't have any
regression test on this.

Oleg

On Wed, Feb 5, 2014 at 2:03 AM, Josh Berkus j...@agliodbs.com wrote:
 On 02/03/2014 07:27 AM, Andres Freund wrote:
 On 2014-02-03 09:22:52 -0600, Merlin Moncure wrote:
 I lost my stomach (or maybe it was the glass of red) somewhere in the
 middle, but I think this needs a lot of work. Especially the io code
 doesn't seem ready to me. I'd consider ripping out the send/recv code
 for 9.4, that seems the biggest can of worms. It will still be usable
 without.

 Not having type send/recv functions is somewhat dangerous; it can
 cause problems for libraries that run everything through the binary
 wire format.  I'd give jsonb a pass on that, being a new type, but
 would be concerned if hstore had that ability revoked.

 Yea, removing it for hstore would be a compat problem...

 offhand note: hstore_send seems pretty simply written and clean; it's
 a simple nonrecursive iterator...

 But a send function is pretty pointless without the corresponding recv
 function... And imo recv simply is to dangerous as it's currently
 written.
 I am not saying that it cannot be made work, just that it's still nearly
 as ugly as when I pointed out several of the dangers some weeks back.

 Oleg, Teodor, any comments on the above?

 --
 Josh Berkus
 PostgreSQL Experts Inc.
 http://pgexperts.com


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun

2014-02-04 Thread Heikki Linnakangas

On 02/04/2014 02:40 AM, Peter Geoghegan wrote:

On Fri, Jan 31, 2014 at 9:09 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

I refactored the loop in _bt_moveright to, well, not have that bug anymore.
The 'page' and 'opaque' pointers are now fetched at the beginning of the
loop. Did I miss something?


I think so, yes. You still aren't assigning the value returned by
_bt_getbuf() to 'buf'.


D'oh, you're right.


Since, as I mentioned, _bt_finish_split() ultimately unlocks *and
unpins*, it may not be the same buffer as before, so even with the
refactoring there are race conditions.


Care to elaborate? Or are you just referring to the missing buf =  ?


A closely related issue is that you haven't mentioned anything about
buffer pins/refcount side effects in comments above
_bt_finish_split(), even though I believe you should.


Ok.


A minor stylistic concern is that I think it would be better to only
have one pair of _bt_finish_split()/_bt_getbuf() calls regardless of
the initial value of 'access'.


Ok.

I also changed _bt_moveright to never return a write-locked buffer, when 
the caller asked for a read-lock (an issue you pointed out earlier in 
this thread).


Attached is a new version of the patch, with those issues fixed. 
btree-incomplete-split-4.patch is a complete patch against the latest 
fix-btree-page-deletion patch, and moveright-assign-fix.patch is just 
the changes to _bt_moveright, if you want to review just the changes 
since the previous patch I posted.


- Heikki
diff --git a/src/backend/access/nbtree/README b/src/backend/access/nbtree/README
index 03efc29..43ee75f 100644
--- a/src/backend/access/nbtree/README
+++ b/src/backend/access/nbtree/README
@@ -404,12 +404,34 @@ an additional insertion above that, etc).
 For a root split, the followon WAL entry is a new root entry rather than
 an insertion entry, but details are otherwise much the same.
 
-Because insertion involves multiple atomic actions, the WAL replay logic
-has to detect the case where a page split isn't followed by a matching
-insertion on the parent level, and then do that insertion on its own (and
-recursively for any subsequent parent insertion, of course).  This is
-feasible because the WAL entry for the split contains enough info to know
-what must be inserted in the parent level.
+Because splitting involves multiple atomic actions, it's possible that the
+system crashes between splitting a page and inserting the downlink for the
+new half to the parent. After recovery, the downlink for the new page will
+be missing. The search algorithm works correctly, as the page will be found
+by following the right-link from its left sibling, although if a lot of
+downlinks in the tree are missing, performance will suffer. A more serious
+consequence is that if the page without a downlink gets split again, the
+insertion algorithm will fail to find the location in the parent level to
+insert the downlink.
+
+Our approach is to create any missing downlinks on-they-fly, when
+searching the tree for a new insertion. It could be done during searches,
+too, but it seems best not to put any extra updates in what would otherwise
+be a read-only operation (updating is not possible in hot standby mode
+anyway). To identify missing downlinks, when a page is split, the left page
+is flagged to indicate that the split is not yet complete (INCOMPLETE_SPLIT).
+When the downlink is inserted to the parent, the flag is cleared atomically
+with the insertion. The child page is kept locked until the insertion in the
+parent is finished and the flag in the child cleared, but can be released
+immediately after that, before recursing up the tree, if the parent also
+needs to be split. This ensures that incompletely split pages should not be
+seen under normal circumstances; only when insertion to the parent fails
+for some reason.
+
+We flag the left page, even though it's the right page that's missing the
+downlink, beacuse it's more convenient to know already when following the
+right-link from the left page to the right page that it will need to have
+its downlink inserted to the parent.
 
 When splitting a non-root page that is alone on its level, the required
 metapage update (of the fast root link) is performed and logged as part
@@ -422,6 +444,14 @@ page is a second record.  If vacuum is interrupted for some reason, or the
 system crashes, the tree is consistent for searches and insertions.  The next
 VACUUM will find the half-dead leaf page and continue the deletion.
 
+Before 9.4, we used to keep track of incomplete splits and page deletions
+during recovery and finish them immediately at end of recovery, instead of
+doing it lazily at the next  insertion or vacuum. However, that made the
+recovery much more complicated, and only fixed the problem when crash
+recovery was performed. An incomplete split can also occur if an otherwise
+recoverable error, like out-of-memory or out-of-disk-space, happens while
+inserting the downlink to