Re: [HACKERS] too much pgbench init output

2013-01-05 Thread Tomas Vondra
On 6.1.2013 05:07, Tatsuo Ishii wrote:
>> On 6.1.2013 03:03, Tatsuo Ishii wrote:
>>> As a committer, I have looked into the patch. I noticed two things:
>>>
>>> 1) In the help you put '-q' option into "Common options" section. I
>>> think this should be moved to "Initialization options" section because
>>> the option is only applied while initializing.
>>
>> Good point, moved.
> 
> In addition to this, I'd suggest to add checking -q is only possible
> with -i option since without -i, -q is meaningless.

Done.

>> There's one more thing I've just noticed - the original version of the
>> patch simply removed the old logging, but this one keeps both old and
>> quiet logging. But the old logging still uses this:
>>
>> fprintf(stderr, "%d of %d tuples (%d%%) done.\n", 
>>
>> while the new logging does this
>>
>> fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s,
>> remaining %.2f s).\n",
>>
>> i.e. it prints additional info about elapsed/estimated time. Do we want
>> to keep it this way (i.e. not to mess with the old logging) or do we
>> want to add these new fields to the old logging too?
>>
>> I suggest to add it to the old logging, to keep the log messages the
>> same, the only difference being the logging frequency.
> 
> If we do so, probably '-q' is not appropeate option name any more,
> since the only difference between old logging and new one is, the
> former is printed every 1 lines while the latter is every 5
> seconds, which is not really "quiet". What do you think?

AFAIK the "5 second" logging is much quieter in most cases (and a bit
more verbose when the initialization gets very slower), so I think the
"quiet" logging is not a bad match although maybe there's a better name.

This change (adding the elapsed/remaining fields to the original loggin)
would be consistent with this name, because considering a single line,
the "-q" is more verbose right now.

So I'd stick with the "-q" option and added the fields to the original
logging. But I'm not opposing a different name, I just can't think of a
better one.

Tomas
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index e376452..39bd8a5 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -39,6 +39,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 
 #ifndef WIN32
 #include 
@@ -102,6 +103,7 @@ extern int  optind;
 #define MAXCLIENTS 1024
 #endif
 
+#define LOG_STEP_SECONDS   5   /* seconds between log messages */
 #define DEFAULT_NXACTS 10  /* default nxacts */
 
 intnxacts = 0; /* number of 
transactions per client */
@@ -150,6 +152,7 @@ char   *index_tablespace = NULL;
 #define naccounts  10
 
 bool   use_log;/* log transaction latencies to 
a file */
+bool   use_quiet;  /* quiet logging onto stderr */
 bool   is_connect; /* establish connection for 
each transaction */
 bool   is_latencies;   /* report per-command latencies */
 intmain_pid;   /* main process id used 
in log filename */
@@ -359,6 +362,7 @@ usage(void)
   "  -n   do not run VACUUM after initialization\n"
   "  -F NUM   fill factor\n"
   "  -s NUM   scaling factor\n"
+  "  -q   quiet logging (one message each 5 seconds)\n"
   "  --foreign-keys\n"
   "   create foreign key constraints between 
tables\n"
   "  --index-tablespace=TABLESPACE\n"
@@ -1362,6 +1366,11 @@ init(bool is_no_vacuum)
charsql[256];
int i;
 
+   /* used to track elapsed time and estimate of the remaining time */
+   instr_time  start, diff;
+   double  elapsed_sec, remaining_sec;
+   int log_interval = 1;
+
if ((con = doConnect()) == NULL)
exit(1);
 
@@ -1430,6 +1439,8 @@ init(bool is_no_vacuum)
}
PQclear(res);
 
+   INSTR_TIME_SET_CURRENT(start);
+
for (i = 0; i < naccounts * scale; i++)
{
int j = i + 1;
@@ -1441,10 +1452,33 @@ init(bool is_no_vacuum)
exit(1);
}
 
-   if (j % 10 == 0)
+   /* If we want to stick with the original logging, print a 
message each
+* 100k inserted rows. */
+   if ((! use_quiet) && (j % 10 == 0))
fprintf(stderr, "%d of %d tuples (%d%%) done.\n",
-   j, naccounts * scale,
-   (int) (((int64) j * 100) / (naccounts * 
scale)));
+   j, naccounts * scale,
+   (int) (

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-01-05 Thread Amit kapila

On Saturday, January 05, 2013 12:35 PM Boszormenyi Zoltan wrote:
2013-01-05 05:58 keltezéssel, Amit kapila írta:
> On Friday, January 04, 2013 10:57 PM Boszormenyi Zoltan wrote:
>> Hi,
>> I am reviewing your patch.
> Thank you very much.
>
>> Yes, you are right adding a new LWLock will avoid the use of sleep.
>> However I am just not sure if for only this purpose we should add a new 
>> LWLock?
>
>> Similar logic is used in CreateLockFile() for postmaster file but it doesn't 
>> use sleep.
>> How about reducing the time of sleep or removing sleep, in that user might 
>> get
>> error and he need to retry to get his command successful?

> I think removing the loop entirely is better.

> However, the behaviour should be discussed by a wider audience:
> - the backends should wait for each other just like other statements
>that fight for the same object (in which case locking is needed), or
> - throw an error immediately on conflicting parallel work

Okay, I shall update the patch with first option as suggested by Noah as well.

>I also tried to trigger one of the errors by creating the lock file manually.
> You need an extra space between the "... retry" and "or ...":

> +   ereport(ERROR,
> + (errcode_for_file_access(),
> +errmsg("could not create 
> lock file
> \"%s\": %m ", LockFileName),
> +errhint("May be too many 
> concurrent edit
> into file happening, please wait!! and retry"
> +"or .lock is 
> file
> accidently left there please clean the file from config_dir")));

Will fix.


Thank you for review.

With Regards,
Amit Kapila.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] too much pgbench init output

2013-01-05 Thread Tatsuo Ishii
> On 6.1.2013 03:03, Tatsuo Ishii wrote:
>> As a committer, I have looked into the patch. I noticed two things:
>> 
>> 1) In the help you put '-q' option into "Common options" section. I
>> think this should be moved to "Initialization options" section because
>> the option is only applied while initializing.
> 
> Good point, moved.

In addition to this, I'd suggest to add checking -q is only possible
with -i option since without -i, -q is meaningless.

>> 2) Shouldn't a long option for '-q' be provided? Something like
>> '--quiet-progress-logging'?
> 
> I don't think so. Currently pgbench has either short or long option, not
> both (for the same thing). I believe we should stick to this and either
> choose "-q" or "--quiet-logging" but not both.

Ok.

>> 3) No patches for docs found (doc/src/sgml/pgbench.sgml)
> 
> I've added a brief description of the "-q" option into the docs. IMHO
> it's enough but feel free to add some more details.

Good.

> There's one more thing I've just noticed - the original version of the
> patch simply removed the old logging, but this one keeps both old and
> quiet logging. But the old logging still uses this:
> 
> fprintf(stderr, "%d of %d tuples (%d%%) done.\n", 
> 
> while the new logging does this
> 
> fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s,
> remaining %.2f s).\n",
> 
> i.e. it prints additional info about elapsed/estimated time. Do we want
> to keep it this way (i.e. not to mess with the old logging) or do we
> want to add these new fields to the old logging too?
> 
> I suggest to add it to the old logging, to keep the log messages the
> same, the only difference being the logging frequency.

If we do so, probably '-q' is not appropeate option name any more,
since the only difference between old logging and new one is, the
former is printed every 1 lines while the latter is every 5
seconds, which is not really "quiet". What do you think?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: PATCH: Split stats file per database WAS: [HACKERS] autovacuum stress-testing our system

2013-01-05 Thread Tomas Vondra
On 3.1.2013 20:33, Magnus Hagander wrote:
> On Thu, Jan 3, 2013 at 8:31 PM, Tomas Vondra  wrote:
>> On 3.1.2013 18:47, Heikki Linnakangas wrote:
>>> How about creating the new directory as a direct subdir of $PGDATA,
>>> rather than buried in global? "global" is supposed to contain data
>>> related to shared catalog relations (plus pg_control), so it doesn't
>>> seem like the right location for per-database stat files. Also, if we're
>>> going to have admins manually zapping the directory (hopefully when the
>>> system is offline), that's less scary if the directory is not buried as
>>> deep.
>>
>> That's clearly possible and it's a trivial change. I was thinking about
>> that actually, but then I placed the directory into "global" because
>> that's where the "pgstat.stat" originally was.
> 
> Yeah, +1 for a separate directory not in global.

OK, I moved the files from "global/stat" to "stat".

Tomas
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index be3adf1..4ec485e 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -64,10 +64,14 @@
 
 /* --
  * Paths for the statistics files (relative to installation's $PGDATA).
+ * Permanent and temprorary, global and per-database files.
  * --
  */
-#define PGSTAT_STAT_PERMANENT_FILENAME "global/pgstat.stat"
-#define PGSTAT_STAT_PERMANENT_TMPFILE  "global/pgstat.tmp"
+#define PGSTAT_STAT_PERMANENT_DIRECTORY"stat"
+#define PGSTAT_STAT_PERMANENT_FILENAME "stat/global.stat"
+#define PGSTAT_STAT_PERMANENT_TMPFILE  "stat/global.tmp"
+#define PGSTAT_STAT_PERMANENT_DB_FILENAME  "stat/%d.stat"
+#define PGSTAT_STAT_PERMANENT_DB_TMPFILE   "stat/%d.tmp"
 
 /* --
  * Timer definitions.
@@ -115,8 +119,11 @@ int
pgstat_track_activity_query_size = 1024;
  * Built from GUC parameter
  * --
  */
+char  *pgstat_stat_directory = NULL;
 char  *pgstat_stat_filename = NULL;
 char  *pgstat_stat_tmpname = NULL;
+char  *pgstat_stat_db_filename = NULL;
+char  *pgstat_stat_db_tmpname = NULL;
 
 /*
  * BgWriter global statistics counters (unused in other processes).
@@ -219,11 +226,16 @@ static intlocalNumBackends = 0;
  */
 static PgStat_GlobalStats globalStats;
 
-/* Last time the collector successfully wrote the stats file */
-static TimestampTz last_statwrite;
+/* Write request info for each database */
+typedef struct DBWriteRequest
+{
+   Oid databaseid; /* OID of the database 
to write */
+   TimestampTz request_time;   /* timestamp of the last write request 
*/
+} DBWriteRequest;
 
-/* Latest statistics request time from backends */
-static TimestampTz last_statrequest;
+/* Latest statistics request time from backends for each DB */
+static DBWriteRequest * last_statrequests = NULL;
+static int num_statrequests = 0;
 
 static volatile bool need_exit = false;
 static volatile bool got_SIGHUP = false;
@@ -252,11 +264,17 @@ static void pgstat_sighup_handler(SIGNAL_ARGS);
 static PgStat_StatDBEntry *pgstat_get_db_entry(Oid databaseid, bool create);
 static PgStat_StatTabEntry *pgstat_get_tab_entry(PgStat_StatDBEntry *dbentry,
 Oid tableoid, bool create);
-static void pgstat_write_statsfile(bool permanent);
-static HTAB *pgstat_read_statsfile(Oid onlydb, bool permanent);
+static void pgstat_write_statsfile(bool permanent, bool force);
+static void pgstat_write_db_statsfile(PgStat_StatDBEntry * dbentry, bool 
permanent);
+static void pgstat_write_db_dummyfile(Oid databaseid);
+static HTAB *pgstat_read_statsfile(Oid onlydb, bool permanent, bool onlydbs);
+static void pgstat_read_db_statsfile(Oid databaseid, HTAB *tabhash, HTAB 
*funchash, bool permanent);
 static void backend_read_statsfile(void);
 static void pgstat_read_current_status(void);
 
+static bool pgstat_write_statsfile_needed();
+static bool pgstat_db_requested(Oid databaseid);
+
 static void pgstat_send_tabstat(PgStat_MsgTabstat *tsmsg);
 static void pgstat_send_funcstats(void);
 static HTAB *pgstat_collect_oids(Oid catalogid);
@@ -285,7 +303,6 @@ static void 
pgstat_recv_recoveryconflict(PgStat_MsgRecoveryConflict *msg, int le
 static void pgstat_recv_deadlock(PgStat_MsgDeadlock *msg, int len);
 static void pgstat_recv_tempfile(PgStat_MsgTempFile *msg, int len);
 
-
 /* 
  * Public functions called from postmaster follow
  * 
@@ -549,8 +566,34 @@ startup_failed:
 void
 pgstat_reset_all(void)
 {
-   unlink(pgstat_stat_filename);
-   unlink(PGSTAT_STAT_PERMANENT_FILENAME);
+   DIR * dir;
+   struct dirent * entry;
+
+   dir = AllocateDir(pgstat_stat_directory);
+   while ((entry = ReadDir(dir, pgstat_stat_directory)) != NULL)
+   {
+   char fname[strlen(pgstat_stat_directory) + 
strlen(ent

Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction

2013-01-05 Thread Tomas Vondra
On 4.1.2013 17:42, Robert Haas wrote:
> On Mon, Dec 31, 2012 at 11:51 AM, Tomas Vondra  wrote:
>> I thought I followed the conding style - which guidelines have I broken?
> 
> + /* If there are no non-local relations, then we're done. Release the 
> memory
> +  * and return. */
> 
> Multi-line comments should start with a line containing only /* and
> end with a line containing only */.
> 
> +DropRelFileNodeAllBuffers(RelFileNodeBackend * rnodes, int nnodes)
> and
> +rnode_comparator(const void * p1, const void * p2)
> 
> The extra spaces after the asterisks should be removed.
> 
> +void smgrdounlinkall(SMgrRelation * rels, int nrels, bool isRedo)
> +{
> 
> void should be on a line by itself.
> 
> Sorry to nitpick.

No, thanks for the nitpicking! Code style is important.

> As for BSEARCH_LIMIT, I don't have a great idea - maybe just
> DROP_RELATIONS_BSEARCH_LIMIT?

Sounds good. I've changed the name and fixed the codestyle issues in the
attached version of the patch.

Tomas
diff --git a/src/backend/catalog/storage.c b/src/backend/catalog/storage.c
index 8dc622a..8c12a43 100644
--- a/src/backend/catalog/storage.c
+++ b/src/backend/catalog/storage.c
@@ -312,6 +312,11 @@ smgrDoPendingDeletes(bool isCommit)
PendingRelDelete *pending;
PendingRelDelete *prev;
PendingRelDelete *next;
+
+   SMgrRelation   *srels = palloc(sizeof(SMgrRelation));
+   int nrels = 0,
+   i = 0,
+   maxrels = 1;
 
prev = NULL;
for (pending = pendingDeletes; pending != NULL; pending = next)
@@ -335,14 +340,31 @@ smgrDoPendingDeletes(bool isCommit)
SMgrRelation srel;
 
srel = smgropen(pending->relnode, 
pending->backend);
-   smgrdounlink(srel, false);
-   smgrclose(srel);
+   
+   /* extend the array if needed (double the size) 
*/
+   if (maxrels <= nrels) {
+   maxrels *= 2;
+   srels = repalloc(srels, 
sizeof(SMgrRelation) * maxrels);
+   }
+   
+   srels[nrels++] = srel;
}
/* must explicitly free the list entry */
pfree(pending);
/* prev does not change */
}
}
+
+   if (nrels > 0)
+   {
+   smgrdounlinkall(srels, nrels, false);
+   
+   for (i = 0; i < nrels; i++)
+   smgrclose(srels[i]);
+   }
+   
+   pfree(srels);
+
 }
 
 /*
diff --git a/src/backend/storage/buffer/bufmgr.c 
b/src/backend/storage/buffer/bufmgr.c
index dddb6c0..52ca2b0 100644
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -62,6 +62,7 @@
 #define BUF_WRITTEN0x01
 #define BUF_REUSABLE   0x02
 
+#define DROP_RELATIONS_BSEARCH_LIMIT   10
 
 /* GUC variables */
 bool   zero_damaged_pages = false;
@@ -108,6 +109,7 @@ static volatile BufferDesc *BufferAlloc(SMgrRelation smgr,
 static void FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln);
 static void AtProcExit_Buffers(int code, Datum arg);
 
+static int rnode_comparator(const void * p1, const void * p2);
 
 /*
  * PrefetchBuffer -- initiate asynchronous read of a block of a relation
@@ -2094,35 +2096,87 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, 
ForkNumber forkNum,
  * 
  */
 void
-DropRelFileNodeAllBuffers(RelFileNodeBackend rnode)
+DropRelFileNodeAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
 {
-   int i;
+   int i, j, n = 0;
+   RelFileNode * nodes;
+
+   nodes = palloc(sizeof(RelFileNode) * nnodes); /* non-local relations */
 
/* If it's a local relation, it's localbuf.c's problem. */
-   if (RelFileNodeBackendIsTemp(rnode))
+   for (i = 0; i < nnodes; i++)
{
-   if (rnode.backend == MyBackendId)
-   DropRelFileNodeAllLocalBuffers(rnode.node);
+   if (RelFileNodeBackendIsTemp(rnodes[i]))
+   {
+   if (rnodes[i].backend == MyBackendId)
+   DropRelFileNodeAllLocalBuffers(rnodes[i].node);
+   }
+   else
+   {
+   nodes[n++] = rnodes[i].node;
+   }
+   }
+
+   /*
+* If there are no non-local relations, then we're done. Release the 
memory
+* and return.
+*/
+   if (n == 0)
+   {
+   pfree(nodes);
return;
}
 
+   /* sort the list of r

Re: [HACKERS] too much pgbench init output

2013-01-05 Thread Tomas Vondra
On 6.1.2013 03:03, Tatsuo Ishii wrote:
> As a committer, I have looked into the patch. I noticed two things:
> 
> 1) In the help you put '-q' option into "Common options" section. I
> think this should be moved to "Initialization options" section because
> the option is only applied while initializing.

Good point, moved.

> 2) Shouldn't a long option for '-q' be provided? Something like
> '--quiet-progress-logging'?

I don't think so. Currently pgbench has either short or long option, not
both (for the same thing). I believe we should stick to this and either
choose "-q" or "--quiet-logging" but not both.

> 3) No patches for docs found (doc/src/sgml/pgbench.sgml)

I've added a brief description of the "-q" option into the docs. IMHO
it's enough but feel free to add some more details.


There's one more thing I've just noticed - the original version of the
patch simply removed the old logging, but this one keeps both old and
quiet logging. But the old logging still uses this:

fprintf(stderr, "%d of %d tuples (%d%%) done.\n", 

while the new logging does this

fprintf(stderr, "%d of %d tuples (%d%%) done (elapsed %.2f s,
remaining %.2f s).\n",

i.e. it prints additional info about elapsed/estimated time. Do we want
to keep it this way (i.e. not to mess with the old logging) or do we
want to add these new fields to the old logging too?

I suggest to add it to the old logging, to keep the log messages the
same, the only difference being the logging frequency.


Tomas
diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index e376452..31764fe 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -39,6 +39,7 @@
 #include "portability/instr_time.h"
 
 #include 
+#include 
 
 #ifndef WIN32
 #include 
@@ -102,6 +103,7 @@ extern int  optind;
 #define MAXCLIENTS 1024
 #endif
 
+#define LOG_STEP_SECONDS   5   /* seconds between log messages */
 #define DEFAULT_NXACTS 10  /* default nxacts */
 
 intnxacts = 0; /* number of 
transactions per client */
@@ -150,6 +152,7 @@ char   *index_tablespace = NULL;
 #define naccounts  10
 
 bool   use_log;/* log transaction latencies to 
a file */
+bool   use_quiet;  /* quiet logging onto stderr */
 bool   is_connect; /* establish connection for 
each transaction */
 bool   is_latencies;   /* report per-command latencies */
 intmain_pid;   /* main process id used 
in log filename */
@@ -359,6 +362,7 @@ usage(void)
   "  -n   do not run VACUUM after initialization\n"
   "  -F NUM   fill factor\n"
   "  -s NUM   scaling factor\n"
+  "  -q   quiet logging (one message each 5 seconds)\n"
   "  --foreign-keys\n"
   "   create foreign key constraints between 
tables\n"
   "  --index-tablespace=TABLESPACE\n"
@@ -1362,6 +1366,11 @@ init(bool is_no_vacuum)
charsql[256];
int i;
 
+   /* used to track elapsed time and estimate of the remaining time */
+   instr_time  start, diff;
+   double  elapsed_sec, remaining_sec;
+   int log_interval = 1;
+
if ((con = doConnect()) == NULL)
exit(1);
 
@@ -1430,6 +1439,8 @@ init(bool is_no_vacuum)
}
PQclear(res);
 
+   INSTR_TIME_SET_CURRENT(start);
+
for (i = 0; i < naccounts * scale; i++)
{
int j = i + 1;
@@ -1441,10 +1452,33 @@ init(bool is_no_vacuum)
exit(1);
}
 
-   if (j % 10 == 0)
+   /* If we want to stick with the original logging, print a 
message each
+* 100k inserted rows. */
+   if ((! use_quiet) && (j % 10 == 0))
fprintf(stderr, "%d of %d tuples (%d%%) done.\n",
-   j, naccounts * scale,
-   (int) (((int64) j * 100) / (naccounts * 
scale)));
+   j, naccounts * scale,
+   (int) (((int64) j * 
100) / (naccounts * scale)));
+   /* let's not call the timing for each row, but only each 100 
rows */
+   else if (use_quiet && (j % 100 == 0))
+   {
+   INSTR_TIME_SET_CURRENT(diff);
+   INSTR_TIME_SUBTRACT(diff, start);
+
+   elapsed_sec = INSTR_TIME_GET_DOUBLE(diff);
+   remaining_sec = (scale * naccounts - j) * elapsed_sec / 
j;
+
+   /* have we reached the next interval (or end)? */
+   if ((j == sc

[HACKERS] question: foreign key constraints and AccessExclusive locks

2013-01-05 Thread Jon Nelson
When adding a foreign key constraint on tableA which references
tableB, why is an AccessExclusive lock on tableB necessary? Wouldn't a
lock that prevents writes be sufficient, or does PostgreSQL have to
modify *both* tables in some fashion? I'm using PostgreSQL 8.4 on
Linux.

-- 
Jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] too much pgbench init output

2013-01-05 Thread Tatsuo Ishii
> On 19.12.2012 06:30, Jeevan Chalke wrote:
>> 
>> 
>> 
>> On Mon, Dec 17, 2012 at 5:37 AM, Tomas Vondra > > wrote:
>> 
>> Hi,
>> 
>> attached is a new version of the patch that
>> 
>> (a) converts the 'log_step_seconds' variable to a constant (and does
>> not allow changing it using a command-line option etc.)
>> 
>> (b) keeps the current logging as a default
>> 
>> (c) adds a "-q" switch that enables the new logging with a 5-second
>> interval
>> 
>> I'm still not convinced there should be yet another know for tuning the
>> log interval - opinions?
>> 
>> 
>> It seems that you have generated a patch over your earlier version and
>> due to that it is not cleanly applying on fresh sources.
>> Please generate patch on fresh sources.
> 
> Seems you're right - I've attached the proper patch against current master.
> 
>> However, I absolutely no issues with the design. Also code review is
>> already done and looks good to me.
>> I think to move forward on this we need someone from core-team. So I am
>> planning to change its status to "ready-for-committor".
>> 
>> Before that please provide updated patch for final code review.

As a committer, I have looked into the patch. I noticed two things:

1) In the help you put '-q' option into "Common options" section. I
think this should be moved to "Initialization options" section because
the option is only applied while initializing.

2) Shouldn't a long option for '-q' be provided? Something like
'--quiet-progress-logging'?

3) No patches for docs found (doc/src/sgml/pgbench.sgml)
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] dynamic SQL - possible performance regression in 9.2

2013-01-05 Thread Jeff Janes
On Wednesday, January 2, 2013, Tom Lane wrote:

> Jeff Janes  writes:
> > Using a RULE-based partitioning instead with row by row insertion, the
> > plancache changes  slowed it down by 300%, and this patch doesn't change
> > that.  But that seems to be down to the insertion getting planned
> > repeatedly, because it decides the custom plan is cheaper than the
> generic
> > plan.  Whatever savings the custom plan may have are clearly less than
> the
> > cost of doing the planning repeatedly.
>
> That scenario doesn't sound like it has anything to do with the one being
> discussed in this thread.  But what do you mean by "rule-based
> partitioning" exactly?  A rule per se wouldn't result in a cached plan
> at all, let alone one with parameters, which would be necessary to
> trigger any use of the custom-cached-plan code path.
>

Right, it is not related to the dynamic SQL, but is to the plan-cache.


> Test cases are way more interesting than hand-wavy complaints.
>

Sorry, when exiled to the hinterlands I have more time to test various
things but not a good enough connectivity to describe them well.  I'm
attaching the test case to load 1e5 rows into a very skinny table with 100
partitions using rules.

"origin" is from a few days ago, "origin_reduce_copies" is Heikki's patch,
and "origin_one_shot" is your now-committed patch.  (unshown are
e6faf910d75027 and e6faf910d75027_prev, but that is where the regression
was introduced)

JJ /usr/local/pgsql_REL9_1_7/
Time: 64252.6907920837 ms
JJ origin/
Time: 186657.824039459 ms
JJ origin_reduce_copies/
Time: 185370.236873627 ms
JJ origin_one_shot/
Time: 189104.484081268 ms


The root problem is that it thinks the generic plan costs about 50% more
than the custom one.  I don't know why it thinks that, or how much it is
worth chasing it down.

On the other hand, your patch does fix almost all of the 9.2.[012]
regression of using the following dynamic SQL trigger (instead of RULES) to
load into the same test case.

CREATE OR REPLACE FUNCTION foo_insert_trigger()
RETURNS trigger AS $$
DECLARE tablename varchar(24);
BEGIN
tablename = 'foo_' || new.partition;
EXECUTE 'INSERT INTO '|| tablename ||' VALUES (($1).*)' USING NEW ;
RETURN NULL;
END;
$$
LANGUAGE plpgsql;

CREATE TRIGGER foo_insert_trigger
BEFORE INSERT ON foo
FOR EACH ROW EXECUTE PROCEDURE foo_insert_trigger();

Cheers,

Jeff
## The purpose of this program is to load the same files that \copy would 
## load into a pgsql table, but do so one row at a time rather than in bulk.
## This could be useful to demonstrate the difference in loading efficiency between bulk 
## and row by row, without having to use different formats for each.

## This makes no attempt to deal with escape characters like \t and \n the same way \copy does, 
## so the loaded results will not be identical but that shouldn't matter if used only for 
## benchmarking and not for actual production loading (and why would you use this for that purpose when 
## \copy is available?)

## When loading into a 2 column unindexed untriggered table, Perl takes about half the CPU time and 
## postgres about half.  When loading to a table with triggers or indexs, Perl's slice becomes 
## mostly immaterial.

use strict;
use warnings;
use DBI;
use Time::HiRes qw(time);
my $start=time();
my $dbi=DBI->connect('DBI:Pg:');
my ($columns) = $dbi->selectrow_array("select count(*) from information_schema.columns where table_name=?", undef, $ARGV[0]);
$columns > 0 or die "no such table '$ARGV[0]'";
## prepare an insert with as many placeholders as the table has columns
my $insert=$dbi->prepare("Insert into $ARGV[0] values (" . (join ',', map '?', 1..$columns) .')');

open my $fh, "<", $ARGV[1] or die "Couldn't open '$ARGV[1]': $!";
$dbi->begin_work();
while (<$fh>) { chomp;
	my @x=split /\t/;
	$insert->execute(@x);
};
$dbi->commit();
my $stop=time();
## make a timing output that look like the one psql \timing would generate
print "Time: ", 1000* ($stop-$start), " ms\n";

-- -- make rand.csv like below.
-- perl -le 'print join "\t", int(rand()*1e9),$_%100 foreach 1..1e5' > rand.csv

-- -- execute on different versions:
--  while (true) ; do for f in /usr/local/pgsql_REL9_1_7/  /usr/local/pgsql_REL9_2_2/ origin_94afbd5831fbc1926f/; do pg_ctl stop ; rm -r /tmp/data; $f/bin/initdb; $f/bin/pg_ctl start -w -o "--shared_buffers=512MB --checkpoint_segments=60 --shared_buffers=16MB"; createdb; echo "JJ $f";  psql -f rules_test.sql -X ; done ; done >& rules_test.sql.out

-- -- and pull results from the log file
-- tail -n 100 -f rules_test.sql.out |grep -P 'JJ|Time:' -a


drop table foo cascade;
create table foo ( x integer, partition integer);
--create index on foo (x);
create table foo_0 (like foo including indexes) inherits (foo);
create table foo_1 (like foo including indexes) inherits (foo);
create table foo_2 (like foo including indexes) inherits (foo);
create table foo_3 (like foo including indexes) inherits (foo);
create table foo_4 (like foo including indexes) inheri

Re: [HACKERS] [PERFORM] Slow query: bitmap scan troubles

2013-01-05 Thread Tom Lane
Jeff Janes  writes:
> [moved to hackers]
> On Wednesday, December 5, 2012, Tom Lane wrote:
>> Hm.  To tell you the truth, in October I'd completely forgotten about
>> the January patch, and was thinking that the 1/1 cost had a lot
>> of history behind it.  But if we never shipped it before 9.2 then of
>> course that idea is false.  Perhaps we should backpatch the log curve
>> into 9.2 --- that would reduce the amount of differential between what
>> 9.2 does and what previous branches do for large indexes.

> I think we should backpatch it for 9.2.3.  I've seen another email which is
> probably due to the same issue (nested loop vs hash join).  And some
> monitoring of a database I am responsible for suggests it might be heading
> in that direction as well as the size grows.

I received an off-list report of a case where not only did the 1/1
factor cause a nestloop-vs-hashjoin decision to be made wrongly, but
even adding the ln() computation as in commit bf01e34b556 didn't fix it.
I believe the index in question was on the order of 2 pages, so
it's not too hard to see why this might be the case:

* historical fudge factor   4 * 2/10 = 0.8
* 9.2 fudge factor  4 * 2/1 = 8.0
* with ln() correction  4 * ln(1 + 2/1) = 4.39 or so

At this point I'm about ready to not only revert the 10-to-1
change, but keep the ln() adjustment, ie make the calculation be
random_page_cost * ln(1 + index_pages/10).  This would give
essentially the pre-9.2 behavior for indexes up to some tens of
thousands of pages, and keep the fudge factor from getting out of
control even for very very large indexes.

> But I am wondering if it should be present at all in 9.3.  When it was
> introduced, the argument seemed to be that smaller indexes might be easier
> to keep in cache.

No.  The argument is that if we don't have some such correction, the
planner is liable to believe that different-sized indexes have *exactly
the same cost*, if a given query would fetch the same number of index
entries.  This is quite easy to demonstrate when experimenting with
partial indexes, in particular - without the fudge factor the planner
sees no advantage of a partial index over a full index from which the
query would fetch the same number of entries.  We do want the planner
to pick the partial index if it's usable, and a fudge factor is about
the least unprincipled way to make it do so.

> The argument for increasing the penalty by a factor of 10 was that the
> smaller one could be "swamped by noise such as page-boundary-roundoff
> behavior".

Yeah, I wrote that, but in hindsight it seems like a mistaken idea.
The noise problem is that because we round off page count and row count
estimates to integers at various places, it's fairly easy for small
changes in statistics to move a plan's estimated cost by significantly
more than this fudge factor will.  However, the case that the fudge
factor is meant to fix is indexes that are otherwise identical for
the query's purposes --- and any roundoff effects will be the same.
(The fudge factor itself is *not* rounded off anywhere, it flows
directly to the bottom-line cost for the indexscan.)

> One thing which depends on the index size which, as far as I can tell, is
> not currently being counted is the cost of comparing the tuples all the way
> down the index.  This would be proportional to log2(indextuples) *
> cpu_index_tuple_cost, or maybe log2(indextuples) *
> (cpu_index_tuple_cost+cpu_operator_cost), or something like that.

Yeah, I know.  I've experimented repeatedly over the years with trying
to account explicitly for index descent costs.  But every time, anything
that looks even remotely principled turns out to produce an overly large
correction that results in bad plan choices.  I don't know exactly why
this is, but it's true.

One other point is that I think it is better for any such correction
to depend on the index's total page count, not total tuple count,
because otherwise two indexes that are identical except for bloat
effects will appear to have identical costs.  So from that standpoint,
the ln() form of the fudge factor seems quite reasonable as a crude form
of index descent cost estimate.  The fact that we're needing to dial
it down so much reinforces my feeling that descent costs are close to
negligible in practice.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-05 Thread Peter Geoghegan
On 21 December 2012 14:08, Robert Haas  wrote:
> I'm sure it's possible; I don't *think* it's terribly easy.

I'm inclined to agree that this isn't a terribly pressing issue.
Certainly, the need to introduce a bunch of new infrastructure to
detect this case seems hard to justify.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cascading replication: should we detect/prevent cycles?

2013-01-05 Thread Joshua Berkus
Robert,

> I'm sure it's possible; I don't *think* it's terribly easy.  The
> usual
> algorithm for cycle detection is to have each node send to the next
> node the path that the data has taken.  But, there's no unique
> identifier for each slave that I know of - you could use IP address,
> but that's not really unique.  And, if the WAL passes through an
> archive, how do you deal with that?  

Not that I know how to do this, but it seems like a more direct approach is to 
check whether there's a master anywhere up the line.  H.  Still sounds 
fairly difficult.

> I'm sure somebody could figure
> all of this stuff out, but it seems fairly complicated for the
> benefit
> we'd get.  I just don't think this is going to be a terribly common
> problem; if it turns out I'm wrong, I may revise my opinion.  :-)

I don't think it'll be that common either.  The problem is that when it does 
happen, it'll be very hard for the hapless sysadmin involved to troubleshoot.

> To me, it seems that lag monitoring between master and standby is
> something that anyone running a complex replication configuration
> should be doing - and yeah, I think anything involving four standbys
> (or cascading) qualifies as complex.  If you're doing that, you
> should
> notice pretty quickly that your replication lag is increasing
> steadily.  

There are many reasons why replication lag would increase steadily.

> You might also check pg_stat_replication the master and
> notice that there are no connections there any more. 

Well, if you've created a true cycle, every server has one or more replicas.  
The original case I presented was the most probably cause of accidental cycles: 
the original master dies, and the on-call sysadmin accidentally connects the 
first replica to the last replica while trying to recover the cluster.

AFAICT, the only way to troubleshoot a cycle is to test every server in the 
network to see if it's a master and has replicas, and if no server is a master 
with replicas, it's a cycle.  Again, not fast or intuitive.

 Could someone
> miss those tell-tale signs?  Sure.  But they could also set
> autovacuum_naptime to an hour and then file a support ticket
> complaining that about table bloat - and they do.  Personally, as
> user
> screw-ups go, I'd consider that scenario (and its fourteen cousins,
> twenty-seven second cousins, and three hundred and ninety two other
> extended family members) as higher-priority and lower effort to fix
> than this particular thing.

I agree that this isn't a particularly high-priority issue.  I do think it 
should go on the TODO list, though, just in case we get a GSOC student or other 
new contributor who wants to tackle it.

--Josh




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store "timestamptz" of database creation on "pg_database"

2013-01-05 Thread Stephen Frost
* Fabrízio de Royes Mello (fabriziome...@gmail.com) wrote:
> But those tables are filled only when we execute COMMENT ON statement...
> then your idea is create a 'null' comment every time we create a single
> object... is it?

Yes, and have the actual 'description' field (as it's variable) at the
end of the catalog.

Regarding the semantics of it- I was thinking about how directories and
unix files work.  Basically, adding or removing a sub-object would
update the alter time on the object itself, changing an already existing
object or sub-object would update only the object/sub-object's alter
time.  Creating an object or sub/object would set its create time and
alter time to the same value.  I would distinguish 'create' from
'ctime', however, and have our 'create' time be only the actual
*creation* time of the object.  ALTER table OWNER TO user; would update
"table"s alter time.

Open to other thoughts on this and perhaps we should create a wiki page
to start documentating the semantics.  Once we get agreement there, it's
just a bit of code. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: [COMMITTERS] pgsql: Invent a "one-shot" variant of CachedPlans for better performanc

2013-01-05 Thread Tom Lane
Simon Riggs  writes:
> On 4 January 2013 22:42, Tom Lane  wrote:
>> Invent a "one-shot" variant of CachedPlans for better performance.

> Just a moment of reflection here but I did already invent a "one-shot"
> plan concept in a patch submission to 9.2, called exactly that, which
> enables a variety of optimizations, this being one.

If you're speaking of
http://archives.postgresql.org/pgsql-hackers/2011-06/msg01168.php

that patch was vastly more invasive than this one, with vastly less
clear performance characteristics (none of which were documented by test
cases).  And it had the potential for unexpected user-visible semantics
changes; for instance, as submitted it made EXPLAIN produce results
different from what might actually happen at execution.  I don't think
there's any comparison to this patch at all except in the nomenclature.

> We need a full "one-shot" concept linking the planner and executor for
> all sorts of reasons, not just this one. We can discuss the
> practicality of individual optimizations but the linkage should be
> clear throughout the whole infrastructure.

I thought then, and I think now, that such a concept was too squishy
to be useful as an actual guide to what to change.  The particular
arguments you advanced then have been overtaken by events anyway;
notably that Marti Raudsepp's work on caching stable subexpressions at
execution seems like a much more general answer to the problem of
handling stable functions efficiently.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_retainxlog for inclusion in 9.3?

2013-01-05 Thread Phil Sorber
On Tue, Jan 1, 2013 at 10:10 AM, Magnus Hagander  wrote:
> So, it turns out the reason I got no feedback on this tool, was that I
> forgot both to email about and to actually push the code to github :O
> So this is actually code that's almost half a year old and that I was
> supposed to submit for the first or second commitfest. Oops.
>
> So, the tool and a README for it right now can be found at
> https://github.com/mhagander/pg_retainxlog for the time being.
>
> The idea behind the tool is to plug a hole in the case when
> pg_receivexlog is used for log archiving, which is that you still need
> a "blocking" archive_command in order to make sure that files aren't
> recycled on the master. So for 9.2 you can do this with an
> archive_command that checks if the file has appeared properly on the
> slave - but that usually means you're back at requiring ssh
> connectivity between the machines, for example. Even though this
> information is actually avialable on the master...
>
> This can also be of use to pure replication scenarios, where you don't
> know how to tune wal_keep_segments, but using actual live feedback
> instead of guessing.
>
> When pg_retainxlog is used as an archive_command, it will check the
> pg_stat_replication view instead of checking the slave. It will then
> only return ok once the requested logfile has been replicated to the
> slave. By default it will look for a replication client named
> pg_receivexlog, but it supports overriding the query with anything -
> so you can say things like "needs to have arrived on at least two
> replication slaves before we consider it archived". Or if used instead
> of wal_keep_segmnets, needs to have arrived at *all* replication
> slaves.
>
> Is this a tool that people would like to see included in the general
> toolchain? If so, I'll reformat it to work in the general build
> environment and submit it for the last commitfest.
>
> (comments on the code itself are of course also welcome)
>
> --
>  Magnus Hagander
>  Me: http://www.hagander.net/
>  Work: http://www.redpill-linpro.com/
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

+1 to this concept, however it may be implemented.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bad examples in pg_dump README

2013-01-05 Thread Josh Kupershmidt
On Sat, Jan 5, 2013 at 7:34 AM, Magnus Hagander  wrote:
> On Fri, Jan 4, 2013 at 3:36 AM, Josh Kupershmidt  wrote:
> Do we need to keep it at all, really? Certainly the introductory part
> is covered in the main documentation already...

Pretty much. (I did find note #2 mildly interesting.)

> I believe the tar notes are also out of date. For example, they claim
> that you can't expect pg_restore to work with an uncompressed tar
> format - yet the header in pg_backup_tar.c says that an uncompressed
> tar format is compatible with a directory format dump, and thus you
> *can* use pg_restore.
>
> (And fwiw,the note about the user should probably go in src/port/ now
> that we moved the tar header creation there a few days ago)

Hrm yeah, so the second paragraph under the tar section can certainly be axed.

> I would suggest we just drop the README file completely. I don't think
> it adds any value at all.
>
> Any objections to that path? :)

I think that's OK, since there's not much left in that README after
removing the bogus examples, introductory text that's covered
elsewhere, and obsolete second paragraph about the tar format. Perhaps
we could keep the other paragraphs about the tar format, either in the
header comments for pg_backup_tar.c or in src/port/, though?

Oh, and for this comment in pg_backup_tar.c:

|  *  See the headers to pg_backup_files & pg_restore for more
details.

there is no longer a pg_backup_files.c.

Josh


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enhanced error fields

2013-01-05 Thread Pavel Stehule
2013/1/5 Peter Geoghegan :
> On 5 January 2013 16:56, Pavel Stehule  wrote:
>>> It seems that we're in agreement, then. I'll prepare a version of the
>>> patch very similar to the one I previously posted, but with some
>>> caveats about how reliably the values can be used. I think that that
>>> should be fine.
>>
>> is there agreement of routine_name and trigger_name fields?
>
> Well, Tom and I are both opposed to including those fields. Peter E
> seemed to support it in some way, but didn't respond to Tom's
> criticisms (which were just a restatement of my own). So, it seems to
> me that we're not going to do that, assuming nothing changes.

if I understand well Robert Haas is for including these fields - so
score is still 2:2 - but this is not a  match :)

I have no more new arguments for these fields - yes, there are no change

Pavel

>
> --
> Peter Geoghegan   http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store "timestamptz" of database creation on "pg_database"

2013-01-05 Thread Fabrízio de Royes Mello
* Stephen Frost  wrote:
>
> * Fabrízio de Royes Mello (fabriziome...@gmail.com) wrote:
> > * also we discuss about create two new catalogs, one local and another
> > shared (like pg_description and pg_shdescription) to track creation
times
> > of all database objects.
>
> Creating a separate catalog (or two) every time we want to track XYZ for
> all objects is rather overkill...  Thinking about this a bit more, and
> noting that pg_description/shdescription more-or-less already exist as a
> framework for tracking 'something' for 'all catalog entries'- why don't
> we just add these columns to those tables..?

But those tables are filled only when we execute COMMENT ON statement...
then your idea is create a 'null' comment every time we create a single
object... is it?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] enhanced error fields

2013-01-05 Thread Peter Geoghegan
On 5 January 2013 16:56, Pavel Stehule  wrote:
>> It seems that we're in agreement, then. I'll prepare a version of the
>> patch very similar to the one I previously posted, but with some
>> caveats about how reliably the values can be used. I think that that
>> should be fine.
>
> is there agreement of routine_name and trigger_name fields?

Well, Tom and I are both opposed to including those fields. Peter E
seemed to support it in some way, but didn't respond to Tom's
criticisms (which were just a restatement of my own). So, it seems to
me that we're not going to do that, assuming nothing changes.

-- 
Peter Geoghegan   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git author vs committer

2013-01-05 Thread Magnus Hagander
On Thu, Sep 13, 2012 at 9:00 AM, Magnus Hagander  wrote:
> On Thu, Sep 13, 2012 at 5:22 AM, Peter Eisentraut  wrote:
>> On Wed, 2012-09-12 at 19:13 +0200, Magnus Hagander wrote:
>>> Just to be clear, what you're saying is we want to change the policy
>>> that says "committer must be on list of approved committers &&
>>> commiter==author" to "committer must be on list of approved committers
>>> && author must be on list of approved committers"?
>>
>> Yes, that would be my request.
>
> Definitely sounds like a reasonable thing to do. So unless there are
> objections, I'll try to get around to getting that done not too long
> from now.

Sorry, this took much longer than I had initially planned for, but at
least it's been done now.  Hopefully I didn't break anything else in
the process :)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reporting hba lines

2013-01-05 Thread Magnus Hagander
On Fri, Jun 29, 2012 at 4:39 PM, Tom Lane  wrote:
> Magnus Hagander  writes:
>> Turned out to be a bit more work than I thought, since the current
>> parser reads pg_hba byte by byte, and not line by line. So I had to
>> change that. See attached, seems reasonable?
>
> A couple of comments:
>
> * In some places you have "if ((c = *(*lineptr)++) != '\0')" and in other
> places just "if ((c = *(*lineptr)++))".  This should be consistent (and
> personally I prefer the first way).
>
> * I'm not sure that this conversion is right:
>
> !   if (c != EOF)
> !   ungetc(c, fp);
> ---
> !   if (c != '\0')
> !   (*lineptr)--;
>
> In the file case, it's impossible to push back EOF, and unnecessary
> since another getc will produce EOF again anyway.  In the string case,
> though, I think you might need to decrement the lineptr unconditionally,
> else next call will run off the end of the string no?
>
> * This bit seems a bit ugly, and not Windows-aware either:
>
> !   /* We don't store the 
> trailing newline */
> !   if 
> (rawline[strlen(rawline)-1] == '\n')
> !   
> rawline[strlen(rawline)-1] = '\0';
> !
>
> It might be better to strip trailing \n and \r from the line immediately
> upon read, rather than here.

I re-found this branch that I had completely forgotten about, when
cleaning up my reposiroty. oops.

Attached is an updated version of the patch, per the comments from Tom
and rebased on top of the current master. Since it's been a long time
ago, and some code churn in the area, another round of review is
probably a good thing...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


hba_line3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enhanced error fields

2013-01-05 Thread Pavel Stehule
2013/1/4 Peter Geoghegan :
> On 4 January 2013 18:07, Tom Lane  wrote:
>> Exactly.  To my mind, the *entire* point of this patch is to remove the
>> need for people to try to dig information out of potentially-localized
>> message strings.  It's not clear to me that we have to strain to provide
>> information that isn't in the currently-reported messages --- we are
>> only trying to make it easier for client-side code to extract the
>> information it's likely to need.
>
> It seems that we're in agreement, then. I'll prepare a version of the
> patch very similar to the one I previously posted, but with some
> caveats about how reliably the values can be used. I think that that
> should be fine.

is there agreement of routine_name and trigger_name fields?

Regards

Pavel

>
> --
> Peter Geoghegan   http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] enhanced error fields

2013-01-05 Thread Pavel Stehule
Hello

2013/1/4 Robert Haas :
> On Fri, Dec 28, 2012 at 1:21 PM, Peter Geoghegan  
> wrote:
>> Now, as to the question of whether we need to make sure that
>> everything is always fully qualified - I respectfully disagree with
>> Stephen, and maintain my position set out in the last round of
>> feedback [1], which is that we don't need to fully qualify everything,
>> because *for the purposes of error handling*, which is what I think we
>> should care about, these fields are sufficient:
>>
>> +   column_name text,
>> +   table_name text,
>> +   constraint_name text,
>> +   schema_name text,
>>
>> If you use the same constraint name everywhere, you might get the same
>> error message. The situations in which this scheme might fall down are
>> too implausible for me to want to bloat up all those ereport sites any
>> further (something that Stephen also complained about).
>>
>> I think that the major outstanding issues are concerning whether or
>> not I have the API here right. I make explicit guarantees as to the
>> availability of certain fields for certain errcodes (a small number of
>> "Class 23 - Integrity Constraint Violation" codes). No one has really
>> said anything about that, though I think it's important.
>>
>> I also continue to think that we shouldn't have "routine_name",
>> "routine_schema" and "trigger_name" fields - I think it's wrong-headed
>> to have an exception handler magically act on knowledge about where
>> the exception came from that has been baked into the exception - is
>> there any sort of precedent for this? Pavel disagrees here. Again, I
>> defer to others.
>
> I don't really agree with this.  To be honest, my biggest concern
> about this patch is that it will make it take longer to report an
> error.  At least in the cases where these additional fields are
> included, it will take CPU time to populate those fields, and maybe
> there will be some overhead even in the cases where they aren't even
> used (although I'd expect that to be too little to measure).  Now,
> maybe that doesn't matter in the case where the error is being
> reported back to the client, because the overhead of shipping packets
> across even a local socket likely dwarfs the overhead, but I think it
> matters a lot where you are running a big exception-catching loop.
> That is already painfully slow, and -1 from me on anything that makes
> it significantly slower.  I have a feeling this isn't the first time
> I'm ranting about this topic in relation to this patch, so apologies
> if this is old news.

We did these tests independently - Peter and me with same result - in
use cases developed for highlighting impact of this patch you can see
some slowdown - if I remember well - less than 3% - and it raised
exceptions really intensively - and It can be visible only from stored
procedures environment - if somebody use it from client side, then
impact is zero.

>
> But if we decide that there is no performance issue or that we don't
> care about the hit there, then I think Stephen and Pavel are right to
> want a large amount of very precise detail.  What's the use case for
> this feature?  Presumably, it's for people for whom parsing the error
> message is not a practical option, so either they textual error
> message doesn't identify the target object sufficiently precisely, and
> they want to make sure they know what it applies to; or else it's for
> people who want any error that applies to a table to be handled the
> same way (without worrying about exactly which error they have got).
> Imagine, for example, someone writing a framework that will be used as
> a basis for many different applications.  It might want to do
> something, like, I don't know, update the comment on a table every
> time an error involving that table occurs.  Clearly, precise
> identification of the table is a must.  In a particular application
> development environment, it's reasonable to rely on users to name
> things sensibly, but if you're shipping a tool that might be dropped
> into any arbitrary environment, that's significantly more dangerous.
>
> Similarly, for a function-related error, you'd need something like
> that looks like the output of pg_proc.oid::regprocedure, or individual
> fields with the various components of that output.  That sounds like
> routine_name et. al.

Probably we don't need all fields mentioned in ANSI SQL, because some
from these fields has no sense in pg, but routine name and trigger
name is really basic for some little bit more sophisticate work with
exception on PL level.

Regards

Pavel


>
> I'm not happy about the idea of shipping OIDs back to the client.
> OIDs are too user-visible as it is; we should try to make them less
> not moreso.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2013-01-05 Thread Boszormenyi Zoltan

2013-01-05 16:58 keltezéssel, Magnus Hagander írta:

On Sat, Jan 5, 2013 at 3:41 PM, Magnus Hagander  wrote:

On Thu, Jan 3, 2013 at 1:33 PM, Boszormenyi Zoltan  wrote:

2013-01-02 11:54 keltezéssel, Boszormenyi Zoltan írta:

2013-01-02 10:37 keltezéssel, Boszormenyi Zoltan írta:

2013-01-02 10:12 keltezéssel, Magnus Hagander írta:
Attached is the quotes-v2 patch, the function is renamed and
the comment is modified plus the pg_basebackup v21 patch
that uses this function.


Rebased after pgtar.h was added. The quotes.c comment was modified
even more so it doesn't say this function is used for SQL string literals.

Quotes patch applied with a few small changes:
1) Support for the msvc build (needs an entry added for new files that
go in src/port if they should be used on Windows)
2) Makefile entries should be alphabetical (yes, that's really trivial
nitpicking :D)

After some further modifications, I've applied the pg_basebackup patch as well.


Thank you very much!


Thanks! And again, apologies for dragging the process out longer than
should've been necessary.


I blamed it on me not having done reviews earlier in the commitfest,
which I finally did last week. Now, if only Tom could find some time
to review the lock_timeout patch... ;-)

Thanks again and best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store "timestamptz" of database creation on "pg_database"

2013-01-05 Thread Stephen Frost
* Fabrízio de Royes Mello (fabriziome...@gmail.com) wrote:
> * also we discuss about create two new catalogs, one local and another
> shared (like pg_description and pg_shdescription) to track creation times
> of all database objects.

Creating a separate catalog (or two) every time we want to track XYZ for
all objects is rather overkill...  Thinking about this a bit more, and
noting that pg_description/shdescription more-or-less already exist as a
framework for tracking 'something' for 'all catalog entries'- why don't
we just add these columns to those tables..?  This would also address
Peter's concern about making sure we do this 'wholesale' and in one
release rather than spread across multiple releases- just make sure it
covers the same set of things which 'comment' does.

Also, I don't think we really need a GUC for this.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2013-01-05 Thread Magnus Hagander
On Sat, Jan 5, 2013 at 3:41 PM, Magnus Hagander  wrote:
> On Thu, Jan 3, 2013 at 1:33 PM, Boszormenyi Zoltan  wrote:
>> 2013-01-02 11:54 keltezéssel, Boszormenyi Zoltan írta:
>>
>> 2013-01-02 10:37 keltezéssel, Boszormenyi Zoltan írta:
>>
>> 2013-01-02 10:12 keltezéssel, Magnus Hagander írta:
>> Attached is the quotes-v2 patch, the function is renamed and
>> the comment is modified plus the pg_basebackup v21 patch
>> that uses this function.
>>
>>
>> Rebased after pgtar.h was added. The quotes.c comment was modified
>> even more so it doesn't say this function is used for SQL string literals.
>
> Quotes patch applied with a few small changes:
> 1) Support for the msvc build (needs an entry added for new files that
> go in src/port if they should be used on Windows)
> 2) Makefile entries should be alphabetical (yes, that's really trivial
> nitpicking :D)

After some further modifications, I've applied the pg_basebackup patch as well.

Thanks! And again, apologies for dragging the process out longer than
should've been necessary.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposal: Store "timestamptz" of database creation on "pg_database"

2013-01-05 Thread Fabrízio de Royes Mello
On Fri, Jan 4, 2013 at 4:07 PM, Peter Eisentraut  wrote:

> On 1/3/13 3:26 PM, Robert Haas wrote:
> > It's true, as we've often
> > said here, that leveraging the OS facilities means that we get the
> > benefit of improving OS facilities "for free" - but it also means that
> > we never exceed what the OS facilities are able to provide.
>
> And that should be the deciding factor, shouldn't it?  Clearly, the OS
> timestamps do not satisfy the requirements of tracking database object
> creation times.
>
>
+1

And IMHO we must decide what we do or if we'll don't anything.

In this thread was discussed many ways to how to implement and how not
implement, so I compile some important points discussed before (sorry if I
forgot something):

* the original proposal was just to add a column in shared catalog
'pg_database' to track creation time (I already sent a patch [1]), but the
discussion going to implement a way to track creation time off all database
objects

* some people said if we implement that then we must have some way to alter
creation times by SQL (ALTER cmds) and also have a way to dump and restore
this info by pg_dump/pg_restore. Some agreed and others disagree.

* we talk about implement it with EventTriggers, but they not cover shared
objects (like databases, roles, tablespaces,...), and someone talked to
extend EventTriggers to cover this kind of objects or maybe we have a way
to create *shared tables* (this is what I understood). This way force to
every people implement your own track time system or maybe someone share a
extension to do that.

* also we discuss about create two new catalogs, one local and another
shared (like pg_description and pg_shdescription) to track creation times
of all database objects.

Please fix if I forgot something. Anyway, we must decide what to do.

I don't know enough, but I have another idea. What you guys think about we
have tables like "stats tables" to track creation times, with a GUC to
enable or disable this behavior.


Regards,


[1] http://archives.postgresql.org/pgsql-hackers/2013-01/msg00111.php

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] [PATCH] Make pg_basebackup configure and start standby [Review]

2013-01-05 Thread Magnus Hagander
On Thu, Jan 3, 2013 at 1:33 PM, Boszormenyi Zoltan  wrote:
> 2013-01-02 11:54 keltezéssel, Boszormenyi Zoltan írta:
>
> 2013-01-02 10:37 keltezéssel, Boszormenyi Zoltan írta:
>
> 2013-01-02 10:12 keltezéssel, Magnus Hagander írta:
> Attached is the quotes-v2 patch, the function is renamed and
> the comment is modified plus the pg_basebackup v21 patch
> that uses this function.
>
>
> Rebased after pgtar.h was added. The quotes.c comment was modified
> even more so it doesn't say this function is used for SQL string literals.

Quotes patch applied with a few small changes:
1) Support for the msvc build (needs an entry added for new files that
go in src/port if they should be used on Windows)
2) Makefile entries should be alphabetical (yes, that's really trivial
nitpicking :D)


--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] bad examples in pg_dump README

2013-01-05 Thread Magnus Hagander
On Fri, Jan 4, 2013 at 3:36 AM, Josh Kupershmidt  wrote:
> The ./src/bin/pg_dump README contains several inoperable examples.
> First, this suggestion:
>
> |   or to list tables:
> |
> |   pg_restore  --table | less
>
> seems bogus, i.e. the --table option requires an argument specifing
> which table to restore, and does not "list tables". Next, this
> suggested command:
>
> |  pg_restore  -l --oid --rearrange | less
>
> hasn't worked since 7.4 or thereabouts, since we don't have
> --rearrange or --oid anymore. (I'm not sure we ever had --oid, maybe
> it was supposed to be --oid-order?). Then, three examples claim we
> support a "--use" flag, e.g.
>
> |  pg_restore backup.bck --use=toc.lis > script.sql
>
> where presumably "--use-list" was meant. Further little gripes with
> this README include mixed use of tabs and spaces for the examples, and
> lines running over the 80-column limit which at least some of our
> other READMEs seem to honor.
>
> I started out attempting to fix up the README, keeping the original
> example commands intact, but upon further reflection I think the
> examples of dumping, restoring, and selective-restoring in that REAMDE
> don't belong there anyway. There are already better examples of such
> usage in the pg_dump/pg_restore documentation pages, and IMO that's
> where such generally-useful usage information belongs.
>
> I propose slimming down the pg_dump README, keeping intact the
> introductory notes and details of the tar format.

Do we need to keep it at all, really? Certainly the introductory part
is covered in the main documentation already...

I believe the tar notes are also out of date. For example, they claim
that you can't expect pg_restore to work with an uncompressed tar
format - yet the header in pg_backup_tar.c says that an uncompressed
tar format is compatible with a directory format dump, and thus you
*can* use pg_restore.

(And fwiw,the note about the user should probably go in src/port/ now
that we moved the tar header creation there a few days ago)

I would suggest we just drop the README file completely. I don't think
it adds any value at all.

Any objections to that path? :)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_retainxlog for inclusion in 9.3?

2013-01-05 Thread Magnus Hagander
On Fri, Jan 4, 2013 at 7:13 PM, Peter Eisentraut  wrote:
> On 1/3/13 12:30 PM, Robert Haas wrote:
>> On Thu, Jan 3, 2013 at 11:32 AM, Magnus Hagander  wrote:
>>> Any particular reason? It goes pretty tightly together with
>>> pg_receivexlog, which is why I'd prefer putting it alongside that one.
>>> But if you have a good argument against it, I can change my mind :)
>>
>> Mostly that it seems like a hack, and I suspect we may come up with a
>> better way to do this in the future.
>
> It does seem like a hack.  Couldn't this be implemented with a backend
> switch instead?

It definitely is a bit of a hack.

I assume by backend switch you mean guc, right? If so, no, not easily
so. Because it's the archiver process that does the deleting. And this
process does not have access to a "full backend interface", e.g. the
ability to run a query. We could make it look at the same data that's
currently shown in pg_stat_replicatoin through shared memory, but thta
would *only* work in the very most simple cases (e.g. a single
pg_receivexlog and no other replication). The ability to run a custom
SQL query is going to be necessary for anything a bit more advanced.


> Also, as a small practical matter, since this is a server-side program
> (since it's being used as archive_command), we shouldn't put it into the
> pg_basebackup directory, because that would blur the lines about what to
> install where, in particular for the translations.

Good argument. That along with the being a hack, and the comment from
Robert, means that maybe contrib/ is a better place for it, yes.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-01-05 Thread Noah Misch
On Sat, Jan 05, 2013 at 08:05:11AM +0100, Boszormenyi Zoltan wrote:
> 2013-01-05 05:58 keltez?ssel, Amit kapila ?rta:
>> On Friday, January 04, 2013 10:57 PM Boszormenyi Zoltan wrote:
>>> In create_conf_lock_file():
>>
>>> Can't we add a new LWLock and use it as a critical section instead
>>> of waiting for 10 seconds? It would be quite surprising to wait
>>> 10 seconds  when accidentally executing two SET PERSISTENT
>>> statements in parallel.
>> Yes, you are right adding a new LWLock will avoid the use of sleep.
>> However I am just not sure if for only this purpose we should add a new 
>> LWLock?
>>
>> Similar logic is used in CreateLockFile() for postmaster file but it doesn't 
>> use sleep.
>> How about reducing the time of sleep or removing sleep, in that user might 
>> get
>> error and he need to retry to get his command successful?
>
> I think removing the loop entirely is better.

Agreed.  A new LWLock with a narrow purpose is fine.  CreateLockFile() happens
early, before shared memory is available.  Its approach is not a relevant
precedent for code that runs later.

> However, the behaviour should be discussed by a wider audience:
> - the backends should wait for each other just like other statements
>that fight for the same object (in which case locking is needed), or
> - throw an error immediately on conflicting parallel work

All other things being equal, the first choice sounds better.  (I don't think
the decision is essential to the utility of this feature.)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-01-05 Thread Boszormenyi Zoltan

2013-01-05 05:58 keltezéssel, Amit kapila írta:

On Friday, January 04, 2013 10:57 PM Boszormenyi Zoltan wrote:

Hi,
I am reviewing your patch.

Thank you very much.


Since you are using a constant string, it would be a little faster
to use "sizeof(string)-1" as it can be computed at compile-time
and not run the strlen() all the time this code is reached.

I have reffered the code just above for PG_TEMP_FILE_PREFIX in same function 
sendDir().
I have that not only that place, but there other places where strlen is used 
for PG_TEMP_FILE_PREFIX.
I think in this path, such optimization might not help much.
However if you think we should take care of this, then I can find other places 
where similar change can be done
to make it consistent?


Yes, but it needs a separate cleanup patch.

Also, since the SET PERSISTENT patch seems to create a single lock file,
sizeof(string) (which includes the 0 byte at the end of the string, so it
matches the whole filename) can be used instead of strlen(string) or
sizeof(string)-1 that match the filename as a prefix only.




In create_conf_lock_file():



Can't we add a new LWLock and use it as a critical section instead
of waiting for 10 seconds? It would be quite surprising to wait
10 seconds  when accidentally executing two SET PERSISTENT
statements in parallel.

Yes, you are right adding a new LWLock will avoid the use of sleep.
However I am just not sure if for only this purpose we should add a new LWLock?

Similar logic is used in CreateLockFile() for postmaster file but it doesn't 
use sleep.
How about reducing the time of sleep or removing sleep, in that user might get
error and he need to retry to get his command successful?

With Regards,
Amit Kapila.




--
--
Zoltán Böszörményi
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers