date:20160127




OK, so I had an extra look at this patch and I am marking it as ready
for committer.


Ok.


Attached is a rebase after recent changes in pgbench code & doc.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..d42208a 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -796,17 +796,21 @@ pgbench  options  dbname
   Sets variable varname to an integer value calculated
   from expression.
   The expression may contain integer constants such as 5432,
-  references to variables :variablename,
+  double constants such as 3.14159,
+  references to integer variables :variablename,
   and expressions composed of unary (-) or binary operators
-  (+, -, *, /, %)
-  with their usual associativity, and parentheses.
+  (+, -, *, /,
+  %) with their usual associativity, function calls and
+  parentheses.
+   shows the available
+  functions.
  
 
  
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -826,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for further information
+   about the precise shape of these distributions, depending on the value
+   of the parameter.
  
 
  
@@ -965,18 +938,184 @@ f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))

   
 
+   
+   
+PgBench Functions
+
+ 
+  
+   Function
+   Return Type
+   Description
+   Example
+   Result
+  
+ 
+ 
+

Re: [HACKERS] Patch: ResourceOwner optimization for tables with many partitions

2016-01-27 Thread Aleksander Alekseev

Hello, Tom.

I'm a bit concerned regarding assumption that sizeof int never exceeds 4
bytes. While this could be true today for most C compilers, standard
[1][2] doesn't guarantee that. Perhaps we should add something like:

StaticAssertStmt(sizeof(int) <= sizeof(int32),
"int size exceeds int32 size");

It costs nothing but could save a lot of time (not mentioning data
loss) some unlucky user.

[1] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf
[2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff



Hello Alvaro,


I'm not really sure about the fact that we operate on those Stats
structs without locking.  I see upthread you convinced Michael that it
was okay, but is it really?  How severe is the damage if two threads
happen to collide?


For stats shared among threads, when it occurs one data about one 
transaction is not counted.


On the risk side: the collision probability is pretty low because the time 
to update a value is a "few" cycles, and the time to execute a transaction 
is typically in ms: I think under 1/10,000,000 data could be lost.


On the advantageous side: locking costs significant time thus would impact 
performance, I think that the measured performance loss because the 
occasional transaction data is not counted is lower that the performance 
loss due to systematically locking.


So for me this is really a low risk trade-off.


[...]
It seems a bit funny to have the start_time not be reset when 0.0 is
passed, which is almost all the callers.  Using a float as a boolean
looks pretty odd; is that kosher?  Maybe it'd be a good idea to have a
separate boolean flag instead?  Something like this

/*
* Initialize a StatsData struct to all zeroes.  Use the given
* start_time only if reset_start_time, otherwise keep the original
* value.
*/
static void
initStats(StatsData *sd, double start_time, bool reset_start_time)
{
sd->cnt = 0;
sd->skipped = 0;
initSimpleStats(&sd->latency);
initSimpleStats(&sd->lag);

/* not necessarily overriden? */
if (reset_start_time)
sd->start_time = start_time;
}


Obviously this would work. I did not think the special case was worth the 
extra argument. This one has some oddity too, because the second argument 
is ignored depending on the third. Do as you feel.



I renamed a couple of your functionettes, for instance doSimpleStats to
addToSimpleStats and appendSimpleStats to mergeSimpleStats.


Fine with me.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff



Hello again,


If you want to implement real non-ambiguous-prefix code (i.e. have "se"
for "select-only", but reject "s" as ambiguous) be my guest.


I'm fine with filtering out ambiguous cases (i.e. just the "s" case). 
Attached a small patch for that.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..124e70d 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -269,6 +269,9 @@ pgbench  options  dbname
 Add the specified builtin script to the list of executed scripts.
 Available builtin scripts are: tpcb-like,
 simple-update and select-only.
+The provided scriptname needs only be an unambiguous
+prefix of the builtin name, hence si would be enough to
+select simple-update.
 With special name list, show the list of builtin scripts
 and exit immediately.

diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..6350948 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -2649,22 +2649,32 @@ listAvailableScripts(void)
 	fprintf(stderr, "\n");
 }
 
+/* return commands for selected builtin script, if unambiguous */
 static char *
 findBuiltin(const char *name, char **desc)
 {
-	int			i;
+	int			i, found = 0, len = strlen(name);
+	char	   *commands = NULL;
 
 	for (i = 0; i < N_BUILTIN; i++)
 	{
-		if (strncmp(builtin_script[i].name, name,
-	strlen(builtin_script[i].name)) == 0)
+		if (strncmp(builtin_script[i].name, name, len) == 0)
 		{
 			*desc = builtin_script[i].desc;
-			return builtin_script[i].commands;
+			commands = builtin_script[i].commands;
+			found++;
 		}
 	}
 
-	fprintf(stderr, "no builtin script found for name \"%s\"\n", name);
+	if (found == 1)
+		return commands;
+
+	/* error cases */
+	if (found == 0)
+		fprintf(stderr, "no builtin script found for name \"%s\"\n", name);
+	else /* found > 1 */
+		fprintf(stderr,
+"%d builtin scripts found for prefix \"%s\"\n", found, name);
 	listAvailableScripts();
 	exit(1);
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-27 Thread Etsuro Fujita


On 2016/01/27 12:20, Etsuro Fujita wrote:

On 2016/01/26 22:57, Rushabh Lathia wrote:

On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita
mailto:fujita.ets...@lab.ntt.co.jp>> wrote:

On 2016/01/25 17:03, Rushabh Lathia wrote:



int
IsForeignRelUpdatable (Relation rel);



Documentation for IsForeignUpdatable() need to change as it says:

If the IsForeignRelUpdatable pointer is set to NULL, foreign
tables are
assumed
to be insertable, updatable, or deletable if the FDW provides
ExecForeignInsert,
ExecForeignUpdate or ExecForeignDelete respectively.

With introduce of DMLPushdown API now this is no more correct,
as even if
FDW don't provide ExecForeignInsert, ExecForeignUpdate or
ExecForeignDelete API
still foreign tables are assumed to be updatable or deletable
with
DMLPushdown
API's, right ?



That's what I'd like to discuss.

I intentionally leave that as-is, because I think we should
determine the updatability of a foreign table in the current
manner.  As you pointed out, even if the FDW doesn't provide eg,
ExecForeignUpdate, an UPDATE on a foreign table could be done using
the DML pushdown APIs if the UPDATE is *pushdown-safe*.  However,
since all UPDATEs on the foreign table are not necessarily
pushdown-safe, I'm not sure it's a good idea to assume the
table-level updatability if the FDW provides the DML pushdown
callback routines.  To keep the existing updatability decision, I
think the FDW should provide the DML pushdown callback routines
together with ExecForeignInsert, ExecForeignUpdate, or
ExecForeignDelete.  What do you think about that?



Sorry but I am not in favour of adding compulsion that FDW should provide
the DML pushdown callback routines together with existing
ExecForeignInsert,
ExecForeignUpdate or ExecForeignDelete APIs.

May be we should change the documentation in such way, that explains

a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert,
ExecForeignUpdate or ExecForeignDelete APIs
b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable
check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs
c) If FDW PlanDMLPushdown is non-NULL and plan is pushable
check for DMLPushdown APIs.

Does this sounds wired ?



Yeah, but I think that that would be what is done during executor
startup (see CheckValidResultRel()), while what the documentation is
saying is about relation_is_updatable(); that is, how to decide the
updatability of a given foreign table, not how the executor processes an
individual INSERT/UPDATE/DELETE on a updatable foreign table.  So, I'm
not sure it's a good idea to modify the documentation in such a way.



However, I agree that we should add a documentation note about the
compulsion somewhere.  Maybe something like this:

The FDW should provide DML pushdown callback routines together with
table-updating callback routines described above.  Even if the callback
routines are provided, the updatability of a foreign table is determined
based on the presence of ExecForeignInsert, ExecForeignUpdate or
ExecForeignDelete if the IsForeignRelUpdatable pointer is set to NULL.


On second thought, I think it might be okay to assume the presence of 
PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, and 
EndDMLPushdown is also sufficient for the insertablity, updatability, 
and deletability of a foreign table, if the IsForeignRelUpdatable 
pointer is set to NULL.  How about modifying the documentation like this:


If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are 
assumed to be insertable, updatable, or deletable if the FDW provides 
ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, 
or if the FDW provides PlanDMLPushdown, BeginDMLPushdown, 
IterateDMLPushdown, and EndDMLPushdown described below.


Of course, we also need to modify relation_is_updatable() accordingly.

What's your opinion?

Best regards,
Etsuro Fujita




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GIN pending list clean up exposure to SQL

2016-01-27 Thread Fujii Masao

On Mon, Jan 25, 2016 at 3:54 PM, Jeff Janes  wrote:
> On Wed, Jan 20, 2016 at 6:17 AM, Fujii Masao  wrote:
>> On Sat, Jan 16, 2016 at 7:42 AM, Julien Rouhaud
>>  wrote:
>>> On 15/01/2016 22:59, Jeff Janes wrote:
 On Sun, Jan 10, 2016 at 4:24 AM, Julien Rouhaud
  wrote:
>>>
>>> All looks fine to me, I flag it as ready for committer.
>>
>> When I compiled the PostgreSQL with the patch, I got the following error.
>> ISTM that the inclusion of pg_am.h header file is missing in ginfast.c.
>
> Thanks.  Fixed.
>
>> gin_clean_pending_list() must check whether the server is in recovery or not.
>> If it's in recovery, the function must exit with an error. That is, IMO,
>> something like the following check must be added.
>>
>>  if (RecoveryInProgress())
>>  ereport(ERROR,
>>
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>>   errmsg("recovery is in progress"),
>>   errhint("GIN pending list cannot be
>> cleaned up during recovery.")));
>>
>> It's better to make gin_clean_pending_list() check whether the target index
>> is temporary index of other sessions or not, like pgstatginindex() does.
>
> I've added both of these checks.  Sorry I overlooked your early email
> in this thread about those.
>
>>
>> +RelationindexRel = index_open(indexoid, AccessShareLock);
>>
>> ISTM that AccessShareLock is not safe when updating the pending list and
>> GIN index main structure. Probaby we should use RowExclusiveLock?
>
> Other callers of the ginInsertCleanup function also do so while
> holding only the AccessShareLock on the index.  It turns out that
> there is a bug around this, as discussed in "Potential GIN vacuum bug"
> (http://www.postgresql.org/message-id/flat/CAMkU=1xalflhuuohfp5v33rzedlvb5aknnujceum9knbkrb...@mail.gmail.com)
>
> But, that bug has to be solved at a deeper level than this patch.
>
> I've also cleaned up some other conflicts, and chose a more suitable
> OID for the new catalog function.
>
> The number of new header includes needed to implement this makes me
> wonder if I put this code in the correct file, but I don't see a
> better location for it.
>
> New version attached.

Thanks for updating the patch! It looks good to me.

Based on your patch, I just improved the doc. For example,
I added the following note into the doc.

+These functions cannot be executed during recovery.
+Use of these functions is restricted to superusers and the owner
+of the given index.

If there is no problem, I'm thinking to commit this version.

Regards,

-- 
Fujii Masao
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 18036,18044  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18036,18051 
  brin_summarize_new_values
 
  
+
+ gin_clean_pending_list
+
+ 
 
   shows the functions
  available for index maintenance tasks.
+ These functions cannot be executed during recovery.
+ Use of these functions is restricted to superusers and the owner
+ of the given index.
 
  
 
***
*** 18056,18061  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18063,18075 
 integer
 summarize page ranges not already summarized

+   
+
+ gin_clean_pending_list(index regclass)
+
+bigint
+move GIN pending list entries into main index structure
+   
   
  
 
***
*** 18069,18074  postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
--- 18083,18100 
  into the index.
 
  
+
+ gin_clean_pending_list accepts the OID or name of
+ a GIN index and cleans up the pending list of the specified GIN index
+ by moving entries in it to the main GIN data structure in bulk.
+ It returns the number of pages cleaned up from the pending list.
+ Note that the cleanup does not happen and the return value is 0
+ if the argument is the GIN index built with fastupdate
+ option disabled because it doesn't have a pending list.
+ Please see  and 
+ for details of the pending list and fastupdate option.
+
+ 

  

*** a/doc/src/sgml/gin.sgml
--- b/doc/src/sgml/gin.sgml
***
*** 734,740 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the table is vacuumed, or if the pending list becomes larger than
 , the entries are moved to the
 main GIN data structure using the same bulk insert
 techniques used during initial index creation.  This greatly improves
--- 734,742 
 from the indexed item). As of PostgreSQL 8.4,
 GIN is capable of postponing much of this work by inserting
 new tuples into a temporary, unsorted list of pending entries.
!When the tabl

Re: [HACKERS] WIP: Failover Slots

2016-01-27 Thread Craig Ringer

Hi all

Here's v3 of failover slots.

It doesn't add the UI yet, but it's now functionally complete except for
timeline following for logical slots, and I have a plan for that.
From 533a9327b54ba744b0a1fb0048e8cfe7d3d45ea1 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Wed, 20 Jan 2016 17:16:29 +0800
Subject: [PATCH 1/2] Implement failover slots

Originally replication slots were unique to a single node and weren't
recorded in WAL or replicated. A logical decoding client couldn't follow
a physical standby failover and promotion because the promoted replica
didn't have the original master's slots. The replica may not have
retained all required WAL and there was no way to create a new logical
slot and rewind it back to the point the logical client had replayed to
anyway.

Failover slots lift this limitation by replicating slots consistently to
physical standbys, keeping them up to date and using them in WAL
retention calculations. This allows a logical decoding client to follow
a physical failover and promotion without losing its place in the change
stream.

Simon Riggs and Craig Ringer

WIP. Open items:

* Testing
* Implement !failover slots and UI for marking slots as failover slots
* Fix WAL retention for slots created before a basebackup
---
 src/backend/access/rmgrdesc/Makefile   |   2 +-
 src/backend/access/rmgrdesc/replslotdesc.c |  63 
 src/backend/access/transam/rmgr.c  |   1 +
 src/backend/access/transam/xlogutils.c |   6 +-
 src/backend/commands/dbcommands.c  |   3 +
 src/backend/replication/basebackup.c   |  12 -
 src/backend/replication/logical/decode.c   |   1 +
 src/backend/replication/logical/logical.c  |  19 +-
 src/backend/replication/logical/logicalfuncs.c |   3 +
 src/backend/replication/slot.c | 439 -
 src/backend/replication/slotfuncs.c|   1 +
 src/bin/pg_xlogdump/replslotdesc.c |   1 +
 src/bin/pg_xlogdump/rmgrdesc.c |   1 +
 src/include/access/rmgrlist.h  |   1 +
 src/include/replication/slot.h |  61 +---
 src/include/replication/slot_xlog.h| 103 ++
 16 files changed, 624 insertions(+), 93 deletions(-)
 create mode 100644 src/backend/access/rmgrdesc/replslotdesc.c
 create mode 12 src/bin/pg_xlogdump/replslotdesc.c
 create mode 100644 src/include/replication/slot_xlog.h

diff --git a/src/backend/access/rmgrdesc/Makefile b/src/backend/access/rmgrdesc/Makefile
index c72a1f2..600b544 100644
--- a/src/backend/access/rmgrdesc/Makefile
+++ b/src/backend/access/rmgrdesc/Makefile
@@ -10,7 +10,7 @@ include $(top_builddir)/src/Makefile.global
 
 OBJS = brindesc.o clogdesc.o committsdesc.o dbasedesc.o gindesc.o gistdesc.o \
 	   hashdesc.o heapdesc.o mxactdesc.o nbtdesc.o relmapdesc.o \
-	   replorigindesc.o seqdesc.o smgrdesc.o spgdesc.o \
+	   replorigindesc.o replslotdesc.o seqdesc.o smgrdesc.o spgdesc.o \
 	   standbydesc.o tblspcdesc.o xactdesc.o xlogdesc.o
 
 include $(top_srcdir)/src/backend/common.mk
diff --git a/src/backend/access/rmgrdesc/replslotdesc.c b/src/backend/access/rmgrdesc/replslotdesc.c
new file mode 100644
index 000..b882846
--- /dev/null
+++ b/src/backend/access/rmgrdesc/replslotdesc.c
@@ -0,0 +1,63 @@
+/*-
+ *
+ * replslotdesc.c
+ *	  rmgr descriptor routines for replication/slot.c
+ *
+ * Portions Copyright (c) 2015, PostgreSQL Global Development Group
+ *
+ *
+ * IDENTIFICATION
+ *	  src/backend/access/rmgrdesc/replslotdesc.c
+ *
+ *-
+ */
+#include "postgres.h"
+
+#include "replication/slot_xlog.h"
+
+void
+replslot_desc(StringInfo buf, XLogReaderState *record)
+{
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = XLogRecGetInfo(record) & ~XLR_INFO_MASK;
+
+	switch (info)
+	{
+		case XLOG_REPLSLOT_UPDATE:
+			{
+ReplicationSlotInWAL xlrec;
+
+xlrec = (ReplicationSlotInWAL) rec;
+
+appendStringInfo(buf, "slot %s to xmin=%u, catmin=%u, restart_lsn="UINT64_FORMAT"@%u",
+		NameStr(xlrec->name), xlrec->xmin, xlrec->catalog_xmin,
+		xlrec->restart_lsn, xlrec->restart_tli);
+
+break;
+			}
+		case XLOG_REPLSLOT_DROP:
+			{
+xl_replslot_drop *xlrec;
+
+xlrec = (xl_replslot_drop *) rec;
+
+appendStringInfo(buf, "slot %s", NameStr(xlrec->name));
+
+break;
+			}
+	}
+}
+
+const char *
+replslot_identify(uint8 info)
+{
+	switch (info)
+	{
+		case XLOG_REPLSLOT_UPDATE:
+			return "CREATE_OR_UPDATE";
+		case XLOG_REPLSLOT_DROP:
+			return "DROP";
+		default:
+			return NULL;
+	}
+}
diff --git a/src/backend/access/transam/rmgr.c b/src/backend/access/transam/rmgr.c
index 7c4d773..0bd5796 100644
--- a/src/backend/access/transam/rmgr.c
+++ b/src/backend/access/transam/rmgr.c
@@ -24,6 +24,7 @@
 #include "commands/sequence.h"
 #include "commands/tablespace.h"
 #include "replication/origin.h"
+#include "

[HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-01-27 Thread Artur Zakirov


Hello.

When a user try to create a text search dictionary for the russian 
language on Mac OS then called the following error message:


  CREATE EXTENSION hunspell_ru_ru;
+ ERROR:  invalid byte sequence for encoding "UTF8": 0xd1
+ CONTEXT:  line 341 of configuration file 
"/Users/stas/code/postgrespro2/tmp_install/Users/stas/code/postgrespro2/install/share/tsearch_data/ru_ru.affix": 
"SFX Y   хаться шутсяхаться


Russian dictionary was downloaded from 
http://extensions.openoffice.org/en/project/slovari-dlya-russkogo-yazyka-dictionaries-russian
Affix and dictionary files was extracted from the archive and converted 
to UTF-8. Also a converted dictionary can be downloaded from 
https://github.com/select-artur/hunspell_dicts/tree/master/ru_ru


This behavior occurs on:
- Mac OS X 10.10 Yosemite and Mac OS X 10.11 El Capitan.
- latest PostgreSQL version from git and PostgreSQL 9.5 (probably also 
on 9.4.5).


There is also the test to reproduce this bug in the attachment.

Did you meet this bug? Do you have a solution or a workaround?

Thanks in advance.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
#include 
#include 

char *src = "SFX Y   ÑÐ°ÑÑÑÑ ÑÑÑÑÑÑÐ°ÑÑÑÑ";

int
main(int argc, char *argv[])
{
	char c1[1024], c2[1024], c3[1024], c4[1024], c5[1024];

	setlocale(LC_CTYPE, "ru_RU.UTF-8");

	sscanf(src, "%6s %204s %204s %204s %204s", c1, c2, c3, c4, c5);

	printf("%s/%s/%s/%s/%s\n", c1, c2, c3, c4, c5);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench stats per script & other stuff



Hello again,


Here's part b rebased, pgindented and with some minor additional tweaks
(mostly function commands and the function renames I mentioned).


Patch looks ok to me, various tests where ok as well.


Still concerned about the unlocked stat accums.


See my arguments in other mail. I can add a lock if this is a blocker, but 
I think that it is actually better without, because of quantum: the 
measuring process should avoid affecting the measured data, and locking is 
not cheap.



I haven't tried to rebase the other ones yet, they need manual conflict
fixes.


Find attached 14-c/d/e rebased patches.

About e, for some obscure reason I failed in my initial attempt at 
inserting the misplaced options in their rightfull position in the option 
list. Sorry for the noise.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index 42d0667..ade1b53 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -1138,6 +1138,9 @@ number of transactions actually processed: 1/1
 tps = 618.764555 (including connections establishing)
 tps = 622.977698 (excluding connections establishing)
 SQL script 1: 
+ - 1 transactions (100.0% of total, tps = 618.764555)
+ - latency average = 15.844 ms
+ - latency stddev = 2.715 ms
  - statement latencies in milliseconds:
 0.004386\set nbranches 1 * :scale
 0.001343\set ntellers 10 * :scale
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index 305c319..5594d1c 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -164,6 +164,7 @@ bool		use_log;			/* log transaction latencies to a file */
 bool		use_quiet;			/* quiet logging onto stderr */
 int			agg_interval;		/* log aggregates instead of individual
  * transactions */
+boolper_script_stats = false; /* whether to collect stats per script */
 int			progress = 0;		/* thread progress report every this seconds */
 bool		progress_timestamp = false; /* progress report with Unix time */
 int			nclients = 1;		/* number of clients */
@@ -299,6 +300,7 @@ static struct
 {
 	const char *name;
 	Command   **commands;
+	StatsData stats;
 }	sql_script[MAX_SCRIPTS];	/* SQL script files */
 static int	num_scripts;		/* number of scripts in sql_script[] */
 static int	num_commands = 0;	/* total number of Command structs */
@@ -1308,7 +1310,7 @@ top:
 		/* transaction finished: calculate latency and log the transaction */
 		if (commands[st->state + 1] == NULL)
 		{
-			if (progress || throttle_delay || latency_limit || logfile)
+			if (progress || throttle_delay || latency_limit || per_script_stats || logfile)
 doTxStats(thread, st, &now, false, logfile, agg);
 			else
 thread->stats.cnt++;
@@ -1401,7 +1403,7 @@ top:
 	}
 
 	/* Record transaction start time under logging, progress or throttling */
-	if ((logfile || progress || throttle_delay || latency_limit) && st->state == 0)
+	if ((logfile || progress || throttle_delay || latency_limit || per_script_stats) && st->state == 0)
 	{
 		INSTR_TIME_SET_CURRENT(st->txn_begin);
 
@@ -1889,6 +1891,9 @@ doTxStats(TState *thread, CState *st, instr_time *now,
 
 	if (use_log)
 		doLog(thread, st, logfile, now, agg, skipped, latency, lag);
+
+	if (per_script_stats) /* mutex? hmmm... these are only statistics */
+		doStats(& sql_script[st->use_file].stats, skipped, latency, lag);
 }
 
 
@@ -2678,6 +2683,7 @@ addScript(const char *name, Command **commands)
 
 	sql_script[num_scripts].name = name;
 	sql_script[num_scripts].commands = commands;
+	initStats(& sql_script[num_scripts].stats, 0.0);
 	num_scripts++;
 }
 
@@ -2761,22 +2767,40 @@ printResults(TState *threads, StatsData *total, instr_time total_time,
 	printf("tps = %f (including connections establishing)\n", tps_include);
 	printf("tps = %f (excluding connections establishing)\n", tps_exclude);
 
-	/* Report per-command latencies */
-	if (is_latencies)
+	/* Report per-script stats */
+	if (per_script_stats)
 	{
 		int			i;
 
 		for (i = 0; i < num_scripts; i++)
 		{
-			Command   **commands;
+			printf("SQL script %d: %s\n"
+   " - "INT64_FORMAT" transactions (%.1f%% of total, tps = %f)\n",
+   i+1, sql_script[i].name,
+   sql_script[i].stats.cnt,
+   100.0 * sql_script[i].stats.cnt / total->cnt,
+   sql_script[i].stats.cnt / time_include);
 
-			printf("SQL script %d: %s\n", i + 1, sql_script[i].name);
-			printf(" - statement latencies in milliseconds:\n");
+			if (latency_limit)
+printf(" - number of transactions skipped: "INT64_FORMAT" (%.3f%%)\n",
+	   sql_script[i].stats.skipped,
+	   100.0 * sql_script[i].stats.skipped /
+	   (sql_script[i].stats.skipped + sql_script[i].stats.cnt));
 
-			for (commands = sql_script[i].commands; *commands != NULL; commands++)
-printf("   %11.3f  %s\n",
-   1000.0 * (*commands)->stats.sum / (*commands)->stats.count,
-	   (*commands)->line);
+			printSimpleStats(" - latency",

[HACKERS] Trivial doc fix in logicaldecoding.sgml

2016-01-27 Thread Shulgin, Oleksandr

Hi,

Please find attached a simple copy-paste fix for CREATE_REPLICATION_SLOT
syntax.

--
Alex
From 05119485a473febe8ffd95103fd7774bc31ee079 Mon Sep 17 00:00:00 2001
From: Oleksandr Shulgin 
Date: Wed, 27 Jan 2016 11:27:35 +0100
Subject: [PATCH] Fix CREATE_REPLICATION_SLOT syntax in logicaldecoding.sgml

---
 doc/src/sgml/logicaldecoding.sgml | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/doc/src/sgml/logicaldecoding.sgml b/doc/src/sgml/logicaldecoding.sgml
index 1ae5eb6..926637b 100644
--- a/doc/src/sgml/logicaldecoding.sgml
+++ b/doc/src/sgml/logicaldecoding.sgml
@@ -280,7 +280,7 @@ $ pg_recvlogical -d postgres --slot test --drop-slot
 The commands
 
  
-  CREATE_REPLICATION_SLOT slot_name LOGICAL options
+  CREATE_REPLICATION_SLOT slot_name LOGICAL output_plugin
  
 
  
-- 
2.5.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-01-27 Thread Shulgin, Oleksandr

On Wed, Jan 27, 2016 at 10:59 AM, Artur Zakirov 
wrote:

> Hello.
>
> When a user try to create a text search dictionary for the russian
> language on Mac OS then called the following error message:
>
>   CREATE EXTENSION hunspell_ru_ru;
> + ERROR:  invalid byte sequence for encoding "UTF8": 0xd1
> + CONTEXT:  line 341 of configuration file
> "/Users/stas/code/postgrespro2/tmp_install/Users/stas/code/postgrespro2/install/share/tsearch_data/ru_ru.affix":
> "SFX Y   хаться шутсяхаться
>
> Russian dictionary was downloaded from
> http://extensions.openoffice.org/en/project/slovari-dlya-russkogo-yazyka-dictionaries-russian
> Affix and dictionary files was extracted from the archive and converted to
> UTF-8. Also a converted dictionary can be downloaded from
> https://github.com/select-artur/hunspell_dicts/tree/master/ru_ru


Not sure why the file uses "SET KOI8-R" directive then?

This behavior occurs on:
> - Mac OS X 10.10 Yosemite and Mac OS X 10.11 El Capitan.
> - latest PostgreSQL version from git and PostgreSQL 9.5 (probably also on
> 9.4.5).
>
> There is also the test to reproduce this bug in the attachment.
>

What error message do you get with this test program?  (I don't get any,
but I'm not on Mac OS.)

--
Alex

Re: [HACKERS] pgbench stats per script & other stuff

2016-01-27 Thread Alvaro Herrera

Fabien COELHO wrote:

> >It seems a bit funny to have the start_time not be reset when 0.0 is
> >passed, which is almost all the callers.  Using a float as a boolean
> >looks pretty odd; is that kosher?  Maybe it'd be a good idea to have a
> >separate boolean flag instead?

> Obviously this would work. I did not think the special case was worth the
> extra argument. This one has some oddity too, because the second argument is
> ignored depending on the third. Do as you feel.

Actually my question was whether keeping the original start_time was the
intended design.  I think some places are okay with keeping the original
value, but the ones in addScript, the per-thread loop in main(), and the
global one also in main() should all be getting a 0.0 instead of leaving
the value uninitialized.

(I did turn the arguments around so that the bool is second and the
float is third.  Thanks for the suggestion.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor improvement to fdwhandler.sgml

2016-01-27 Thread Etsuro Fujita

Here is a small patch to do s/for/For/ to two section titles in
fdwhandlers.sgml, for consistency.

Best regards,
Etsuro Fujita
diff --git a/doc/src/sgml/fdwhandler.sgml b/doc/src/sgml/fdwhandler.sgml
index dc2d890..9c8406c 100644
--- a/doc/src/sgml/fdwhandler.sgml
+++ b/doc/src/sgml/fdwhandler.sgml
@@ -798,7 +798,7 @@ RecheckForeignScan (ForeignScanState *node, TupleTableSlot *slot);

 

-FDW Routines for EXPLAIN
+FDW Routines For EXPLAIN
 
 
 
@@ -851,7 +851,7 @@ ExplainForeignModify (ModifyTableState *mtstate,

 

-FDW Routines for ANALYZE
+FDW Routines For ANALYZE
 
 
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-01-27 Thread Artur Zakirov

On 27.01.2016 13:46, Shulgin, Oleksandr wrote:

Not sure why the file uses "SET KOI8-R" directive then?

This directive is used only by Hunspell program. PostgreSQL ignores this 
directive and assumes that input affix and dictionary files in the UTF-8 
encoding.

What error message do you get with this test program?  (I don't get any,
but I'm not on Mac OS.)
--
Alex

With this program you will get wrong output. A error message is not 
called. You can execute the following commands:

> cc test.c -o test
> ./test

You will get the output:

SFX/Y/?/аться/шутся

Although the output should be:

SFX/Y/хаться/шутся/хаться

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Fwd: [DOCS] pgbench doc typos

2016-01-27 Thread Erik Rijkers


Two trivial changes to  doc/src/sgml/ref/pgbench.sgml

Erik Rijkers


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-01-27 Thread Stas Kelvich

Hi.

I tried that and confirm strange behaviour. It seems that problem with small 
cyrillic letter ‘х’. (simplest obscene language filter? =)

That can be reproduced with simpler test

Stas



test.c
Description: Binary data

 
> On 27 Jan 2016, at 13:59, Artur Zakirov  wrote:
> 
> On 27.01.2016 13:46, Shulgin, Oleksandr wrote:
>> 
>> Not sure why the file uses "SET KOI8-R" directive then?
>> 
> 
> This directive is used only by Hunspell program. PostgreSQL ignores this 
> directive and assumes that input affix and dictionary files in the UTF-8 
> encoding.
> 
>> 
>> 
>> What error message do you get with this test program?  (I don't get any,
>> but I'm not on Mac OS.)
>> --
>> Alex
>> 
>> 
> 
> With this program you will get wrong output. A error message is not called. 
> You can execute the following commands:
> 
> > cc test.c -o test
> > ./test
> 
> You will get the output:
> 
> SFX/Y/?/аться/шутся
> 
> Although the output should be:
> 
> SFX/Y/хаться/шутся/хаться
> 
> -- 
> Artur Zakirov
> Postgres Professional: http://www.postgrespro.com
> Russian Postgres Company
> 
> 
> -- 
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Fwd: [DOCS] pgbench doc typos

2016-01-27 Thread Erik Rijkers


On 2016-01-27 11:06, Erik Rijkers wrote:

Two trivial changes to  doc/src/sgml/ref/pgbench.sgml


Sorry - now attached.

Erik Rijkers

--- ./doc/src/sgml/ref/pgbench.sgml.orig	2016-01-27 12:29:16.857488633 +0100
+++ ./doc/src/sgml/ref/pgbench.sgml	2016-01-27 12:30:09.643862616 +0100
@@ -1056,7 +1056,7 @@
  0 84 4142 0 1412881037 918023 2333
  0 85 2465 0 1412881037 919759 740
 
-   In this example, transaction 82 was late, because it's latency (6.173 ms) was
+   In this example, transaction 82 was late, because its latency (6.173 ms) was
over the 5 ms limit. The next two transactions were skipped, because they
were already late before they were even started.
   
@@ -1097,7 +1097,7 @@
   
 
   
-   Here is example outputs:
+   Here is example output:
 
 1345828501 5601 1542744 483552416 61 2573
 1345828503 7884 1979812 565806736 60 1479

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Using user mapping OID as hash key for connection hash

2016-01-27 Thread Ashutosh Bapat

Hi All,
As discussed in postgres_fdw join pushdown thread [1], for two different
effective local users which use public user mapping we will be creating two
different connections to the foreign server with the same credentials.

Robert suggested [2] that we should use user mapping OID as the connection
cache key instead of current userid and serverid.

There are two patches attached here:
1. pg_fdw_concache.patch.short - shorter version of the fix. Right now
ForeignTable, ForeignServer have corresponding OIDs saved in these
structures. But UserMapping doesn't. Patch adds user mapping OID as a
member to this structure. This member is then used as key in
GetConnection().
2. pg_fdw_concache.patch.large - most of the callers of GetConnection() get
ForeignServer object just to pass it to GetConnection(). GetConnection can
obtain ForeignServer by using serverid from UserMapping and doesn't need
ForeignServer to be an argument. Larger version of patch has this change.

GetConnection has named the UserMapping argument as just "user", ideally it
should have been named user_mapping. But that seems to be too obvious to be
unintentional. So, I have left that change.

The patch has added userid and user mapping oid to a debug3 message in
GetConnection(). the message is displayed when a new connection to foreign
server is created. With only that change, if we run script multi_conn.sql
(attached) we see that it's creating two connections when same user mapping
is used. Also attached is the output of the same script run on my setup.
Since min_messages is set to DEBUG3 there are too many unrelated messages.
So, search for "new postgres_fdw connection .." for new connection messages.

I have included the changes to the DEBUG3 message in GetConnection(), since
it may be worth committing those changes. In case, reviewers/committers
disagree, those chagnes can be removed.

[1]
http://www.postgresql.org/message-id/CAFjFpRf-LiD5bai4D6cSUseJh=xxjqipo_vn8mtnzg16tmw...@mail.gmail.com
[2]
http://www.postgresql.org/message-id/ca+tgmoymmv_du-vppq1d7ufsjaopbq+lgpxtchnuqfobjg2...@mail.gmail.com
--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

pg_fdw_concache.patch.large
Description: Binary data

pg_fdw_concache.patch.short
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-27 Thread Aleksander Alekseev

> > This patch affects header files. By any chance didn't you forget to
> > run `make clean` after applying it? As we discussed above, when you
> > change .h files autotools doesn't rebuild dependent .c files:
> >
> 
> Yes, actually i always compile using "make clean;make -j20; make
> install" If you want i will run it again may be today or tomorrow and
> post the result.
> 
> 

Most likely HASHHDR.mutex is not only bottleneck in your case so my
patch doesn't help much. Unfortunately I don't have access to any
POWER8 server so I can't investigate this issue. I suggest to use a
gettimeofday trick I described in a first message of this thread. Its
time consuming but it gives a clear understanding which code is keeping
a lock.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] better systemd integration

Hi

2015-11-17 15:08 GMT+01:00 Peter Eisentraut :

> I have written a couple of patches to improve the integration of the
> postgres daemon with systemd.
>
> The setup that is shipped with Red Hat- and Debian-family packages at
> the moment is just an imitation of the old shell scripts, relying on
> polling by pg_ctl for readiness, with various custom pieces of
> complexity for handling custom port numbers and such.
>
> In the first patch, my proposal is to use sd_notify() calls from
> libsystemd to notify the systemd daemon directly when startup is
> completed.  This is the recommended low-overhead solution that is now
> being adopted by many other server packages.  It allows us to cut out
> pg_ctl completely from the startup configuration and makes the startup
> configuration manageable by non-wizards.  An example is included in the
> patch.
>
> The second patch improves integration with the system journal managed by
> systemd.  This is a facility that captures a daemon's standard output
> and error and records it in configurable places, including syslog.  The
> patch adds a new log_destination that is like stderr but marks up the
> output so that systemd knows the severity.  With that in place, users
> can choose to do away with the postgres log file management and let
> systemd do it.
>
> The third patch is technically unrelated but arose while I was working
> on this.  It improves error reporting when the data directory is missing.
>


2. all tests passed

The issues:

1. configure missing systemd integration test, compilation fails:

postmaster.o postmaster.c
postmaster.c:91:31: fatal error: systemd/sd-daemon.h: No such file or
directory

3. PostgreSQL is able to write to systemd log, but multiline entry was
stored with different priorities

 do $$ begin raise warning 'NAZDAREK'; end $$;

first line

{
"__CURSOR" :
"s=cac797bc03f242febea9f32357bba773;i=b4a5;b=e8d5b3df2ebf46dd86c39046b326bd32;m=1cb792a63b;t=52a4f3ad40860;x=57014959bf6e3481",
"__REALTIME_TIMESTAMP" : "1453894661310560",
"__MONOTONIC_TIMESTAMP" : "123338925627",
"_BOOT_ID" : "e8d5b3df2ebf46dd86c39046b326bd32",
"SYSLOG_FACILITY" : "3",
"_UID" : "1001",
"_GID" : "1001",
"_CAP_EFFECTIVE" : "0",
"_SELINUX_CONTEXT" : "system_u:system_r:init_t:s0",
"_MACHINE_ID" : "b8299a722638414a8776d3e130e228e4",
"_HOSTNAME" : "localhost.localdomain",
"_SYSTEMD_SLICE" : "system.slice",
"_TRANSPORT" : "stdout",
"SYSLOG_IDENTIFIER" : "postgres",
"_PID" : "3150",
"_COMM" : "postgres",
"_EXE" : "/usr/local/pgsql/bin/postgres",
"_CMDLINE" : "/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data -c
log_destination=systemd",
"_SYSTEMD_CGROUP" : "/system.slice/postgresql.service",
"_SYSTEMD_UNIT" : "postgresql.service",
"PRIORITY" : "5",
"MESSAGE" : "WARNING:  NAZDAREK"
}

second line

{
"__CURSOR" :
"s=cac797bc03f242febea9f32357bba773;i=b4a6;b=e8d5b3df2ebf46dd86c39046b326bd32;m=1cb792a882;t=52a4f3ad40aa6;x=ae9801b2ecbd4da3",
"__REALTIME_TIMESTAMP" : "1453894661311142",
"__MONOTONIC_TIMESTAMP" : "123338926210",
"_BOOT_ID" : "e8d5b3df2ebf46dd86c39046b326bd32",
"PRIORITY" : "6",
"SYSLOG_FACILITY" : "3",
"_UID" : "1001",
"_GID" : "1001",
"_CAP_EFFECTIVE" : "0",
"_SELINUX_CONTEXT" : "system_u:system_r:init_t:s0",
"_MACHINE_ID" : "b8299a722638414a8776d3e130e228e4",
"_HOSTNAME" : "localhost.localdomain",
"_SYSTEMD_SLICE" : "system.slice",
"_TRANSPORT" : "stdout",
"SYSLOG_IDENTIFIER" : "postgres",
"_PID" : "3150",
"_COMM" : "postgres",
"_EXE" : "/usr/local/pgsql/bin/postgres",
"_CMDLINE" : "/usr/local/pgsql/bin/postgres -D /usr/local/pgsql/data -c
log_destination=systemd",
"_SYSTEMD_CGROUP" : "/system.slice/postgresql.service",
"_SYSTEMD_UNIT" : "postgresql.service",
"MESSAGE" : "CONTEXT:  PL/pgSQL function inline_code_block line 1 at
RAISE"
}

Is it expected?

Second issue:

Mapping of levels between pg and journal levels is moved by1

+case DEBUG1:
+systemd_log_prefix = "<7>" /* SD_DEBUG */;
+break;
+case LOG:
+case COMMERROR:
+case INFO:
+systemd_log_prefix = "<6>" /* SD_INFO */;
+break;
+case NOTICE:
+case WARNING:
+systemd_log_prefix = "<5>" /* SD_NOTICE */;
+break;
+case ERROR:
+systemd_log_prefix = "<4>" /* SD_WARNING */;
+break;
+case FATAL:
+systemd_log_prefix = "<3>" /* SD_ERR */;
+break;
+case PANIC:

is it expected?

This is little bit unexpected - (can be correct).

When I use filtering "warnings", then I got errors, etc. I can understand
so these systems are not compatible, but these differences should be well
documented.

I didn't find any other issues. It is working without any problems.

Regards

Pavel

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-01-27 Thread Rushabh Lathia

On Wed, Jan 27, 2016 at 2:50 PM, Etsuro Fujita 
wrote:

> On 2016/01/27 12:20, Etsuro Fujita wrote:
>
>> On 2016/01/26 22:57, Rushabh Lathia wrote:
>>
>>> On Tue, Jan 26, 2016 at 4:15 PM, Etsuro Fujita
>>> mailto:fujita.ets...@lab.ntt.co.jp>>
>>> wrote:
>>>
>>> On 2016/01/25 17:03, Rushabh Lathia wrote:
>>>
>>
> int
>>> IsForeignRelUpdatable (Relation rel);
>>>
>>
> Documentation for IsForeignUpdatable() need to change as it says:
>>>
>>> If the IsForeignRelUpdatable pointer is set to NULL, foreign
>>> tables are
>>> assumed
>>> to be insertable, updatable, or deletable if the FDW provides
>>> ExecForeignInsert,
>>> ExecForeignUpdate or ExecForeignDelete respectively.
>>>
>>> With introduce of DMLPushdown API now this is no more correct,
>>> as even if
>>> FDW don't provide ExecForeignInsert, ExecForeignUpdate or
>>> ExecForeignDelete API
>>> still foreign tables are assumed to be updatable or deletable
>>> with
>>> DMLPushdown
>>> API's, right ?
>>>
>>
> That's what I'd like to discuss.
>>>
>>> I intentionally leave that as-is, because I think we should
>>> determine the updatability of a foreign table in the current
>>> manner.  As you pointed out, even if the FDW doesn't provide eg,
>>> ExecForeignUpdate, an UPDATE on a foreign table could be done using
>>> the DML pushdown APIs if the UPDATE is *pushdown-safe*.  However,
>>> since all UPDATEs on the foreign table are not necessarily
>>> pushdown-safe, I'm not sure it's a good idea to assume the
>>> table-level updatability if the FDW provides the DML pushdown
>>> callback routines.  To keep the existing updatability decision, I
>>> think the FDW should provide the DML pushdown callback routines
>>> together with ExecForeignInsert, ExecForeignUpdate, or
>>> ExecForeignDelete.  What do you think about that?
>>>
>>
> Sorry but I am not in favour of adding compulsion that FDW should provide
>>> the DML pushdown callback routines together with existing
>>> ExecForeignInsert,
>>> ExecForeignUpdate or ExecForeignDelete APIs.
>>>
>>> May be we should change the documentation in such way, that explains
>>>
>>> a) If FDW PlanDMLPushdown is NULL, then check for ExecForeignInsert,
>>> ExecForeignUpdate or ExecForeignDelete APIs
>>> b) If FDW PlanDMLPushdown is non-NULL and plan is not pushable
>>> check for ExecForeignInsert, ExecForeignUpdate or ExecForeignDelete APIs
>>> c) If FDW PlanDMLPushdown is non-NULL and plan is pushable
>>> check for DMLPushdown APIs.
>>>
>>> Does this sounds wired ?
>>>
>>
> Yeah, but I think that that would be what is done during executor
>> startup (see CheckValidResultRel()), while what the documentation is
>> saying is about relation_is_updatable(); that is, how to decide the
>> updatability of a given foreign table, not how the executor processes an
>> individual INSERT/UPDATE/DELETE on a updatable foreign table.  So, I'm
>> not sure it's a good idea to modify the documentation in such a way.
>>
>
> However, I agree that we should add a documentation note about the
>> compulsion somewhere.  Maybe something like this:
>>
>> The FDW should provide DML pushdown callback routines together with
>> table-updating callback routines described above.  Even if the callback
>> routines are provided, the updatability of a foreign table is determined
>> based on the presence of ExecForeignInsert, ExecForeignUpdate or
>> ExecForeignDelete if the IsForeignRelUpdatable pointer is set to NULL.
>>
>
> On second thought, I think it might be okay to assume the presence of
> PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown, and EndDMLPushdown
> is also sufficient for the insertablity, updatability, and deletability of
> a foreign table, if the IsForeignRelUpdatable pointer is set to NULL.  How
> about modifying the documentation like this:
>
> If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are
> assumed to be insertable, updatable, or deletable if the FDW provides
> ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete respectively, or
> if the FDW provides PlanDMLPushdown, BeginDMLPushdown, IterateDMLPushdown,
> and EndDMLPushdown described below.
>
> Of course, we also need to modify relation_is_updatable() accordingly.
>
>
> What's your opinion?
>
>
If I understood correctly, above documentation means, that if FDW have
DMLPushdown APIs that is enough. But in reality thats not the case, we need
 ExecForeignInsert, ExecForeignUpdate, or ExecForeignDelete in case DML is
not pushable.

And here fact is DMLPushdown APIs are optional for FDW, so that if FDW
don't have DMLPushdown APIs they can still very well perform the DML
operations using ExecForeignInsert, ExecForeignUpdate, or
ExecForeignDelete. So documentation should be like:

If the IsForeignRelUpdatable pointer is set to NULL, foreign tables are
assumed to be insertable, updatable, or

Re: [HACKERS] Mac OS: invalid byte sequence for encoding "UTF8"

2016-01-27 Thread Artur Zakirov


On 27.01.2016 14:14, Stas Kelvich wrote:

Hi.

I tried that and confirm strange behaviour. It seems that problem with small 
cyrillic letter ‘х’. (simplest obscene language filter? =)

That can be reproduced with simpler test

Stas




The test program was corrected. Now it uses wchar_t type. And it works 
correctly and gives right output.


I think the NIImportOOAffixes() in spell.c should be corrected to avoid 
this bug.


--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
#include 
#include 
#include 

char *src = "SFX Y   ÑÐ°ÑÑÑÑ ÑÑÑÑÑÑÐ°ÑÑÑÑ";

int
main(int argc, char *argv[])
{
	wchar_t c1[1024], c2[1024], c3[1024], c4[1024], c5[1024];

	setlocale(LC_CTYPE, "ru_RU.UTF-8");

	sscanf(src, "%6ls %204ls %204ls %204ls %204ls", c1, c2, c3, c4, c5);

	printf("%ls/%ls/%ls/%ls/%ls\n", c1, c2, c3, c4, c5);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal:Use PGDLLEXPORT for libpq

2016-01-27 Thread Yury Zhuravlev


Craig Ringer wrote:
On 27 January 2016 at 00:16, Yury Zhuravlev 
 wrote:
 
It says more about the modules, and not about libpq. Using 
gendef.pl for this library in the light of the development of my 
CMake build seems silly.



For what it's worth I personally agree. I'd rather have 
PGDLLEXPORT used directly, not least because it'd let us built 
with -fvisibility=hidden under *nix. But I'm in the minority and 
not inclined to push the point.



If so many problems with MSVC can discard his support of Postgres?
MSVC:
Not supported C99-C1x.
Problems build dynamic library.
Realy problems build out-tree module.
etc

Under windows we can use MinGW64/Msys or LLVM/Clang for MSVC.
The current situation is similar to masochism. We're not trying to change 
the code to make it more portable.
But at the same time try using black magic to make Postgres work on 
non-POSIX systems.

What's the point now support the MSVC?

Thanks.
--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-01-27 Thread Torsten Zühlsdorff


On 26.01.2016 13:53, Michael Paquier wrote:


Imagine for example a script that in some rare cases passes happens to
pass infinity into generate_series() - in that case I'd much rather error
out than wait till the end of the universe.

So +1 from me to checking for infinity.



+1

ERROR infinite result sets are not supported, yet



Maybe we should skip the "yet". Or do we really plan to support them in
(infinite) future? ;)

+1 from me to check infinity also.


Something like the patch attached would be fine? This wins a backpatch
because the query continuously running eats memory, no?


Looks fine to me.

(Minor mention: is there no newline needed between the tests?)

Greetings,
Torsten


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal:Use PGDLLEXPORT for libpq

2016-01-27 Thread Michael Paquier

On Wed, Jan 27, 2016 at 9:30 PM, Yury Zhuravlev wrote:
> What's the point now support the MSVC?

Many companies use it, including mine, and likely EDB.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal:Use PGDLLEXPORT for libpq

2016-01-27 Thread Alvaro Herrera

Yury Zhuravlev wrote:

> If so many problems with MSVC can discard his support of Postgres?

That doesn't sound likely.  Keep in mind that users might want to
compile extension modules, and not everyone is going to use mingw for
that.  As far as I know, stuff compiled with MSVC is not compatible with
mingw compiled objects.  So even if the main packages switched to
compiling with mingw, we'd probably still want to support MSVC.

> Under windows we can use MinGW64/Msys or LLVM/Clang for MSVC.

I'm guessing that LLVM/Clang port would be useful for something, but I'm
not clear what.

Are we moving forward with the CMake stuff?  It would be *awesome* to
get rid of the MSVC build scripts, and perhaps we can move forward on
some smaller items such as PGDLLEXPORT markings as well.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal:Use PGDLLEXPORT for libpq

2016-01-27 Thread Craig Ringer

On 27 January 2016 at 20:54, Alvaro Herrera 
wrote:

> > If so many problems with MSVC can discard his support of Postgres?
>

I strongly disagree. MSVC is a high quality compiler and the primary tool
for the platform. Yes, it's behind on standards support and that's annoying
-  OTOH MinGW relies on reverse-engineered headers, an old gcc fork, and
has had some pretty nasty bugs.

It's a bit like saying we should drop gcc support on Linux and use icc or
clang because it's more convenient for us.

Every platform has warts. We're just more used to ignoring warts on
!windows.

> That doesn't sound likely.  Keep in mind that users might want to
> compile extension modules, and not everyone is going to use mingw for
> that.  As far as I know, stuff compiled with MSVC is not compatible with
> mingw compiled objects.  So even if the main packages switched to
> compiling with mingw, we'd probably still want to support MSVC.
>
>
They are compatible. You can use mingw modules in a MSVC-built postgres.
The other way around should work too.

The main reason why is that on Windows you are expected to be very careful
about your C library, always free()ing memory in the same module (DLL/EXE)
you malloc() it in. Same rules with file handles, etc. This is required to
work correctly with modules compiled with a mix of MSVC versions or mix of
debug and release MSVC runtimes. The same principle applies to MinGW.

> > Under windows we can use MinGW64/Msys or LLVM/Clang for MSVC.
>
> I'm guessing that LLVM/Clang port would be useful for something, but I'm
> not clear what.
>
> Are we moving forward with the CMake stuff?  It would be *awesome* to
> get rid of the MSVC build scripts, and perhaps we can move forward on
> some smaller items such as PGDLLEXPORT markings as well.
>
>
Yeah, strongly agree there.

CMake has excellent MSVC support btw.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Proposal:Use PGDLLEXPORT for libpq

2016-01-27 Thread Andres Freund


On 2016-01-27 21:05:06 +0800, Craig Ringer wrote:
> On 27 January 2016 at 20:54, Alvaro Herrera 
> wrote:

> > > If so many problems with MSVC can discard his support of Postgres?

> I strongly disagree. MSVC is a high quality compiler and the primary tool
> for the platform.

I think it's pretty obvious that we're not going to do that, so let's
drop that specific topic.


> Yes, it's behind on standards support and that's annoying

They're fixing that largely btw.


> -  OTOH MinGW relies on reverse-engineered headers, an old gcc fork, and
> has had some pretty nasty bugs.

I think that's also mostly fixed with mingw64 et al.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Using quicksort for every external sort run

2016-01-27 Thread Peter Geoghegan

On Wed, Dec 23, 2015 at 9:37 AM, Robert Haas  wrote:
>> But yes, let me concede more clearly: the cost model is based on
>> frobbing. But at least it's relatively honest about that, and is
>> relatively simple. I think it might be possible to make it simpler,
>> but I have a feeling that anything we can come up with will basically
>> have the same quality that you so dislike. I don't know how to do
>> better. Frankly, I'd rather be roughly correct than exactly wrong.
>
> Sure, but the fact that the model has huge discontinuities - perhaps
> most notably a case where adding a single tuple to the estimated
> cardinality changes the crossover point by a factor of two - suggests
> that you are probably wrong.  The actual behavior does not change
> sharply when the size of the SortTuple array crosses 1GB, but the
> estimates do.

Here is some fairly interesting analysis of Quicksort vs. Heapsort,
from Bentley, coauthor of our own Quicksort implementation:

https://youtu.be/QvgYAQzg1z8?t=16m15s

(This link picks up at the right point to see the comparison, complete
with an interesting graph).

It probably doesn't tell you much that you didn't already know, at
least at this exact point, but it's nice to see Bentley's graph. This
perhaps gives you some idea of why my "quicksort with spillover" cost
model had a cap of MaxAllocSize of SortTuples, past which we always
needed a very compelling case. That was my rough guess of where the
Heapsort graph takes a sharp upward turn. Before then, Bentley shows
that it's close enough to a straight line.

Correct me if I'm wrong, but I think that the only outstanding issue
with all patches posted here so far is the "quicksort with spillover"
cost model. Hopefully this can be cleared up soon. As I've said, I am
very receptive to other people's suggestions about how that should
work.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-27 Thread Anastasia Lubennikova




22.01.2016 13:48, Aleksander Alekseev:

Then, this thread became too tangled. I think it's worth to write a
new message with the patch, the test script, some results and brief
overview of how does it really works. It will make following review
much easier.

Sure.

HASHHDR represents a hash table. It could be usual or partitioned.
Partitioned table is stored in a shared memory and accessed by multiple
processes simultaneously. To prevent data corruption hash table is
partitioned and each process has to acquire a lock for a corresponding
partition before accessing data in it. Number of partition is determine
by lower bits of key's hash value. Most tricky part is --- dynahash
knows nothing about these locks, all locking is done on calling side.

Since shared memory is pre-allocated and can't grow to allocate memory
in a hash table we use freeList. Also we use nentries field to count
current number of elements in a hash table. Since hash table is used by
multiple processes these fields are protected by a spinlock. Which as
it turned out could cause lock contention and create a bottleneck.

After trying a few approaches I discovered that using one spinlock per
partition works quite well. Here are last benchmark results:

http://www.postgresql.org/message-id/20151229184851.1bb7d1bd@fujitsu

Note that "no locks" solution cant be used because it doesn't guarantee
that all available memory will be used in some corner cases.

You can find a few more details and a test script in the first message
of this thread. If you have any other questions regarding this patch
please don't hesitate to ask.

InitLocks
/*
 * Compute init/max size to request for lock hashtables.  Note these
 * calculations must agree with LockShmemSize!
 */

This comment certainly requires some changes.
BTW, could you explain why init_table_size was two times less than 
max_table_size?

Although, I don't see any problems with your changes.


-hctl->nentries = 0;
-hctl->freeList = NULL;

Why did you delete these two lines? I wonder if you should rewrite them 
instead?


+ * This particular number of partitions significantly reduces lock 
contention

+ * in partitioned hash tables, almost if partitioned tables didn't use any
+ * locking at all

As far as I understood, this number was obtained experimentally? Maybe 
you should note that in the comment.



And the last, but not least.

+nelem_alloc = ((int) nelem) / partitions_number;
+if (nelem_alloc == 0)
+nelem_alloc = 1;
+
+for (i = 0; i < partitions_number; i++)
+if (!element_alloc(hashp, nelem_alloc, i))
+ereport(ERROR,
+(errcode(ERRCODE_OUT_OF_MEMORY),
+ errmsg("out of memory")));


It looks like you forgot to round the result of the division.
For example, if you have nelem=25 and partitions_number=6.
25 / 6 = 4. And then you allocate 24 nelems, while 1 is lost.

What about productivity, I did a test on my notebook. Patched version 
shows small performance improvement.


pgbench -j 1 -c 1 -f pgbench.sql -T 300 postgres
base patched
127tps  145tps
pgbench -j 8 -c 8 -f pgbench.sql -T 300 postgres
base patched
247tps  270tps

But I haven't any big machine to perform tests, where the patch is 
supposed to make significant changes.

So I would rely on your and the other reviewers results.

Except mentioned notes, I suppose the patch is good enough to pass it to 
a reviewer.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

[HACKERS] WAL Re-Writes

2016-01-27 Thread Amit Kapila

As discussed previously about WAL Re-Writes [1], I have
done some investigation in that area which I would like to
share.

Currently we always write WAL in 8KB blocks, which could
lead to a lot of re-write of data for small-transactions. Consider
the case where the amount to be written is usually < 4KB,
we always write it in 8KB chunks which is the major source
of re-writes. I have tried various options to reduce this re-write
of data.

First, I have tried to write the WAL in exact size which the
transaction or otherwise has been requested to XLogWrite(),
patch for this experiment is attached. I have written a small test
patch (calculate_wal_written_by_backend_v1.patch) to calculate the
amount of WAL written by XLogWrite() API and found that the actual
WAL-writes by PostgreSQL have reduced by half with patch on
(pgbench tpc-b workload with 4-clients), but unfortunately this lead
to significant decrease (more than 50%) in TPS. Jan Wieck has then
found out by observing OS stats /stat> that
this patch has reduced the writes, but introduced reads and probable
theory behind the same is that as the patch is not writing in block
boundaries of OS, OS has to read the block to complete this write
operation. Now why OS couldn't find the corresponding block in
memory is that, while closing the WAL file, we use
POSIX_FADV_DONTNEED if wal_level is less than 'archive' which
lead to this problem. So with this experiment, the conclusion is that
though we can avoid re-write of WAL data by doing exact writes, but
it could lead to significant reduction in TPS.

Then, I have tried by writing the WAL in chunks and introduced
a guc wal_write_chunk_size to experiment with different chunk
sizes (patch - write_wal_chunks_v1.patch). I have noticed that
at 4096 bytes chunk size (OS block size), there is approximately 35%
reduction in WAL writes (at different client counts for pgbench
read-write workload) both by using my test patch
calculate_wal_written_by_backend_v1.patch and by observing
OS stats /stat>. Now where I see a good
amount of reduction in WAL writes, but the TPS increase
is between 1~5% for read-write workloads. In some cases at lower
client-count (4), I have seen increase upto 10~15% across multiple
runs, but didn't find a clear trend which can suggest that at lower-
client counts it will always be such a good improvement, OTOH
I have not observed any regression with 4096 bytes WAL chunk size
in my tests till now. One likely theory that we might not see much
improvement at high client count is due to the logic in XLogFlush()
where we combine the WAL writes from multiple clients and the
combined size is greater than 4096 bytes in which case it will write
8K blocks. For all other chunk sizes 512 bytes, 1024 bytes,
2048 bytes, I observed that the smaller the chunk size, better is
reduction in WAL writes, but trend for TPS is just opposite (lesser
the chunk size, worse is TPS) and the probable reason is same as
explained in previous paragraph.

Thoughts?

Note -
1. OS level reduction for WAL writes is done by having WAL
and data on separate disks.
2. I can share the detailed performance data if required, but
I thought it is better to first share the Approach of patch.
3. Patches are more of a Proof-of-concept stage, rather than real
implementation, but I think it won't need too much effort to
improve it, if we find any particular approach as an acceptable
approach.

[1] -
http://www.postgresql.org/message-id/CA+TgmobWdBcbuipWPsbHSbf+-KDmatnYQYZ=akaju6alb5m...@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

avoid_extra_walwrites_v1.patch
Description: Binary data

calculate_wal_written_by_backend_v1.patch
Description: Binary data

write_wal_chunks_v1.patch
Description: Binary data

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-27 Thread Aleksander Alekseev

> This comment certainly requires some changes.

Fixed.

> BTW, could you explain why init_table_size was two times less than 
> max_table_size?

I have no clue. My best guess is that it was a reasonable thing to do in
the past. Then somebody changed a code and now there is little reason
to use init_table_size for partitioned tables.

> Why did you delete these two lines? I wonder if you should rewrite
> them instead?

```
MemSet(hctl, 0, sizeof(HASHHDR));
 
-   hctl->nentries = 0;
-   hctl->freeList = NULL;
```

These fields were initialized with zero values twice. It makes little
sense to me. 

> As far as I understood, this number was obtained experimentally?
> Maybe you should note that in the comment.

These numbers are very platform specific and will be outdated very
soon. I recall that my code was criticized for including exact numbers
not a long time ago. So I suggest to keep this part as is.

> For example, if you have nelem=25 and partitions_number=6.
> 25 / 6 = 4. And then you allocate 24 nelems, while 1 is lost.

Agree. Fixed.

> Except mentioned notes, I suppose the patch is good enough

I guess I will mark this patch as "Ready for Committer" then.
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 9673fe0..0c8e4fb 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -495,7 +495,7 @@ pgss_shmem_startup(void)
 	info.hash = pgss_hash_fn;
 	info.match = pgss_match_fn;
 	pgss_hash = ShmemInitHash("pg_stat_statements hash",
-			  pgss_max, pgss_max,
+			  pgss_max,
 			  &info,
 			  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE);
 
diff --git a/src/backend/storage/buffer/buf_table.c b/src/backend/storage/buffer/buf_table.c
index 39e8baf..dd5acb7 100644
--- a/src/backend/storage/buffer/buf_table.c
+++ b/src/backend/storage/buffer/buf_table.c
@@ -62,7 +62,7 @@ InitBufTable(int size)
 	info.num_partitions = NUM_BUFFER_PARTITIONS;
 
 	SharedBufHash = ShmemInitHash("Shared Buffer Lookup Table",
-  size, size,
+  size,
   &info,
   HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
 }
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 81506ea..4c18701 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -237,7 +237,7 @@ InitShmemIndex(void)
 	hash_flags = HASH_ELEM;
 
 	ShmemIndex = ShmemInitHash("ShmemIndex",
-			   SHMEM_INDEX_SIZE, SHMEM_INDEX_SIZE,
+			   SHMEM_INDEX_SIZE,
 			   &info, hash_flags);
 }
 
@@ -255,17 +255,12 @@ InitShmemIndex(void)
  * exceeded substantially (since it's used to compute directory size and
  * the hash table buckets will get overfull).
  *
- * init_size is the number of hashtable entries to preallocate.  For a table
- * whose maximum size is certain, this should be equal to max_size; that
- * ensures that no run-time out-of-shared-memory failures can occur.
- *
  * Note: before Postgres 9.0, this function returned NULL for some failure
  * cases.  Now, it always throws error instead, so callers need not check
  * for NULL.
  */
 HTAB *
 ShmemInitHash(const char *name, /* table string name for shmem index */
-			  long init_size,	/* initial table size */
 			  long max_size,	/* max size of the table */
 			  HASHCTL *infoP,	/* info about key and bucket size */
 			  int hash_flags)	/* info about infoP */
@@ -299,7 +294,7 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
 	/* Pass location of hashtable header to hash_create */
 	infoP->hctl = (HASHHDR *) location;
 
-	return hash_create(name, init_size, infoP, hash_flags);
+	return hash_create(name, max_size, infoP, hash_flags);
 }
 
 /*
diff --git a/src/backend/storage/lmgr/lock.c b/src/backend/storage/lmgr/lock.c
index 9c2e49c..8585a76 100644
--- a/src/backend/storage/lmgr/lock.c
+++ b/src/backend/storage/lmgr/lock.c
@@ -373,18 +373,10 @@ void
 InitLocks(void)
 {
 	HASHCTL		info;
-	long		init_table_size,
-max_table_size;
+	long		max_table_size;
 	bool		found;
 
 	/*
-	 * Compute init/max size to request for lock hashtables.  Note these
-	 * calculations must agree with LockShmemSize!
-	 */
-	max_table_size = NLOCKENTS();
-	init_table_size = max_table_size / 2;
-
-	/*
 	 * Allocate hash table for LOCK structs.  This stores per-locked-object
 	 * information.
 	 */
@@ -392,16 +384,15 @@ InitLocks(void)
 	info.keysize = sizeof(LOCKTAG);
 	info.entrysize = sizeof(LOCK);
 	info.num_partitions = NUM_LOCK_PARTITIONS;
+	max_table_size = NLOCKENTS();
 
 	LockMethodLockHash = ShmemInitHash("LOCK hash",
-	   init_table_size,
 	   max_table_size,
 	   &info,
 	HASH_ELEM | HASH_BLOBS | HASH_PARTITION);
 
 	/* Assume an average of 2 holders per lock */
 	max_table_size *= 2;
-	init_table_size *= 2;
 
 	/*
 	 * Allocate hash table for PROCLOCK structs.  This stores
@@ -413,7 +404,6 @@ InitLocks(void)
 	info.num_partitions = NUM_LOC

Re: [HACKERS] pgbench stats per script & other stuff



Hello again,


Obviously this would work. I did not think the special case was worth the
extra argument. This one has some oddity too, because the second argument is
ignored depending on the third. Do as you feel.


Actually my question was whether keeping the original start_time was the
intended design.


Sorry I misunderstood the question.

The answer is essentially yes, the field is needed for the "aggregated" 
mode where this specific behavior is used.


However, after some look at the code I think that it is possible to do 
without.


I also spotted an small issue under low tps where the last aggregation was 
not shown.


With the attached version these problems have been removed, no conditional 
initialization. There is also a small diff with the version you sent.


--
Fabien.diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index d5f242c..b3fe994 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -166,10 +166,8 @@ int			agg_interval;		/* log aggregates instead of individual
  * transactions */
 int			progress = 0;		/* thread progress report every this seconds */
 bool		progress_timestamp = false; /* progress report with Unix time */
-int			progress_nclients = 0;		/* number of clients for progress
-		 * report */
-int			progress_nthreads = 0;		/* number of threads for progress
-		 * report */
+int			nclients = 1;		/* number of clients */
+int			nthreads = 1;		/* number of threads */
 bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
@@ -193,6 +191,35 @@ typedef struct
 #define SHELL_COMMAND_SIZE	256 /* maximum size allowed for shell command */
 
 /*
+ * Simple data structure to keep stats about something.
+ *
+ * XXX probably the first value should be kept and used as an offset for
+ * better numerical stability...
+ */
+typedef struct
+{
+	int64		count;			/* how many values were encountered */
+	double		min;			/* the minimum seen */
+	double		max;			/* the maximum seen */
+	double		sum;			/* sum of values */
+	double		sum2;			/* sum of squared values */
+} SimpleStats;
+
+/*
+ * Data structure to hold various statistics, used for interval statistics as
+ * well as file statistics.
+ */
+typedef struct
+{
+	long		start_time;		/* interval start time, for aggregates */
+	int64		cnt;			/* number of transactions */
+	int64		skipped;		/* number of transactions skipped under --rate
+ * and --latency-limit */
+	SimpleStats latency;
+	SimpleStats lag;
+} StatsData;
+
+/*
  * Connection state
  */
 typedef struct
@@ -213,10 +240,8 @@ typedef struct
 	bool		prepared[MAX_SCRIPTS];	/* whether client prepared the script */
 
 	/* per client collected stats */
-	int			cnt;			/* xacts count */
+	int64		cnt;			/* transaction count */
 	int			ecnt;			/* error count */
-	int64		txn_latencies;	/* cumulated latencies */
-	int64		txn_sqlats;		/* cumulated square latencies */
 } CState;
 
 /*
@@ -228,19 +253,14 @@ typedef struct
 	pthread_t	thread;			/* thread handle */
 	CState	   *state;			/* array of CState */
 	int			nstate;			/* length of state[] */
-	instr_time	start_time;		/* thread start time */
-	instr_time *exec_elapsed;	/* time spent executing cmds (per Command) */
-	int		   *exec_count;		/* number of cmd executions (per Command) */
 	unsigned short random_state[3];		/* separate randomness for each thread */
 	int64		throttle_trigger;		/* previous/next throttling (us) */
 
 	/* per thread collected stats */
+	instr_time	start_time;		/* thread start time */
 	instr_time	conn_time;
-	int64		throttle_lag;	/* total transaction lag behind throttling */
-	int64		throttle_lag_max;		/* max transaction lag */
-	int64		throttle_latency_skipped;		/* lagging transactions
- * skipped */
-	int64		latency_late;	/* late transactions */
+	StatsData	stats;
+	int64		latency_late;	/* executed but late transactions */
 } TState;
 
 #define INVALID_THREAD		((pthread_t) 0)
@@ -272,33 +292,14 @@ typedef struct
 	char	   *argv[MAX_ARGS]; /* command word list */
 	int			cols[MAX_ARGS]; /* corresponding column starting from 1 */
 	PgBenchExpr *expr;			/* parsed expression */
+	SimpleStats stats;			/* time spent in this command */
 } Command;
 
-typedef struct
-{
-
-	long		start_time;		/* when does the interval start */
-	int			cnt;			/* number of transactions */
-	int			skipped;		/* number of transactions skipped under --rate
- * and --latency-limit */
-
-	double		min_latency;	/* min/max latencies */
-	double		max_latency;
-	double		sum_latency;	/* sum(latency), sum(latency^2) - for
- * estimates */
-	double		sum2_latency;
-
-	double		min_lag;
-	double		max_lag;
-	double		sum_lag;		/* sum(lag) */
-	double		sum2_lag;		/* sum(lag*lag) */
-} AggVals;
-
 static struct
 {
 	const char *name;
-	Command	 **commands;
-} sql_script[MAX_SCRIPTS];		/* SQL script files */
+	Command   **commands;
+}	sql_script[MAX_S

Re: [HACKERS] insert/update performance

2016-01-27 Thread Jinhua Luo

>
> But what kind of rows would satisfy heap_page_prune() and what would not?
>
> In my case all updates are doing the same thing (there is no HOT
> updates, obviously), but why some updated rows are reported by
> heap_page_prune() but the others are not? And it's also a random
> issue. That means sometimes heap_page_prune() would report all
> removable rows, and sometimes it reports no rows.
>

I check the codes again.

The heap_page_prune() would skip items if ItemIdIsDead() returns true.

That means some obsoleted items are flagged dead before vacuum, and I
found 3 places:

1) heap_page_prune_opt() --> heap_page_prune() --> ItemIdSetDead()
2) _bt_check_unique() --> ItemIdMarkDead()
3) _bt_killitems() --> ItemIdMarkDead()

In my case, the first one happens most frequently.
And it's interesting that it's invoked from select statement!

 0x80ca000 : heap_page_prune_opt+0x0/0x1a0
 0x80d030d : index_fetch_heap+0x11d/0x140
 0x80d035e : index_getnext+0x2e/0x40
 0x81eec9b : IndexNext+0x3b/0x100
 0x81e4ddf : ExecScan+0x15f/0x290
 0x81eed8d : ExecIndexScan+0x2d/0x50
 0x81ddb20 : ExecProcNode+0x1f0/0x2a0
 0x81dac6c : standard_ExecutorRun+0xfc/0x160
 0x82d0503 : PortalRunSelect+0x183/0x200
 0x82d17da : PortalRun+0x26a/0x3c0
 0x82cf452 : PostgresMain+0x2282/0x2fc0
 0x8097f52 : ServerLoop+0xb1b/0xec2
 0x82793d7 : PostmasterMain+0x1237/0x13c0
 0x8098b6c : main+0x48c/0x4d4
 0xb754fa83 : __libc_start_main+0xf3/0x210
 0x8098bd5 : _start+0x21/0x2c


Regards,
Jinhua Luo


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2016-01-27 Thread Dilip Kumar

On Wed, Jan 27, 2016 at 5:15 PM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:

> Most likely HASHHDR.mutex is not only bottleneck in your case so my
> patch doesn't help much. Unfortunately I don't have access to any
> POWER8 server so I can't investigate this issue. I suggest to use a
> gettimeofday trick I described in a first message of this thread. Its
> time consuming but it gives a clear understanding which code is keeping
> a lock.
>

I have also tested the pgbench Readonly test when data don't fit into
shared buffer,
Because in this case HASHHDR.mutex access will be quite frequent.
And in this case i do see very good improvement in POWER8 server.

Test Result:

Scale Factor:300
Shared Buffer:512MB
pgbench -c$ -j$ -S -M Prepared postgres

Clientbasepatch
64   222173318318
128195805262290

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] Why format() adds double quote?

2016-01-27 Thread Daniel Verite

Tatsuo Ishii wrote:

> 2) What does the SQL standard say? Do they say that non ASCII white
>   spaces should be treated as ASCII white spaces?

I've used white space in the example, but I'm concerned about
punctuation too.

unicode.org has this helpful paper:
http://www.unicode.org/L2/L2000/00260-sql.pdf
which studies Unicode in SQL-99 identifiers.

The relevant BNF they extracted from the standard looks like this:

identifier body> ::=

   [ {  |  }... ]

 ::=

   | 

 ::=

| 
| 
| 
| 
| 
| 
| 
| 

 ::=

 ::= ...

 ::=

   | 

The current version of quote_ident() plays it safe by implementing
the rule that, as soon it encounters a character outside
of US-ASCII, it surrounds the identifier with double quotes, no matter
to which category or block this character belongs.
So its output is guaranteed to be compatible with the above grammar.

The change in the patch is that multibyte characters just don't imply
quoting.

But according to the points 1 and 2 of the paper, the first character
must have the Unicode alphabetic property, and it must not
have the Unicode combining property.

I'm mostly ignorant in Unicode so I'm not sure of the precise
implications of having such Unicode properties, but still my
understanding is that the new quote_ident() ignores these rules,
so in this sense it could produce outputs that wouldn't be
compatible with SQL-99.

Also, here's what we say in the manual about non quoted identifiers:
http://www.postgresql.org/docs/current/static/sql-syntax-lexical.html

"SQL identifiers and key words must begin with a letter (a-z, but also
letters with diacritical marks and non-Latin letters) or an underscore
(_). Subsequent characters in an identifier or key word can be
letters, underscores, digits (0-9), or dollar signs ($)"

So it explicitly allows letters in general  (and also seems less
strict than SQL-99 about underscore), but it makes no promise about
Unicode punctuation or spaces, for instance, even though in practice
the parser seems to accept them just fine.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] proposal: PL/Pythonu - function ereport

Hi

>
> > I though about it lot of, and I see a  the core of problems in orthogonal
> > constructed exceptions in Python and Postgres. We working with statements
> > elog, ereport, RAISE EXCEPTION - and these statements are much more like
> > templates - can generate any possible exception. Python is based on
> working
> > with instances of predefined exceptions. And it is core of problem.
> > Template like class cannot be ancestor of instance oriented classes. This
> > is task for somebody who knows well OOP and meta OOP in Python - total
>
> I've been following this discussion with great interest, because PL/Java
> also is rather overdue for tackling the same issues: it has some partial
> ability to catch things from PostgreSQL and even examine them in proper
> detail, but very limited ability to throw things as information-rich as
> is possible from C with ereport. And that work is not as far along as
> you are with PL/Python, there is just a preliminary design/discussion
> wiki document at
>   https://github.com/tada/pljava/wiki/Thoughts-on-logging
>
> I was unaware of this project in PL/Pythonu when I began it, then added
> the reference when I saw this discussion.
>
>
I read your article and it is exactly same situation.

It is conflict between procedural (PostgreSQL) and OOP (Python, Java) API.
I see possible solution in design independent class hierarchies - static
(buildin exceptions) and dynamic (custom exceptions). It cannot be mixed,
but there can be some abstract ancestor. Second solution is defensive -
using procedural API for custom exceptions - what i doing in PLPythonu.

Regards

Pavel

Re: [HACKERS] Existence check for suitable index in advance when concurrently refreshing.

2016-01-27 Thread Masahiko Sawada

On Wed, Jan 27, 2016 at 4:42 PM, Fujii Masao  wrote:
> On Tue, Jan 26, 2016 at 9:33 PM, Masahiko Sawada  
> wrote:
>> Hi all,
>>
>> In concurrently refreshing materialized view, we check whether that
>> materialized view has suitable index(unique and not having WHERE
>> condition), after filling data to new snapshot
>> (refresh_matview_datafill()).
>> This logic leads to taking a lot of time until postgres returns ERROR
>> log if that table doesn't has suitable index and table is large. it
>> wastes time.
>> I think we should check whether that materialized view can use
>> concurrently refreshing or not in advance.
>
> +1
>
>> The patch is attached.
>>
>> Please give me feedbacks.

Thank you for having look at this patch.

> +indexRel = index_open(indexoid, RowExclusiveLock);
>
> Can we use AccessShareLock here, instead?

Yeah, I think we can use it. Fixed.

> +if (indexStruct->indisunique &&
> +IndexIsValid(indexStruct) &&
> +RelationGetIndexExpressions(indexRel) == NIL &&
> +RelationGetIndexPredicate(indexRel) == NIL)
> +hasUniqueIndex = true;
> +
> +index_close(indexRel, RowExclusiveLock);
>
> In the case where hasUniqueIndex = true, ISTM that we can get out of
> the loop immediately just after calling index_close(). No?

Fixed.

> +/* Must have at least one unique index */
> +Assert(foundUniqueIndex);
>
> Can we guarantee that there is at least one valid unique index here?
> If yes, it's better to write the comment about that.
>

Added.

Attached latest patch. Please review it.

Regards,

--
Masahiko Sawada


matview_concurrently_refresh_check_index_v2.patch
Description: binary/octet-stream

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why format() adds double quote?

2016-01-27 Thread Daniel Verite

Tatsuo Ishii wrote:

> What is the "visual hint"? If you are talking about psql's output, it 
> never adds "visual hint" (double quotations). 
> 
> If you are talking about the string handling in a program, what kind 
> of program cares about "visiual"? 

Sure. The scenario I'm thinking about would be someone inspecting
generated queries, say for controlling or troubleshooting,
the queries containing identifiers injected with the help of
quote_ident/format. That could be from the logs, or
monitoring or audit tools.

If identifiers contain weird Unicode characters, currently 
that's relatively easy to spot because they get surrounded by
double quotes.

If I see something like this: UPDATE "test․table" SET...
I immediately think that there's something fishy. It looks like test
should be a schema, but the surrounding quotes say otherwise.
In any case, it's clear that it updates a table in the current schema.

But if I see that: UPDATE test․table SET...
is seems legit and seems to update "table" inside the "test" schema
even though that's not what it does, it actually updates the
"test․table" table in the current schema, because the dot between
test and table is not the US-ASCII U+002E,  it's U+2024,
'ONE DOT LEADER'
On my display, they are almost indiscernible.

This boils down to the fact that the current quote_ident gives:

=# select quote_ident('test․table');
 quote_ident  
--
 "test․table"

whereas the quote_ident patched as proposed gives:

=# select quote_ident('test․table');
 quote_ident 
-
 test․table

So this is what I don't feel good about.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Implementing a new Scripting Language


Hi all,

We have an open source scripting engine named Lucee that is used 
primarily for web application -- https://github.com/lucee/Lucee -- it is 
written in Java and is usually run as a servlet, but can be accessed in 
other ways (like JSR-223).


You can think of the language as a combination of PHP and Javascript, 
albeit much simpler and more intuitive IMHO.


I was wondering how difficult it would be to implement a Postgres 
extension that will act as a wrapper around it and will allow to write 
functions in that language?


I am a Java developer myself and have a very limited knowledge of C/C++, 
but if someone can point me in the right direction of whatever 
documentation there is on the subject -- that would be great.


Thanks,

--

Igal Sapir
Lucee Core Developer
Lucee.org

Re: [HACKERS] Tsvector editing functions

2016-01-27 Thread Stas Kelvich

Hi

> On 22 Jan 2016, at 19:03, Tomas Vondra  wrote:
> OK, although I do recommend using more sensible variable names, i.e. why how 
> to use 'lexemes' instead of 'lexarr' for example? Similarly for the other 
> functions.


Changed. With old names I tried to follow conventions in surrounding code, but 
probably that is a good idea to switch to more meaningful names in new code.

>> 
>> 
>> delete(tsin tsvector, tsv_filter tsvector) — Delete lexemes and/or positions 
>> of tsv_filter from tsin. When lexeme in tsv_filter has no positions function 
>> will delete any occurrence of same lexeme in tsin. When tsv_filter lexeme 
>> have positions function will delete them from positions of matching lexeme 
>> in tsin. If after such removal resulting positions set is empty then 
>> function will delete that lexeme from resulting tsvector.
>> 
> 
> I can't really imagine situation in which I'd need this, but if you do have a 
> use case for it ... although in the initial paragraph you say "... but if 
> somebody wants to delete for example ..." which suggests you may not have 
> such use case.
> 
> Based on bad experience with extending API based on vague ideas, I recommend 
> only really adding functions with existing need. It's easy to add a function 
> later, much more difficult to remove it or change the signature.

I tried to create more or less self-contained api, e.g. have ability to negate 
effect of concatenation. But i’ve also asked people around what they think 
about extending API and everybody convinced that it is better to stick to 
smaller API. So let’s drop it. At least that functions exists in mail list in 
case if somebody will google for such kind of behaviour.

>> 
>> Also if we want some level of completeness of API and taking into account 
>> that concat() function shift positions on second argument I thought that it 
>> can be useful to also add function that can shift all positions of specific 
>> value. This helps to undo concatenation: delete one of concatenating 
>> tsvectors and then shift positions in resulting tsvector. So I also wrote 
>> one another small function:
>> 
>> shift(tsin tsvector,offset int16) — Shift all positions in tsin by given 
>> offset
> 
> That seems rather too low-level. Shouldn't it be really built into delete() 
> directly somehow?


I think it is ambiguous task on delete. But if we are dropping support of 
delete(tsvector, tsvector) I don’t see points in keeping that functions.

>>> 
>>> 7) Some of the functions use intexterm that does not match the function
>>>   name. I see two such cases - to_tsvector and setweight. Is there a
>>>   reason for that?
>>> 
>> 
>> Because sgml compiler wants unique indexterm. Both functions that
>> youmentioned use overloading of arguments and have non-unique name.
> 
> As Michael pointed out, that should probably be handled by using  
> and  tags.


Done.


> On 19 Jan 2016, at 00:21, Alvaro Herrera  wrote:
> 
> 
> It's a bit funny that you reintroduce the "unrecognized weight: %d"
> (instead of %c) in tsvector_setweight_by_filter.
> 


Ah, I was thinking about moving it to separate diff and messed. Fixed and 
attaching diff with same fix for old tsvector_setweight.




tsvector_ops-v2.1.diff
Description: Binary data


tsvector_ops-v2.2.diff
Description: Binary data



---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Vladimir Sitnikov

Why do you want that at the database level?
Do you have end-to-end scenario that benefits from using Lucee?

>I was wondering how difficult it would be to implement a Postgres extension 
>that will act as a wrapper around it and will allow to write functions in that 
>language?

Have you checked PL/Java?

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing a new Scripting Language


On 1/27/2016 8:40 AM, Vladimir Sitnikov wrote:

Why do you want that at the database level?
Do you have end-to-end scenario that benefits from using Lucee?
Lucee is very intuitive and powerful, so it's more for ease of use than 
anything, and to attract more Lucee users to use PostgreSQL (most of 
them use SQL Server or MySQL).


If the pl/v8 was easily ported to Windows then I probably wouldn't even 
try to add Lucee, but it seems to be quite difficult to compile the 
latest versions for Windows and it looks like the project is not 
maintained (for example, it uses an older version of V8 with no known 
intention to upgrade).


If I had more experience with C++ then I'd probably try to update V8, 
but so far my attempts have not been very fruitful.



I was wondering how difficult it would be to implement a Postgres extension 
that will act as a wrapper around it and will allow to write functions in that 
language?

Have you checked PL/Java?
That seems like a good place to start, thanks.  Are there also any docs 
about the subject?



Igal


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why format() adds double quote?

2016-01-27 Thread Tom Lane

"Daniel Verite"  writes:
> This boils down to the fact that the current quote_ident gives:

> =# select quote_ident('testâ¤table');
>  quote_ident  
> --
>  "testâ¤table"

> whereas the quote_ident patched as proposed gives:

> =# select quote_ident('testâ¤table');
>  quote_ident 
> -
>  testâ¤table

> So this is what I don't feel good about.

This patch was originally proposed as a simple, cost-free change,
but it's becoming obvious that it is no such thing.  I think
we should probably reject it and move on.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log operating system user connecting via unix socket

2016-01-27 Thread José Arthur Benetasso Villanova

Hi again.

About the privileges, our support can create roles / databases, drop
existing databases, dump /restore, change other users passwords. It's not
feasible right now create a 1:1 map of system users and postgres users.
Maybe in the future.

I wrote 2 possible patches, both issuing a detail message only if
log_connections is enabled.

The first one using the Stephen Frost suggestion, inside the Port struct (I
guess that this is the one, I coudn't find the Peer struct)

The second one following the same approach of cf commit 5e0b5dcab, as
pointed by Tom Lane.

Again, feel free to comment and criticize.

On Sun, Jan 17, 2016 at 3:07 PM, Stephen Frost  wrote:

> Tom,
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Stephen Frost  writes:
> > > What I think we really want here is logging of the general 'system
> > > user' for all auth methods instead of only for the 'peer' method.
> >
> > Well, we don't really know that except in a small subset of auth
> > methods.  I agree that when we do know it, it's useful info to log.
>
> Right.
>
> > My big beef with the proposed patch is that the log message is emitted
> > unconditionally.  There are lots and lots of users who feel that during
> > normal operation, *zero* log messages should get emitted.  Those
> villagers
> > would be on our doorsteps with pitchforks if we shipped this patch as-is.
>
> Agreed.
>
> > I would propose that this information should be emitted only when
> > log_connections is enabled, and indeed that it should be part of the
> > log_connections message not a separate message.  So this leads to
> > thinking that somehow, the code for individual auth methods should
> > be able to return an "additional info" field for inclusion in
> > log_connections.  We already have such a concept for auth failures,
> > cf commit 5e0b5dcab.
>
> Apologies if it wasn't clear, but that's exactly what I was suggesting
> by saying to add it to PerformAuthentication, which is where we emit
> the connection info when log_connections is enabled.
>
> > > ... and also make it available in pg_stat_activity.
> >
> > That's moving the goalposts quite a bit, and I'm not sure it's necessary
> > or even desirable.  Let's just get this added to log_connections output,
> > and then see if there's field demand for more.
>
> This was in context of peer_cn, which is just a specific "system user"
> value and which we're already showing in pg_stat_* info tables.  I'd
> love to have the Kerberos principal available, but I don't think it'd
> make sense to have a 'pg_stat_kerberos' just for that.
>
> I agree that it's moving the goalposts for this patch and could be an
> independent patch, but I don't see it as any different, from a
> desirability and requirements perspective, than what we're doing for SSL
> connections.
>
> Thanks!
>
> Stephen
>



-- 
José Arthur Benetasso Villanova
commit 76594784c50bca1b09f687e58f17ff27230076be
Author: Jose Arthur Benetasso Villanova 
Date:   Tue Jan 19 11:50:22 2016 -0200

Log message

diff --git a/src/backend/libpq/auth.c b/src/backend/libpq/auth.c
index 57c2f48..ac1c785 100644
--- a/src/backend/libpq/auth.c
+++ b/src/backend/libpq/auth.c
@@ -991,6 +991,7 @@ pg_GSS_recvauth(Port *port)
 		return STATUS_ERROR;
 	}
 
+	port->system_user = pstrdup(gbuf.value);
 	ret = check_usermap(port->hba->usermap, port->user_name, gbuf.value,
 		pg_krb_caseins_users);
 
@@ -1291,6 +1292,7 @@ pg_SSPI_recvauth(Port *port)
 		int			retval;
 
 		namebuf = psprintf("%s@%s", accountname, domainname);
+		port->system_user = pstrdup(namebuf);
 		retval = check_usermap(port->hba->usermap, port->user_name, namebuf, true);
 		pfree(namebuf);
 		return retval;
@@ -1561,8 +1563,11 @@ ident_inet_done:
 		pg_freeaddrinfo_all(local_addr.addr.ss_family, la);
 
 	if (ident_return)
+	{
 		/* Success! Check the usermap */
+		port->system_user = pstrdup(ident_user);
 		return check_usermap(port->hba->usermap, port->user_name, ident_user, false);
+	}
 	return STATUS_ERROR;
 }
 
@@ -1609,6 +1614,8 @@ auth_peer(hbaPort *port)
 	}
 
 	strlcpy(ident_user, pw->pw_name, IDENT_USERNAME_MAX + 1);
+	port->system_user = pstrdup(ident_user);
+
 
 	return check_usermap(port->hba->usermap, port->user_name, ident_user, false);
 }
@@ -2124,6 +2131,7 @@ CheckLDAPAuth(Port *port)
 		return STATUS_ERROR;
 	}
 
+	port->system_user = pstrdup(fulluser);
 	pfree(fulluser);
 
 	return STATUS_OK;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index e22d4db..f425808 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -255,7 +255,8 @@ PerformAuthentication(Port *port)
 #endif
 ereport(LOG,
 		(errmsg("replication connection authorized: user=%s",
-port->user_name)));
+port->user_name),
+		port->system_user ? errdetail_log("system_user=%s", port->system_user) : 0));
 		}
 		else
 		{
@@ -269,7 +270,8 @@ PerformAuthentication(Port *port)
 #endif
 ereport(LOG,
 		(errmsg("connection authorize

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Vladimir Sitnikov

> If the pl/v8 was easily ported to Windows then I probably wouldn't even try 
> to add Lucee,

That is a good question. ChakraCore has been open sourced recently. It
might be easier to build under Windows.

>That seems like a good place to start, thanks
I am not sure you would be able to bind high performance java runtime
with the backend. There are no that many JREs, and not much of them
are good at "in-backend" operation.

Thus your question boils down to 2 possibilities:
1) You execute Lucee in some JRE that runs in the backend (frankly
speaking, I doubt it is a good way to go)
2) You implement Lucee parser/executor/compiler in C and use it as
typical PostgreSQL extension

Vladimir


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] plpgsql - DECLARE - cannot to use %TYPE or %ROWTYPE for composite types

Hi

2016-01-18 21:37 GMT+01:00 Alvaro Herrera :

> > diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
> > new file mode 100644
> > index 1ae4bb7..c819517
> > *** a/src/pl/plpgsql/src/pl_comp.c
> > --- b/src/pl/plpgsql/src/pl_comp.c
> > *** plpgsql_parse_tripword(char *word1, char
> > *** 1617,1622 
> > --- 1617,1677 
> >   return false;
> >   }
> >
> > + /*
> > +  * Derive type from ny base type controlled by reftype_mode
> > +  */
> > + static PLpgSQL_type *
> > + derive_type(PLpgSQL_type *base_type, int reftype_mode)
> > + {
> > + Oid typoid;
>
> I think you should add a typedef to the REFTYPE enum, and have this
> function take that type rather than int.
>

done


>
> > + case PLPGSQL_REFTYPE_ARRAY:
> > + {
> > + /*
> > +  * Question: can we allow anyelement (array or
> nonarray) -> array direction.
> > +  * if yes, then probably we have to modify
> enforce_generic_type_consistency,
> > +  * parse_coerce.c where still is check on scalar
> type -> raise error
> > +  * ERROR:  42704: could not find array type for
> data type integer[]
> > +  *
> > + if
> (OidIsValid(get_element_type(base_type->typoid)))
> > + return base_type;
> > + */
>
> I think it would be better to resolve this question outside a code
> comment.
>

done


>
> > + typoid = get_array_type(base_type->typoid);
> > + if (!OidIsValid(typoid))
> > + ereport(ERROR,
> > +
>  (errcode(ERRCODE_DATATYPE_MISMATCH),
> > +  errmsg("there are not
> array type for type %s",
> > +
>  format_type_be(base_type->typoid;
>
> nodeFuncs.c uses this wording:
> errmsg("could not find array type for data type %s",
> which I think you should adopt.
>

sure, fixed


>
> > --- 1681,1687 
> >* --
> >*/
> >   PLpgSQL_type *
> > ! plpgsql_parse_wordtype(char *ident, int reftype_mode)
> >   {
> >   PLpgSQL_type *dtype;
> >   PLpgSQL_nsitem *nse;
>
> Use the typedef'ed enum, as above.
>
> > --- 1699,1721 
> >   switch (nse->itemtype)
> >   {
> >   case PLPGSQL_NSTYPE_VAR:
> > ! {
> > ! dtype = ((PLpgSQL_var *)
> (plpgsql_Datums[nse->itemno]))->datatype;
> > ! return derive_type(dtype, reftype_mode);
> > ! }
> >
> > ! case PLPGSQL_NSTYPE_ROW:
> > ! {
> > ! dtype = ((PLpgSQL_row *)
> (plpgsql_Datums[nse->itemno]))->datatype;
> > ! return derive_type(dtype, reftype_mode);
> > ! }
> >
> > + /*
> > +  * XXX perhaps allow REC here? Probably it has not
> any sense, because
> > +  * in this moment, because PLpgSQL doesn't support
> rec parameters, so
> > +  * there should not be any rec polymorphic
> parameter, and any work can
> > +  * be done inside function.
> > +  */
>
> I think you should remove from the "?" onwards in that comment, i.e.
> just keep what was already in the original comment (minus the ROW)
>

I tried to fix it, not sure if understood well.


>
> > --- 1757,1763 
> >* --
> >*/
> >   PLpgSQL_type *
> > ! plpgsql_parse_cwordtype(List *idents, int reftype_mode)
> >   {
> >   PLpgSQL_type *dtype = NULL;
> >   PLpgSQL_nsitem *nse;
>
> Typedef.
>
> > --- 2720,2737 
> >   tok = yylex();
> >   if (tok_is_keyword(tok, &yylval,
> >  K_TYPE, "type"))
> > ! result = plpgsql_parse_wordtype(dtname,
> PLPGSQL_REFTYPE_TYPE);
> > ! else if (tok_is_keyword(tok, &yylval,
> > !
>  K_ELEMENTTYPE, "elementtype"))
> > ! result = plpgsql_parse_wordtype(dtname,
> PLPGSQL_REFTYPE_ELEMENT);
> > ! else if (tok_is_keyword(tok, &yylval,
> > !
>  K_ARRAYTYPE, "arraytype"))
> > ! result = plpgsql_parse_wordtype(dtname,
> PLPGSQL_REFTYPE_ARRAY);
> >   else if (tok_is_keyword(tok, &yylval,
> >
>  K_ROWTYPE, "rowtype"))
> >   result = plpgsql_parse_wordrowtype(dtname);
> > ! if (result)
> > ! return result;
> >   }
>
> This plpgsql parser stuff is pretty tiresome.  (Not this patch's fault
> -- just saying.)
>
>
> > *** extern bool plpgsql_parse_dblword(char *
> > *** 961,968 
> > PLwdatum *wdatum, PLcwo

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Chapman Flack

On 01/27/2016 11:46 AM, Igal @ Lucee.org wrote:

>> Have you checked PL/Java?
> That seems like a good place to start, thanks.  Are there also any docs
> about the subject?

I just did a quick search on Lucee and what I found suggests that
it compiles to JVM bytecode and runs on a JVM. If that is the
case, and it can compile methods that will have the sort of
method signatures PL/Java expects, and you can put the .class
files in a jar and load it, your job should be just about done. :)

Or, you might end up writing thin wrappers in Java, probably
nothing more.

Another possibility: Java has pluggable script engine support
(java specification request 233, see the javax.script package).
Does Lucee have an existing JSR 233 engine implementation?

PL/Java does not _currently_ have JSR233 support, but it is
definitely being thought about ... the idea being, put a Lucee
JSR233 engine jar on the classpath, define it as a new PostgreSQL
language (PL/Java will handle the interfacing), and then actually
write stuff like

DO $$echo("Hello World");$$ LANGUAGE lucee;

As I said, in current PL/Java version, JSR 233 is still
science fiction ... but it's "hard science fiction", not the
fantasy stuff.

http://tada.github.io/pljava/

-Chap

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing a new Scripting Language


On 1/27/2016 9:58 AM, jflack wrote:

I just did a quick search on Lucee and what I found suggests that
it compiles to JVM bytecode and runs on a JVM. If that is the
case, and it can compile methods that will have the sort of
method signatures PL/Java expects, and you can put the .class
files in a jar and load it, your job should be just about done. :)

yes, Lucee uses ASM4 to construct class files which are mostly POJOs

Or, you might end up writing thin wrappers in Java, probably
nothing more.

Another possibility: Java has pluggable script engine support
(java specification request 233, see the javax.script package).
Does Lucee have an existing JSR 233 engine implementation?
the next version of Lucee (currently in Beta) does support JSR-223, 
which I actually mentioned as a viable solution in my first email in 
this thread.  That would be awesome if PL/Java would support JSR-223.


thanks,


Igal


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing a new Scripting Language


On 1/27/2016 9:57 AM, Vladimir Sitnikov wrote:

That is a good question. ChakraCore has been open sourced recently. It
might be easier to build under Windows.
interesting.  but now we will need to write an extension for that, e.g. 
PL/Chakra, which brings back my original question:

are there any docs as to how to implement a new scripting language? ;)

I am not sure you would be able to bind high performance java runtime
with the backend. There are no that many JREs, and not much of them
are good at "in-backend" operation.

Thus your question boils down to 2 possibilities:
1) You execute Lucee in some JRE that runs in the backend (frankly
speaking, I doubt it is a good way to go)
yes, that's what I had in mind.  I wasn't thinking of an embedded JRE.  
TBH I didn't think there were any of those until your email.


thanks,


Igal


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Log operating system user connecting via unix socket

2016-01-27 Thread Stephen Frost

José,

* José Arthur Benetasso Villanova (jose.art...@gmail.com) wrote:
> I wrote 2 possible patches, both issuing a detail message only if
> log_connections is enabled.
> 
> The first one using the Stephen Frost suggestion, inside the Port struct (I
> guess that this is the one, I coudn't find the Peer struct)

Ah, yes, apologies for the typo.

> The second one following the same approach of cf commit 5e0b5dcab, as
> pointed by Tom Lane.

This really isn't quite following along in the approach used by
5e0b5dcab, from my viewing of it.  I believe Tom was suggesting that an
essentially opaque value be returned to be included rather than what
you've done which codifies it as 'system_user'.

I'm not a fan of that approach though, as the mapping system we have in
pg_ident is generalized and this should be implemented generally by all
authentication methods which support mappings.

That's also why I was suggesting to get rid of peer_cn in the Port
structure in favor of having the 'system_user' or similar variable and
using it in all of these cases where we provide mapping support- then
all of the auth methods which support mappings would set that value,
including the existing SSL code.  You might need to check and see if
there's anything which depends on peer_cn being NULL for non-SSL
connections and adjust that logic, but hopefully that's not what we're
relying on.  I don't see anything like that on a quick glance through
the peer_cn usage.

Also, it looks like you missed one of the exit cases from
pg_SSPI_recvauth(), no?  You may need to refactor that code a bit to
provide easy access to what the system username used is, or simply make
sure to set the port->system_user value in both paths.

Lastly, are you sure that you have the right memory context for the
pstrdup() calls that you're making?  be-secure-openssl.c goes to some
effort to ensure that the memory allocated for peer_cn is in the
TopMemoryContext, but I don't see anything like that in the code
proposed, which makes me worried you didn't consider which memory
context you were allocating in.

Thanks!

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 19:21 GMT+01:00 Igal @ Lucee.org :

> On 1/27/2016 9:57 AM, Vladimir Sitnikov wrote:
>
>> That is a good question. ChakraCore has been open sourced recently. It
>> might be easier to build under Windows.
>>
> interesting.  but now we will need to write an extension for that, e.g.
> PL/Chakra, which brings back my original question:
> are there any docs as to how to implement a new scripting language? ;)
>

David Fetter wrote some presentation - some years ago was popular to write
own PL

me too

 https://wiki.postgresql.org/images/a/a2/Plpgsql_internals.pdf

source code of plpgsql is good example - it pretty simple

you have to write handler

https://github.com/petere/plsh



I am not sure you would be able to bind high performance java runtime
>> with the backend. There are no that many JREs, and not much of them
>> are good at "in-backend" operation.
>>
>> Thus your question boils down to 2 possibilities:
>> 1) You execute Lucee in some JRE that runs in the backend (frankly
>> speaking, I doubt it is a good way to go)
>>
> yes, that's what I had in mind.  I wasn't thinking of an embedded JRE.
> TBH I didn't think there were any of those until your email.
>
> thanks,
>
>
> Igal
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 19:37 GMT+01:00 Pavel Stehule :

>
>
> 2016-01-27 19:21 GMT+01:00 Igal @ Lucee.org :
>
>> On 1/27/2016 9:57 AM, Vladimir Sitnikov wrote:
>>
>>> That is a good question. ChakraCore has been open sourced recently. It
>>> might be easier to build under Windows.
>>>
>> interesting.  but now we will need to write an extension for that, e.g.
>> PL/Chakra, which brings back my original question:
>> are there any docs as to how to implement a new scripting language? ;)
>>
>
> David Fetter wrote some presentation - some years ago was popular to write
> own PL
>
> me too
>
>  https://wiki.postgresql.org/images/a/a2/Plpgsql_internals.pdf
>
> source code of plpgsql is good example - it pretty simple
>
> you have to write handler
>
> https://github.com/petere/plsh
>

same author

 https://github.com/petere/plhaskell
https://github.com/petere/plxslt

>
>
>
>
> I am not sure you would be able to bind high performance java runtime
>>> with the backend. There are no that many JREs, and not much of them
>>> are good at "in-backend" operation.
>>>
>>> Thus your question boils down to 2 possibilities:
>>> 1) You execute Lucee in some JRE that runs in the backend (frankly
>>> speaking, I doubt it is a good way to go)
>>>
>> yes, that's what I had in mind.  I wasn't thinking of an embedded JRE.
>> TBH I didn't think there were any of those until your email.
>>
>> thanks,
>>
>>
>> Igal
>>
>>
>>
>> --
>> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
>> To make changes to your subscription:
>> http://www.postgresql.org/mailpref/pgsql-hackers
>>
>
>

Re: [HACKERS] Implementing a new Scripting Language


On 1/27/2016 10:41 AM, Pavel Stehule wrote:


2016-01-27 19:37 GMT+01:00 Pavel Stehule >:



David Fetter wrote some presentation - some years ago was popular
to write own PL

me too

https://wiki.postgresql.org/images/a/a2/Plpgsql_internals.pdf

source code of plpgsql is good example - it pretty simple

you have to write handler

https://github.com/petere/plsh


same author

https://github.com/petere/plhaskell
https://github.com/petere/plxslt


Thanks!  I'll take a look.


Igal

Re: [HACKERS] Implementing a new Scripting Language

2016-01-27 Thread Chapman Flack

On 01/27/2016 01:17 PM, Igal @ Lucee.org wrote:

> the next version of Lucee (currently in Beta) does support JSR-223,
> which I actually mentioned as a viable solution in my first email in

Sorry, I jumped in late.

> this thread.  That would be awesome if PL/Java would support JSR-223.

Ok, if your 233 support is already in beta, you'll get there
before we do, but the paths should intersect eventually. :)

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Implementing a new Scripting Language


On 1/27/2016 10:48 AM, Chapman Flack wrote:


Ok, if your 233 support is already in beta, you'll get there
before we do, but the paths should intersect eventually. :)


Actually, once your support for JSR-223 is implemented (it's 
two-twenty-three, not thirty ;)), we will be able to use Javascript 
through that via the Nashorn project, which seems like a better (at 
least as a cross-platform) solution than V8 or Chakra.

http://www.oracle.com/technetwork/articles/java/jf14-nashorn-2126515.html

Now that would be very exciting IMO!


Igal




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pgbench small bug fix