Re: [HACKERS] plpgsql.print_strict_params

2013-09-16 Thread Ian Lawrence Barwick
2013/9/15 Marko Tiikkaja ma...@joh.to:

 Attached an updated patch to fix the tabs and to change this to a
 compile-time option.  Now it's not possible to flexibly disable the feature
 during runtime, but I think that's fine.

I'm taking a look at this patch as part of the current commitfest [*],
(It's the first time I've formally reviewed a patch for a commitfest
so please let me know if I'm missing something.)

[*] https://commitfest.postgresql.org/action/patch_view?id=1221

Submission review
-
* Is the patch in a patch format which has context?
Yes.

* Does it apply cleanly to the current git master?
Yes. No new files are introduced by this patch.

* Does it include reasonable tests, necessary doc patches, etc?
Yes.

However the sample function provided in the documentation throws a
runtime error due to a missing FROM-clause entry.


Usability review

* Does the patch actually implement that?
Yes.

* Do we want that?
It seems like it would be useful; no opposition has been raised on
-hackers so far.

* Do we already have it?
No.

* Does it follow SQL spec, or the community-agreed behavior?
SQL spec: n/a. I do note that it makes use of the # syntax
before the body of a PL/pgSQL function, which is currently only
used for #variable_conflict [*]. I can imagine this syntax might
be used for other purposes down the road, so it might be worth
keeping an eye on it before it becomes a hodge-podge of ad-hoc
options.

[*] http://www.postgresql.org/docs/current/static/plpgsql-implementation.html

* Does it include pg_dump support (if applicable)?
n/a

* Are there dangers?
I don't see any.

* Have all the bases been covered?

This lineis in exec_get_query_params() and exec_get_dynquery_params():

return (no parameters);

Presumably the message will escape translation and this line should be better
written as:
   return _((no parameters));

Also, if the offending query parameter contains a single quote, the output
is not escaped:

postgres=# select get_userid(E'foo''');
ERROR:  query returned more than one row
DETAIL:  p1 = 'foo''
CONTEXT:  PL/pgSQL function get_userid(text) line 9 at SQL statement

Not sure if that's a real problem though.


Feature test


* Does the feature work as advertised?
Yes.

* Are there corner cases the author has failed to consider?
I can't see any.

* Are there any assertion failures or crashes?
No.


Performance review
--

* Does the patch slow down simple tests?
No.

* If it claims to improve performance, does it?
n/a

* Does it slow down other things?
No.


Coding review
-

* Does it follow the project coding guidelines?
Yes.

The functions added in pl_exec.c - exec_get_query_params() and
exec_get_dynquery_params() do strike me as potentially misnamed,
as they don't actually execute anything but are more utility
functions for generating formatted output.

Maybe format_query_params() etc. would be better?


* Are there portability issues?
I don't think so.

* Will it work on Windows/BSD etc?
Tested on OS X and Linux. I don't see anything, e.g. system calls,
which might stop it working on Windows.

* Are the comments sufficient and accurate?
exec_get_query_params() and exec_get_dynquery_params()
could do with some brief comments explaining their purpose.


* Does it do what it says, correctly?
As far as I can tell, yes.

* Does it produce compiler warnings?
No.

* Can you make it crash?
So far, no.


Architecture review
---

* Is everything done in a way that fits together coherently with other
features/modules?
Patch affects PL/pgSQL only.

* Are there interdependencies that can cause problems?
No.


Regards

Ian Barwick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \h open

2013-09-16 Thread Heikki Linnakangas

On 16.09.2013 07:43, Oleg Bartunov wrote:

I noticed there is nothing available in built-in psql help about OPEN
command. Does it intentional ?


That's because there is no OPEN command. PL/pgSQL does have an OPEN 
statement, but that's just PL/pgSQL. And ecpg too, but again that's 
handled by ecpg, not the backend.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command

2013-09-16 Thread Satoshi Nagayasu

(2013/08/08 20:52), Vik Fearing wrote:

As part of routine maintenance monitoring, it is interesting for us to
have statistics on the CLUSTER command (timestamp of last run, and
number of runs since stat reset) like we have for (auto)ANALYZE and
(auto)VACUUM.  Patch against today's HEAD attached.

I would add this to the next commitfest but I seem to be unable to log
in with my community account (I can log in to the wiki).  Help appreciated.


I have reviewed the patch.

Succeeded to build with the latest HEAD, and passed the regression
tests.

Looks good enough, and I'd like to add a test case here, not only
for the view definition, but also working correctly.

Please take a look at attached one.

Regards,
--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp
diff --git a/src/test/regress/expected/stats.out 
b/src/test/regress/expected/stats.out
index 56bace1..ba614b2 100644
--- a/src/test/regress/expected/stats.out
+++ b/src/test/regress/expected/stats.out
@@ -28,7 +28,13 @@ SELECT pg_sleep(2.0);
 CREATE TEMP TABLE prevstats AS
 SELECT t.seq_scan, t.seq_tup_read, t.idx_scan, t.idx_tup_fetch,
(b.heap_blks_read + b.heap_blks_hit) AS heap_blks,
-   (b.idx_blks_read + b.idx_blks_hit) AS idx_blks
+   (b.idx_blks_read + b.idx_blks_hit) AS idx_blks,
+   coalesce(t.last_vacuum, now()) AS last_vacuum,
+   coalesce(t.last_analyze, now()) AS last_analyze,
+   coalesce(t.last_cluster, now()) AS last_cluster,
+   t.vacuum_count,
+   t.analyze_count,
+   t.cluster_count
   FROM pg_catalog.pg_stat_user_tables AS t,
pg_catalog.pg_statio_user_tables AS b
  WHERE t.relname='tenk2' AND b.relname='tenk2';
@@ -111,4 +117,27 @@ SELECT st.heap_blks_read + st.heap_blks_hit = 
pr.heap_blks + cl.relpages,
  t| t
 (1 row)
 
+-- table maintenance stats
+ANALYZE tenk2;
+VACUUM tenk2;
+CLUSTER tenk2 USING tenk2_unique1;
+SELECT pg_sleep(1.0);
+ pg_sleep 
+--
+ 
+(1 row)
+
+SELECT st.last_vacuum  pr.last_vacuum,
+   st.last_analyze  pr.last_analyze,
+   st.last_cluster  pr.last_cluster,
+   st.vacuum_count  pr.vacuum_count,
+   st.analyze_count  pr.analyze_count,
+   st.cluster_count  pr.cluster_count
+  FROM pg_stat_user_tables AS st, prevstats pr
+ WHERE st.relname='tenk2';
+ ?column? | ?column? | ?column? | ?column? | ?column? | ?column? 
+--+--+--+--+--+--
+ t| t| t| t| t| t
+(1 row)
+
 -- End of Stats Test
diff --git a/src/test/regress/sql/stats.sql b/src/test/regress/sql/stats.sql
index bb349b2..71e5e27 100644
--- a/src/test/regress/sql/stats.sql
+++ b/src/test/regress/sql/stats.sql
@@ -22,7 +22,13 @@ SELECT pg_sleep(2.0);
 CREATE TEMP TABLE prevstats AS
 SELECT t.seq_scan, t.seq_tup_read, t.idx_scan, t.idx_tup_fetch,
(b.heap_blks_read + b.heap_blks_hit) AS heap_blks,
-   (b.idx_blks_read + b.idx_blks_hit) AS idx_blks
+   (b.idx_blks_read + b.idx_blks_hit) AS idx_blks,
+   coalesce(t.last_vacuum, now()) AS last_vacuum,
+   coalesce(t.last_analyze, now()) AS last_analyze,
+   coalesce(t.last_cluster, now()) AS last_cluster,
+   t.vacuum_count,
+   t.analyze_count,
+   t.cluster_count
   FROM pg_catalog.pg_stat_user_tables AS t,
pg_catalog.pg_statio_user_tables AS b
  WHERE t.relname='tenk2' AND b.relname='tenk2';
@@ -81,4 +87,20 @@ SELECT st.heap_blks_read + st.heap_blks_hit = pr.heap_blks 
+ cl.relpages,
   FROM pg_statio_user_tables AS st, pg_class AS cl, prevstats AS pr
  WHERE st.relname='tenk2' AND cl.relname='tenk2';
 
+-- table maintenance stats
+ANALYZE tenk2;
+VACUUM tenk2;
+CLUSTER tenk2 USING tenk2_unique1;
+
+SELECT pg_sleep(1.0);
+
+SELECT st.last_vacuum  pr.last_vacuum,
+   st.last_analyze  pr.last_analyze,
+   st.last_cluster  pr.last_cluster,
+   st.vacuum_count  pr.vacuum_count,
+   st.analyze_count  pr.analyze_count,
+   st.cluster_count  pr.cluster_count
+  FROM pg_stat_user_tables AS st, prevstats pr
+ WHERE st.relname='tenk2';
+
 -- End of Stats Test

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cube extension point support // GSoC'13

2013-09-16 Thread Heikki Linnakangas

On 12.07.2013 14:57, Stas Kelvich wrote:

Hello.

here is a patch adding to cube extension support for compressed representation 
of point cubes. If cube is a point, i.e. has coincident lower left and upper 
right corners, than only one corner is stored. First bit of the cube header 
indicates whether the cube is point or not. Few moments:

* Patch preserves binary compatibility with old indices
* All functions that create cubes from user input, check whether it is a point 
or not
* All internal functions that can return cubes takes care of all cases where a 
cube might become a point


Great!

cube_is_point() needs to still handle old-style points. An NDBOX without 
the point-flag set, where the ll and ur coordinates for each dimension 
are the same, still needs to be considered a point. Even if you are 
careful to never construct such structs in the code, they can be present 
on-disk if you have upgraded from an earlier version with pg_upgrade. 
Same in cube_out().



* Added tests for checking correct point behavior


You'll need to adjust all the expected output files, not only cube_1.out.


Also this patch includes adapted Alexander Korotkov's patch with kNN-based 
ordering operator, which he wrote for postgresql-9.0beta1 with knngist patch. 
More info there 
http://www.postgresql.org/message-id/aanlktimhfaq6hcibrnk0tlcqmiyhywhwaq2zd87wb...@mail.gmail.com


To make review easier, it would be better to keep that as a separate 
patch, actually. Could you split it up again, please?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Triggers on foreign tables

2013-09-16 Thread Ronan Dunklau
On Thursday 12 September 2013 12:10:01 Peter Eisentraut wrote:
 The documentation build fails:
 
 openjade:trigger.sgml:72:9:E: end tag for ACRONYM omitted, but OMITTAG
 NO was specified
 openjade:trigger.sgml:70:56: start tag was here

Thank you, I took the time to install a working doc-building environment. 
Please find attached a patch that corrects this.diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index e5ec738..08d133c 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -93,7 +93,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
 
   para
The following table summarizes which types of triggers may be used on
-   tables and views:
+   tables, views and foreign tables:
   /para
 
   informaltable id=supported-trigger-types
@@ -111,7 +111,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalBEFORE//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -122,7 +122,7 @@ CREATE [ CONSTRAINT ] TRIGGER replaceable class=PARAMETERname/replaceable
   entry align=center morerows=1literalAFTER//entry
   entry align=centercommandINSERT//commandUPDATE//commandDELETE//entry
   entry align=centerTables/entry
-  entry align=centerTables and views/entry
+  entry align=centerTables, views and foreign tables/entry
  /row
  row
   entry align=centercommandTRUNCATE//entry
@@ -244,7 +244,7 @@ UPDATE OF replaceablecolumn_name1/replaceable [, replaceablecolumn_name2/
 termreplaceable class=parametertable_name/replaceable/term
 listitem
  para
-  The name (optionally schema-qualified) of the table or view the trigger
+  The name (optionally schema-qualified) of the table, view or foreign table the trigger
   is for.
  /para
 /listitem
diff --git a/doc/src/sgml/trigger.sgml b/doc/src/sgml/trigger.sgml
index f579340..37c0a35 100644
--- a/doc/src/sgml/trigger.sgml
+++ b/doc/src/sgml/trigger.sgml
@@ -33,7 +33,7 @@
para
 A trigger is a specification that the database should automatically
 execute a particular function whenever a certain type of operation is
-performed.  Triggers can be attached to both tables and views.
+performed.  Triggers can be attached to tables, views and foreign tables.
   /para
 
   para
@@ -64,6 +64,15 @@
/para
 
para
+On foreign tables, trigger can be defined to execute either before or after any
+commandINSERT/command,
+commandUPDATE/command or
+commandDELETE/command operation, only once per acronymSQL/acronym statement.
+This means that row-level triggers are not supported for foreign tables.
+   /para
+
+
+   para
 The trigger function must be defined before the trigger itself can be
 created.  The trigger function must be declared as a
 function taking no arguments and returning type literaltrigger/.
@@ -96,6 +105,7 @@
 after may only be defined at statement level, while triggers that fire
 instead of an commandINSERT/command, commandUPDATE/command,
 or commandDELETE/command may only be defined at row level.
+On foreign tables, triggers can not be defined at row level.
/para
 
para
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index adc74dd..68bfa9c 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -3141,13 +3141,16 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
 			pass = AT_PASS_MISC;
 			break;
 		case AT_EnableTrig:		/* ENABLE TRIGGER variants */
-		case AT_EnableAlwaysTrig:
-		case AT_EnableReplicaTrig:
 		case AT_EnableTrigAll:
 		case AT_EnableTrigUser:
 		case AT_DisableTrig:	/* DISABLE TRIGGER variants */
 		case AT_DisableTrigAll:
 		case AT_DisableTrigUser:
+ATSimplePermissions(rel, ATT_TABLE | ATT_FOREIGN_TABLE);
+pass = AT_PASS_MISC;
+break;
+		case AT_EnableReplicaTrig:
+		case AT_EnableAlwaysTrig:
 		case AT_EnableRule:		/* ENABLE/DISABLE RULE variants */
 		case AT_EnableAlwaysRule:
 		case AT_EnableReplicaRule:
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index d86e9ad..d84b61b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -184,12 +184,23 @@ CreateTrigger(CreateTrigStmt *stmt, const char *queryString,
 			RelationGetRelationName(rel)),
 	 errdetail(Views cannot have TRUNCATE triggers.)));
 	}
-	else
+	else if (rel-rd_rel-relkind == RELKIND_FOREIGN_TABLE)
+	{
+		if(stmt-row){
+			ereport(ERROR,
+	(errcode(ERRCODE_WRONG_OBJECT_TYPE),
+	 errmsg(\%s\ is a foreign table,
+			

Re: [HACKERS] Re: [BUGS] BUG #7873: pg_restore --clean tries to drop tables that don't exist

2013-09-16 Thread Satoshi Nagayasu

(2013/07/06 1:16), Pavel Stehule wrote:

I am sending a patch that removes strict requirements for DROP IF
EXISTS statements. This behave is similar to our ALTER IF EXISTS
behave now.


postgres=# DROP CAST IF EXISTS (sss AS public.casttesttype);
NOTICE:  types sss and public.casttesttype does not exist, skipping
DROP CAST
postgres=#   DROP FUNCTION IF EXISTS public.pt_in_widget(point, widget);
NOTICE:  function public.pt_in_widget(point,widget) does not exist, skipping
DROP FUNCTION
postgres=#  DROP OPERATOR IF EXISTS public.% (point, widget);
NOTICE:  operator public.% does not exist, skipping
DROP OPERATOR
postgres=# DROP TRIGGER test_trigger_exists ON no_such_table;
ERROR:  relation no_such_table does not exist
postgres=# DROP TRIGGER IF EXISTS test_trigger_exists ON no_such_table;
NOTICE:  trigger test_trigger_exists for table no_such_table does
not exist, skipping
DROP TRIGGER

This functionality is necessary for correct quite reload from dump
without possible warnings


I'm looking at this patch, and I have a question here.

Should DROP TRIGGER IF EXISTS ignore error for non-existing trigger
and non-existing table? Or just only for non-existing trigger?

I've not understood the pg_restore issue precisely so far,
but IMHO DROP TRIGGER IF EXISTS means if the _trigger_ exists,
not if the _table_ exists.

Is this a correct and/or an expected behavior?

Sorry if I missed some consensus which we already made.

Any comments?

Regards,
--
Satoshi Nagayasu sn...@uptime.jp
Uptime Technologies, LLC. http://www.uptime.jp


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-09-16 Thread Heikki Linnakangas

On 15.09.2013 12:14, Alexander Korotkov wrote:

On Sat, Jun 29, 2013 at 12:56 PM, Heikki Linnakangas
hlinnakan...@vmware.com  wrote:


There's a few open questions:

1. How are we going to handle pg_upgrade? It would be nice to be able to
read the old page format, or convert on-the-fly. OTOH, if it gets too
complicated, might not be worth it. The indexes are much smaller with the
patch, so anyone using GIN probably wants to rebuild them anyway, sooner or
later. Still, I'd like to give it a shot.

2. The patch introduces a small fixed 32-entry index into the packed
items. Is that an optimal number?

3. I'd like to see some performance testing of insertions, deletions, and
vacuum. I suspect that maintaining the 32-entry index might be fairly
expensive, as it's rewritten on every update to a leaf page.


It appears that maintaining 32-entry index is really expensive because it
required re-decoding whole page. This issue is fixed in attached version of
patch by introducing incremental updating of that index. Benchmarks will be
posted later today..


Great! Please also benchmark WAL replay; you're still doing 
non-incremental update of the 32-entry index in ginRedoInsert.


A couple of quick comments after a quick glance (in addition to the above):

The varbyte encoding stuff is a quite isolated piece of functionality. 
And potentially useful elsewhere, too. It would be good to separate that 
into a separate .c/.h files. I'm thinking of 
src/backend/access/gin/packeditemptr.c, which would contain 
ginDataPageLeafReadItemPointer, ginDataPageLeafWriteItemPointer and 
ginDataPageLeafGetItemPointerSize functions. A new typedef for 
varbyte-encoded things would probably be good too, something like 
typedef char *PackedItemPointer. In the new .c file, please also add a 
file-header comment explaining the encoding.


The README really needs to be updated. The posting tree page structure 
is a lot more complicated now, there needs to be a whole new section to 
explain it. A picture would be nice, similar to the one in bufpage.h.


It's a bit funny that we've clumped together all different kinds of GIN 
insertions into one WAL record type. ginRedoInsert does completely 
different things depending on what kind of a page it is. And the 
ginXlogInsert struct also contains different kinds of things depending 
on what kind of an insertion it is. It would be cleaner to split it into 
three. (this isn't your patch's fault - it was like that before, too.)


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN improvements part 1: additional information

2013-09-16 Thread Alexander Korotkov
On Mon, Sep 16, 2013 at 11:43 AM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:

 On 15.09.2013 12:14, Alexander Korotkov wrote:

 On Sat, Jun 29, 2013 at 12:56 PM, Heikki Linnakangas
 hlinnakan...@vmware.com  wrote:

  There's a few open questions:

 1. How are we going to handle pg_upgrade? It would be nice to be able to
 read the old page format, or convert on-the-fly. OTOH, if it gets too
 complicated, might not be worth it. The indexes are much smaller with the
 patch, so anyone using GIN probably wants to rebuild them anyway, sooner
 or
 later. Still, I'd like to give it a shot.

 2. The patch introduces a small fixed 32-entry index into the packed
 items. Is that an optimal number?

 3. I'd like to see some performance testing of insertions, deletions, and
 vacuum. I suspect that maintaining the 32-entry index might be fairly
 expensive, as it's rewritten on every update to a leaf page.


 It appears that maintaining 32-entry index is really expensive because it
 required re-decoding whole page. This issue is fixed in attached version
 of
 patch by introducing incremental updating of that index. Benchmarks will
 be
 posted later today..


 Great! Please also benchmark WAL replay; you're still doing
 non-incremental update of the 32-entry index in ginRedoInsert.

 A couple of quick comments after a quick glance (in addition to the above):

 The varbyte encoding stuff is a quite isolated piece of functionality. And
 potentially useful elsewhere, too. It would be good to separate that into a
 separate .c/.h files. I'm thinking of 
 src/backend/access/gin/**packeditemptr.c,
 which would contain ginDataPageLeafReadItemPointer**,
 ginDataPageLeafWriteItemPointe**r and ginDataPageLeafGetItemPointerS**ize
 functions. A new typedef for varbyte-encoded things would probably be good
 too, something like typedef char *PackedItemPointer. In the new .c file,
 please also add a file-header comment explaining the encoding.

 The README really needs to be updated. The posting tree page structure is
 a lot more complicated now, there needs to be a whole new section to
 explain it. A picture would be nice, similar to the one in bufpage.h.

 It's a bit funny that we've clumped together all different kinds of GIN
 insertions into one WAL record type. ginRedoInsert does completely
 different things depending on what kind of a page it is. And the
 ginXlogInsert struct also contains different kinds of things depending on
 what kind of an insertion it is. It would be cleaner to split it into
 three. (this isn't your patch's fault - it was like that before, too.)


Apparently some bugs breaks index on huge updates independent on
incremental update of the 32-entry. I'm in debugging now. That's why
benchmarks are delayed.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-16 Thread Heikki Linnakangas

On 14.09.2013 02:41, Richard Poole wrote:

The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
on systems that support it. It's based on Christian Kruse's patch from
last year, incorporating suggestions from Andres Freund.


I don't understand the logic in figuring out the pagesize, and the 
smallest supported hugepage size. First of all, even without the patch, 
why do we round up the size passed to mmap() to the _SC_PAGE_SIZE? 
Surely the kernel will round up the request all by itself. The mmap() 
man page doesn't say anything about length having to be a multiple of 
pages size.


And with the patch, why do you bother detecting the minimum supported 
hugepage size? Surely the kernel will choose the appropriate hugepage 
size just fine on its own, no?



It is still WIP as there are a couple of points that Andres has pointed
out to me that haven't been addressed yet;


Which points are those?

I wonder if it would be better to allow setting huge_tlb_pages=try even 
on platforms that don't have hugepages. It would simply mean the same as 
'off' on such platforms.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UNNEST with multiple args, and TABLE with multiple funcs

2013-09-16 Thread Boszormenyi Zoltan

2013-09-13 21:03 keltezéssel, Andrew Gierth írta:

Latest version of patch, incorporating regression tests and docs, and
fixing the operator issue previously raised.


It looks good. I think it's ready for a committer.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2013-09-16 Thread Thom Brown
On 15 September 2013 01:14, Alvaro Herrera alvhe...@2ndquadrant.com wrote:

 Hi,

 Here's a reviewable version of what I've dubbed Minmax indexes.  Some
 people said they would like to use some other name for this feature, but
 I have yet to hear usable ideas, so for now I will keep calling them
 this way.  I'm open to proposals, but if you pick something that cannot
 be abbreviated mm I might have you prepare a rebased version which
 renames the files and structs.

 The implementation here has been simplified from what I originally
 proposed at 20130614222805.gz5...@eldon.alvh.no-ip.org -- in particular,
 I noticed that there's no need to involve aggregate functions at all; we
 can just use inequality operators.  So the pg_amproc entries are gone;
 only the pg_amop entries are necessary.

 I've somewhat punted on the question of doing resummarization separately
 from vacuuming.  Right now, resummarization (as well as other necessary
 index cleanup) takes place in amvacuumcleanup.  This is not optimal; I
 have stated elsewhere that I'd like to create separate maintenance
 actions that can be carried out by autovacuum.  That would be useful
 both for Minmax indexes and GIN indexes (pending insertion list); maybe
 others.  That's not part of this patch, however.

 The design of this stuff is in the file minmax-proposal at the top of
 the tree.  That file is up to date, though it still contains some open
 questions that were present in the original proposal.  (I have not fixed
 some bogosities pointed out by Noah, for instance.  I will do that
 shortly.)  In a final version, that file would be applied as
 src/backend/access/minmax/README, most likely.

 One area on which I needed to modify core code is IndexBuildHeapScan.  I
 needed a version that was able to scan only a certain range of pages,
 not the entire table, so I introduced a new IndexBuildHeapRangeScan, and
 added a quick heap_scansetlimits function.  I haven't tested that this
 works outside of the HeapRangeScan thingy, so it's probably completely
 bogus; I'm open to suggestions if people think this should be
 implemented differently.  In any case, keeping that implementation
 together with vanilla IndexBuildHeapScan makes a lot of sense.

 One thing still to tackle is when to mark ranges as unsummarized.  Right
 now, any new tuple on a page range would cause a new index entry to be
 created and a new revmap update.  This would cause huge index bloat if,
 say, a page is emptied and vacuumed and filled with new tuples with
 increasing values outside the original range; each new tuple would
 create a new index tuple.  I have two ideas about this (1. mark range as
 unsummarized if 3rd time we touch the same page range; 2. vacuum the
 affected index page if it's full, so we can maintain the index always up
 to date without causing unduly bloat), but I haven't implemented
 anything yet.

 The amcostestimate routine is completely bogus; right now it returns
 constant 0, meaning the index is always chosen if it exists.

 There are opclasses for int4, numeric and text.  The latter doesn't work
 at all, because collation info is not passed down at all.  I will have
 to figure that out (even if I find unlikely that minmax indexes have any
 usefulness on top of text columns).  I admit that numeric hasn't been
 tested, and it's quite likely that they won't work; mainly because of
 lack of some datumCopy() calls, about which the code contains some
 /* XXX */ lines.  I think this should be relatively straightforward.
 Ideally, the final version of this patch would contain opclasses for all
 supported datatypes (i.e. the same that have got btree opclasses).

 I have messed up the opclass information, as evidenced by failures in
 opr_sanity regression test.  I will research that later.


 There's working contrib/pageinspect support; pg_xlogdump (and wal_debug)
 seems to work sanely too.
 This patch compiles cleanly under -Werror.

   The research leading to these results has received funding from the
   European Union's Seventh Framework Programme (FP7/2007-2013) under
   grant agreement n° 318633


Thanks for the patch, but I seem to have immediately hit a snag:

pgbench=# CREATE INDEX minmaxtest ON pgbench_accounts USING minmax (aid);
PANIC:  invalid xlog record length 0

-- 
Thom


Re: [HACKERS] plpgsql.print_strict_params

2013-09-16 Thread Marko Tiikkaja

On 9/16/13 8:04 AM, Ian Lawrence Barwick wrote:

I'm taking a look at this patch as part of the current commitfest [*],


Thanks!


However the sample function provided in the documentation throws a
runtime error due to a missing FROM-clause entry.


Ugh.  I'll look into fixing that.


* Does it follow SQL spec, or the community-agreed behavior?
SQL spec: n/a. I do note that it makes use of the # syntax
before the body of a PL/pgSQL function, which is currently only
used for #variable_conflict [*]. I can imagine this syntax might
be used for other purposes down the road, so it might be worth
keeping an eye on it before it becomes a hodge-podge of ad-hoc
options.


Agreed.  I have two patches like this on the commitfest and one more 
cooking, so if more than one of these make it into PG, we should 
probably discuss how the general mechanism should work and look like in 
the future.



* Have all the bases been covered?

This lineis in exec_get_query_params() and exec_get_dynquery_params():

 return (no parameters);

Presumably the message will escape translation and this line should be better
written as:
return _((no parameters));


Nice catch.  Will look into this.  Another option would be to simply 
omit the DETAIL line if there were no parameters.  At this very moment 
I'm thinking that might be a bit nicer.



Also, if the offending query parameter contains a single quote, the output
is not escaped:

postgres=# select get_userid(E'foo''');
ERROR:  query returned more than one row
DETAIL:  p1 = 'foo''
CONTEXT:  PL/pgSQL function get_userid(text) line 9 at SQL statement

Not sure if that's a real problem though.


Hmm..  I should probably look at what happens when the parameters to a 
prepared statement are currently logged and imitate that.



* Does it follow the project coding guidelines?
Yes.

The functions added in pl_exec.c - exec_get_query_params() and
exec_get_dynquery_params() do strike me as potentially misnamed,
as they don't actually execute anything but are more utility
functions for generating formatted output.

Maybe format_query_params() etc. would be better?


Agreed, they could use some renaming.


* Are the comments sufficient and accurate?
exec_get_query_params() and exec_get_dynquery_params()
could do with some brief comments explaining their purpose.


Agreed.


(It's the first time I've formally reviewed a patch for a commitfest
so please let me know if I'm missing something.)


I personally think you did an excellent job.  Huge thanks so far!


Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?

2013-09-16 Thread wangshuo

wangs...@highgo.com.cn wangs...@highgo.com.cn wrote:


I modified the code for this situation.I consider it very simple.



It will does not modify the table file, when the scale has been
increased exclusively.




Kevin Grittner kgri...@ymail.com wrote:

This patch would allow data in a column which was not consistent
with the column definition:

test=# create table n (val numeric(5,2));
CREATE TABLE
test=# insert into n values ('123.45');
INSERT 0 1
test=# select * from n;
  val  

 123.45
(1 row)

test=# alter table n alter column val type numeric(5,4);
ALTER TABLE
test=# select * from n;
  val  

 123.45
(1 row)

Without your patch the ALTER TABLE command gets this error (as it
should):

test=# alter table n alter column val type numeric(5,4);
ERROR:  numeric field overflow
DETAIL:  A field with precision 5, scale 4 must round to an absolute
value less than 10^1.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Thanks for your reply and test.
I had added a new function named ATNumericColumnChangeRequiresCheck to 
check the data

when the scale of numeric increase.
Now,the ALTER TABLE command could prompt this error:

postgres=# alter table tt alter  COLUMN t1 type numeric (5,4);
ERROR:  numeric field overflow
DETAIL:  A field with precision 5, scale 4 must round to an absolute 
value less than 10^1.

STATEMENT:  alter table tt alter  COLUMN t1 type numeric (5,4);
ERROR:  numeric field overflow
DETAIL:  A field with precision 5, scale 4 must round to an absolute 
value less than 10^1.


I packed a new patch about this modification.

I think this  ' altering field  type model ' could modify all the type 
in database.
Make different modification to column‘s datatype for different 
situation.
For example when you modify the scale of numeric, if you think that the 
5.0 and 5.00 is different,

the table file must be rewritten; otherwise, needn't be rewritten.


 Wang Shuo
 HighGo Software Co.,Ltd.
 September 16, 2013
diff -uNr b/src/backend/commands/tablecmds.c a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c	2013-08-31 17:11:00.529744869 +0800
+++ a/src/backend/commands/tablecmds.c	2013-09-16 16:33:49.527455560 +0800
@@ -367,7 +367,10 @@
 	  AlteredTableInfo *tab, Relation rel,
 	  bool recurse, bool recursing,
 	  AlterTableCmd *cmd, LOCKMODE lockmode);
-static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno);
+static void ATNumericColumnChangeRequiresCheck(AlteredTableInfo *tab);
+static bool ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno,
+	int32 oldtypemod, int32 newtypemod,
+			AlteredTableInfo *tab);
 static void ATExecAlterColumnType(AlteredTableInfo *tab, Relation rel,
 	  AlterTableCmd *cmd, LOCKMODE lockmode);
 static void ATExecAlterColumnGenericOptions(Relation rel, const char *colName,
@@ -7480,7 +7483,7 @@
 		newval-expr = (Expr *) transform;
 
 		tab-newvals = lappend(tab-newvals, newval);
-		if (ATColumnChangeRequiresRewrite(transform, attnum))
+		if (ATColumnChangeRequiresRewrite(transform, attnum, attTup-atttypmod, targettypmod, tab))
 			tab-rewrite = true;
 	}
 	else if (transform)
@@ -7520,6 +7523,102 @@
 }
 
 /*
+ * check the data when the scale of numeric increase
+ */
+
+static void
+ATNumericColumnChangeRequiresCheck(AlteredTableInfo *tab)
+{
+	Relation	oldrel;
+	TupleDesc	oldTupDesc;
+	int			i;
+	ListCell   *l;
+	EState	   *estate;
+	ExprContext *econtext;
+	bool	   *isnull;
+	TupleTableSlot *oldslot;
+	HeapScanDesc scan;
+	HeapTuple	tuple;
+	Snapshot	snapshot;
+	List	   *dropped_attrs = NIL;
+	ListCell   *lc;
+	Datum	   *values;
+
+	/*
+	 * Open the relation(s).  We have surely already locked the existing
+	 * table.
+	 */
+
+	oldrel = heap_open(tab-relid, NoLock);
+	oldTupDesc = tab-oldDesc;
+
+	for (i = 0; i  oldTupDesc-natts; i++)
+	{
+		if (oldTupDesc-attrs[i]-attisdropped)
+			dropped_attrs = lappend_int(dropped_attrs, i);
+	}
+	/*
+	 * Generate the constraint and default execution states
+	 */
+
+	estate = CreateExecutorState();	
+
+	foreach(l, tab-newvals)
+	{
+		NewColumnValue *ex = lfirst(l);
+
+		/* expr already planned */
+		ex-exprstate = ExecInitExpr((Expr *) ex-expr, NULL);
+	}
+
+	econtext = GetPerTupleExprContext(estate);
+	oldslot = MakeSingleTupleTableSlot(oldTupDesc);
+	
+	/* Preallocate values/isnull arrays */
+	i = oldTupDesc-natts;
+	values = (Datum *) palloc(i * sizeof(Datum));
+	isnull = (bool *) palloc(i * sizeof(bool));
+	memset(values, 0, i * sizeof(Datum));
+	memset(isnull, true, i * sizeof(bool));
+
+	/*
+	 * Scan through the rows, generating a new row if needed and then
+	 * checking all the constraints.
+	 */
+	snapshot = RegisterSnapshot(GetLatestSnapshot());
+	scan = heap_beginscan(oldrel, snapshot, 0, NULL);
+	while ((tuple = heap_getnext(scan, ForwardScanDirection)) != NULL)
+	{
+		foreach(lc, dropped_attrs)
+			isnull[lfirst_int(lc)] = true;
+
+		heap_deform_tuple(tuple, oldTupDesc, values, 

Re: [HACKERS] Minmax indexes

2013-09-16 Thread Heikki Linnakangas

On 15.09.2013 03:14, Alvaro Herrera wrote:

+ Partial indexes are not supported; since an index is concerned with minimum 
and
+ maximum values of the involved columns across all the pages in the table, it
+ doesn't make sense to exclude values.  Another way to see partial indexes
+ here would be those that only considered some pages in the table instead of 
all
+ of them; but this would be difficult to implement and manage and, most likely,
+ pointless.


Something like this seems completely sensible to me:

create index i_accounts on accounts using minmax (ts) where valid = true;

The situation where that would be useful is if 'valid' accounts are 
fairly well clustered, but invalid ones are scattered all over the 
table. The minimum and maximum stoed in the index would only concern 
valid accounts.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-16 Thread Andres Freund
On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
 On 14.09.2013 02:41, Richard Poole wrote:
 The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
 on systems that support it. It's based on Christian Kruse's patch from
 last year, incorporating suggestions from Andres Freund.
 
 I don't understand the logic in figuring out the pagesize, and the smallest
 supported hugepage size. First of all, even without the patch, why do we
 round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
 will round up the request all by itself. The mmap() man page doesn't say
 anything about length having to be a multiple of pages size.

I think it does:
   EINVAL We don't like addr, length, or offset (e.g., they are  too
  large,  or  not aligned on a page boundary).
and
   A file is mapped in multiples of the page size.  For a file that is not 
a multiple
   of  the  page size, the remaining memory is zeroed when mapped, and 
writes to that
   region are not written out to the file.  The effect of changing the  
size  of  the
   underlying  file  of  a  mapping  on the pages that correspond to added 
or removed
   regions of the file is unspecified.

And no, according to my past experience, the kernel does *not* do any
such rounding up. It will just fail.

 And with the patch, why do you bother detecting the minimum supported
 hugepage size? Surely the kernel will choose the appropriate hugepage size
 just fine on its own, no?

It will fail if it's not a multiple.

 It is still WIP as there are a couple of points that Andres has pointed
 out to me that haven't been addressed yet;
 
 Which points are those?

I don't know which point Richard already has fixed, so I'll let him
comment on that.

 I wonder if it would be better to allow setting huge_tlb_pages=try even on
 platforms that don't have hugepages. It would simply mean the same as 'off'
 on such platforms.

I wouldn't argue against that.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2013-09-16 Thread Chris Travers


 On 16 September 2013 at 11:03 Heikki Linnakangas hlinnakan...@vmware.com
 wrote:


 Something like this seems completely sensible to me:

 create index i_accounts on accounts using minmax (ts) where valid = true;

 The situation where that would be useful is if 'valid' accounts are
 fairly well clustered, but invalid ones are scattered all over the
 table. The minimum and maximum stoed in the index would only concern
 valid accounts.

Here's one that occurs to me:

CREATE INDEX i_billing_id_mm ON billing(id) WHERE paid_in_full IS NOT TRUE;

Note that this would be a frequently moving target and over years of billing,
the subset would be quite small compared to the full system (imagine, say, 50k
rows out of 20M).

Best Wises,
Chris Travers

 - Heikki


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] git apply vs patch -p1

2013-09-16 Thread Andres Freund
On 2013-09-16 10:16:37 +0530, Ashutosh Bapat wrote:
 On Sun, Sep 15, 2013 at 12:38 AM, Andres Freund and...@2ndquadrant.comwrote:
 
  On 2013-09-14 15:03:52 -0400, Andrew Dunstan wrote:
  
   On 09/14/2013 02:37 PM, Josh Berkus wrote:
   Folks,
   
   Lately I've been running into a lot of reports of false conflicts
   reported by git apply.  The most recent one was the points patch,
   which git apply rejected for completely ficticious reasons (it claimed
   that the patch was trying to create a new file where a file already
   existed, which it wasn't).
   
   I think we should modify the patch review and developer instructions to
   recommend always using patch -p1 (or -p0, depending), even if the patch
   was produced with git diff.
   
   Thoughts?
   
  
  
   FWIW that's what I invariably use.
  
   You do have to be careful to git-add/git-rm any added/deleted files,
  which
   git-apply does for you (as well as renames) - I've been caught by that a
   couple of times.
 
  git reset?
 
 
 git reset wouldn't remove the files that were newly added by the patch,
 would it?

Depends on how you do it. I simply commit patches I look at - then they
can easily be removed using git reset --hard HEAD^. And it allows to
make further changes/annotations during review that are clearly
separated from the patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2013-09-16 Thread Andres Freund
On 2013-09-16 11:19:19 +0100, Chris Travers wrote:
 
 
  On 16 September 2013 at 11:03 Heikki Linnakangas hlinnakan...@vmware.com
  wrote:
 
 
  Something like this seems completely sensible to me:
 
  create index i_accounts on accounts using minmax (ts) where valid = true;
 
  The situation where that would be useful is if 'valid' accounts are
  fairly well clustered, but invalid ones are scattered all over the
  table. The minimum and maximum stoed in the index would only concern
  valid accounts.

Yes, I wondered the same myself.

 Here's one that occurs to me:
 
 CREATE INDEX i_billing_id_mm ON billing(id) WHERE paid_in_full IS NOT TRUE;
 
 Note that this would be a frequently moving target and over years of billing,
 the subset would be quite small compared to the full system (imagine, say, 50k
 rows out of 20M).

In that case you'd just use a normal btree index, no?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Not In Foreign Key Constraint

2013-09-16 Thread Misa Simic
Hi hackers,

I just wonder how hard would be to implement something like Not In FK
Constraint or opposite to FK...

i.e:

 FK ensures that value of FK column of inserted row exists in refferenced
Table

 NotInFK should ensure  that value of NotInFK column of inserted row does
not Exist in referenced Table...


The only difference/problem I see is that adding that constraint on an
Table - Forces the same Constraint on another table (but in opposite
direction)


i.e.

TableA(tableA_pk, other_columns)
TableB(tableb_fk_tableA_pk, other_columns)
TableC(tablec_notInfk_tableA_pk, other_column)


each _pk column is Primary Key of its Table
TableB has on PK FK to TableA on the same time...

INSERT INTO TableA VALUES ('tableAPK1', 'somedata')

INSERT INTO TableB VALUES ('tableAPK1'. 'somedata')

everything ok,


now, we would like to Add NotInFK on TableC To TableA

INSERT INTO TableC VALUES ('tableAPK1'. 'somedata')

Should Fail - because of 'tableAPK1' exists in TableA

INSERT INTO TableC VALUES ('tableAPK2'. 'somedata')

Should pass - because of 'tableAPK2'  does not exist in TableA...

How ever, now

INSERT INTO TableA VALUES ('tableAPK2'. 'somedata')

should fail as well - because of that value exists in TableC


I guess that rule can be achieved with triigers on TableA and TableC - but
the same is true for FK (and FK constraint is more effective then trigger -
that is why I wonder would it be useful/achievable to create that kind of
constraint)

Thoughts, ideas?

Many thanks,

Misa








* *


Re: [HACKERS] GUC for data checksums

2013-09-16 Thread Heikki Linnakangas

On 15.09.2013 17:05, Andres Freund wrote:

On 2013-09-15 03:34:53 +0200, Bernd Helmle wrote:



--On 15. September 2013 00:25:34 +0200 Andres Freund
and...@2ndquadrant.com  wrote:


Looks like a good idea to me. The implementation looks sane as well,
except that I am not sure if we really need to introduce that faux
variable. If the variable cannot be set and we have a SHOW hook, do we
need it?


It's along the line with the other informational variables like block_size
et al. Do you want to have a function instead or what's your intention?


Well, you've added a data_checksums variable that won't ever get used,
right? You can't set the variable and the show hook doesn't actually use
it.
The reason you presumably did so is that there is no plain variable that
contains information about data checksums, we first need to read the
control file to know whether it's enabled and GUCs are initialized way
earlier than that.

A quick look unfortunately shows that there's no support for GUCs
without an actual underlying variable, so unless somebody adds that,
there doesn't seem to be much choice.

I think a comment documenting that the data_checksums variable is not
actually used would be appropriate.


Surprisingly we don't have any other gucs that would be set at initdb 
time, and not changed after that. But we used to have two, lc_collate 
and lc_ctype, until we changed them to be per-database settings. We used 
to do this in ReadControlFile:



/* Make the fixed locale settings visible as GUC variables, too */
SetConfigOption(lc_collate, ControlFile-lc_collate,
PGC_INTERNAL, PGC_S_OVERRIDE);
SetConfigOption(lc_ctype, ControlFile-lc_ctype,
PGC_INTERNAL, PGC_S_OVERRIDE);


I did the same for data_checksums, and committed. Thanks for the patch.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GUC for data checksums

2013-09-16 Thread Andres Freund
On 2013-09-16 14:43:27 +0300, Heikki Linnakangas wrote:
 On 15.09.2013 17:05, Andres Freund wrote:
 On 2013-09-15 03:34:53 +0200, Bernd Helmle wrote:
 
 
 --On 15. September 2013 00:25:34 +0200 Andres Freund
 and...@2ndquadrant.com  wrote:
 
 Looks like a good idea to me. The implementation looks sane as well,
 except that I am not sure if we really need to introduce that faux
 variable. If the variable cannot be set and we have a SHOW hook, do we
 need it?
 
 It's along the line with the other informational variables like block_size
 et al. Do you want to have a function instead or what's your intention?
 
 Well, you've added a data_checksums variable that won't ever get used,
 right? You can't set the variable and the show hook doesn't actually use
 it.
 The reason you presumably did so is that there is no plain variable that
 contains information about data checksums, we first need to read the
 control file to know whether it's enabled and GUCs are initialized way
 earlier than that.
 
 A quick look unfortunately shows that there's no support for GUCs
 without an actual underlying variable, so unless somebody adds that,
 there doesn't seem to be much choice.
 
 I think a comment documenting that the data_checksums variable is not
 actually used would be appropriate.
 
 Surprisingly we don't have any other gucs that would be set at initdb time,
 and not changed after that. But we used to have two, lc_collate and
 lc_ctype, until we changed them to be per-database settings. We used to do
 this in ReadControlFile:
 
  /* Make the fixed locale settings visible as GUC variables, too */
  SetConfigOption(lc_collate, ControlFile-lc_collate,
  PGC_INTERNAL, PGC_S_OVERRIDE);
  SetConfigOption(lc_ctype, ControlFile-lc_ctype,
  PGC_INTERNAL, PGC_S_OVERRIDE);
 
 I did the same for data_checksums, and committed. Thanks for the patch.

I don't think it's fatal - as you say we've done so before - but note
that both the committed and Bernd's version don't report the correct
value on postgres -C data_checksums because we haven't read the control
file yet...
Maybe we should do that earlier?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix picksplit with nan values

2013-09-16 Thread Andrew Gierth
 Alexander == Alexander Korotkov aekorot...@gmail.com writes:

 Alexander 2) NaN coordinates should be processed in GiST index scan
 Alexander like in sequential scan.

postgres=# select * from pts order by a - '(0,0)' limit 10;
a 
--
 (1,1)
 (7,nan)
 (9,nan)
 (11,nan)
 (4,nan)
 (nan,6)
 (2,1)
 (1,2)
 (2,2)
 (3,1)
(10 rows)

postgres=# set enable_indexscan=false;
SET

postgres=# select * from pts order by a - '(0,0)' limit 10;
   a   
---
 (1,1)
 (2,1)
 (1,2)
 (2,2)
 (3,1)
 (1,3)
 (3,2)
 (2,3)
 (4,1)
 (1,4)
(10 rows)

this data set was created by:
insert into pts
  select point(i,j)
from (select generate_series(1,100)::float8 union all select 'nan') s1(i),
 (select generate_series(1,100)::float8 union all select 'nan') s2(j)
   order by random();

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UTF8 national character data type support WIP patch and list of open issues.

2013-09-16 Thread MauMau

Hello,

I think it would be nice for PostgreSQL to support national character types 
largely because it should ease migration from other DBMSs.


[Reasons why we need NCHAR]
--
1. Invite users of other DBMSs to PostgreSQL.  Oracle, SQL Server, MySQL, 
etc. all have NCHAR support.  PostgreSQL is probably the only database out 
of major ones that does not support NCHAR.
Sadly, I've read a report from some Japanese government agency that the 
number of MySQL users exceeded that of PostgreSQL here in Japan in 2010 or 
2011.  I wouldn't say that is due to NCHAR support, but it might be one 
reason.  I want PostgreSQL to be more popular and regain those users.


2. Enhance the open image of PostgreSQL by implementing more features of 
SQL standard.  NCHAR may be a wrong and unnecessary feature of SQL standard 
now that we have Unicode support, but it is defined in the standard and 
widely implemented.


3. I have heard that some potential customers didn't adopt PostgreSQL due to 
lack of NCHAR support.  However, I don't know the exact reason why they need 
NCHAR.


4. I guess some users really want to continue to use ShiftJIS or EUC_JP for 
database encoding, and use NCHAR for a limited set of columns to store 
international text in Unicode:

- to avoid code conversion between the server and the client for performance
- because ShiftJIS and EUC_JP require less amount of storage (2 bytes for 
most Kanji) than UTF-8 (3 bytes)
This use case is described in chapter 6 of Oracle Database Globalization 
Support Guide.

--


I think we need to do the following:

[Minimum requirements]
--
1. Accept NCHAR/NVARCHAR as data type name and N'...' syntactically.
This is already implemented.  PostgreSQL treats NCHAR/NVARCHAR as synonyms 
for CHAR/VARCHAR, and ignores N prefix.  But this is not documented.


2. Declare support for national character support in the manual.
1 is not sufficient because users don't want to depend on undocumented 
behavior.  This is exactly what the TODO item national character support 
in PostgreSQL TODO wiki is about.


3. Implement NCHAR/NVARCHAR as distinct data types, not as synonyms so that:
- psql \d can display the user-specified data types.
- pg_dump/pg_dumpall can output NCHAR/NVARCHAR columns as-is, not as 
CHAR/VARCHAR.
- To implement additional features for NCHAR/NVARCHAR in the future, as 
described below.

--




[Optional requirements]
--
1. Implement client driver support, such as:
- NCHAR host variable type (e.g. NCHAR var_name[12];) in ECPG, as 
specified in the SQL standard.
- national character methods (e.g. setNString, getNString, 
setNCharacterStream) as specified in JDBC 4.0.
I think at first we can treat these national-character-specific features as 
the same as CHAR/VARCHAR.


2. NCHAR/NVARCHAR columns can be used in non-UTF-8 databases and always 
contain Unicode data.
I think it is sufficient at first that NCHAR/NVARCHAR columns can only be 
used in UTF-8 databases and they store UTF-8 strings.  This allows us to 
reuse the input/output/send/recv functions and other infrastructure of 
CHAR/VARCHAR.  This is a reasonable compromise to avoid duplication and 
minimize the first implementation of NCHAR support.


3. Store strings in UTF-16 encoding in NCHAR/NVARCHAR columns.
Fixed-width encoding may allow faster string manipulation as described in 
Oracle's manual.  But I'm not sure about this, because UTF-16 is not a real 
fixed-width encoding due to supplementary characters.

--


I don't think it is good to implement NCHAR/NVARCHAR types as extensions 
like contrib/citext, because NCHAR/NVARCHAR are basic types and need 
client-side support.  That is, client drivers need to be aware of the fixed 
NCHAR/NVARCHAR OID values.


How do you think we should implement NCHAR support?

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-16 Thread Heikki Linnakangas

On 16.09.2013 13:15, Andres Freund wrote:

On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:

On 14.09.2013 02:41, Richard Poole wrote:

The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
on systems that support it. It's based on Christian Kruse's patch from
last year, incorporating suggestions from Andres Freund.


I don't understand the logic in figuring out the pagesize, and the smallest
supported hugepage size. First of all, even without the patch, why do we
round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
will round up the request all by itself. The mmap() man page doesn't say
anything about length having to be a multiple of pages size.


I think it does:
EINVAL We don't like addr, length, or offset (e.g., they are  too
   large,  or  not aligned on a page boundary).


That doesn't mean that they *all* have to be aligned on a page boundary. 
It's understandable that 'addr' and 'offset' have to be, but it doesn't 
make much sense for 'length'.



and
A file is mapped in multiples of the page size.  For a file that is not 
a multiple
of  the  page size, the remaining memory is zeroed when mapped, and 
writes to that
region are not written out to the file.  The effect of changing the  
size  of  the
underlying  file  of  a  mapping  on the pages that correspond to added 
or removed
regions of the file is unspecified.

And no, according to my past experience, the kernel does *not* do any
such rounding up. It will just fail.


I wrote a little test program to play with different values (attached). 
I tried this on my laptop with a 3.2 kernel (uname -r: 3.10-2-amd6), and 
on a VM with a fresh Centos 6.4 install with 2.6.32 kernel 
(2.6.32-358.18.1.el6.x86_64), and they both work the same:


$ ./mmaptest 100 # mmap 100 bytes

in a different terminal:
$ cat /proc/meminfo  | grep HugePages_Rsvd
HugePages_Rsvd:1

So even a tiny allocation, much smaller than any page size, succeeds, 
and it reserves a huge page. I tried the same with larger values; the 
kernel always uses huge pages, and rounds up the allocation to a 
multiple of the huge page size.


So, let's just get rid of the /sys scanning code.

Robert, do you remember why you put the pagesize = 
sysconf(_SC_PAGE_SIZE); call in the new mmap() shared memory allocator?


- Heikki
#include sys/mman.h
#include stdio.h
#include errno.h
#include string.h

int main(int argc, char **argv)
{
	char *ptr;
	int size;

	size = (argc  1) ? atoi(argv[1]) : (100 * 4096);

	ptr = mmap(NULL, size, PROT_READ | PROT_WRITE,
			   MAP_SHARED | MAP_ANONYMOUS | MAP_HUGETLB, -1, 0);

	if (ptr != (void *) -1)
		printf(success: %p\n, ptr);
	else
		printf(failure: %s\n, strerror(errno));

	sleep(10);

	return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-16 Thread Andres Freund
On 2013-09-16 16:13:57 +0300, Heikki Linnakangas wrote:
 On 16.09.2013 13:15, Andres Freund wrote:
 On 2013-09-16 11:15:28 +0300, Heikki Linnakangas wrote:
 On 14.09.2013 02:41, Richard Poole wrote:
 The attached patch adds the MAP_HUGETLB flag to mmap() for shared memory
 on systems that support it. It's based on Christian Kruse's patch from
 last year, incorporating suggestions from Andres Freund.
 
 I don't understand the logic in figuring out the pagesize, and the smallest
 supported hugepage size. First of all, even without the patch, why do we
 round up the size passed to mmap() to the _SC_PAGE_SIZE? Surely the kernel
 will round up the request all by itself. The mmap() man page doesn't say
 anything about length having to be a multiple of pages size.
 
 I think it does:
 EINVAL We don't like addr, length, or offset (e.g., they are  too
large,  or  not aligned on a page boundary).
 
 That doesn't mean that they *all* have to be aligned on a page boundary.
 It's understandable that 'addr' and 'offset' have to be, but it doesn't make
 much sense for 'length'.
 
 and
 A file is mapped in multiples of the page size.  For a file that is 
  not a multiple
 of  the  page size, the remaining memory is zeroed when mapped, and 
  writes to that
 region are not written out to the file.  The effect of changing the  
  size  of  the
 underlying  file  of  a  mapping  on the pages that correspond to 
  added or removed
 regions of the file is unspecified.
 
 And no, according to my past experience, the kernel does *not* do any
 such rounding up. It will just fail.
 
 I wrote a little test program to play with different values (attached). I
 tried this on my laptop with a 3.2 kernel (uname -r: 3.10-2-amd6), and on a
 VM with a fresh Centos 6.4 install with 2.6.32 kernel
 (2.6.32-358.18.1.el6.x86_64), and they both work the same:
 
 $ ./mmaptest 100 # mmap 100 bytes
 
 in a different terminal:
 $ cat /proc/meminfo  | grep HugePages_Rsvd
 HugePages_Rsvd:1
 
 So even a tiny allocation, much smaller than any page size, succeeds, and it
 reserves a huge page. I tried the same with larger values; the kernel always
 uses huge pages, and rounds up the allocation to a multiple of the huge page
 size.

When developing the prototype I am pretty sure I had to add the rounding
up - but I am not sure why now, because after chatting with Heikki about
it, I've looked around and the initial MAP_HUGETLB support in the kernel
(commit 4e52780d41a741fb4861ae1df2413dd816ec11b1) has support for
rounding up.

 So, let's just get rid of the /sys scanning code.

Alternatively we could round up NBuffers to actually use the
additionally allocated space. Not sure if that's worth the amount of
code, but wasting several megabytes - or even gigabytes - of memory
isn't nice either.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: add MAP_HUGETLB to mmap() where supported (WIP)

2013-09-16 Thread Andres Freund
On 2013-09-16 15:18:50 +0200, Andres Freund wrote:
  So even a tiny allocation, much smaller than any page size, succeeds, and it
  reserves a huge page. I tried the same with larger values; the kernel always
  uses huge pages, and rounds up the allocation to a multiple of the huge page
  size.
 
 When developing the prototype I am pretty sure I had to add the rounding
 up - but I am not sure why now, because after chatting with Heikki about
 it, I've looked around and the initial MAP_HUGETLB support in the kernel
 (commit 4e52780d41a741fb4861ae1df2413dd816ec11b1) has support for
 rounding up.

Ok, the reason for that seems to have been the following bug
https://bugzilla.kernel.org/show_bug.cgi?id=56881

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-16 Thread Merlin Moncure
On Sat, Sep 14, 2013 at 9:27 PM, chris travers ch...@2ndquadrant.com wrote:
 Hi all;

 Currently json_populate_record and json_populate_recordset cannot work with
 nested json objects.  This creates two fundamental problems when trying to
 use JSON as an interface format.

 The first problem is you can't easily embed a json data type in an json
 object and have it populate a record.  This means that storing extended
 attributes in the database is somewhat problematic if you accept the whole
 row in as a json object.

 The second problem is that nested data structures and json don't go together
 well.  You can't have  a composite type which has as an attribute an array
 of another composite type and populate this from a json object.  This makes
 json largely an alternative to hstore for interfaces in its current shape.

 I would propose handling the json_populate_record and friends as such:

 1. Don't throw errors initially as a pre-check if the json object is nested.
 2. If one comes to a nested fragment, check the attribute type it is going
 into first.
 2.1 If it is a json type, put the nested fragment there.
 2.2 If it is a composite type (i.e. anything in pg_class), push it
 through another json_populate_record run
 2.3 If it is neither, then see if a json::[type] cast exists, if so call
 it.
 2.4 Otherwise raise an exception

 I have a few questions before I go on to look at creating a patch.

 1.  Are there any problems anyone spots with this approach?

 2.  Is anyone working on something like this?

 3.  Would it be preferable to build something like this first as an
 extension (perhaps with different function names) or first as a patch?

Huge +1 on on this.  Couple random thoughts:

*) Hard to see how you would structure this as an extension as you're
adjusting the behaviors of existing functions, unless you wanted to
introduce new function names for testing purposes?

*) Would like to at least consider being able to use casting syntax as
a replacement for populate_record and (the missing) populate_array
for most usages.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertions in PL/PgSQL

2013-09-16 Thread Peter Eisentraut
On 9/15/13 10:49 AM, Marko Tiikkaja wrote:
 On 2013-09-15 16:34, Peter Eisentraut wrote:
 On Sat, 2013-09-14 at 20:47 +0200, Marko Tiikkaja wrote:
 Attached is a patch for supporting assertions in PL/PgSQL.  These are
 similar to the Assert() backend macro: they can be disabled during
 compile time, but when enabled, abort execution if the passed expression
 is not true.

 Doesn't build:
 
 Ugh.  Accidentally edited an auto-generated file.  Fixed in the
 attached, thanks!

Please fix the tabs in the SGML files.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Proposal: UPDATE/DELETE ... WHERE OFFSET n OF cursor_name, was: Re: New ECPG idea, was: Re: [HACKERS] ECPG FETCH readahead

2013-09-16 Thread Boszormenyi Zoltan

Hi,

2013-08-17 13:02 keltezéssel, Boszormenyi Zoltan írta:
[snip, discussion of WHERE CURRENT OF in the ECPG client lib]

I had a second thought about it and the client side caching and
parser behind the application's back seems to be an overkill.

Instead, I propose a different solution, which is a logical extension of
FETCH { FORWARD | BACKWARD } N, which is a PostgreSQL extension.

The proposed solution would be:

UPDATE / DELETE ... WHERE OFFSET SignedIconst OF cursor_name

I imagine that FETCH would keep the array of TIDs/ItemPointerDatas
of the last FETCH statement.

The argument to OFFSET would be mostly in negative terms,
with 0 being equivalent of WHERE CURRENT OF.

E.g.:

FETCH 2 FROM mycur; -- fetches two rows
UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the first row
UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates current row

or

FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor
UPDATE mytab SET ... WHERE OFFSET -2 OF mycur; -- updates the first row
UPDATE mytab SET ... WHERE OFFSET -1 OF mycur; -- updates the second row
UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- throws an error like WHERE 
CURRENT OF

or

FETCH 3 FROM mycur; -- fetches two rows, reaches end of the cursor
MOVE BACKWARD 2 IN mycur;
UPDATE mytab SET ... WHERE OFFSET 0 OF mycur; -- updates the first row (now 
current)
UPDATE mytab SET ... WHERE OFFSET 1 OF mycur; -- updates the second row

The cached array can be kept valid until the next FETCH statement,
even if  moves out of the interval of the array, except in case the
application changes the sign of the cursor position, e.g. previously it used
MOVE ABSOLUTE with positive numbers and suddenly it switches to
backward scanning with MOVE ABSOLUTE negative number or vice-versa.

This would solve the only source of slowdown in the client side cursor caching
in ECPG present in my current ECPG cursor readahead patch, there would be
no more MOVE + UPDATE/DELETE WHERE CURRENT OF. On the other hand,
exploiting this proposed feature in ECPG would make it incompatible with older
servers unless it detects the server version it connects to and uses the current
method.

Comments?

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] insert throw error when year field len 4 for timestamptz datatype

2013-09-16 Thread Haribabu kommi
On 14 August 2013 Rushabh Lathia wrote:

postgres=# create table test ( a timestamptz);
CREATE TABLE

-- Date with year 1000
postgres=#  insert into test values ( 'Sat Mar 11 23:58:48 1000 IST');
INSERT 0 1

-- Now try with year 1 it will return error
postgres=#  insert into test values ( 'Sat Mar 11 23:58:48 1 IST');
ERROR:  invalid input syntax for type timestamp with time zone: Sat Mar 11 
23:58:48 1 IST
LINE 1: insert into test values ( 'Sat Mar 11 23:58:48 1 IST');

here error coming from timestamptz_in() - datefields_to_timestamp() -
DecodeDateTime() stack.

Looking more at the DecodeDateTime() function, here error coming while trying
to Decode year field which is 1 in the our test. For year field ftype is
DTK_NUMBER, and under DTK_NUMBER for this case if drop in to following 
condition:

else if (flen  4)
{
dterr = DecodeNumberField(flen, field[i], fmask,
 tmask, tm,
 fsec, is2digits);
if (dterr  0)
return dterr;
}

because flen in out case flen is 5 (1).

As per the comment above DecodeNumberField(), it interpret numeric string as a
concatenated date or time field. So ideally we should be into DecodeNumberField
function only with (fmask  DTK_DATE_M) == 0 or (fmask  DTK_TIME_M) == 0,
right ??

So, I tried the same and after that test working fine.

PFA patch and share your input/suggestions.

Patch applies cleanly to HEAD. As this patch tries to improve in inserting the 
date of the year value to be more than 4 in length.
But it didn't solve all the ways to insert the year field more than 4 in 
length. Please check the following test.


postgres=# insert into test values ('10001010 10:10:10 IST');
INSERT 0 1
postgres=# insert into test values ('100011010 10:10:10 IST');
ERROR:  invalid input syntax for type timestamp with time zone: 100011010 
10:10:10 IST at character 26
STATEMENT:  insert into test values ('100011010 10:10:10 IST');
ERROR:  invalid input syntax for type timestamp with time zone: 100011010 
10:10:10 IST
LINE 1: insert into test values ('100011010 10:10:10 IST');
^

I feel it is better to provide the functionality of inserting year field more 
than 4 in length in all flows.

Regards,
Hari babu.


Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-16 Thread Chris Travers


 On 16 September 2013 at 14:43 Merlin Moncure mmonc...@gmail.com wrote:


 Huge +1 on on this. Couple random thoughts:

 *) Hard to see how you would structure this as an extension as you're
 adjusting the behaviors of existing functions, unless you wanted to
 introduce new function names for testing purposes?

Yeah, and reading the source, it looks like some parts of the JSON parsing code
will have to be rewritten because the nested object errors are thrown quite
deeply in the parsing stage.  It looks to me as if this will require some
significant copying as a POC into a new file with different publicly exposed
function names.

 *) Would like to at least consider being able to use casting syntax as
 a replacement for populate_record and (the missing) populate_array
 for most usages.

Yes.  I am trying to figure out how best to do this at present.  Initially I
think I would be happy to settle for casts wrapping functions which themselves
just wrap the call to populate_record.

What I will probably do for my POC is expose the following methods:

1.  json_populate_type()
2.  json_populate_array()

Then we can talk about whether to merge the changes into the core,

This will be an interesting way to get into PostgreSQL hacking.

Best Wishes,
Chris Travers


 merlin


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers
Best Wishes,
Chris Travers
http://www.2ndquadrant.com
PostgreSQL Services, Training, and Support

Re: [HACKERS] Freezing without write I/O

2013-09-16 Thread Heikki Linnakangas

On 27.08.2013 19:37, Heikki Linnakangas wrote:

On 27.08.2013 18:56, Heikki Linnakangas wrote:

Here's an updated patch.


Ah, forgot one thing:

Here's a little extension I've been using to test this. It contains two
functions; one to simply consume N xids, making it faster to hit the
creation of new XID ranges and wraparound. The other,
print_xidlsnranges(), prints out the contents of the current XID-LSN
range array.

Also, just ran into two silly bugs in the patch version I posted, while
checking that that xidtest extension hasn't bitrotted. A fixed version
has been pushed to the git repository, so make sure you use that version
if you want to actually run it.


Here's a rebased version of the patch, including the above-mentioned 
fixes. Nothing else new.


- Heikki

--
- Heikki


xid-lsn-ranges-3.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Hannu Krosing ha...@2ndquadrant.com wrote:

 Lots of people were bitten when (undocumented) hash
 functions were changed thus breaking hash-based partitioning.

Nobody can be affected by the new operators in this patch unless
they choose to use them to compare two records.  They don't work
for any other type and they don't come into play unless you
specifically request them.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-16 Thread Merlin Moncure
On Mon, Sep 16, 2013 at 8:57 AM, Chris Travers ch...@2ndquadrant.com wrote:
 On 16 September 2013 at 14:43 Merlin Moncure mmonc...@gmail.com wrote:


 Huge +1 on on this. Couple random thoughts:

 *) Hard to see how you would structure this as an extension as you're
 adjusting the behaviors of existing functions, unless you wanted to
 introduce new function names for testing purposes?

 Yeah, and reading the source, it looks like some parts of the JSON parsing
 code will have to be rewritten because the nested object errors are thrown
 quite deeply in the parsing stage.  It looks to me as if this will require
 some significant copying as a POC into a new file with different publicly
 exposed function names.

ISTM that's not worth it then.

 *) Would like to at least consider being able to use casting syntax as
 a replacement for populate_record and (the missing) populate_array
 for most usages.

 Yes.  I am trying to figure out how best to do this at present.  Initially I
 think I would be happy to settle for casts wrapping functions which
 themselves just wrap the call to populate_record.

right.

 What I will probably do for my POC is expose the following methods:

 1.  json_populate_type()

hm if you're going to name it that way, prefer json_populate_value().
(or maybe _scalar() or _datum()).  but we have to_json(), so how about
from_json()?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-15 19:49:26 -0400, Noah Misch wrote:
 On Sat, Sep 14, 2013 at 08:58:32PM +0200, Andres Freund wrote:
  On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote:
   Andres Freund and...@2ndquadrant.com wrote:
But both arrays don't have the same binary representation since
the former has a null bitmap, the latter not. So, if you had a
composite type like (int4[]) and would compare that without
invoking operators you'd return something false in some cases
because of the null bitmaps.
   
   Not for the = operator.  The new identical operator would find
   them to not be identical, though.
  
  Yep. And I think that's a problem if exposed to SQL. People won't
  understand the hazards and end up using it because its faster or
  somesuch.
 
 The important question is whether to document the new operator and/or provide
 it under a guessable name.  If we give the operator a weird name, don't
 document it, and put an internal use only comment in the catalogs, that is
 essentially as good as hiding this feature at the SQL level.

Doesn't match my experience.

 Type-specific identity operators seem like overkill, anyway.  If we find that
 meaningless variations in a particular data type are causing too many false
 non-matches for the generic identity operator, the answer is to make the
 functions generating datums of that type settle on a canonical form.  That
 would be the solution for your example involving array null bitmaps.

I think that's pretty much unrealistic. I am pretty sure that if either
of us starts looking we will find at about a dozen of such cases and
miss the other dozen. Not to speak about external code which is damn
likely to contain such cases.
And I think that efficiency will often make such normalization expensive
(consider postgis where Datums afaik can exist with an internal bounding
box or without).

I think it's far more realistic to implement an identity operator that
will fall back to a type specific operator iff equals has strange
properties.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-09-16 Thread Andres Freund
On 2013-08-29 10:39:09 -0400, Robert Haas wrote:
 I have been of the opinion for some time now that the
 shared-invalidation code is not a particularly good design for much of
 what we need.  Waiting for an old snapshot is often a proxy for
 waiting long enough that we can be sure every other backend will
 process the shared-invalidation message before it next uses any of the
 cached data that will be invalidated by that message.  However, it
 would be better to be able to send invalidation messages in some way
 that causes them to processed more eagerly by other backends, and that
 provides some more specific feedback on whether or not they have
 actually been processed.  Then we could send the invalidation
 messages, wait just until everyone confirms that they have been seen,
 which should hopefully happen quickly, and then proceed.

Actually, the shared inval code already has that knowledge, doesn't it?
ISTM all we'd need is have a sequence number of SI entries which has
to be queryable. Then one can simply wait till all backends have
consumed up to that id which we keep track of the furthest back backend
in shmem.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about checksum feature in 9.3

2013-09-16 Thread Ants Aasma
On Sun, Sep 15, 2013 at 8:13 AM, Kevin k...@gatorgraphics.com wrote:
 My attempts to compile it vectorized on OS X seemed to have failed since I 
 don't find a vector instruction in the .o file even though the options 
 -msse4.1 -funroll-loops -ftree-vectorize should be supported according to the 
 man page for Apple's llvm-gcc.

I'm not sure what version of LLVM Apple is using for llvm-gcc. I know
that clang+llvm 3.3 can successfully vectorize the checksum algorithm
when -O3 is used.

 So, has anyone compiled checksum vectorized on OS X? Are there any 
 performance data that would indicate whether or not I should worry with this 
 in the first place?

Even without vectorization the worst case performance hit is about
20%. This is for a workload that is fully bottlenecked on swapping
pages in between shared buffers and OS cache. In real world cases it's
hard to imagine it having any measurable effect. A single core can
checksum several gigabytes per second of I/O without vectorization,
and about 30GB/s with vectorization.

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Questions about checksum feature in 9.3

2013-09-16 Thread David Johnston
Ants Aasma-2 wrote
 So, has anyone compiled checksum vectorized on OS X? Are there any
 performance data that would indicate whether or not I should worry with
 this in the first place?
 
 Even without vectorization the worst case performance hit is about
 20%. This is for a workload that is fully bottlenecked on swapping
 pages in between shared buffers and OS cache. In real world cases it's
 hard to imagine it having any measurable effect. A single core can
 checksum several gigabytes per second of I/O without vectorization,
 and about 30GB/s with vectorization.

Thoughts on how/where to provide guidance as to this kind of concern.  The
single paragraph in the initdb documentation seems to be lacking.  Would a
destination page on the wiki, linked to from the documentation, where
current knowledge regarding benchmarks and caveats can be stored, be
appropriate.

To that end, Ants, do you actually have some resources and/or benchmarks
which support your claim and that you can provide links to?

The single core aspect is interesting.  Does the implementation have a
dedicated core to perform these calculations or must the same thread that
handles the relevant query perform this work as well?  How much additional
impact/overhead does having to multitask have on the maximum throughput of a
single core in processing checksums?

This whole vectorization angle also doesn't seem to be in the
documentation...though I didn't look super hard.

David J.



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Questions-about-checksum-feature-in-9-3-tp5770936p5771100.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Merlin Moncure
On Sun, Sep 15, 2013 at 6:49 PM, Noah Misch n...@leadboat.com wrote:
 On Sat, Sep 14, 2013 at 08:58:32PM +0200, Andres Freund wrote:
 On 2013-09-14 11:25:52 -0700, Kevin Grittner wrote:
  Andres Freund and...@2ndquadrant.com wrote:
   But both arrays don't have the same binary representation since
   the former has a null bitmap, the latter not. So, if you had a
   composite type like (int4[]) and would compare that without
   invoking operators you'd return something false in some cases
   because of the null bitmaps.
 
  Not for the = operator.  The new identical operator would find
  them to not be identical, though.

 Yep. And I think that's a problem if exposed to SQL. People won't
 understand the hazards and end up using it because its faster or
 somesuch.

 The important question is whether to document the new operator and/or provide
 it under a guessable name.  If we give the operator a weird name, don't
 document it, and put an internal use only comment in the catalogs, that is
 essentially as good as hiding this feature at the SQL level.

 I'm of two minds on that question.  On the one hand, MV maintenance is hardly
 the first use case for an identity operator.  Any replication system or user
 space materialized view implementation might want this.  On the other hand,
 offering it for the record type exclusively is surprising.  It's also
 surprising how records with different numbers of dropped columns can be found
 identical, even though a record column within the top-level record is not
 permitted to vary that way.

 Supposing a decision to document the operator, a second question is whether
 === is the right name:

I vote to reserve '===' as shorthand for 'IS NOT DISTINCT FROM' and
give the binary equality operator a funky name.  I would document the
operator though.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-09-16 Thread Andres Freund
Hi,

Looking at this version of the patch now:
1) comment for Phase 4 of REINDEX CONCURRENTLY ends with an incomplete
sentence.

2) I don't think the drop algorithm used now is correct. Your
index_concurrent_set_dead() sets both indisvalid = false and indislive =
false at the same time. It does so after doing a WaitForVirtualLocks() -
but that's not sufficient. Between waiting and setting indisvalid =
false another transaction could start which then would start using that
index. Which will not get updated anymore by other concurrent backends
because of inislive = false.
You really need to follow index_drop's lead here and first unset
indisvalid then wait till nobody can use the index for querying anymore
and only then unset indislive.

3) I am not sure if the swap algorithm used now actually is correct
either. We have mvcc snapshots now, right, but we're still potentially
taking separate snapshot for individual relcache lookups. What's
stopping us from temporarily ending up with two relcache entries with
the same relfilenode?
Previously you swapped names - I think that might end up being easier,
because having names temporarily confused isn't as bad as two indexes
manipulating the same file.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:

 I think it's far more realistic to implement an identity operator
 that will fall back to a type specific operator iff equals has
 strange properties.

My biggest problem with that approach is that it defaults to
incorrect behavior for types for which user-visible differences
among equal values are possible, and requires per-type fixes to
get correct behavior.  The approach in this patch provides correct
behavior as the default, with possible per-type changes for
optimization, as people find that desirable.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve setup for documentation building with FOP

2013-09-16 Thread Alvaro Herrera
Peter Eisentraut wrote:
 In order to modernize our documentation tool chain, I have made a few
 tweaks to allow using FOP to build the PDF documentation.  I'd like to
 get some testing on different operating systems and versions thereof in
 order to learn whether this solution is robust and easily accessible by
 users and developers, and also whether the resulting documents are
 correct and free of major rendering problems.  I'll add it to the
 commitfest for that.  Reviewers should ideally just have to install the
 fop package that is hopefully supplied by their operating systems and
 then run the make targets mentioned in the patch.  I'm aware of a few
 potential pitfalls, but I'd rather not mention them yet in order to get
 unbiased reports.

The FOP-based build works fine for me.  I gave the output a look.  I
like that text formatted with fixed-width font wraps at the right
margin, instead of continuing beyond it; there are some strange
artifacts about it (such as addition of hyphens in some places, say in
the middle of a configure option); but I think it's nicer than the
output of the other toolchain nevertheless.  This willingness to break
in the middle of pre formatted elements looks weird in some places;
for example table 17.2 SSL Server File Usage; old output puts the option
name on first line, file path on second line; new output puts option
name on top, file name is split in half.

In the US size PDF, look at page 386; below the gmake install-world
there's the bottom-of-page horizontal line, but the paragraph continues
below that.  Bug?

Section 17.8 Encryption Options is formatted differently; previous
method used indentation for the paragraph after each item, new one uses
a table.  I like the old method better.  This is notoriously bad in the
superuser_reserved_connections entry in 18.3.1. Connection Settings,
and others.  Also, the synopsis in PQconnectStartParams entry in 31.1
Database Connection Control Functions looks a lot worse.  This
manifests in many places, and while it's an improvement in a minority of
them, I think it's a net loss overall.

I like that the new output has horizontal lines at top and bottom of the
text area of the page.  In the old method, the page heading (above that
line) contains the chapter number and title, while in FOP it only has
title (no number).  I find that number useful.  Also, limiting the space
for the title and wrapping if the title is too long seems pointless; see
example of this in chapter High Availability, Load Balancing and
Replication, where even the word Bal-ancing has been hyphenated.

Formatting of note, tip and such seems a lot better in FOP.  For
warning, the old method used a surrounding box; the new one just has a
Warning title and indented paragraph, just like for note et al.
Doesn't look like a problem to me.  It'd be nice to have pretty icons
for these areas.

Formatting of ref links is nicer: for instance, you get something like
  which can make those operations much faster (see
  Section 14.4.7, “Disable WAL Archival and Streaming Replica-
  tion”).
instead of 
  which can make those operations much faster (see Section 14.4.7).
Not sure if there are cases in which this can be a problem.

[aside: I really wish we didn't paste verbatim psql output in SGML docs ]

sidebar renders as grey-background box in FOP, white-background in old
tool.  Not sure about this; it looks like the only place with grey
background in the whole manual.  We only have one sidebar in the
entire SGML source.

Example 19.1 Example pg_hba.conf Entries is completely broken (no page
break); renders normally in old method.  There are other cases of this,
including libpq sample programs and ECPG.

url renders differently: it used to produce a footnote.  FOP instead
puts the link text inline.  Not really sure about this.

The table 25.1 High Availability, Load Balancing, and Replication
Feature Matrix contains a lot of [bull], which look odd.  Presumably
an image of a bull head would be better?

Tables 27-13 and 27-14 are misformatted in both implementations.  Surely
we can do better than overlaying the contents of cells ...

The START_REPLICATION stuff in the Frontend/Backend Protocol chapter is
completely broken.  Maybe wrong markup?  Also in StartupMessage.

Seems author resulted in nothing in the old toolchain, but now it does
print the name of the author.  There are only two authors mentioned, in
NLS and GEQO, though.  In the GEQO case it's a bit funny because the
author is now mentioned twice.

Speaking of GEQO: the links in its 53.4 Further Reading section don't
look well in the FOP.  And the bibliography looks completely
different.

Oh, the index at the end is not output.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Hannu Krosing
On 09/16/2013 04:01 PM, Kevin Grittner wrote:
 Hannu Krosing ha...@2ndquadrant.com wrote:

 Lots of people were bitten when (undocumented) hash
 functions were changed thus breaking hash-based partitioning.
 Nobody can be affected by the new operators in this patch unless
 they choose to use them to compare two records.  They don't work
 for any other type and they don't come into play unless you
 specifically request them.
What I meant is that rather than leave it really undocumented,
document it as system function for specific usage, has caveats
and may change in future versions. use at your own risk and
make sure you know what you are doing

PostgreSQL has good enough introspection features that people
tend to find functions and operators using psql-s \d ...

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company




-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Where to load modules from?

2013-09-16 Thread Greg Stark
On 15 Sep 2013 18:55, Andrew Dunstan and...@dunslane.net wrote:


 On 09/15/2013 05:52 PM, Jeff Janes wrote:

 On Sun, Sep 15, 2013 at 6:51 AM, Peter Eisentraut pete...@gmx.netmailto:
pete...@gmx.net wrote:

 On Sat, 2013-09-14 at 22:15 +0200, Dimitri Fontaine wrote:
 
  This proposal comes with no patch because I think we are able to
  understand it without that, so that it would only be a waste of
  everybody's time to attach code for a random solution on the
 list here
  to that email.

 It shouldn't be in the commit fest if it has no patch.


 I thought the general recommendation was the opposite, that planning and
road maps should be submitted for review before non-trivial coding is
started; and that despite the name the commitfest is the best way that this
is done. Of course now I can't find the hackers thread where this
recommendation was made...



 It is unquestionably correct that roadmaps and planning should be made
available for review and discussion. But the assertion that this should be
done via the commitfest is not. The commitfest app has never been for
anything other than code, that I am aware of, and I am quite sure you will
find fierce resistance to any notion that design discussions should take
place anywhere but on this mailing list.

Well the code reviews should also go via the list so that's neither here
nor there.

One of the original problems the commitfest was aiming to solve was Tay
people would had be a project, make some tentative progress, ask if they're
on the right track or how to tackle some problem, hear nothing until
feature freeze at which point the original author had moved on and dropped
the project.

In other words, forcing the issue is one of the original design goals of
commitfests.


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Hannu Krosing ha...@2ndquadrant.com wrote:

 What I meant is that rather than leave it really undocumented,
 document it as system function for specific usage, has caveats
 and may change in future versions. use at your own risk and
 make sure you know what you are doing

Well, that was my original assumption and intention; but when I
went to look for where the operators for record *equals* were
defined, I found that we had apparently chosen to leave them
undocumented.  Oddly, under a section titled Row-wise Comparison
we only document the behavior of comparisons involving what the SQL
spec calls row value constructor.   I asked whether that was
intentional, and heard only the chirping of crickets:

http://www.postgresql.org/message-id/1378848776.70700.yahoomail...@web162902.mail.bf1.yahoo.com

If we choose not to document the equals operator for records, it
hardly makes sense to document the identical operator for records.

 PostgreSQL has good enough introspection features that people
 tend to find functions and operators using psql-s \d ...

One would think so, yet I don't recall seeing anyone posting
regarding the existing undocumented record comparison operators. 
Nor do I recall seeing anyone posting about the undocumented
pattern comparison operators.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 10:58:01 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-09-16 10:46:53 -0700, Kevin Grittner wrote:
  I don't recall seeing anyone posting
  regarding the existing undocumented record comparison operators.
  Nor do I recall seeing anyone posting about the undocumented
  pattern comparison operators.
 
  I've used and have seen them being used in client code...
 
 Which, the record operators or the text pattern operators (which
 ignore collations and multi-byte character considerations and just
 use memcmp())?

I was referring to the text pattern ops. But I've seen the record ones
as well.

 http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/varlena.c;h=5e2c2ddc532c604a05f365f0cf6761033a35be76;hb=master#l1719
 
 Is the fact that you have seen them used in client code even though
 they are not documented an argument for or against adding
 documentation for them?

Neither. It's an argument that providing operators with confusing
behaviours will not go unnoticed/unused.

I don't really think the record identical and pattern ops situations are
comparable. The latter has a well defined behaviour (using the C locale
basically) and is only useable for types where it's well defined. The
proposed identical operator returns false for comparisons of actually
equal data, that's quite different imo.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \h open

2013-09-16 Thread Kevin Grittner
Oleg Bartunov obartu...@gmail.com wrote:

 I noticed there is nothing available in built-in psql help about
 OPEN command. Does it intentional ?

 postgres=# \h open
 No help available for open.
 Try \h with no arguments to see available help.

PostgreSQL does not include OPEN as a SQL command:

http://www.postgresql.org/docs/current/interactive/sql-commands.html

DECLARE CURSOR opens the cursor immediately.

Some PLs include an OPEN statement, but psql does not have help for all the PLs.

As an example, plpgsql supports OPEN:

http://www.postgresql.org/docs/current/interactive/plpgsql-cursors.html#PLPGSQL-CURSOR-OPENING

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 10:46:53 -0700, Kevin Grittner wrote:
 One would think so, yet I don't recall seeing anyone posting
 regarding the existing undocumented record comparison operators. 
 Nor do I recall seeing anyone posting about the undocumented
 pattern comparison operators.

I've used and have seen them being used in client code...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 10:46:53 -0700, Kevin Grittner wrote:
 I don't recall seeing anyone posting
 regarding the existing undocumented record comparison operators.
 Nor do I recall seeing anyone posting about the undocumented
 pattern comparison operators.

 I've used and have seen them being used in client code...

Which, the record operators or the text pattern operators (which
ignore collations and multi-byte character considerations and just
use memcmp())?

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/utils/adt/varlena.c;h=5e2c2ddc532c604a05f365f0cf6761033a35be76;hb=master#l1719

Is the fact that you have seen them used in client code even though
they are not documented an argument for or against adding
documentation for them?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Merlin Moncure
On Mon, Sep 16, 2013 at 12:46 PM, Kevin Grittner kgri...@ymail.com wrote:
 Hannu Krosing ha...@2ndquadrant.com wrote:

 What I meant is that rather than leave it really undocumented,
 document it as system function for specific usage, has caveats
 and may change in future versions. use at your own risk and
 make sure you know what you are doing

 Well, that was my original assumption and intention; but when I
 went to look for where the operators for record *equals* were
 defined, I found that we had apparently chosen to leave them
 undocumented.  Oddly, under a section titled Row-wise Comparison
 we only document the behavior of comparisons involving what the SQL
 spec calls row value constructor.   I asked whether that was
 intentional, and heard only the chirping of crickets:

 http://www.postgresql.org/message-id/1378848776.70700.yahoomail...@web162902.mail.bf1.yahoo.com

 If we choose not to document the equals operator for records, it
 hardly makes sense to document the identical operator for records.

 PostgreSQL has good enough introspection features that people
 tend to find functions and operators using psql-s \d ...

 One would think so, yet I don't recall seeing anyone posting
 regarding the existing undocumented record comparison operators.
 Nor do I recall seeing anyone posting about the undocumented
 pattern comparison operators.

This behavior came about via a gripe of mine (but mostly courtesy Tom
Lane_: 
http://www.postgresql.org/message-id/6EE64EF3AB31D5448D0007DD34EEB34101AEF5@Herge.rcsinc.local

It brought row-wise comparon closer- (if not exactly to-) SQL spec.
The underlying use-case is to do ISAM-like record iteration over the
index.  index being the operative word.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 Kevin Grittner kgri...@ymail.com wrote:

 I don't recall seeing anyone posting regarding the existing
 undocumented record comparison operators.

 This behavior came about via a gripe of mine (but mostly courtesy
 Tom Lane_:
 http://www.postgresql.org/message-id/6EE64EF3AB31D5448D0007DD34EEB34101AEF5@Herge.rcsinc.local

 It brought row-wise comparon closer- (if not exactly to-) SQL
 spec.  The underlying use-case is to do ISAM-like record
 iteration over the index.  index being the operative word.

Indeed.  The documented behavior complies with the spec (and is
extremely useful!), but only fully matches the behavior when one of
the arguments is a row value constructor.  The documentation sort
of acknowledges this by describing the arguments not as row or
record, but as row_constructor.  Comparison operators which
followed those exact rules would not be usable for indexing, so we
have slightly different semantics to support that.  I have no gripe
whatsoever with any of that.  I'm even OK with not documenting this
behavior, since it might do more to confuse than elucidate.  I did
ask the question, because if we choose not to document the behavior
of record comparisons based around the concept of record equals,
I don't think it makes sense to document record comparisons based
around the concept of record image equals.  That is the
terminology used in the function names and opfamily, but I have
been using identical in discussion, which perhaps has confused
the issue.

If we take documentation of record comparisons farther, we need to
be prepared to explain why this is correct (because it is required
by spec):

test=# select row(1, null::text) = row(1, null::text);
 ?column? 
--
 
(1 row)

... but this is also correct (because otherwise indexes don't
work):

test=# select row(1, null::text) = row(1, null::text)::record;
 ?column? 
--
 t
(1 row)

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 10:58:01 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 10:46:53 -0700, Kevin Grittner wrote:
 I don't recall seeing anyone posting regarding the existing
 undocumented record comparison operators.  Nor do I recall
 seeing anyone posting about the undocumented pattern
 comparison operators.

 I've used and have seen them being used in client code...

 Which, the record operators or the text pattern operators (which
 ignore collations and multi-byte character considerations and
 just use memcmp())?

 I was referring to the text pattern ops. But I've seen the record
 ones as well.

 Is the fact that you have seen them used in client code even
 though they are not documented an argument for or against adding
 documentation for them?

 Neither. It's an argument that providing operators with confusing
 behaviours will not go unnoticed/unused.

It is true that there are likely to be some who find the new
operators useful and will read the code to determine how to use
them, just as they apparently did for the undocumented operators
which already exist.

 I don't really think the record identical and pattern ops
 situations are comparable. The latter has a well defined
 behaviour (using the C locale basically) and is only useable for
 types where it's well defined.

But it is *not* restricted to text in the C locale.  It uses
memcmp() on text values even if they contain multi-byte characters
and use collations which require different sequencing.

 The proposed identical operator returns false for comparisons of
 actually equal data,

That is the point of it.  It tells you whether the record images
are byte-for-byte identical for all values within the record.  In
some cases a detected difference clearly causes a user-visible
difference; in others it is merely a performance or storage-space
difference.  For example in your array example the array with the
unnecessary bitmap attached to it takes an extra eight bytes on
disk, for no apparent benefit.  The point is that code which wants
to know whether those are identical images can detect that they are
not.  This works by default for all types, without having to track
down which types are capable of doing this for equal values or in
which cases the differences are user-visible.

 that's quite different imo.

Sorting a string based on its byte image, with no regard to its
collation or character boundaries, is what these text pattern
operators do, and it is exactly what my patch does.  My patch just
provides such a test for all data types within a record by default,
rather than requiring a new opfamily for every type.

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git apply vs patch -p1

2013-09-16 Thread Andrew Gierth
 Josh == Josh Berkus j...@agliodbs.com writes:

 Josh The issue isn't that, it's that git apply is just buggy and
 Josh can't tell the difference between a new file and a modified
 Josh one.

It's not the fault of git apply; the patch contained explicit
annotations on all the files claiming that they were new. Both the
patches I've looked at so far (picksplit NaNs and enable_material)
had the same defect.

The question is, how are these submitters preparing their patches?

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git apply vs patch -p1

2013-09-16 Thread Josh Berkus
On 09/15/2013 11:46 PM, Ashutosh Bapat wrote:
 On Sun, Sep 15, 2013 at 12:38 AM, Andres Freund and...@2ndquadrant.comwrote:
 git reset?


 git reset wouldn't remove the files that were newly added by the patch,
 would it?

The issue isn't that, it's that git apply is just buggy and can't tell
the difference between a new file and a modified one.

The points patch contained no new files, just modifications.  But for
some reason, git apply read it as being all new files, which failed.

patch -p1 worked fine.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertions in PL/PgSQL

2013-09-16 Thread Pavel Stehule
Hello

a few other comments:

1. you disable a assert in compile time in dependency of enable_assertions
variable. I don't think, so it is good idea. When somebody enables a
assertions, then assertions will not work on all cached functions in
session. You should to do check if assertions are enabled in execution time
(there are no any significant advantage do it in compile time) or you
should to clean cache.

2. a failed assert should to raise a exception, that should not be handled
by any exception handler - similar to ERRCODE_QUERY_CANCELED - see
exception_matches_conditions.

Regards

Pavel


2013/9/14 Marko Tiikkaja ma...@joh.to

 Hi,

 Attached is a patch for supporting assertions in PL/PgSQL.  These are
 similar to the Assert() backend macro: they can be disabled during compile
 time, but when enabled, abort execution if the passed expression is not
 true.

 A simple example:

 CREATE FUNCTION delete_user(username text) RETURNS VOID AS $$
 BEGIN
 DELETE FROM users WHERE users.username = delete_user.username;
 ASSERT FOUND;
 END
 $$ LANGUAGE plpgsql;

 SELECT delete_user('mia');
 ERROR:  Assertion on line 4 failed
 CONTEXT:  PL/pgSQL function delete_user(text) line 4 at ASSERT


 Again, I'll add this to the open commitfest, but feedback is greatly
 appreciated.


 Regards,
 Marko Tiikkaja


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] git apply vs patch -p1

2013-09-16 Thread Jeff Janes
On Mon, Sep 16, 2013 at 12:11 PM, Andrew Gierth and...@tao11.riddles.org.uk
 wrote:

  Josh == Josh Berkus j...@agliodbs.com writes:

  Josh The issue isn't that, it's that git apply is just buggy and
  Josh can't tell the difference between a new file and a modified
  Josh one.

 It's not the fault of git apply; the patch contained explicit
 annotations on all the files claiming that they were new. Both the
 patches I've looked at so far (picksplit NaNs and enable_material)
 had the same defect.

 The question is, how are these submitters preparing their patches?


I used git diff configured to use src/tools/git-external-diff, as
described here:

http://wiki.postgresql.org/wiki/Working_with_Git

The resulting patch applies fine with patch, but not with git apply.

If I instead generate a patch with git diff --no-ext-diff, then it applies
with git apply.

Cheers,

Jeff


Re: [HACKERS] Updatable view columns

2013-09-16 Thread Marko Tiikkaja

Hi Dean,

First of all, thanks for working on this!

The patch compiles and applies fine (though with offsets).  The feature 
is in the SQL standard, and further improves an existing feature.


Some stuff I've spotted and fixed in the attached patch (hope you don't 
mind, thought it'd be easier to just fix these rather than describing 
them in email):

   - Some typos I've spotted in some of the comments
   - Fixed escaping of _ in regression tests

Other than these, the patch looks good to me.  Will wait for your 
thoughts and an updated patch before marking ready for committer.



Regards,
Marko Tiikkaja
diff --git a/doc/src/sgml/ref/create_view.sgml 
b/doc/src/sgml/ref/create_view.sgml
index 4505290..e0fbe1e 100644
--- a/doc/src/sgml/ref/create_view.sgml
+++ b/doc/src/sgml/ref/create_view.sgml
@@ -440,7 +440,7 @@ CREATE VIEW pg_comedies AS
 programlisting
 CREATE VIEW comedies AS
 SELECT f.*,
-   country_code_to_name(f.country_code) AS country
+   country_code_to_name(f.country_code) AS country,
(SELECT avg(r.rating)
 FROM user_ratings r
 WHERE r.film_id = f.id) AS avg_rating
diff --git a/src/backend/rewrite/rewriteHandler.c 
b/src/backend/rewrite/rewriteHandler.c
index 4c8c4f1..a35e63e 100644
--- a/src/backend/rewrite/rewriteHandler.c
+++ b/src/backend/rewrite/rewriteHandler.c
@@ -625,7 +625,7 @@ adjustJoinTreeList(Query *parsetree, bool removert, int 
rt_index)
  * table, but it's more convenient to do it here while we still have easy
  * access to the view's original RT index.)  This is only necessary for
  * trigger-updatable views, for which the view remains the result relation of
- * the query.  For auto-updatable we must not do this, since it might add
+ * the query.  For auto-updatable views we must not do this, since it might add
  * assignments to non-updatable view columns.  For rule-updatable views it is
  * unnecessary extra work, since the query will be rewritten with a different
  * result relation which will be processed when we recurse via RewriteQuery.
@@ -2102,7 +2102,7 @@ view_query_is_auto_updatable(Query *viewquery, bool 
security_barrier,
 
 
 /*
- * view_cols_are_auto_updatable - test whether the all of the required columns
+ * view_cols_are_auto_updatable - test whether all of the required columns
  * of an auto-updatable view are actually updatable. Returns NULL (if all the
  * required columns can be updated) or a message string giving the reason that
  * they cannot be.
@@ -2111,7 +2111,7 @@ view_query_is_auto_updatable(Query *viewquery, bool 
security_barrier,
  * assign to any non-updatable columns.
  *
  * Additionally it may be used to retrieve the set of updatable columns in the
- * view or, if one or more of the required columns is not updatable, the name
+ * view, or if one or more of the required columns is not updatable, the name
  * of the first offending non-updatable column.
  *
  * The caller must have already verified that this is an auto-updatable view
diff --git a/src/test/regress/expected/updatable_views.out 
b/src/test/regress/expected/updatable_views.out
index e463b43..c725bba 100644
--- a/src/test/regress/expected/updatable_views.out
+++ b/src/test/regress/expected/updatable_views.out
@@ -30,7 +30,7 @@ CREATE VIEW ro_view20 AS SELECT * FROM seq; -- View based on 
a sequence
 CREATE VIEW ro_view21 AS SELECT a, b, generate_series(1, a) g FROM base_tbl; 
-- SRF in targetlist not supported
 SELECT table_name, is_insertable_into
   FROM information_schema.tables
- WHERE table_name LIKE E'r_\_view%'
+ WHERE table_name LIKE E'r_\\_view%'
  ORDER BY table_name;
  table_name | is_insertable_into 
 +
@@ -59,7 +59,7 @@ SELECT table_name, is_insertable_into
 
 SELECT table_name, is_updatable, is_insertable_into
   FROM information_schema.views
- WHERE table_name LIKE E'r_\_view%'
+ WHERE table_name LIKE E'r_\\_view%'
  ORDER BY table_name;
  table_name | is_updatable | is_insertable_into 
 +--+
@@ -88,7 +88,7 @@ SELECT table_name, is_updatable, is_insertable_into
 
 SELECT table_name, column_name, is_updatable
   FROM information_schema.columns
- WHERE table_name LIKE E'r_\_view%'
+ WHERE table_name LIKE E'r_\\_view%'
  ORDER BY table_name, ordinal_position;
  table_name |  column_name  | is_updatable 
 +---+--
@@ -1222,7 +1222,7 @@ SELECT * FROM base_tbl ORDER BY a;
 
 SELECT table_name, is_insertable_into
   FROM information_schema.tables
- WHERE table_name LIKE E'r_\_view%'
+ WHERE table_name LIKE E'r_\\_view%'
  ORDER BY table_name;
  table_name | is_insertable_into 
 +
@@ -1233,7 +1233,7 @@ SELECT table_name, is_insertable_into
 
 SELECT table_name, is_updatable, is_insertable_into
   FROM information_schema.views
- WHERE table_name LIKE E'r_\_view%'
+ WHERE table_name LIKE E'r_\\_view%'
  ORDER BY table_name;
  table_name | is_updatable | is_insertable_into 
 

Re: [HACKERS] Possible memory leak with SQL function?

2013-09-16 Thread Greg Stark
Noah, this is the kind of memory leak I was referring to which would be
nice if valgrind could help with. I'm not sure exactly what that would look
like though, I've never tried writing support code for valgrind to deal
with custom allocators.

-- 
greg
On 16 Sep 2013 15:38, Yeb Havinga yebhavi...@gmail.com wrote:

 On 2013-09-13 18:32, Robert Haas wrote:

 On Thu, Sep 12, 2013 at 5:29 AM, Yeb Havinga yebhavi...@gmail.com
 wrote:

 Is the following known behaviour, or should I put some time in writing a
 self contained test case?

 We have a function that takes a value and returns a ROW type. With the
 function implemented in language SQL, when executing this function in a
 large transaction, memory usage of the backend process increases.
 MemoryContextStats showed a lot of SQL function data. Debugging
 init_sql_fcache() showed that it was for the same function oid each time,
 and the oid was the function from value to ROW type.

 When the function is implemented in PL/pgSQL, the memory usage was much
 less.

 I'm sorry I cannot be more specific at the moment, such as what is 'much
 less' memory with a PL/pgSQl function, and are there as many SQL function
 data's as calls to the SQL function, because I would have to write a test
 case for this. I was just wondering, if this is known behavior of SQL
 functions vs PL/pgSQL functions, or could it be a bug?

 It sounds like a bug to me, although I can't claim to know everything
 there is to know about this topic.

  I spent some time writing a test case, but failed to make a test case
 that showed the memory difference I described upthread, in contrast, in the
 test below, the SQL function actually shows a smaller memory footprint than
 the plpgsql counterpart. This test case only demonstrates that in a long
 running transaction, calling sql or plpgsql functions causes increasing
 memory usage that is not released until after commit.

 callit.sql:
 --
 DO
 $$
 DECLARE  b text;
  i int;
 BEGIN
 --   SELECT 'a' into b; -- memory constant
i := fp('a'); -- memory increases
 --   i := fs('a'); -- memory increases but slow
 END;
 $$ LANGUAGE plpgsql;
 -

 sqlvsplpgsql.sql:
 -
 CREATE OR REPLACE FUNCTION fp (a text)
  RETURNS int
  AS $$
 DECLARE result int;
 BEGIN
 SELECT 10 INTO result;
 RETURN result;
 END;
 $$
 LANGUAGE plpgsql;
 CREATE OR REPLACE FUNCTION fs (a text)
  RETURNS int
  AS $$
  SELECT 10;
 $$
 LANGUAGE sql;
 \i callit.sql
 -


 rm /tmp/ff /tmp/ff2 ; cp callit.sql /tmp/ff ; cat /tmp/ff /tmp/ff 
 /tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff; cat /tmp/ff /tmp/ff 
 /tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  /tmp/ff2;
 cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  /tmp/ff2; cat
 /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  /tmp/ff2; cat /tmp/ff2
 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  /tmp/ff2; cat /tmp/ff2 /tmp/ff2
  /tmp/ff;cat /tmp/ff /tmp/ff  /tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff

 psql -1 postgres -f /tmp/ff

 Then watch htop in another terminal.


 --
 Yeb Havinga
 http://www.mgrid.net/
 Mastering Medical Data



 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-hackershttp://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Proposal: json_populate_record and nested json objects

2013-09-16 Thread Andrew Dunstan


On 09/16/2013 09:57 AM, Chris Travers wrote:


 On 16 September 2013 at 14:43 Merlin Moncure mmonc...@gmail.com 
wrote:



 Huge +1 on on this. Couple random thoughts:

 *) Hard to see how you would structure this as an extension as you're
 adjusting the behaviors of existing functions, unless you wanted to
 introduce new function names for testing purposes?
Yeah, and reading the source, it looks like some parts of the JSON 
parsing code will have to be rewritten because the nested object 
errors are thrown quite deeply in the parsing stage.  It looks to me 
as if this will require some significant copying as a POC into a new 
file with different publicly exposed function names.



I don't believe any of the parsing code should require changing at all. 
The event handlers that the parser calls would need to be changed. If 
you're at PostgresOpen you should be attending my talk which includes an 
example of how to use this API.


cheers

andrew




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 16:58:21 -0400, Noah Misch wrote:
 memcmp() has served well for HOT and for _equalConst(); why would it suddenly
 fall short for MV maintenance?

I don't have a problem using it internally, I have a problem exposing
the capability to sql. Don't tell me that's the same.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible memory leak with SQL function?

2013-09-16 Thread Yeb Havinga

On 2013-09-13 18:32, Robert Haas wrote:

On Thu, Sep 12, 2013 at 5:29 AM, Yeb Havinga yebhavi...@gmail.com wrote:

Is the following known behaviour, or should I put some time in writing a
self contained test case?

We have a function that takes a value and returns a ROW type. With the
function implemented in language SQL, when executing this function in a
large transaction, memory usage of the backend process increases.
MemoryContextStats showed a lot of SQL function data. Debugging
init_sql_fcache() showed that it was for the same function oid each time,
and the oid was the function from value to ROW type.

When the function is implemented in PL/pgSQL, the memory usage was much
less.

I'm sorry I cannot be more specific at the moment, such as what is 'much
less' memory with a PL/pgSQl function, and are there as many SQL function
data's as calls to the SQL function, because I would have to write a test
case for this. I was just wondering, if this is known behavior of SQL
functions vs PL/pgSQL functions, or could it be a bug?

It sounds like a bug to me, although I can't claim to know everything
there is to know about this topic.

I spent some time writing a test case, but failed to make a test case 
that showed the memory difference I described upthread, in contrast, in 
the test below, the SQL function actually shows a smaller memory 
footprint than the plpgsql counterpart. This test case only demonstrates 
that in a long running transaction, calling sql or plpgsql functions 
causes increasing memory usage that is not released until after commit.


callit.sql:
--
DO
$$
DECLARE  b text;
 i int;
BEGIN
--   SELECT 'a' into b; -- memory constant
   i := fp('a'); -- memory increases
--   i := fs('a'); -- memory increases but slow
END;
$$ LANGUAGE plpgsql;
-

sqlvsplpgsql.sql:
-
CREATE OR REPLACE FUNCTION fp (a text)
 RETURNS int
 AS $$
DECLARE result int;
BEGIN
SELECT 10 INTO result;
RETURN result;
END;
$$
LANGUAGE plpgsql;
CREATE OR REPLACE FUNCTION fs (a text)
 RETURNS int
 AS $$
 SELECT 10;
$$
LANGUAGE sql;
\i callit.sql
-


rm /tmp/ff /tmp/ff2 ; cp callit.sql /tmp/ff ; cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff; cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff;cat /tmp/ff /tmp/ff  
/tmp/ff2; cat /tmp/ff2 /tmp/ff2  /tmp/ff


psql -1 postgres -f /tmp/ff

Then watch htop in another terminal.


--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] git apply vs patch -p1

2013-09-16 Thread Andrew Gierth
 Jeff == Jeff Janes jeff.ja...@gmail.com writes:

 Jeff I used git diff configured to use
 Jeff src/tools/git-external-diff, as described here:

hmm...

so that git-external-diff script is trying to fake git diff output,
including using 'diff -git' and index lines, but the context-diff
output wouldn't work with git apply even if the header lines were
correct (as far as I can see git apply accepts only git's unified-diff
format - the git people clearly have no truck with context diffs).

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add use of asprintf()

2013-09-16 Thread Alvaro Herrera
Peter Eisentraut wrote:

 The attached patch should speak for itself.

Yeah, it's a very nice cleanup.

 I have supplied a few variants:
 
 - asprintf() is the standard function, supplied by libpgport if
 necessary.
 
 - pg_asprintf() is asprintf() with automatic error handling (like
 pg_malloc(), etc.)
 
 - psprintf() is the same idea but with palloc.

Looks good to me, except that pg_asprintf seems to be checking ret
instead of rc.

Is there a reason for the API discrepancy of pg_asprintf vs. psprintf?
I don't see that we use the integer return value anywhere.  Callers
interested in the return value can use asprintf directly (and you have
already inserted callers that do nonstandard things using direct
asprintf).

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Dead function argument?

2013-09-16 Thread Antonin Houska
While reading storage/lmgr/lock.c I noticed that the last (proc)
argument of LockCheckConflicts() is not referenced inside the function.
Is it just a remnant from older version?

// Antonin Houska (Tony)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Noah Misch
On Mon, Sep 16, 2013 at 04:28:23PM +0200, Andres Freund wrote:
 On 2013-09-15 19:49:26 -0400, Noah Misch wrote:
  Type-specific identity operators seem like overkill, anyway.  If we find 
  that
  meaningless variations in a particular data type are causing too many false
  non-matches for the generic identity operator, the answer is to make the
  functions generating datums of that type settle on a canonical form.  That
  would be the solution for your example involving array null bitmaps.
 
 I think that's pretty much unrealistic. I am pretty sure that if either
 of us starts looking we will find at about a dozen of such cases and
 miss the other dozen. Not to speak about external code which is damn
 likely to contain such cases.

It wouldn't be a problem if we missed cases or declined to update known cases.
The array example probably isn't worth changing.  Who's going to repeatedly
flip-flop an array column value between the two representations and then
bemoan the resulting MV write traffic?

 And I think that efficiency will often make such normalization expensive
 (consider postgis where Datums afaik can exist with an internal bounding
 box or without).

If it's so difficult to canonicalize between two supposedly-identical
variations, it's likewise a stretch to trust that all credible operations will
fail to distinguish between those variations.

 I think it's far more realistic to implement an identity operator that
 will fall back to a type specific operator iff equals has strange
 properties.

Complicating such efforts, the author of a custom identity operator doesn't
have the last word on functions that process the data type.  Take your postgis
example; if a second party adds a has_internal_bounding_box() function, an
identity operator ignoring that facet is no longer valid.

memcmp() has served well for HOT and for _equalConst(); why would it suddenly
fall short for MV maintenance?

nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revive line type

2013-09-16 Thread Alvaro Herrera
Peter Eisentraut escribió:
 Here is a new patch for the line type, with a new input/output format
 {A,B,C}, as discussed in this thread.

I gave this a quick look.  The new input/output format appears to work
well.  The input format using two points still works, which is nice.
Regression tests pass.  I tried binary input and output using COPY and
it seems to round-trip without problem.

With the two-point input, horizontal and vertical lines can accept NaNs
without producing a NaN in the output if not necessary.

I tried interpt and intersect functions and they seem to work.
Haven't gotten around to trying other functions or operators.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 16:58:21 -0400, Noah Misch wrote:
 memcmp() has served well for HOT and for _equalConst(); why
 would it suddenly fall short for MV maintenance?

 I don't have a problem using it internally, I have a problem
 exposing the capability to sql.

... like we do in *pattern ops and the
suppress_redundant_updates_trigger() function?

 Don't tell me that's the same.

No, this gives users a way to make the same test that HOT uses for
whether values match, albeit undocumented.  Well, not exactly the
same test, because this patch detoasts before comparing -- but
pretty close.  The question is, if it's unsafe for a user to make
this test, why would it be safe for HOT to use it?

I'm really having trouble understanding what problem you have with
this.  Can you describe a scenario where someone shoots themselves
in the foot with it, because I'm not seeing any?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 14:39:30 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
  On 2013-09-16 16:58:21 -0400, Noah Misch wrote:
  memcmp() has served well for HOT and for _equalConst(); why
  would it suddenly fall short for MV maintenance?
 
  I don't have a problem using it internally, I have a problem
  exposing the capability to sql.
 
 ... like we do in *pattern ops and the

That's still an absurd comparison. pattern ops have a defined behaviour,
even for multibyte characters. After all, you can simply have a UTF-8
database with C collation. Remember, in a C collation pattern ops can
use normal indexes.
The only thing that happens is that multibyte chars are sorted
differently. They aren't sorted basically randomly or anything.

 suppress_redundant_updates_trigger() function?

You get superflous trigger calls. So what. It's not usable for anything
but a trigger.

 I'm really having trouble understanding what problem you have with
 this.  Can you describe a scenario where someone shoots themselves
 in the foot with it, because I'm not seeing any?

It certainly will be surprising to just about anyone that something like:

SELECT * FROM foo WHERE whatever_composite_or_row === '(...)';

may not return rows where there's no SQL level discernible difference
(even by looking at the text conversion) between
whatever_composite_or_row and '(...)' because e.g. of the aforementioned
array null bitmaps.

I can understand claiming that the risk is acceptable but arguing it's
not there seems extremly strange to me.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 23:58:46 +0200, Andres Freund wrote:
  suppress_redundant_updates_trigger() function?
 
 You get superflous trigger calls. So what. It's not usable for anything
 but a trigger.

Primarily unneccesary IO, not unneccessary trigger calls (which can also
happen).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Row-wise Comparison

2013-09-16 Thread Noah Misch
On Tue, Sep 10, 2013 at 02:32:56PM -0700, Kevin Grittner wrote:
 The operators and sequencing involving actual records seems to be
 different from that for row value constructors, and it appears to
 be for good reason -- so that indexing will work correctly.
 
 My questions:
 
 Did I miss somewhere that the docs do cover this?

I, too, don't see it.

 If not, do we want to describe it?  Why not?

+1 for documenting it.  We document , , =, =, =, and  generically[1].
Several types that make non-obvious decisions for those operators (float8,
range types, arrays) document those decisions.  record hasn't done so, but
it should.

 If we don't want to document the above, would the same arguments
 apply to the operators I'm adding?  (i.e., Do we want to avoid docs
 on these, possibly on the basis of them being an internal
 implementation detail?)

Separate decision, IMO.  See progress on the more-recent thread.


[1] http://www.postgresql.org/docs/9.3/static/functions-comparison.html

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 14:39:30 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 On 2013-09-16 16:58:21 -0400, Noah Misch wrote:
 memcmp() has served well for HOT and for _equalConst(); why
 would it suddenly fall short for MV maintenance?

 I don't have a problem using it internally, I have a problem
 exposing the capability to sql.

 ... like we do in *pattern ops and the

 That's still an absurd comparison. pattern ops have a defined
 behaviour, even for multibyte characters. After all, you can
 simply have a UTF-8 database with C collation. Remember, in a C
 collation pattern ops can use normal indexes.
 The only thing that happens is that multibyte chars are sorted
 differently. They aren't sorted basically randomly or anything.

Neither are records using the new operator -- there is a
deterministic sort order based on the bytes representing the
record's values.

 suppress_redundant_updates_trigger() function?

 You get superflous trigger calls. So what.

My feeling exactly.

 I'm really having trouble understanding what problem you have
 with this.  Can you describe a scenario where someone shoots
 themselves in the foot with it, because I'm not seeing any?

 It certainly will be surprising to just about anyone that
 something like:

 SELECT * FROM foo WHERE whatever_composite_or_row === '(...)';

 may not return rows where there's no SQL level discernible
 difference (even by looking at the text conversion) between
 whatever_composite_or_row and '(...)' because e.g. of the
 aforementioned array null bitmaps.

Since the operator is bound to a function named record_image_eq(),
it seems to me to be a feature that if the image isn't equal due to
a bitmap being present in one and not the other it says they
differ.  It's not a bug or a problem.  It gives you a way to *tell*
whether two rows actually have the same representation, which could
be valuable for debugging.  The closest you can easily come without
this is to see that pg_column_size() yields a different result for
them, where that is true (as in the array bitmap case).

 I can understand claiming that the risk is acceptable but arguing
 it's not there seems extremly strange to me.

It's not a risk.  It's why the operator exists.  Perhaps the name
of the operator (===) or the fact that I've been using the
shorthand description of identical instead of writing out record
image equals in these emails has confused you.  I can stop using
the shorter description and have absolutely no attachment to the
operator name, if that helps.

The point of the operator is to determine whether two records,
which may compare as equal (but which don't necessarily appear
the same to a user), have the same byte image for each
corresponding value.  The point of the opfamily is to be able to do
a MergeJoin on this basis.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revive line type

2013-09-16 Thread David Fetter
On Mon, Sep 16, 2013 at 07:04:32PM -0300, Alvaro Herrera wrote:
 Peter Eisentraut escribió:
  Here is a new patch for the line type, with a new input/output format
  {A,B,C}, as discussed in this thread.
 
 I gave this a quick look.  The new input/output format appears to work
 well.  The input format using two points still works, which is nice.
 Regression tests pass.  I tried binary input and output using COPY and
 it seems to round-trip without problem.
 
 With the two-point input, horizontal and vertical lines can accept NaNs
 without producing a NaN in the output if not necessary.
 
 I tried interpt and intersect functions and they seem to work.
 Haven't gotten around to trying other functions or operators.

Should the things you tried and others be in the regression tests?  If
so, should we start with whatever had been in the regression tests
when the line type was dropped?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] record identical operator

2013-09-16 Thread Andres Freund
On 2013-09-16 15:26:08 -0700, Kevin Grittner wrote:
  I can understand claiming that the risk is acceptable but arguing
  it's not there seems extremly strange to me.
 
 It's not a risk.  It's why the operator exists. 

Pft. It's fine if the materialized view code uses it. I don't see the
danger there.
It's about users discovering it. If they notice it, they will use it
because its a crapload faster than normal row comparisons. And deals
with NULLs in a simpler way.

 Perhaps the name
 of the operator (===) or the fact that I've been using the
 shorthand description of identical instead of writing out record
 image equals in these emails has confused you.

If you really think that confusing you is the correct description for
concerns about users not understanding limitations (which you didn't
seem to know about), go ahead. Way to go to solicit feedback.

 I can stop using
 the shorter description and have absolutely no attachment to the
 operator name, if that helps.

You're unfortunately going to need operators if you want mergejoins to
work, right? If not I'd have said name it matview_needs_update() or
something like that... But yes, === certainly seems like a bad idea.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dead function argument?

2013-09-16 Thread Alvaro Herrera
Antonin Houska escribió:
 While reading storage/lmgr/lock.c I noticed that the last (proc)
 argument of LockCheckConflicts() is not referenced inside the function.
 Is it just a remnant from older version?

My guess is this is just an oversight.  The use of that argument was
removed in commit 8563ccae2caf.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] relscan_details.h

2013-09-16 Thread Alvaro Herrera
relscan.h is a bit of an unfortunate header because it requires a lot of
other headers to compile, and is also required by execnodes.h.  This
quick patch removes the struct definitions from that file, moving them
into a new relscan_details.h file; the reworked relscan.h does not need
any other header, freeing execnodes.h from needlessly including lots of
unnecessary stuff all over the place.  Only the files that really need
the struct definition include the new file, and as seen in the patch,
they are quite few.

execnodes.h is included by 147 backend source files; relscan_details.h
only 27.  So the exposure area is reduced considerably.  This patch also
modifies a few other backend source files to add one or both of genam.h
and heapam.h, which were previously included through execnodes.h.

This is probably missing tweaks for contrib/.  The following modules
would need to include relscan_details.h: sepgsql dblink pgstattuple
pgrowlocks.  I expect the effect on third-party modules to be much less
than by the htup_details.h change.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index cb17d38..1dfa888 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -15,7 +15,7 @@
 #include postgres.h
 
 #include access/gin_private.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include miscadmin.h
 #include utils/datum.h
 #include utils/memutils.h
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index afee2db..d22724f 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -15,7 +15,7 @@
 #include postgres.h
 
 #include access/gin_private.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include pgstat.h
 #include utils/memutils.h
 #include utils/rel.h
diff --git a/src/backend/access/gist/gistget.c b/src/backend/access/gist/gistget.c
index e97ab8f..c58152d 100644
--- a/src/backend/access/gist/gistget.c
+++ b/src/backend/access/gist/gistget.c
@@ -15,7 +15,7 @@
 #include postgres.h
 
 #include access/gist_private.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include miscadmin.h
 #include pgstat.h
 #include utils/builtins.h
diff --git a/src/backend/access/gist/gistscan.c b/src/backend/access/gist/gistscan.c
index b5553ff..e7e6614 100644
--- a/src/backend/access/gist/gistscan.c
+++ b/src/backend/access/gist/gistscan.c
@@ -16,7 +16,7 @@
 
 #include access/gist_private.h
 #include access/gistscan.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include utils/memutils.h
 #include utils/rel.h
 
diff --git a/src/backend/access/hash/hash.c b/src/backend/access/hash/hash.c
index 8895f58..3f52a61 100644
--- a/src/backend/access/hash/hash.c
+++ b/src/backend/access/hash/hash.c
@@ -19,7 +19,7 @@
 #include postgres.h
 
 #include access/hash.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include catalog/index.h
 #include commands/vacuum.h
 #include optimizer/cost.h
diff --git a/src/backend/access/hash/hashscan.c b/src/backend/access/hash/hashscan.c
index 8c2918f..54e91f3 100644
--- a/src/backend/access/hash/hashscan.c
+++ b/src/backend/access/hash/hashscan.c
@@ -16,7 +16,7 @@
 #include postgres.h
 
 #include access/hash.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include utils/memutils.h
 #include utils/rel.h
 #include utils/resowner.h
diff --git a/src/backend/access/hash/hashsearch.c b/src/backend/access/hash/hashsearch.c
index 91661ba..9ab44d0 100644
--- a/src/backend/access/hash/hashsearch.c
+++ b/src/backend/access/hash/hashsearch.c
@@ -15,7 +15,7 @@
 #include postgres.h
 
 #include access/hash.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include miscadmin.h
 #include pgstat.h
 #include utils/rel.h
diff --git a/src/backend/access/hash/hashutil.c b/src/backend/access/hash/hashutil.c
index aa071d9..a7d5cca 100644
--- a/src/backend/access/hash/hashutil.c
+++ b/src/backend/access/hash/hashutil.c
@@ -16,7 +16,7 @@
 
 #include access/hash.h
 #include access/reloptions.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include utils/lsyscache.h
 #include utils/rel.h
 
diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
index ead3d69..f599383 100644
--- a/src/backend/access/heap/heapam.c
+++ b/src/backend/access/heap/heapam.c
@@ -44,7 +44,7 @@
 #include access/heapam_xlog.h
 #include access/hio.h
 #include access/multixact.h
-#include access/relscan.h
+#include access/relscan_details.h
 #include access/sysattr.h
 #include access/transam.h
 #include access/tuptoaster.h
diff --git a/src/backend/access/index/genam.c b/src/backend/access/index/genam.c
index 15debed..cacd0b6 100644
--- a/src/backend/access/index/genam.c
+++ b/src/backend/access/index/genam.c
@@ -19,7 +19,7 @@
 
 #include postgres.h
 
-#include access/relscan.h

Re: [HACKERS] [PATCH] Revive line type

2013-09-16 Thread Alvaro Herrera
David Fetter escribió:

 Should the things you tried and others be in the regression tests?  If
 so, should we start with whatever had been in the regression tests
 when the line type was dropped?

Actually, the patch does include a regression test for the revived type
(and it passes).  I don't think more than that is needed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add an ldapoption to disable chasing LDAP referrals

2013-09-16 Thread Peter Eisentraut
On Tue, 2013-07-09 at 11:33 +1000, James Sewell wrote:
 New patch attached. I've moved from using a boolean to an enum
 trivalue.

You have updated the documentation to say that the ldareferrals option
only applies in search+bind mode.  But looking over the code I think it
applies in both modes.  Check please.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers