Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread Pavel Stehule
2014-06-17 7:30 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:

 Use subtransaction , the tuples that had inserted into heap  must be
 inserted again  when some exception is raised,
 it is too expensive.
 My solution is :
 1. delete the tuple that caused the error tuple;
 2. release all the resources when  inserting  the tuple;
 3. continue insert next tuple
 Is it feasible?  Anybody give me some suggestion?


no, it should not work - after any exception some memory structures should
be in undefined state. Errors in PostgreSQL are destructive and any error
must be followed by ROLLBACK.

Subtransaction for any row is expensive, but subtransaction for some block
is cheap

Regards

Pavel



 --

 张晓博   研发二部

 北京人大金仓信息技术股份有限公司

 地址:北京市海淀区上地西路八号院上地科技大厦4号楼501

 邮编:100085

 电话:(010) 5885 1118 - 8450

 手机:15311394463

 邮箱:xbzh...@kingbase.com.cn


 *From:* Alvaro Herrera alvhe...@2ndquadrant.com
 *Date:* 2014-06-17 02:37
 *To:* Pavel Stehule pavel.steh...@gmail.com
 *CC:* xbzhang xbzh...@kingbase.com.cn; pgsql-hackers
 pgsql-hackers@postgresql.org
 *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
 Pavel Stehule wrote:
  2014-06-16 11:46 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:
 
  
   I want to implement the skip errors for copy from,lik as :
   create table A (c int primary key);
   copy A from stdin;
   1
   1
   2
   \.
  
   copy will failed:
   ERROR: duplicate key violates primary key constraint CC_PKEY
   CONTEXT: COPY CC, line 2: 1
  
   I want skip the error, and continue to copy the reset of tuple. The
 result
   will be that there are two rows in table A: 1 and 2.
  
   how to implement that ? Anybody give me some suggestion?
 
  you should to reimplement a copy procedure to use a subtransactions.
 Using
  subtransaction for any row is too expensive, but you can do
 subtransaction
  per 1000 rows, and when some exception is raised, then store data per one
  row/one subtransaction.

 See http://pgloader.io/ for a ready-made solution.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services




 -
 ???
 :AVG - www.avg.com
 ??:2013.0.3480 / ?:3955/7685 - :06/16/14





Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread xbzhang






one resource owner per tuples, when error happens, only release resource owner 
belong to error tuple.Why some memory structures should be in undefined state? 
Can you give some examples?


 From: Pavel StehuleDate: 2014-06-17 14:01To: xbzhangCC: Alvaro Herrera; 
pgsql-hackersSubject: Re: Re: [HACKERS] How to implement the skip errors for 
copy from ?


2014-06-17 7:30 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:



Use subtransaction , the tuples that had inserted into heap  must be inserted 
again  when some exception is raised,

it is too expensive.My solution is :1. delete the tuple that caused the error 
tuple;2. release all the resources when  inserting  the tuple;

3. continue insert next tupleIs it feasible?  Anybody give me some suggestion?


no, it should not work - after any exception some memory structures should be 
in undefined state. Errors in PostgreSQL are destructive and any error must be 
followed by ROLLBACK.



Subtransaction for any row is expensive, but subtransaction for some block is 
cheap 

Regards

Pavel
 






张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501


邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzh...@kingbase.com.cn


 

From: Alvaro HerreraDate: 2014-06-17 02:37To: Pavel Stehule

CC: xbzhang; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip 
errors for copy from ?

Pavel Stehule wrote:
 2014-06-16 11:46 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:
 
 
  I want to implement the skip errors for copy from,lik as :
  create table A (c int primary key);
  copy A from stdin;
  1
  1
  2
  \.
 
  copy will failed:
  ERROR: duplicate key violates primary key constraint CC_PKEY
  CONTEXT: COPY CC, line 2: 1
 
  I want skip the error, and continue to copy the reset of tuple. The result
  will be that there are two rows in table A: 1 and 2.
 
  how to implement that ? Anybody give me some suggestion?
 
 you should to reimplement a copy procedure to use a subtransactions. Using
 subtransaction for any row is too expensive, but you can do subtransaction
 per 1000 rows, and when some exception is raised, then store data per one
 row/one subtransaction.
 
See http://pgloader.io/ for a ready-made solution.
 
-- 
Álvaro Herrera    http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
 
 
 
 
-
???
:AVG - www.avg.com
??:2013.0.3480 / ?:3955/7685 - :06/16/14
 




在此邮件中未发现病毒。

检查工具:AVG - www.avg.com

版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14



Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread Pavel Stehule
2014-06-17 8:46 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:

 one resource owner per tuples, when error happens, only release resource
 owner belong to error tuple.
 Why some memory structures should be in undefined state? Can you give
 some examples?


there can be raised any exception -- any non fatal exception. I remember,
when I wrote some similar without exception, then it was very unstable.

Pavel




 *From:* Pavel Stehule pavel.steh...@gmail.com
 *Date:* 2014-06-17 14:01
 *To:* xbzhang xbzh...@kingbase.com.cn
 *CC:* Alvaro Herrera alvhe...@2ndquadrant.com; pgsql-hackers
 pgsql-hackers@postgresql.org
 *Subject:* Re: Re: [HACKERS] How to implement the skip errors for copy
 from ?



 2014-06-17 7:30 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:

 Use subtransaction , the tuples that had inserted into heap  must be
 inserted again  when some exception is raised,
 it is too expensive.
 My solution is :
 1. delete the tuple that caused the error tuple;
 2. release all the resources when  inserting  the tuple;
 3. continue insert next tuple
 Is it feasible?  Anybody give me some suggestion?


 no, it should not work - after any exception some memory structures should
 be in undefined state. Errors in PostgreSQL are destructive and any error
 must be followed by ROLLBACK.

 Subtransaction for any row is expensive, but subtransaction for some block
 is cheap

 Regards

 Pavel



 --

 张晓博   研发二部

 北京人大金仓信息技术股份有限公司

 地址:北京市海淀区上地西路八号院上地科技大厦4号楼501

 邮编:100085

 电话:(010) 5885 1118 - 8450

 手机:15311394463

 邮箱:xbzh...@kingbase.com.cn


 *From:* Alvaro Herrera alvhe...@2ndquadrant.com
 *Date:* 2014-06-17 02:37
 *To:* Pavel Stehule pavel.steh...@gmail.com
 *CC:* xbzhang xbzh...@kingbase.com.cn; pgsql-hackers
 pgsql-hackers@postgresql.org
 *Subject:* Re: [HACKERS] How to implement the skip errors for copy from ?
 Pavel Stehule wrote:
  2014-06-16 11:46 GMT+02:00 xbzhang xbzh...@kingbase.com.cn:
 
  
   I want to implement the skip errors for copy from,lik as :
   create table A (c int primary key);
   copy A from stdin;
   1
   1
   2
   \.
  
   copy will failed:
   ERROR: duplicate key violates primary key constraint CC_PKEY
   CONTEXT: COPY CC, line 2: 1
  
   I want skip the error, and continue to copy the reset of tuple. The
 result
   will be that there are two rows in table A: 1 and 2.
  
   how to implement that ? Anybody give me some suggestion?
 
  you should to reimplement a copy procedure to use a subtransactions.
 Using
  subtransaction for any row is too expensive, but you can do
 subtransaction
  per 1000 rows, and when some exception is raised, then store data per
 one
  row/one subtransaction.

 See http://pgloader.io/ for a ready-made solution.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services




 -
 ???
 :AVG - www.avg.com
 ??:2013.0.3480 / ?:3955/7685 - :06/16/14



 在此邮件中未发现病毒。
 检查工具:AVG - www.avg.com
 版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14




Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Amit Kapila
On Fri, May 23, 2014 at 10:01 PM, Amit Kapila amit.kapil...@gmail.com
wrote:
 On Fri, Jan 31, 2014 at 3:24 PM, Andres Freund and...@2ndquadrant.com
wrote:
  I've pushed a rebased version of the patchset to
  http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git
  branch rwlock contention.
  220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem,
  ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA.

 As per discussion in developer meeting, I wanted to test shared
 buffer scaling patch with this branch.  I am getting merge
 conflicts as per HEAD.  Could you please get it resolved, so that
 I can get the data.

I have started looking into this patch and have few questions/
findings which are shared below:

1. I think stats for lwstats-ex_acquire_count will be counted twice,
first it is incremented in LWLockAcquireCommon() and then in
LWLockAttemptLock()

2.
Handling of potentialy_spurious case seems to be pending
in LWLock functions like LWLockAcquireCommon().

LWLockAcquireCommon()
{
..
/* add to the queue */
LWLockQueueSelf(l, mode);

/* we're now guaranteed to be woken up if necessary */
mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);

}

I think it can lead to some problems, example:

Session -1
1. Acquire Exclusive LWlock

Session -2
1. Acquire Shared LWlock

1a. Unconditionally incrementing shared count by session-2

Session -1
2. Release Exclusive lock

Session -3
1. Acquire Exclusive LWlock
It will put itself to wait queue by seeing the lock count incremented
by Session-2

Session-2
1b. Decrement the shared count and add it to wait queue.

Session-4
1. Acquire Exclusive lock
   This session will get the exclusive lock, because even
   though other lockers are waiting, lockcount is zero.

Session-2
2. Try second time to take shared lock, it won't get
   as session-4 already has an exclusive lock, so it will
   start waiting

Session-4
2. Release Exclusive lock
   it will not wake the waiters because waiters have been added
   before acquiring this lock.

So in above scenario, Session-3 and Session-2 are waiting in queue
with nobody to awake them.

I have not reproduced the exact scenario above,
so I might be missing some thing which will not
lead to above situation.

3.
LWLockAcquireCommon()
{
for (;;)
{
PGSemaphoreLock(proc-sem, false);
if (!proc-lwWaiting)
..
}
proc-lwWaiting is checked, updated without spinklock where
as previously it was done under spinlock, won't it be unsafe?

4.
LWLockAcquireCommon()
{
..
for (;;)
{
/* false means cannot accept cancel/die interrupt here. */
PGSemaphoreLock(proc-sem, false);
if (!proc-lwWaiting)
break;
extraWaits++;
}
lock-releaseOK = true;
}

lock-releaseOK is updated/checked without spinklock where
as previously it was done under spinlock, won't it be unsafe?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Martijn van Oosterhout
On Sat, Jun 14, 2014 at 03:35:33PM -0400, Tom Lane wrote:
 The best that I think is reasonable to do in such cases is to pull out
 a separate copy of the sub-select for each actual NEW reference in a
 rule query.  So the example above would give rise to an expanded
 rule query along the lines of
 
   INSERT INTO foolog VALUES ( (SELECT x as a, y as b, ...).a,
   (SELECT x as a, y as b, ...).b,
   ... );

Would it not be possible to use WITH here, like:

WITH bar AS ( ... subselect ... )
INSERT INTO foolog VALUES (bar.a, bar.b, ...)

Or am I missing something?

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Hannu Krosing
On 06/14/2014 09:35 PM, Tom Lane wrote:
 As I mentioned awhile ago, I'm thinking about implementing the
 SQL-standard construct

   UPDATE foo SET ..., (a,b,...) = (SELECT x,y,...), ...

 I've run into a rather nasty problem, which is how does this interact
 with expansion of NEW references in ON UPDATE rules?  
Was'nt there a plan (consensus?) about deprecating rules altogether ?

Cheers
Hannu
 For example,
 suppose foo has a rule

   ON UPDATE DO ALSO INSERT INTO foolog VALUES (new.a, new.b, ...);

 The existing implementation relies on being able to pull expressions
 for individual fields' new values out of the UPDATE targetlist; but
 there is no independent expression for the new value of a here.
 Worse yet, the NEW references might be in WHERE quals, or some other
 place outside the targetlist of the rule query, which pretty much
 breaks the implementation I'd sketched earlier.

 The best that I think is reasonable to do in such cases is to pull out
 a separate copy of the sub-select for each actual NEW reference in a
 rule query.  So the example above would give rise to an expanded
 rule query along the lines of

   INSERT INTO foolog VALUES ( (SELECT x as a, y as b, ...).a,
   (SELECT x as a, y as b, ...).b,
   ... );

 which would work, but it would re-evaluate the sub-select more times
 than the user might be hoping.  (Of course, if there are volatile
 functions in the sub-select, he's screwed, but that's not a new
 problem with rules.)

 Given that ON UPDATE rules are close to being a deprecated feature,
 it doesn't seem appropriate to work harder than this; and frankly
 I don't see how we could avoid multiple sub-select evaluations anyway,
 if the NEW references are in WHERE or other odd places.

 Another possible answer is to just throw a not implemented error;
 but that doesn't seem terribly helpful, and I think it wouldn't save
 a lot of code anyway.

 Thoughts?

   regards, tom lane




-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump reporing version of server pg_dump as comments in the output

2014-06-17 Thread Jeevan Chalke
On Tue, Mar 4, 2014 at 11:28 AM, Wang, Jing ji...@fast.au.fujitsu.com
wrote:

 I don't buy your argument. Why isn't verbose option sufficient? Did you
 read the old thread about this [1]?
 [1] http://www.postgresql.org/message-id/3677.1253912...@sss.pgh.pa.us

 AFAICS a lot of people compare pg_dump diffs. If we apply this patch,
 it would break those applications.


Well, as it already mentioned by Peter in above old mail thread that
a diff of the same database made by different (major) versions of pg_dump
will already be different in most situations, so adding the pg_dump version
number in it is essentially free from this perspective

So I don't think there will be any issue with application break.

Also, it is *already* available if
 you add verbose option (which is sufficient to satisfy those that want
 the client and/or
 server version) in plain mode (the other modes already include the
 desired info by default). In the past, timestamps were removed to avoid
 noise in diffs.


Verbose option do include timestamps which will fail while comparing.
Also Verbose has too many details which may not be interested by people but
I
guess more people will clearly wanted to see the server and client version.

Thus I see this patch is useful and thus reviewed it.


 Sorry, I don't understand which application will break?  Can you give me
 more detail information?
 Timestamps always vary which do affect the diff.  But I can't imagine
 why adding the version will affect the pg_dump diffs.
 I don't think the version number vary  frequently.


Review comments:

1. Patch applies cleanly. make/make install/make check all are fine.
2. No issues fine as such in my unit testing.
3. As Tom suggested,
pg_dump  foo is giving identical output to pg_dump -Fc | pg_restore  foo

Basically only the code lines are re-arranged and thus it has NO affect on
system as such. So no issues found and as per me it is good to go in.

However following code chunk looks weird:

+   else
+   {
+   ahprintf(AH, \n);
+   }

You need extra line when NOT verbose mode but you don't need that when it is
verbose mode. This is just because of the fact that dumpTimestamp() call in
verbose mode adds an extra line.
Can't we remove this else all-together and print extra line unconditionally
?
Also can we remove curly braces in if and else near these code changes ?
(Attached patch along those lines)

Anyone has any other views ?

Thanks

-- 
Jeevan B Chalke
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
index f6fbf44..4aed1cb 100644
--- a/src/bin/pg_dump/pg_backup_archiver.c
+++ b/src/bin/pg_dump/pg_backup_archiver.c
@@ -353,16 +353,17 @@ RestoreArchive(Archive *AHX)
 
 	ahprintf(AH, --\n-- PostgreSQL database dump\n--\n\n);
 
+	if (AH-archiveRemoteVersion)
+		ahprintf(AH, -- Dumped from database version %s\n,
+ AH-archiveRemoteVersion);
+	if (AH-archiveDumpVersion)
+		ahprintf(AH, -- Dumped by pg_dump version %s\n,
+ AH-archiveDumpVersion);
+
 	if (AH-public.verbose)
-	{
-		if (AH-archiveRemoteVersion)
-			ahprintf(AH, -- Dumped from database version %s\n,
-	 AH-archiveRemoteVersion);
-		if (AH-archiveDumpVersion)
-			ahprintf(AH, -- Dumped by pg_dump version %s\n,
-	 AH-archiveDumpVersion);
 		dumpTimestamp(AH, Started on, AH-createDate);
-	}
+
+	ahprintf(AH, \n);
 
 	if (ropt-single_txn)
 	{

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-17 Thread Michael Paquier
On Tue, Jun 17, 2014 at 4:36 AM, Ronan Dunklau ronan.dunk...@dalibo.com wrote:
 Le lundi 16 juin 2014 16:07:51 Michael Paquier a écrit :
 On Sat, May 24, 2014 at 5:08 AM, Ronan Dunklau ronan.dunk...@dalibo.com
 2) The query I am seeing on this spec offers the possiblitily to query
 TABLE_NAME LIKE 'pattern'. Is this part of the spec as well? If yes,
 are you planning to add it as well. I imagine that this could be quite
 handy for users importing from a foreign schema tables that have the
 same prefix name for example. ImportForeignSchemaRestrictionType could
 be extended with an IMPORT_LIKE type. Extending a bit the spec...
 There could be a list of LIKE patterns, this matches more your code by
 adding all of them in table_names. I am not voting to add TABLE_NAME
 in the list of reserved keywords though, so something like TABLE LIKE
 'pattern' would be nice.

 From the spec I have been reading, the import qualification is defined as :

 table name list ::= table name [ { comma table name }... ]

 where table name is defined as a simple table name, without any qualifier.
Indeed. Let's stick to that for simplicity, and use only LIMIT TO/EXCEPT.

 3) This has been already raised in this thread, but something missing
 with this patch is the possiblity to pass options when executing an
 import. This could allow a FDW to do more fine-grained operations on
 the underlying objects of the table. There would be two level of
 options: table level and schema level. For example, with the current
 feature it is not possible to rename imported tables on-the-fly. I
 imagine that this would be useful.

 The attached patch implements options.
Thanks.

 Now, we should think about what options may be desirable for postgres_fdw.
An option to include triggers in what is fetched? I am sure you know
they can be created on foreign tables now :)

 I don't understand your point about table-level and schema-level options,
 though. There is no concept of foreign schema allowing to set options on it,
 AFAIK.
My point is to extend items of the table list to accept options:
[ { LIMIT TO | EXCEPT } ( table_item [ , table_item ] ) ]
where table_item:
table_name [ OPTIONS ( option_item [ , option_item ] ) ]

This would make possible post-operations on the tables fetched before
the FDW create individual CreateForeignTableStmt entries in
ImportForeignSchema. In the case of postgres_fdw, a potential option
would be to allow a renaming of the table after fetching them from
remote.

 Renaming imported tables on the fly should, IMO, be done after the import
 statement. Since it is transactional, nothing prevents you from running the
 ALTER TABLE statements after that.
True as well.

 5) The error message returned to user when import fails because of a
 missing type is rather unfriendly. Let's imagine with two nodes and
 postgres_fdw... Node 1:
 create type toto as (a int, b int);
 create table toto_tab (a toto);
 Node 2:
 =# import foreign schema public from server postgres_server into public;
 ERROR:  42704: type public.toto does not exist
 LOCATION:  parseTypeString, parse_type.c:797
 I would rather imagine something like IMPORT failed because of
 missing type defined on remote but not locally.

 I simply prepended the suggested  error message to the previous one:

 # IMPORT FOREIGN SCHEMA import_source FROM SERVER remote1 INTO
 import_destination;
 ERROR:  IMPORT of table t1 failed because of missing type defined on remote 
 but
 not locally: type public.typ1 does not exist
 Time: 11,305 ms
Thanks for the addition.

 6) Import does not fail if a table specified in LIMIT TO is not
 defined on remote-side:
 =# import foreign schema public from server postgres_server limit to
 (tab_not_there) into public;
 IMPORT FOREIGN SCHEMA
 =# \d
 No relations found.

 Thanks, this is corrected in the attached patch. The error detection code is
 O(n²), but I assumed LIMIT TO clauses should be small enough for it not to be
 a problem.

 10) Code has many whitespaces.

 I hope this is corrected in the attached patch, if not, could you please point
 me at specific examples ?

 11) Sometimes the import goes strangely:
 11-1) After some testing I noticed that tables with incorrect names
 were imported when using LIMIT TO. For example on remote a table
 called ab is present, IMPORT SCHEMA LIMIT TO 'other_ab' created a
 local entry called other_ab while it should create nothing
 11-2) Tables with empty names can be created locally. On foreign server:
 =# \d
  List of relations
  Schema |   Name   | Type  | Owner
 +--+---+
  public | aa   | table | ioltas
  public | toto_tab | table | ioltas
 (2 rows)
 On local node:
 =# import foreign schema public from server postgres_server limit to
 (toto_tab, NULL, aa) into public;
 -- (Forget about NULL here, I could reproduce it without as well)
 IMPORT FOREIGN SCHEMA
 Time: 13.589 ms
 =# \d
  List of relations
  Schema |   Name   | Type  | Owner
 

Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread Amit Kapila
On Tue, Jun 17, 2014 at 12:16 PM, xbzhang xbzh...@kingbase.com.cn wrote:

 one resource owner per tuples, when error happens, only release resource
owner belong to error tuple.
 Why some memory structures should be in undefined state? Can you give
some examples?

There might be some LWlocks which might have been taken
before error and you won't know which one to free.  Another
is that postgres uses memory context to allocate/free memory
in most places, so there can be allocated memory which needs
to be released, transaction/sub-transaction abort takes care of all
such and many more similar things.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread xbzhang






LWlocks can record in resource owner per tuples, so they can be released at 
rigth way, but the memory allocated on memory context is one problem.Are there 
any others problems?




张晓博   研发二部
北京人大金仓信息技术股份有限公司
地址:北京市海淀区上地西路八号院上地科技大厦4号楼501
邮编:100085
电话:(010) 5885 1118 - 8450
手机:15311394463
邮箱:xbzh...@kingbase.com.cn
 From: Amit KapilaDate: 2014-06-17 17:10To: xbzhangCC: Pavel Stehule; Alvaro 
Herrera; pgsql-hackersSubject: Re: [HACKERS] How to implement the skip errors 
for copy from ?On Tue, Jun 17, 2014 at 12:16 PM, xbzhang 
xbzh...@kingbase.com.cn wrote:

 one resource owner per tuples, when error happens, only release resource 
 owner belong to error tuple.

 Why some memory structures should be in undefined state? Can you give some 
 examples?

There might be some LWlocks which might have been takenbefore error and you 
won't know which one to free.  Another
is that postgres uses memory context to allocate/free memoryin most places, so 
there can be allocated memory which needsto be released, 
transaction/sub-transaction abort takes care of all
such and many more similar things.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


在此邮件中未发现病毒。

检查工具:AVG - www.avg.com

版本:2013.0.3480 / 病毒数据库:3955/7689 - 发布日期:06/16/14



Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Vik Fearing
On 06/17/2014 09:43 AM, Hannu Krosing wrote:
 On 06/14/2014 09:35 PM, Tom Lane wrote:
  As I mentioned awhile ago, I'm thinking about implementing the
  SQL-standard construct
 
 UPDATE foo SET ..., (a,b,...) = (SELECT x,y,...), ...
 
  I've run into a rather nasty problem, which is how does this interact
  with expansion of NEW references in ON UPDATE rules?  

 Was'nt there a plan (consensus?) about deprecating rules altogether ?

I believe that was just for user access to them, ie CREATE RULE.  I
don't think there was ever question of purging them from the code base.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window function optimisation, allow pushdowns of items matching PARTITION BY clauses

2014-06-17 Thread David Rowley
On Sun, May 25, 2014 at 2:10 PM, Thomas Mayer thomas.ma...@student.kit.edu
wrote:

 Hello David,

 sorry for the late response. I will try out your changes from the view of
 a user in mid-June. However, I can't do a trustworthy code review as I'm
 not an experienced postgre-hacker (yet).


Thanks, that sort of review will be a great start.

I've attached an updated patch as it seems there's been a change that
removes redundant sorts which was breaking one of the regression tests in
v0.2 of the patch.

+EXPLAIN (COSTS OFF)
+SELECT depname,depsalary FROM (SELECT depname,
+  sum(salary) OVER (PARTITION BY depname) depsalary,
+  rank() OVER (ORDER BY enroll_date) emprank
+  FROM empsalary) emp
+WHERE depname = 'sales';

This used to produced a plan which included a sort by enroll_date, but this
is no longer the case. I've updated the expected results to remove the
extra sort from the explain output

Regards

David Rowley


wfunc_pushdown_partitionby_v0.3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-06-17 Thread Michael Paquier
On Tue, Jun 17, 2014 at 5:06 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Tue, Jun 17, 2014 at 4:36 AM, Ronan Dunklau ronan.dunk...@dalibo.com 
 wrote:
 Now, we should think about what options may be desirable for postgres_fdw.
 An option to include triggers in what is fetched? I am sure you know
 they can be created on foreign tables now :)
Please forget that. This makes no sense as long as ImportForeignSchema
returns only a list of CreateForeighTableStmt, and even with that,
there are considerations like the order on which the objects created
should be applied through a potential call of standard_ProcessUtility
or similar.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-06-17 Thread David Rowley
On Wed, Jun 4, 2014 at 12:50 AM, Noah Misch n...@leadboat.com wrote:

 As a point of procedure, I recommend separating the semijoin support into
 its
 own patch.  Your patch is already not small; delaying non-essential parts
 will
 make the essential parts more accessible to reviewers.


In the attached patch I've removed all the SEMI and ANTI join removal code
and left only support for LEFT JOIN removal of sub-queries that can be
proved to be unique on the join condition by looking at the GROUP BY and
DISTINCT clause.

Example:

SELECT t1.* FROM t1 LEFT OUTER JOIN (SELECT value,COUNT(*) FROM t2 GROUP BY
value) t2 ON t1.id = t2.value;

Regards

David Rowley


subquery_leftjoin_removal_v1.1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Andres Freund
On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
 On Fri, May 23, 2014 at 10:01 PM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  On Fri, Jan 31, 2014 at 3:24 PM, Andres Freund and...@2ndquadrant.com
 wrote:
   I've pushed a rebased version of the patchset to
   http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git
   branch rwlock contention.
   220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small problem,
   ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA.
 
  As per discussion in developer meeting, I wanted to test shared
  buffer scaling patch with this branch.  I am getting merge
  conflicts as per HEAD.  Could you please get it resolved, so that
  I can get the data.

 I have started looking into this patch and have few questions/
 findings which are shared below:

 1. I think stats for lwstats-ex_acquire_count will be counted twice,
 first it is incremented in LWLockAcquireCommon() and then in
 LWLockAttemptLock()

Hrmpf. Will fix.

 2.
 Handling of potentialy_spurious case seems to be pending
 in LWLock functions like LWLockAcquireCommon().

 LWLockAcquireCommon()
 {
 ..
 /* add to the queue */
 LWLockQueueSelf(l, mode);

 /* we're now guaranteed to be woken up if necessary */
 mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);

 }

 I think it can lead to some problems, example:

 Session -1
 1. Acquire Exclusive LWlock

 Session -2
 1. Acquire Shared LWlock

 1a. Unconditionally incrementing shared count by session-2

 Session -1
 2. Release Exclusive lock

 Session -3
 1. Acquire Exclusive LWlock
 It will put itself to wait queue by seeing the lock count incremented
 by Session-2

 Session-2
 1b. Decrement the shared count and add it to wait queue.

 Session-4
 1. Acquire Exclusive lock
This session will get the exclusive lock, because even
though other lockers are waiting, lockcount is zero.

 Session-2
 2. Try second time to take shared lock, it won't get
as session-4 already has an exclusive lock, so it will
start waiting

 Session-4
 2. Release Exclusive lock
it will not wake the waiters because waiters have been added
before acquiring this lock.

I don't understand this step here? When releasing the lock it'll notice
that the waiters is  0 and acquire the spinlock which should protect
against badness here?

 3.
 LWLockAcquireCommon()
 {
 for (;;)
 {
 PGSemaphoreLock(proc-sem, false);
 if (!proc-lwWaiting)
 ..
 }
 proc-lwWaiting is checked, updated without spinklock where
 as previously it was done under spinlock, won't it be unsafe?

It was previously checked/unset without a spinlock as well:
/*
 * Awaken any waiters I removed from the queue.
 */
while (head != NULL)
{
LOG_LWDEBUG(LWLockRelease, T_NAME(l), T_ID(l), release 
waiter);
proc = head;
head = proc-lwWaitLink;
proc-lwWaitLink = NULL;
proc-lwWaiting = false;
PGSemaphoreUnlock(proc-sem);
}
I don't think there's dangers here, lwWaiting will only ever be
manipulated by the the PGPROC's owner. As discussed elsewhere there
needs to be a write barrier before the proc-lwWaiting = false, even in
upstream code.

 4.
 LWLockAcquireCommon()
 {
 ..
 for (;;)
 {
 /* false means cannot accept cancel/die interrupt here. */
 PGSemaphoreLock(proc-sem, false);
 if (!proc-lwWaiting)
 break;
 extraWaits++;
 }
 lock-releaseOK = true;
 }

 lock-releaseOK is updated/checked without spinklock where
 as previously it was done under spinlock, won't it be unsafe?

Hm. That's probably buggy. Good catch. Especially if you have a compiler
that does byte manipulation by reading e.g. 4 bytes from a struct and
then write the wider variable back... So the releaseOk bit needs to move
into LWLockDequeueSelf().

Thanks for looking!

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wrapping in extended mode doesn't work well with default pager

2014-06-17 Thread Pavel Stehule
2014-06-16 23:28 GMT+02:00 Jeff Janes jeff.ja...@gmail.com:

 On Wed, Jun 11, 2014 at 12:59 PM, Greg Stark st...@mit.edu wrote:

  I think this whole exercise has mostly just convinced me we should
  implement an HTTP interface and reimplement psql as a browser app.

 I certainly hope not.  I've seen lots of browser apps that were nice
 enough to use for casual use by a casual user.  I've never seen one
 that was an effective power tool for power users, the way psql is.
 Now maybe they are out there and I just don't know about them, but I
 have my doubts.

 As an additional tool, to each his own.  But a browser-based
 replacement for psql, -1 from me.


We can integrate a text console browsers like links, elinks or lynx instead

and we can call a BROWSER instead PAGER when \pset is html

pavel@localhost postgresql92]$ PAGER=elinks -force-html psql postgres
psql (9.4beta1)
Type help for help.

postgres=# \pset format html
Output format (format) is html.
postgres=# SELECT * FROM pg_proc;

works perfect

[pavel@localhost postgresql92]$ PAGER=lynx -stdin psql postgres
psql (9.4beta1)
Type help for help.

postgres=# \pset format html
Output format (format) is html.
postgres=# SELECT * FROM pg_proc;

Writing html browsing into psql is useless now and I don't think so it is
good idea. On second hand better integration mentioned browsers can be very
useful.

Regards

Pavel



 And as far browser-based things apply to this patch, I must say I've
 tried micromanaging the way large amounts of data wrap in a HTML table
 when I found the default to be inadequate, and I have not found that
 to be noticeably easy, either.

 The original version of this patch was only a few lines long and did
 one very simple and useful thing: avoiding the printing of whole
 screens full of hyphens when in 'expanded mode'.  If we end up
 reverting the other parts of this patch, hopefully we don't lose that
 part.

 Cheers,

 Jeff



[HACKERS] [REVIEW] psql tab completion for DROP TRIGGER/RULE and ALTER TABLE ... DISABLE/ENABLE

2014-06-17 Thread Ian Barwick

Andreas Karlsson (andr...@proxel.se) wrote:

Hi,

When benchmarking an application I got annoyed at how basic the tab
completion for ALTER TABLE ... DISABLE/ENABLE TRIGGER and DROP TRIGGER
is. So here is a patch improving the tab completion around triggers. For
consistency I have also added the same completions to rules since their
DDL is almost identical.


Thanks for this patch; I'm playing around with rules at the moment and it was
very useful. A quick review:

- applies cleanly to HEAD

- does what it claims, i.e. adds tab completion support for this syntax:

ALTER TABLE table { ENABLE | DISABLE } [ ALWAYS | REPLICA ] { RULE | 
TRIGGER } rule_or_trigger
DROP TRIGGER trigger ON relation { CASCADE | RESTRICT }
DROP RULE rule ON relation { CASCADE | RESTRICT }

- code style is consistent with the project style

One issue - the table's internal triggers will also be listed. which can result 
in
something like this:

database= ALTER TABLE object_version DISABLE TRIGGER TAB
RI_ConstraintTrigger_a_1916401  RI_ConstraintTrigger_a_1916422  
RI_ConstraintTrigger_c_1916358
RI_ConstraintTrigger_a_1916402  RI_ConstraintTrigger_c_1916238  
RI_ConstraintTrigger_c_1916359
RI_ConstraintTrigger_a_1916406  RI_ConstraintTrigger_c_1916239  
RI_ConstraintTrigger_c_1916398
RI_ConstraintTrigger_a_1916407  RI_ConstraintTrigger_c_1916263  
RI_ConstraintTrigger_c_1916399
RI_ConstraintTrigger_a_1916411  RI_ConstraintTrigger_c_1916264  
RI_ConstraintTrigger_c_1916478
RI_ConstraintTrigger_a_1916412  RI_ConstraintTrigger_c_1916298  
RI_ConstraintTrigger_c_1916479
RI_ConstraintTrigger_a_1916416  RI_ConstraintTrigger_c_1916299  
RI_ConstraintTrigger_c_1916513
RI_ConstraintTrigger_a_1916417  RI_ConstraintTrigger_c_1916328  
RI_ConstraintTrigger_c_1916514
RI_ConstraintTrigger_a_1916421  RI_ConstraintTrigger_c_1916329  
ts_vector_update

This is a bit of an extreme case, but I don't think manually manipulating
internal triggers (which can only be done as a superuser) is a common enough
operation to justify their inclusion. I suggest adding
'AND tgisinternal is FALSE' to 'Query_for_trigger_of_table' to hide them.



Regards

Ian Barwick



--
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-13 20:07:29 +0530, rahilasye...@gmail.com wrote:

 Patch named Support-for-lz4-and-snappy adds support for LZ4 and Snappy
 in PostgreSQL.

I haven't looked at this in any detail yet, but I note that the patch
creates src/common/lz4/.travis.yml, which it shouldn't.

I have a few preliminary comments about your patch.

 @@ -84,6 +87,7 @@ boolXLogArchiveMode = false;
  char*XLogArchiveCommand = NULL;
  bool EnableHotStandby = false;
  bool fullPageWrites = true;
 +int  compress_backup_block = false;

I think compress_backup_block should be initialised to
BACKUP_BLOCK_COMPRESSION_OFF. (But see below.)

 + for (j = 0; j  XLR_MAX_BKP_BLOCKS; j++)
 + compressed_pages[j] = (char *) 
 malloc(buffer_size);

Shouldn't this use palloc?

 + * Create a compressed version of a backup block
 + *
 + * If successful, return a compressed result and set 'len' to its length.
 + * Otherwise (ie, compressed result is actually bigger than original),
 + * return NULL.
 + */
 +static char *
 +CompressBackupBlock(char *page, uint32 orig_len, char *dest, uint32 *len)
 +{

First, the calling convention is a bit strange. I understand that you're
pre-allocating compressed_pages[] so as to avoid repeated allocations;
and that you're doing it outside CompressBackupBlock so as to avoid
passing in the index i. But the result is a little weird.

At the very minimum, I would move the if (!compressed_pages_allocated)
block outside the for (i = 0; i  XLR_MAX_BKP_BLOCKS; i++) loop, and
add some comments. I think we could live with that.

But I'm not at all fond of the code in this function either. I'd write
it like this:

struct varlena *buf = (struct varlena *) dest;

if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_SNAPPY)
{
if (pg_snappy_compress(page, BLCKSZ, buf) == EIO)
return NULL;
}
else if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_LZ4)
{
if (pg_LZ4_compress(page, BLCKSZ, buf) == 0)
return NULL;
}
else if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_PGLZ)
{
if (pglz_compress(page, BLCKSZ, (PGLZ_Header *) buf,
  PGLZ_strategy_default) != 0)
return NULL;
}
else
elog(ERROR, Wrong value for compress_backup_block GUC);

/*
 * …comment about insisting on saving at least two bytes…
 */

if (VARSIZE(buf) = orig_len - 2)
return NULL;

*len = VARHDRSIZE + VARSIZE(buf);

return buf;

I guess it doesn't matter *too* much if the intention is to have all
these compression algorithms only during development/testing and pick
just one in the end. But the above is considerably easier to read in
the meanwhile.

If we were going to keep multiple compression algorithms around, I'd be
inclined to create a pg_compress(…, compression_algorithm) function to
hide these return-value differences from the callers.

 + else if (VARATT_IS_COMPRESSED((struct varlena *) blk)  
 compress_backup_block!=BACKUP_BLOCK_COMPRESSION_OFF)
 + {
 + if (compress_backup_block == BACKUP_BLOCK_COMPRESSION_SNAPPY)
 + {
 + int ret;
 + size_t compressed_length = VARSIZE((struct varlena *) 
 blk) - VARHDRSZ;
 + char *compressed_data = (char *)VARDATA((struct varlena 
 *) blk);
 + size_t s_uncompressed_length;
 +
 + ret = snappy_uncompressed_length(compressed_data,
 + compressed_length,
 + s_uncompressed_length);
 + if (!ret)
 + elog(ERROR, snappy: failed to determine 
 compression length);
 + if (BLCKSZ != s_uncompressed_length)
 + elog(ERROR, snappy: compression size mismatch 
 %d != %zu,
 + BLCKSZ, s_uncompressed_length);
 +
 + ret = snappy_uncompress(compressed_data,
 + compressed_length,
 + page);
 + if (ret != 0)
 + elog(ERROR, snappy: decompression failed: %d, 
 ret);
 + }

…and a pg_decompress() function that does error checking.

 +static const struct config_enum_entry backup_block_compression_options[] = {
 + {off, BACKUP_BLOCK_COMPRESSION_OFF, false},
 + {false, BACKUP_BLOCK_COMPRESSION_OFF, true},
 + {no, BACKUP_BLOCK_COMPRESSION_OFF, true},
 + {0, BACKUP_BLOCK_COMPRESSION_OFF, true},
 + {pglz, BACKUP_BLOCK_COMPRESSION_PGLZ, true},
 + {snappy, BACKUP_BLOCK_COMPRESSION_SNAPPY, true},
 + {lz4, BACKUP_BLOCK_COMPRESSION_LZ4, true},
 + {NULL, 0, false}
 +};

Finally, I don't like the name compress_backup_block.

1. It should have been plural (compress_backup_blockS).

2. 

[HACKERS] releaseOk and LWLockWaitForVar

2014-06-17 Thread Andres Freund
Hi Heikki, All,

Amit just pointed me to a case where the lwlock scalability patch
apparently causes problems and I went on to review it and came across
the following problem in 9.4/master:
LWLockWaitForVar() doesn't set releaseOk to true when waiting
again. Isn't that a bug? What if there's another locker coming in after
LWLockWaitForVar() returns from the PGSemaphoreLock() but before it has
acquire the spinlock? Now, it might be that it's unproblematic because
of hte specific way these locks are used right now, but it doesn't seem
like a good idea to leave it that way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Amit Kapila
On Tue, Jun 17, 2014 at 3:56 PM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
  On Fri, May 23, 2014 at 10:01 PM, Amit Kapila amit.kapil...@gmail.com
  wrote:
   On Fri, Jan 31, 2014 at 3:24 PM, Andres Freund and...@2ndquadrant.com

  wrote:
I've pushed a rebased version of the patchset to
http://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git
branch rwlock contention.
220b34331f77effdb46798ddd7cca0cffc1b2858 actually was the small
problem,
ea9df812d8502fff74e7bc37d61bdc7d66d77a7f was the major PITA.
  
   As per discussion in developer meeting, I wanted to test shared
   buffer scaling patch with this branch.  I am getting merge
   conflicts as per HEAD.  Could you please get it resolved, so that
   I can get the data.
 
  I have started looking into this patch and have few questions/
  findings which are shared below:
 
  1. I think stats for lwstats-ex_acquire_count will be counted twice,
  first it is incremented in LWLockAcquireCommon() and then in
  LWLockAttemptLock()

 Hrmpf. Will fix.

  2.
  Handling of potentialy_spurious case seems to be pending
  in LWLock functions like LWLockAcquireCommon().
 
  LWLockAcquireCommon()
  {
  ..
  /* add to the queue */
  LWLockQueueSelf(l, mode);
 
  /* we're now guaranteed to be woken up if necessary */
  mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);
 
  }
 
  I think it can lead to some problems, example:
 
  Session -1
  1. Acquire Exclusive LWlock
 
  Session -2
  1. Acquire Shared LWlock
 
  1a. Unconditionally incrementing shared count by session-2
 
  Session -1
  2. Release Exclusive lock
 
  Session -3
  1. Acquire Exclusive LWlock
  It will put itself to wait queue by seeing the lock count incremented
  by Session-2
 
  Session-2
  1b. Decrement the shared count and add it to wait queue.
 
  Session-4
  1. Acquire Exclusive lock
 This session will get the exclusive lock, because even
 though other lockers are waiting, lockcount is zero.
 
  Session-2
  2. Try second time to take shared lock, it won't get
 as session-4 already has an exclusive lock, so it will
 start waiting
 
  Session-4
  2. Release Exclusive lock
 it will not wake the waiters because waiters have been added
 before acquiring this lock.

 I don't understand this step here? When releasing the lock it'll notice
 that the waiters is  0 and acquire the spinlock which should protect
 against badness here?

While Releasing lock, I think it will not go to Wakeup waiters
(LWLockWakeup), because releaseOK will be false.  releaseOK
can be set as false when Session-1 has Released Exclusive lock
and wakedup some previous waiter.  Once it is set to false, it can
be reset to true only for retry logic(after getting semaphore).

  3.
 I don't think there's dangers here, lwWaiting will only ever be
 manipulated by the the PGPROC's owner. As discussed elsewhere there
 needs to be a write barrier before the proc-lwWaiting = false, even in
 upstream code.

Agreed.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Hannu Krosing
On 06/17/2014 11:22 AM, Vik Fearing wrote:
 On 06/17/2014 09:43 AM, Hannu Krosing wrote:
 On 06/14/2014 09:35 PM, Tom Lane wrote:
 As I mentioned awhile ago, I'm thinking about implementing the
 SQL-standard construct

UPDATE foo SET ..., (a,b,...) = (SELECT x,y,...), ...

 I've run into a rather nasty problem, which is how does this interact
 with expansion of NEW references in ON UPDATE rules?  
 Was'nt there a plan (consensus?) about deprecating rules altogether ?
 I believe that was just for user access to them, ie CREATE RULE.  I
 don't think there was ever question of purging them from the code base.
But are there any cases, where UPDATE rules are created behind the scenes ?

-- 
Hannu Krosing
PostgreSQL Consultant
Performance, Scalability and High Availability
2ndQuadrant Nordic OÜ



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Andres Freund
On 2014-06-17 11:22:17 +0200, Vik Fearing wrote:
 On 06/17/2014 09:43 AM, Hannu Krosing wrote:
  On 06/14/2014 09:35 PM, Tom Lane wrote:
   As I mentioned awhile ago, I'm thinking about implementing the
   SQL-standard construct
  
UPDATE foo SET ..., (a,b,...) = (SELECT x,y,...), ...
  
   I've run into a rather nasty problem, which is how does this interact
   with expansion of NEW references in ON UPDATE rules?  
 
  Was'nt there a plan (consensus?) about deprecating rules altogether ?
 
 I believe that was just for user access to them, ie CREATE RULE.  I
 don't think there was ever question of purging them from the code base.

I don't think any such concensus has been made? I wish it were, but the
last discussions about it imo ended quite differently.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Andres Freund
On 2014-06-17 18:01:58 +0530, Amit Kapila wrote:
 On Tue, Jun 17, 2014 at 3:56 PM, Andres Freund and...@2ndquadrant.com
  On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
   2.
   Handling of potentialy_spurious case seems to be pending
   in LWLock functions like LWLockAcquireCommon().
  
   LWLockAcquireCommon()
   {
   ..
   /* add to the queue */
   LWLockQueueSelf(l, mode);
  
   /* we're now guaranteed to be woken up if necessary */
   mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);
  
   }
  
   I think it can lead to some problems, example:
  
   Session -1
   1. Acquire Exclusive LWlock
  
   Session -2
   1. Acquire Shared LWlock
  
   1a. Unconditionally incrementing shared count by session-2
  
   Session -1
   2. Release Exclusive lock
  
   Session -3
   1. Acquire Exclusive LWlock
   It will put itself to wait queue by seeing the lock count incremented
   by Session-2
  
   Session-2
   1b. Decrement the shared count and add it to wait queue.
  
   Session-4
   1. Acquire Exclusive lock
  This session will get the exclusive lock, because even
  though other lockers are waiting, lockcount is zero.
  
   Session-2
   2. Try second time to take shared lock, it won't get
  as session-4 already has an exclusive lock, so it will
  start waiting
  
   Session-4
   2. Release Exclusive lock
  it will not wake the waiters because waiters have been added
  before acquiring this lock.
 
  I don't understand this step here? When releasing the lock it'll notice
  that the waiters is  0 and acquire the spinlock which should protect
  against badness here?
 
 While Releasing lock, I think it will not go to Wakeup waiters
 (LWLockWakeup), because releaseOK will be false.  releaseOK
 can be set as false when Session-1 has Released Exclusive lock
 and wakedup some previous waiter.  Once it is set to false, it can
 be reset to true only for retry logic(after getting semaphore).

I unfortunately still can't follow. If Session-1 woke up some previous
waiter the woken up process will set releaseOK to true again when it
loops to acquire the lock?


Somewhat unrelated:

I have a fair amount of doubt about the effectiveness of the releaseOK
logic (which imo also is pretty poorly documented).
Essentially its intent is to avoid unneccessary scheduling when other
processes have already been woken up (i.e. releaseOK has been set to
false). I believe the theory is that if any process has already been
woken up it's pointless to wake up additional processes
(i.e. PGSemaphoreUnlock()) because the originally woken up process will
wake up at some point. But if the to-be-woken up process is scheduled
out because it used all his last timeslices fully that means we'll not
wakeup other waiters for a relatively long time.

It's been introduced in the course of
5b9a058384e714b89e050fc0b6381f97037c665a whose logic generally is rather
sound - I just doubt that the releaseOK part is necessary.

It'd certainly interesting to rip releaseOK out and benchmark the
result... My theory is that the average latency will go down on busy
systems that aren't IO bound.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to implement the skip errors for copy from ?

2014-06-17 Thread Tom Lane
xbzhang xbzh...@kingbase.com.cn writes:
 LWlocks can record in resource owner per tuples, so they can be released at 
 rigth way, but the memory allocated on memory context is one problem.Are 
 there any others problems?

See AbortSubTransaction(), CleanupSubTransaction(), and the rather large
number of subroutines they call.  Almost everything that code does is
connected to cleaning up something that might have been left unfinished
after an elog(ERROR) took control away in the middle of some code
sequence.

In addition, you can't just wave your hands and presto the bad tuple is
not there anymore.  For example, the failure might have been a unique key
violation in some index or other.  Not only is the bad tuple already on
disk, but possibly so are index entries for it in other indexes.  In
general the only way to get rid of those index entries is a VACUUM.
So you really have to have a subtransaction whose XID is what you mark
the new tuple with, and then rolling back the subtransaction is what
causes the new tuple to not be seen as good.  (Actually getting rid of
it will be left for the next VACUUM.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-17 11:22:17 +0200, Vik Fearing wrote:
 On 06/17/2014 09:43 AM, Hannu Krosing wrote:
 Was'nt there a plan (consensus?) about deprecating rules altogether ?

 I believe that was just for user access to them, ie CREATE RULE.  I
 don't think there was ever question of purging them from the code base.

 I don't think any such concensus has been made? I wish it were, but the
 last discussions about it imo ended quite differently.

Yeah, I don't think there's any prospect of removing them in the near
future.  We'd need a (better-designed) replacement feature first.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Andres Freund
On 2014-06-17 09:46:13 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-17 11:22:17 +0200, Vik Fearing wrote:
  On 06/17/2014 09:43 AM, Hannu Krosing wrote:
  Was'nt there a plan (consensus?) about deprecating rules altogether ?
 
  I believe that was just for user access to them, ie CREATE RULE.  I
  don't think there was ever question of purging them from the code base.
 
  I don't think any such concensus has been made? I wish it were, but the
  last discussions about it imo ended quite differently.
 
 Yeah, I don't think there's any prospect of removing them in the near
 future.  We'd need a (better-designed) replacement feature first.

IMO INSTEAD triggers pretty much are that. We only need to make them
work for normal relations as well (partitioning!) and we're pretty much
there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On further review, this is definitely the way to go: it's a
 straight-up win.  The isnull array is never more than one element in
 length, so testing the single element is quite trivial.   The
 attached, revised patch provides a modest but useful speedup for both
 hash and btree index builds.

 Anyone see any reason NOT to do this?  So far it looks like a
 slam-dunk from here.

 If index_form_tuple leaks any memory in the sort context, this would be
 not so much a slam-dunk win as a complete disaster.  I'm not sure that
 no-leak is a safe assumption, especially in cases where we do toast
 compression of the index datums.  (In fact, I'm pretty sure that the
 reason it was coded like this originally was exactly to avoid that
 assumption.)

Yes, that would be bad.  The reason I think it's probably OK is that
index_form_tuple seems to have had, from ancient times (cf.
f67e79045), code of non-trivial to free any memory that's allocated as
a result of de-TOASTing, which we presumably would not bother to do if
we were counting on memory context resets to clean things up.  It
calls toast_compress_datum(), which is similarly careful.  The most
complicated code path appears to be the VARATT_IS_EXTERNAL() case,
where we call heap_tuple_fetch_attr(), which calls
toast_fetch_datum().  I'm kind of wondering whether that can really
happen, though, because I didn't think indexes could use TOAST
storage.  Even if it does happen I don't see any obvious reason why it
should leak, but it'd be harder to be certain that it doesn't.

 Assuming that you inspect the code and determine it's safe, the patch
 had better decorate index_form_tuple with dire warnings about not leaking
 memory; because even if it's safe today, somebody might break it tomorrow.

 On a micro-optimization level, it might be worth passing the TID as
 ItemPointer not ItemPointerData (ie, pass a pointer until we get to
 the point of actually inserting the TID into the index tuple).
 I'm not sure that copying odd-size structs should be assumed to be
 efficient.

Yeah, true.  Checking existing precedent, it looks like we usually
pass ItemPointer rather than ItemPointerData, so it's probably a good
idea to do this that way too for reasons of style if nothing else.  I
kind of wonder whether it's really more efficient to pass an 8-byte
pointer to a 6-byte structure than to just pass the structure itself,
but it might be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 Would it not be possible to use WITH here, like:

 WITH bar AS ( ... subselect ... )
 INSERT INTO foolog VALUES (bar.a, bar.b, ...)

Don't think it works if the sub-select is correlated.

Consider something like

UPDATE summary_table s
  SET (sumx, sumy) = (SELECT sum(x), sum(y) FROM detail_table d
  WHERE d.group = s.group)

and suppose we have a logging rule like the above on summary_table.
You can't push the sub-select into a WITH because it depends on
s.group.  With sufficient intelligence you could rewrite the query
entirely, I guess, but no simple transformation is going to cope.

But come to think of it, WITH is already an interesting precedent: if you
look into rewriteHandler.c you'll notice a boatload of corner cases where
the rewriter just throws up its hands for various combinations of rules
and statements containing WITH.  So maybe that lends a bit more weight
to Andres' position that it's okay to consider this an unimplemented
feature.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 16, 2014 at 8:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On a micro-optimization level, it might be worth passing the TID as
 ItemPointer not ItemPointerData (ie, pass a pointer until we get to
 the point of actually inserting the TID into the index tuple).
 I'm not sure that copying odd-size structs should be assumed to be
 efficient.

 Yeah, true.  Checking existing precedent, it looks like we usually
 pass ItemPointer rather than ItemPointerData, so it's probably a good
 idea to do this that way too for reasons of style if nothing else.  I
 kind of wonder whether it's really more efficient to pass an 8-byte
 pointer to a 6-byte structure than to just pass the structure itself,
 but it might be.

The pointer will certainly be passed in a register, or whatever passes for
registers on the particular machine architecture.  Weird-size structs,
though, tend to have arcane and not-so-efficient rules for being passed
by value.  It's not unlikely that what the compiler will do under the hood
is pass a pointer anyhow, and then do a memcpy to make a local copy in
the called function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-06-17 Thread Robert Haas
On Fri, Jun 13, 2014 at 8:22 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 here is a patch implementing varwidth_bucket (naming is up for discussion)
 function which does binning with variable bucket width. The use-cases are
 same as for width_bucket (=data analytics, mainly histograms), the
 difference is that width_bucket uses buckets of same width but the
 varwidth_bucket accepts an sorted array of right-bound thresholds to define
 the individual buckets.

 Currently applications implement this with long CASE statements which are
 quite hard to read/maintain and are much slower than this implementation
 which uses binary search.

 There are 3 actual functions, one generic and two faster versions for the
 int8 and float8 input that take advantage of the static width of those
 types.

I wonder if stuff like this shouldn't live in contrib rather than
core, but I guess it's probably not worth it for 3 functions.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-17 Thread Robert Haas
On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 attached is a simple patch which makes it possible to change the system
 identifier of the cluster in pg_control. This is useful for
 individualization of the instance that is started on top of data directory
 produced by pg_basebackup - something that's helpful for logical replication
 setup where you need to easily identify each node (it's used by
 Bidirectional Replication for example).

I can clearly understand the utility of being able to reset the system
ID to a new, randomly-generated system ID - but giving the user the
ability to set a particular value of their own choosing seems like a
pretty sharp tool.  What is the use case for that?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in binning functions

2014-06-17 Thread Petr Jelinek

On 17/06/14 16:15, Robert Haas wrote:

On Fri, Jun 13, 2014 at 8:22 PM, Petr Jelinek p...@2ndquadrant.com wrote:

here is a patch implementing varwidth_bucket (naming is up for discussion)
function which does binning with variable bucket width. The use-cases are
same as for width_bucket (=data analytics, mainly histograms), the
difference is that width_bucket uses buckets of same width but the
varwidth_bucket accepts an sorted array of right-bound thresholds to define
the individual buckets.


I wonder if stuff like this shouldn't live in contrib rather than
core, but I guess it's probably not worth it for 3 functions.



I was wondering the same and I also think that 3 simple functions is not 
that much, plus it seems natural extension of the existing width_bucket.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread MauMau

Hello,

As I proposed before in the thread below, I've implemented a simple command 
for reliable WAL archiving.  I would appreciate it if you could review and 
test the patch.


http://www.postgresql.org/message-id/9C1EB95CA1F34DAB93DF549A51E3E874@maumau

Regards
MauMau


pg_copy.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quantify small changes to predicate evaluation

2014-06-17 Thread Dennis Butterstein
Hi Marti,thank you for your quick reply.I tried the proposed tweaks and see
some differences regarding the  measurements. It seems as if the overall
query performance dropped a little what I think the disabled turbo boost
mode is responsible for (all measurements are single query only). I think
that kind of behaviour is somewhat expected.Run1: 26.559sRun2:
28.280sUnfortunately the variance between the runs seems to remain high.In
fact I have exclusive access to the machine and I made sure not to run in
any i/o related problems regarding buffer caches.Are there any other
stumbling blocks I'm missing at the moment?Maybe I've to rethink my
(end-to-end) measurement strategy.In your experience how small is it
possible to get measurement variances on Postgres?Thank you very much.Kind
regards,Dennis



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Quantify-small-changes-to-predicate-evaluation-tp5807190p5807542.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Minmax indexes

2014-06-17 Thread Robert Haas
On Sat, Jun 14, 2014 at 10:34 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas wrote:
 On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Here's an updated version of this patch, with fixes to all the bugs
  reported so far.  Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and
  Amit Kapila for the reports.

 I'm not very happy with the use of a separate relation fork for
 storing this data.

 Here's a new version of this patch.  Now the revmap is not stored in a
 separate fork, but together with all the regular data, as explained
 elsewhere in the thread.

Cool.

Have you thought more about this comment from Heikki?

http://www.postgresql.org/message-id/52495dd3.9010...@vmware.com

I'm concerned that we could end up with one index type of this general
nature for min/max type operations, and then another very similar
index type for geometric operators or text-search operators or what
have you.  Considering the overhead in adding and maintaining an index
AM, I think we should try to be sure that we've done a reasonably
solid job making each one as general as we reasonably can.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] UPDATE SET (a,b,c) = (SELECT ...) versus rules

2014-06-17 Thread Merlin Moncure
On Tue, Jun 17, 2014 at 9:02 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 But come to think of it, WITH is already an interesting precedent: if you
 look into rewriteHandler.c you'll notice a boatload of corner cases where
 the rewriter just throws up its hands for various combinations of rules
 and statements containing WITH.  So maybe that lends a bit more weight
 to Andres' position that it's okay to consider this an unimplemented
 feature.

This reflects previous consensus AIUI.  RULES came up in similar way
with the 'data modifying with' feature; it was decided that as long as
old stuff didn't break new features don't necessarily have to go
through the motions.  This essentially deprecates rules IMO, which is
fine.  Maybe a small adjustment of the note in the rule documentation
couldn't hurt; it currently warns based on performance...a heads up
that current and future SQL features might not be fully supported
would be nice.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Andres Freund
On 2014-06-17 10:26:11 -0400, Robert Haas wrote:
 On Sat, Jun 14, 2014 at 10:34 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Robert Haas wrote:
  On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   Here's an updated version of this patch, with fixes to all the bugs
   reported so far.  Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and
   Amit Kapila for the reports.
 
  I'm not very happy with the use of a separate relation fork for
  storing this data.
 
  Here's a new version of this patch.  Now the revmap is not stored in a
  separate fork, but together with all the regular data, as explained
  elsewhere in the thread.
 
 Cool.
 
 Have you thought more about this comment from Heikki?
 
 http://www.postgresql.org/message-id/52495dd3.9010...@vmware.com

Is there actually a significant usecase behind that wish or just a
general demand for being generic? To me it seems fairly unlikely you'd
end up with something useful by doing a minmax index over bounding
boxes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-17 Thread Petr Jelinek

On 17/06/14 16:18, Robert Haas wrote:

On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek p...@2ndquadrant.com wrote:

attached is a simple patch which makes it possible to change the system
identifier of the cluster in pg_control. This is useful for
individualization of the instance that is started on top of data directory
produced by pg_basebackup - something that's helpful for logical replication
setup where you need to easily identify each node (it's used by
Bidirectional Replication for example).


I can clearly understand the utility of being able to reset the system
ID to a new, randomly-generated system ID - but giving the user the
ability to set a particular value of their own choosing seems like a
pretty sharp tool.  What is the use case for that?



Let's say you want to initialize new logical replication node via 
pg_basebackup and you want your replication slots to be easily 
identifiable so you use your local system id as part of the slot name.


In that case you need to know the future system id of the node because 
you need to register the slot before consistent point to which you 
replay via streaming replication (and you can't replay anymore once you 
changed the system id). Which means you need to generate your system id 
in advance and be able to change it in pg_control later.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 23:26:37 +0900, maumau...@gmail.com wrote:

 Hello,
 
 As I proposed before in the thread below, I've implemented a simple
 command for reliable WAL archiving.  I would appreciate it if you
 could review and test the patch.

Please add your patch to the next CF (i.e. 2014-08), so that it gets
review attention at the appropriate time.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 06/17/2014 07:26 AM, MauMau wrote:
 Hello,
 
 As I proposed before in the thread below, I've implemented a simple
 command for reliable WAL archiving.  I would appreciate it if you could
 review and test the patch.
 
 http://www.postgresql.org/message-id/9C1EB95CA1F34DAB93DF549A51E3E874@maumau

That first hunk refers to dblink -- I'm guessing it does not belong with
this patch.

Joe


- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJToFiiAAoJEDfy90M199hlseMP/Rpwneo5la7mJkHJA0BWUtj/
Hh8yNzzyBVPbKYuTEOZWi2yblFW6yA2dknrYD8RigS1qJFwLw+drFt5673Vi6jKW
Pf7Qn62Ck/U0lZTGXUU9akOhSx7BsKVwH8HvdARIp5DSV2n7/HFDtazi3hTtFq31
GHKA84lPwuynODN42eVez0YXdeRUX7K/s5rCRq154q3BrLPhEEvQwo/kfhet3F0A
utf0ymSPuX3RvpGDDPAZ1oQxTd/xA+qJhX0jrsRoJVaY40rufTgXPmAnNFOdWEds
fXs6ObRm+tHmeLhYVUXhODa1HPHBQMvElVeB3CGxhPUvhP2494sfoLOd7qs8Lblg
oUkYbIJee8Ir7NN34Gc69YW58sekPSVI4vXKu0PwF++Ubs4MYNNd7nP4Wu9N2ISw
p/GPIbx0uR3NbFGCoOLD0K3QHnX/b0FTWHzTTboZ+b69rNIDePpn3eGO2QEOLL/R
P/YkPma8SLyDvNnCzuSU3XDkFmWQsgK7xa7qpsBR1mbdF7zKPfOxCtoby/enSeuk
z7NJxtpHeUQkO7Pb3ZNk6gL+E8UAQihvdBgdBwBzj4qaZyAM5ec29aya3TtBbF+3
UoFX3m4tthR6mMWqizsdadSPvozDjMrhqSRFAjdBSX80Nfs2DVN1Hepp8NXvjRzV
b5RfV7x4yvtr92FFQboj
=+TWh
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [patch] pg_copy - a command for reliable WAL archiving

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 08:02:59 -0700, m...@joeconway.com wrote:

 That first hunk refers to dblink -- I'm guessing it does not belong
 with this patch.

Yes, it's a leftover of the dblink memory leak patch that's in this CF.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Amit Kapila
On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-06-17 18:01:58 +0530, Amit Kapila wrote:
  On Tue, Jun 17, 2014 at 3:56 PM, Andres Freund and...@2ndquadrant.com
   On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
2.
Handling of potentialy_spurious case seems to be pending
in LWLock functions like LWLockAcquireCommon().
   
LWLockAcquireCommon()
{
..
/* add to the queue */
LWLockQueueSelf(l, mode);
   
/* we're now guaranteed to be woken up if necessary */
mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);
   
}
   
I think it can lead to some problems, example:
   
Session -1
1. Acquire Exclusive LWlock
   
Session -2
1. Acquire Shared LWlock
   
1a. Unconditionally incrementing shared count by session-2
   
Session -1
2. Release Exclusive lock
   
Session -3
1. Acquire Exclusive LWlock
It will put itself to wait queue by seeing the lock count
incremented
by Session-2
   
Session-2
1b. Decrement the shared count and add it to wait queue.
   
Session-4
1. Acquire Exclusive lock
   This session will get the exclusive lock, because even
   though other lockers are waiting, lockcount is zero.
   
Session-2
2. Try second time to take shared lock, it won't get
   as session-4 already has an exclusive lock, so it will
   start waiting
   
Session-4
2. Release Exclusive lock
   it will not wake the waiters because waiters have been added
   before acquiring this lock.
  
   I don't understand this step here? When releasing the lock it'll
notice
   that the waiters is  0 and acquire the spinlock which should protect
   against badness here?
 
  While Releasing lock, I think it will not go to Wakeup waiters
  (LWLockWakeup), because releaseOK will be false.  releaseOK
  can be set as false when Session-1 has Released Exclusive lock
  and wakedup some previous waiter.  Once it is set to false, it can
  be reset to true only for retry logic(after getting semaphore).

 I unfortunately still can't follow.

You have followed it pretty well as far as I can understand from your
replies, as there is no reproducible test (which I think is bit tricky to
prepare), so it becomes difficult to explain by theory.

 If Session-1 woke up some previous
 waiter the woken up process will set releaseOK to true again when it
 loops to acquire the lock?

You are right, it will wakeup the existing waiters, but I think the
new logic has one difference which is that it can allow the backend to
take Exclusive lock when there are already waiters in queue.  As per
above example even though Session-2 and Session-3 are in wait
queue, Session-4 will be able to acquire Exclusive lock which I think
was previously not possible.


 Somewhat unrelated:

 I have a fair amount of doubt about the effectiveness of the releaseOK
 logic (which imo also is pretty poorly documented).
 Essentially its intent is to avoid unneccessary scheduling when other
 processes have already been woken up (i.e. releaseOK has been set to
 false). I believe the theory is that if any process has already been
 woken up it's pointless to wake up additional processes
 (i.e. PGSemaphoreUnlock()) because the originally woken up process will
 wake up at some point. But if the to-be-woken up process is scheduled
 out because it used all his last timeslices fully that means we'll not
 wakeup other waiters for a relatively long time.

I think it will also maintain that the wokedup process won't stall for
very long time, because if we wake new waiters, then previously woked
process can again enter into wait queue and similar thing can repeat
for long time.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-17 Thread Andres Freund
On 2014-06-17 20:47:51 +0530, Amit Kapila wrote:
 On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-17 18:01:58 +0530, Amit Kapila wrote:
   On Tue, Jun 17, 2014 at 3:56 PM, Andres Freund and...@2ndquadrant.com
On 2014-06-17 12:41:26 +0530, Amit Kapila wrote:
  I unfortunately still can't follow.
 
 You have followed it pretty well as far as I can understand from your
 replies, as there is no reproducible test (which I think is bit tricky to
 prepare), so it becomes difficult to explain by theory.

I'm working an updated patch that moves the releaseOK into the
spinlocks. Maybe that's the problem already - it's certainly not correct
as is.

  If Session-1 woke up some previous
  waiter the woken up process will set releaseOK to true again when it
  loops to acquire the lock?
 
 You are right, it will wakeup the existing waiters, but I think the
 new logic has one difference which is that it can allow the backend to
 take Exclusive lock when there are already waiters in queue.  As per
 above example even though Session-2 and Session-3 are in wait
 queue, Session-4 will be able to acquire Exclusive lock which I think
 was previously not possible.

I think that was previously possible as well - in a slightly different
set of circumstances though. After a process releases a lock and wakes
up some of several waiters another process can come in and acquire the
lock. Before the woken up process gets scheduled again. lwlocks aren't
fair locks...

  Somewhat unrelated:
 
  I have a fair amount of doubt about the effectiveness of the releaseOK
  logic (which imo also is pretty poorly documented).
  Essentially its intent is to avoid unneccessary scheduling when other
  processes have already been woken up (i.e. releaseOK has been set to
  false). I believe the theory is that if any process has already been
  woken up it's pointless to wake up additional processes
  (i.e. PGSemaphoreUnlock()) because the originally woken up process will
  wake up at some point. But if the to-be-woken up process is scheduled
  out because it used all his last timeslices fully that means we'll not
  wakeup other waiters for a relatively long time.
 
 I think it will also maintain that the wokedup process won't stall for
 very long time, because if we wake new waiters, then previously woked
 process can again enter into wait queue and similar thing can repeat
 for long time.

I don't think it effectively does that - newly incoming lockers ignore
the queue and just acquire the lock. Even if there's some other backend
scheduled to wake up. And shared locks can be acquired when there's
exclusive locks waiting.

I think both are actually critical for performance... Otherwise even a
only lightly contended lock would require scheduler activity when a
processes tries to lock something twice. Given the frequency we acquire
some locks with that'd be disastrous...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Greg Stark
On Tue, Jun 17, 2014 at 3:31 PM, Andres Freund and...@2ndquadrant.com wrote:
 Is there actually a significant usecase behind that wish or just a
 general demand for being generic? To me it seems fairly unlikely you'd
 end up with something useful by doing a minmax index over bounding
 boxes.

Isn't min/max just a 2d bounding box? If you do a bulk data load of
something like the census data then sure, every page will have data
points for some geometrically clustered set of data.

I had in mind to do a small bloom filter per block. In general any
kind of predicate like bounding box should work.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.4 release notes

2014-06-17 Thread Bruce Momjian
On Tue, Jun  3, 2014 at 01:21:51AM -0700, Peter Geoghegan wrote:
 On Sun, May 4, 2014 at 5:46 AM, Bruce Momjian br...@momjian.us wrote:
  Feedback expected and welcomed.
 
 One item currently reads Improve valgrind error reporting.  I
 suggest this be changed to Add support for Valgrind memcheck memory
 error detector. It was possible to use the tool before, but the lack
 of cooperation from Postgres made this far less useful.

I have applied the attached patch to improve this.  I didn't use memory
error because it is not checking for errors in the memory, but rather
invalid memory usage.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/release-9.4.sgml b/doc/src/sgml/release-9.4.sgml
new file mode 100644
index d9f9828..5868529
*** a/doc/src/sgml/release-9.4.sgml
--- b/doc/src/sgml/release-9.4.sgml
***
*** 2229,2235 
  
listitem
 para
! Improve applicationValgrind/ error reporting (Noah Misch)
 /para
/listitem
  
--- 2229,2236 
  
listitem
 para
! Improve applicationValgrind/ detection of invalid memory usage
! (Noah Misch)
 /para
/listitem
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory deallocation after calling cast function

2014-06-17 Thread Bruce Momjian
On Tue, Jun  3, 2014 at 03:59:45PM -0400, Tom Lane wrote:
 Soroosh Sardari soroosh.sard...@gmail.com writes:
  I have problem with memory deallocation. look at the following queries
 
  1- create table test01(a) as select generate_series(1,1)::int8 ;
 
 Do it as, eg,
 
 create table test01(a) as select g::int8 from generate_series(1,1) g;
 
 SRFs in the SELECT targetlist tend to leak memory; this is not easily
 fixable, and nobody is likely to try hard considering the feature's on
 the edge of deprecation anyhow.

Uh, what is replacing SRFs?  CTEs?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] How about a proper TEMPORARY TABLESPACE?

2014-06-17 Thread Matheus de Oliveira
Hi Hackers,

I was facing a situation were we wanted to set temp_tablespaces to a
tablespace on a ephemeral disk (yes, it is AWS ephemeral disk), and I know
many users have faced the same situation. Although it seems safe to create
a tablespace on ephemeral disks if you use it to store only temporary files
(either created by queries or temp tables), PostgreSQL does not have such
information, and can't safely prevent a careless user of creating a
non-temporary relation on this tablespace.

Also, in case of a lost of this tablespace, its PG_TEMP_FILES_DIR should be
created by hand after recovering. That is fine actually, but many users may
not even noticed that.

So, what you guys think about letting PG know somehow that a tablespace is
temporary?

I have took some small time to make a PoC just to see if that is doable.
And so I did a new syntax like:

CREATE TABLESPACE spcname [TEMP | TEMPORARY] LOCATION ...

So, if TEMP or TEMPORARY is present, I mark a new column at pg_tablespace
as true. On every table creation or moving to a new tablespace, I just
check this, and fails if the tablespace is temporary but the
relpersistence says the table is not.

The other part is, every time some query or relation is asked to be stored
on this tablespace, I create (on-demand) the PG_TEMP_FILES_DIR inside of it
(also if it is temporary).

The attached patch (and also on my Github, [1]), shows the PoC. For now,
I'm not worried about the code quality, there are yet a lot of work to be
done there (like ALTER TABLESPACE, better testing, use relcache, etc...),
and it is my first hacking on PG (so I'm a newbie). But I'd like to hear
from you guys if such feature is desirable and if I could starting working
on that for real. Also some thoughts about better way of implementing it.

[1] https://github.com/matheusoliveira/postgres/compare/my_temptablespace

Waiting for thoughts on that.

Best regards,
-- 
Matheus de Oliveira
Analista de Banco de Dados
Dextra Sistemas - MPS.Br nível F!
www.dextra.com.br/postgres
*** a/src/backend/commands/dbcommands.c
--- b/src/backend/commands/dbcommands.c
***
*** 388,399  createdb(const CreatedbStmt *stmt)
--- 388,406 
  			aclcheck_error(aclresult, ACL_KIND_TABLESPACE,
  		   tablespacename);
  
+ 
  		/* pg_global must never be the default tablespace */
  		if (dst_deftablespace == GLOBALTABLESPACE_OID)
  			ereport(ERROR,
  	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
    errmsg(pg_global cannot be used as default tablespace)));
  
+ 		/* can't create a database on temporary tablespace */
+ 		if (is_tablespace_storage_temporary(dst_deftablespace))
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+   errmsg(cannot create a database on a tablespace in temporary storage)));
+ 
  		/*
  		 * If we are trying to change the default tablespace of the template,
  		 * we require that the template not have any files in the new default
***
*** 1083,1088  movedb(const char *dbname, const char *tblspcname)
--- 1090,1101 
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
   errmsg(pg_global cannot be used as default tablespace)));
  
+ 	/* can't create a database on temporary tablespace */
+ 	if (is_tablespace_storage_temporary(dst_tblspcoid))
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			  errmsg(cannot create a database on a tablespace in temporary storage)));
+ 
  	/*
  	 * No-op if same tablespace
  	 */
*** a/src/backend/commands/indexcmds.c
--- b/src/backend/commands/indexcmds.c
***
*** 432,437  DefineIndex(Oid relationId,
--- 432,446 
  		   get_tablespace_name(tablespaceId));
  	}
  
+ 	/* Can't save relations on temporary tablespace */
+ 	if (rel-rd_rel-relpersistence != RELPERSISTENCE_TEMP 
+ 		is_tablespace_storage_temporary(OidIsValid(tablespaceId) ? tablespaceId : MyDatabaseTableSpace))
+ 	{
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			  errmsg(cannot save relation on a tablespace in temporary storage)));
+ 	}
+ 
  	/*
  	 * Force shared indexes into the pg_global tablespace.  This is a bit of a
  	 * hack but seems simpler than marking them in the BKI commands.  On the
*** a/src/backend/commands/tablecmds.c
--- b/src/backend/commands/tablecmds.c
***
*** 523,528  DefineRelation(CreateStmt *stmt, char relkind, Oid ownerId)
--- 523,537 
  (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
   errmsg(only shared relations can be placed in pg_global tablespace)));
  
+ 	/* Can't save relations on temporary tablespace */
+ 	if (stmt-relation-relpersistence != RELPERSISTENCE_TEMP 
+ 		is_tablespace_storage_temporary(OidIsValid(tablespaceId) ? tablespaceId : MyDatabaseTableSpace))
+ 	{
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 			  errmsg(cannot save relation on a tablespace in temporary storage)));
+ 	}
+ 
  	/* Identify user ID that will own the table */
  	if (!OidIsValid(ownerId))
  		

Re: [HACKERS] Memory deallocation after calling cast function

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 11:32:37 -0400, br...@momjian.us wrote:

  SRFs in the SELECT targetlist tend to leak memory; this is not easily
  fixable, and nobody is likely to try hard considering the feature's on
  the edge of deprecation anyhow.
 
 Uh, what is replacing SRFs?  CTEs?

I don't think Tom was referring to SRFs in general, only putting them
directly into the targetlist of a SELECT.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory deallocation after calling cast function

2014-06-17 Thread Andres Freund
On 2014-06-17 21:09:25 +0530, Abhijit Menon-Sen wrote:
 At 2014-06-17 11:32:37 -0400, br...@momjian.us wrote:
 
   SRFs in the SELECT targetlist tend to leak memory; this is not easily
   fixable, and nobody is likely to try hard considering the feature's on
   the edge of deprecation anyhow.
  
  Uh, what is replacing SRFs?  CTEs?
 
 I don't think Tom was referring to SRFs in general, only putting them
 directly into the targetlist of a SELECT.

And the primary reason for using SRFs in the targetlist has gone away
due to LATERAL. It's now possible to pass data from tables to a SRF,
that previously wasn't possibly unless you'd used a SRF in the targetlist.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Robert Haas
On Tue, Jun 17, 2014 at 10:31 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-17 10:26:11 -0400, Robert Haas wrote:
 On Sat, Jun 14, 2014 at 10:34 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Robert Haas wrote:
  On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   Here's an updated version of this patch, with fixes to all the bugs
   reported so far.  Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and
   Amit Kapila for the reports.
 
  I'm not very happy with the use of a separate relation fork for
  storing this data.
 
  Here's a new version of this patch.  Now the revmap is not stored in a
  separate fork, but together with all the regular data, as explained
  elsewhere in the thread.

 Cool.

 Have you thought more about this comment from Heikki?

 http://www.postgresql.org/message-id/52495dd3.9010...@vmware.com

 Is there actually a significant usecase behind that wish or just a
 general demand for being generic? To me it seems fairly unlikely you'd
 end up with something useful by doing a minmax index over bounding
 boxes.

Well, I'm not the guy who does things with geometric data, but I don't
want to ignore the significant percentage of our users who are.  As
you must surely know, the GIST implementations for geometric data
types store bounding boxes on internal pages, and that seems to be
useful to people.  What is your reason for thinking that it would be
any less useful in this context?

I do also think that a general demand for being generic ought to carry
some weight.  We have gone to great lengths to make sure that our
indexing can handle more than just  and , where a lot of other
products have not bothered.  I think we have gotten a lot of mileage
out of that decision and feel that we shouldn't casually back away
from it.  Obviously, we do already have some special-case
optimizations and will likely have more in the future, and there are
can certainly be valid reasons for taking that approach. But it needs
to be justified in some way; we shouldn't accept a less-generic
approach blindly, without questioning whether it's possible to do
better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Andres Freund
On 2014-06-17 11:48:10 -0400, Robert Haas wrote:
 On Tue, Jun 17, 2014 at 10:31 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2014-06-17 10:26:11 -0400, Robert Haas wrote:
  On Sat, Jun 14, 2014 at 10:34 PM, Alvaro Herrera
  alvhe...@2ndquadrant.com wrote:
   Robert Haas wrote:
   On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera
   alvhe...@2ndquadrant.com wrote:
Here's an updated version of this patch, with fixes to all the bugs
reported so far.  Thanks to Thom Brown, Jaime Casanova, Erik Rijkers 
and
Amit Kapila for the reports.
  
   I'm not very happy with the use of a separate relation fork for
   storing this data.
  
   Here's a new version of this patch.  Now the revmap is not stored in a
   separate fork, but together with all the regular data, as explained
   elsewhere in the thread.
 
  Cool.
 
  Have you thought more about this comment from Heikki?
 
  http://www.postgresql.org/message-id/52495dd3.9010...@vmware.com
 
  Is there actually a significant usecase behind that wish or just a
  general demand for being generic? To me it seems fairly unlikely you'd
  end up with something useful by doing a minmax index over bounding
  boxes.
 
 Well, I'm not the guy who does things with geometric data, but I don't
 want to ignore the significant percentage of our users who are.  As
 you must surely know, the GIST implementations for geometric data
 types store bounding boxes on internal pages, and that seems to be
 useful to people.  What is your reason for thinking that it would be
 any less useful in this context?

For me minmax indexes are helpful because they allow to generate *small*
'coarse' indexes over large volumes of data. From my pov that's possible
possible because they don't contain item pointers for every contained
row.
That'ill imo work well if there are consecutive rows in the table that
can be summarized into one min/max range. That's quite likely to happen
for common applications of number of scalar datatypes. But the
likelihood of placing sufficiently many rows with very similar bounding
boxes close together seems much less relevant in practice. And I think
that's generally likely for operations which can't be well represented
as btree opclasses - the substructure that implies inside a Datum will
make correlation between consecutive rows less likely.

Maybe I've a major intuition failure here though...

 I do also think that a general demand for being generic ought to carry
 some weight.

Agreed. It's always a balance act. But it's not like this doesn't use a
datatype abstraction concept...

 We have gone to great lengths to make sure that our
 indexing can handle more than just  and , where a lot of other
 products have not bothered.  I think we have gotten a lot of mileage
 out of that decision and feel that we shouldn't casually back away
 from it.

I don't see this as a case of backing away from that though?

 we shouldn't accept a less-generic
 approach blindly, without questioning whether it's possible to do
 better.

But the aim shouldn't be to add genericity that's not going to be used,
but to add it where it's somewhat likely to help...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-17 Thread Robert Haas
On Tue, Jun 17, 2014 at 10:33 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 On 17/06/14 16:18, Robert Haas wrote:
 On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek p...@2ndquadrant.com
 wrote:
 attached is a simple patch which makes it possible to change the system
 identifier of the cluster in pg_control. This is useful for
 individualization of the instance that is started on top of data
 directory
 produced by pg_basebackup - something that's helpful for logical
 replication
 setup where you need to easily identify each node (it's used by
 Bidirectional Replication for example).


 I can clearly understand the utility of being able to reset the system
 ID to a new, randomly-generated system ID - but giving the user the
 ability to set a particular value of their own choosing seems like a
 pretty sharp tool.  What is the use case for that?

 Let's say you want to initialize new logical replication node via
 pg_basebackup and you want your replication slots to be easily identifiable
 so you use your local system id as part of the slot name.

 In that case you need to know the future system id of the node because you
 need to register the slot before consistent point to which you replay via
 streaming replication (and you can't replay anymore once you changed the
 system id). Which means you need to generate your system id in advance and
 be able to change it in pg_control later.

Hmm.  I guess that makes sense.

But it seems to me that we might need to have a process discussion
here, because, while I'm all in favor of incremental feature proposals
that build towards a larger goal, it currently appears that the larger
goal toward which you are building is not something that's been
publicly discussed and debated on this list.  And I really think we
need to have that conversation.  Obviously, individual patches will
still need to be debated, but I feel like 2ndQuadrant is trying to
construct a castle without showing the community the floor plan.  I
believe that there is relatively broad agreement that we would all
like a castle, but different people may have legitimately different
ideas about how it should be constructed.  If the work arrives as a
series of disconnected pieces (user-specified system ID, event
triggers for CREATE, etc.), then everyone outside of 2ndQuadrant has
to take it on faith that those pieces are going to eventually fit
together in a way that we'll all be happy with.  In some cases, that's
fine, because the feature is useful on its own merits whether it ends
up being part of the castle or not.

But in other cases, like this one, if the premise that the slot name
should match the system identifier isn't something the community wants
to accept, then taking a patch that lets people do that is probably a
bad idea, because at least one person will use it to set the system
identifier of a system to a value that enables physical replication to
take place when that is actually totally unsafe, and we don't want to
enable that for no reason.  Maybe the slot name should match the
replication identifier rather than the standby system ID, for example.
There are conflicting proposals for how replication identifiers should
work, but one of those proposals limits it to 16 bits.  If we're going
to have multiple identifiers floating around anyway, I'd rather have a
slot called 7 than one called 6024402925054484590.  On the other hand,
maybe there's going to be a new proposal to use the database system
identifier as a replication identifier, which might be a fine idea and
which would demolish that argument.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Robert Haas
On Tue, Jun 17, 2014 at 12:04 PM, Andres Freund and...@2ndquadrant.com wrote:
 Well, I'm not the guy who does things with geometric data, but I don't
 want to ignore the significant percentage of our users who are.  As
 you must surely know, the GIST implementations for geometric data
 types store bounding boxes on internal pages, and that seems to be
 useful to people.  What is your reason for thinking that it would be
 any less useful in this context?

 For me minmax indexes are helpful because they allow to generate *small*
 'coarse' indexes over large volumes of data. From my pov that's possible
 possible because they don't contain item pointers for every contained
 row.
 That'ill imo work well if there are consecutive rows in the table that
 can be summarized into one min/max range. That's quite likely to happen
 for common applications of number of scalar datatypes. But the
 likelihood of placing sufficiently many rows with very similar bounding
 boxes close together seems much less relevant in practice. And I think
 that's generally likely for operations which can't be well represented
 as btree opclasses - the substructure that implies inside a Datum will
 make correlation between consecutive rows less likely.

Well, I don't know: suppose you're loading geospatial data showing the
location of every building in some country.  It might easily be the
case that the data is or can be loaded in an order that provides
pretty good spatial locality, leading to tight bounding boxes over
physically consecutive data ranges.

But I'm not trying to say that we absolutely have to support that kind
of thing; what I am trying to say is that there should be a README or
a mailing list post or some such that says: We thought about how
generic to make this.   We considered A, B, and C.  We rejected C as
too narrow, and A because if we made it that general it would have
greatly enlarged the disk footprint for the following reasons.
Therefore we selected B.  Basically, I think Heikki asked a good
question - which was could we abstract this more? - and I can't
recall seeing a clear answer explaining why we could or couldn't and
what the trade-offs would be.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-17 Thread Bruce Momjian
On Wed, Jun  4, 2014 at 06:57:31PM -0400, Tom Lane wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Tom Lane (t...@sss.pgh.pa.us) wrote:
  There are at least two places in inv_api.c where we have
  Assert(pagelen = LOBLKSIZE) that is protecting a subsequent memcpy
  into a local variable of size LOBLKSIZE, so that the only thing standing
  between us and a stack-smash security issue that's trivially exploitable
  in production builds is that on-disk data conforms to our expectation
  about LOBLKSIZE.  I think it's definitely worth promoting these checks
  to regular runtime-if-test-and-elog.
 
  Agreed.  Promoting that to a run-time check seems well worth it to me.
 
 Here's a draft patch for this.  Barring objections I'll commit the whole
 thing to HEAD, and the inv_api.c changes to the back branches as well.

Uh, I think pg_upgrade needs to check that they match too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory deallocation after calling cast function

2014-06-17 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-17 21:09:25 +0530, Abhijit Menon-Sen wrote:
 At 2014-06-17 11:32:37 -0400, br...@momjian.us wrote:
 Uh, what is replacing SRFs?  CTEs?

 I don't think Tom was referring to SRFs in general, only putting them
 directly into the targetlist of a SELECT.

 And the primary reason for using SRFs in the targetlist has gone away
 due to LATERAL.

Exactly.  LATERAL does what you usually want from a SRF in the
targetlist, and it doesn't have bizarre semantics when you use more than
one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-17 Thread Robert Haas
On Fri, Jun 13, 2014 at 11:57 AM, David Johnston
david.g.johns...@gmail.com wrote:
 That's not the reading I want, and it's not the reading you want either,
 but there is nothing in the existing text that justifies single
 evaluation.  So I think we'd be well advised to sit on our hands until
 the committee clarifies that.  It's not like there is some urgent reason
 to have this feature.

 Agreed.

 I don't suppose there is any support or prohibition on the :

 one,two,three integer := generate_series(1,3);

 interpretation...not that I can actually come up with a good use case that
 wouldn't be better implemented via a loop in the main body.

Based on these comments and the remarks by Alvaro and Andres, I think
it's clear that we should reject this patch.  The number of patches
that get through with -1 votes from 3 committers is very small, if not
zero.  While I like the feature in the abstract, I agree with Tom that
it would be better to wait until we have more clarity about what the
semantics are supposed to be.

I will update the CommitFest app accordingly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-17 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 Uh, I think pg_upgrade needs to check that they match too.

Possibly.  What do you think it should do when examining a pg_control
version that lacks the field?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD

2014-06-17 Thread Noah Misch
On Tue, May 27, 2014 at 07:46:41PM -0400, Tom Lane wrote:
 Pushed; thanks for working on this!

Here's a fix to make the MSVC build process account for the addition of
HAVE_UUID_OSSP.  (None of the MSVC buildfarm members enable uuid-ossp.)

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index c0d7f38..f7a5abb 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -214,6 +214,7 @@ s{PG_VERSION_STR [^]+}{__STRINGIFY(x) #x\n#define 
__STRINGIFY2(z) __STRINGIFY
 
if ($self-{options}-{uuid})
{
+   print O #define HAVE_UUID_OSSP\n;
print O #define HAVE_UUID_H\n;
}
if ($self-{options}-{xml})

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-06-17 Thread Robert Haas
On Mon, Jun 2, 2014 at 8:55 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 And here is the tool itself. It consists of two parts:

 1. Modifications to the backend to write the page images
 2. A post-processing tool to compare the logged images between master and
 standby.
 Having that into Postgres at the disposition of developers would be
 great, and I believe that it would greatly reduce the occurrence of
 bugs caused by WAL replay during recovery. So, with the permission of
 the author, I have been looking at this facility for a cleaner
 integration into Postgres.

I'm not sure if this is reasonably possible, but one thing that would
make this tool a whole lot easier to use would be if you could make
all the magic happen in a single server.  For example, suppose you had
a background process that somehow got access to the pre and post
images for every buffer change, and the associated WAL record, and
tried applying the WAL record to the pre-image to see whether it got
the corresponding post-image.  Then you could run 'make check' or so
and afterwards do something like psql -c 'SELECT * FROM
wal_replay_problems()' and hopefully get no rows back.

Don't get me wrong, having this tool at all sounds great.  But I think
to really get the full benefit out of it we need to be able to run it
in the buildfarm, so that if people break stuff it gets noticed
quickly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 12:58 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 On 05/30/2014 11:14 PM, Heikki Linnakangas wrote:
 Yeah. To recap, the failure mode is that if the master crashes and
 restarts, the transaction becomes visible in the master even though it
 was never replicated.

 Wouldn't another pg_clog bit for the transaction be able to sort that out?

How?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 2:36 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 P.S. If you tag your reviews with [REVIEW] in the Subject, it'll be
 easier to keep track of them.

I and, I believe, various other people hate that style, because at
least in Gmail, it breaks the threading.  It is much easier to find
things if they are all posted on one thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Alvaro Herrera
Robert Haas wrote:
 On Mon, Jun 16, 2014 at 2:36 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  P.S. If you tag your reviews with [REVIEW] in the Subject, it'll be
  easier to keep track of them.
 
 I and, I believe, various other people hate that style, because at
 least in Gmail, it breaks the threading.  It is much easier to find
 things if they are all posted on one thread.

Yes, please don't do that.  A simple, normal reply to the message that
submits the patch is much better from my point of view as a subsequent
reviewer and committer.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set new system identifier using pg_resetxlog

2014-06-17 Thread Andres Freund
On 2014-06-17 12:07:04 -0400, Robert Haas wrote:
 On Tue, Jun 17, 2014 at 10:33 AM, Petr Jelinek p...@2ndquadrant.com wrote:
  On 17/06/14 16:18, Robert Haas wrote:
  On Fri, Jun 13, 2014 at 8:31 PM, Petr Jelinek p...@2ndquadrant.com
  wrote:
  attached is a simple patch which makes it possible to change the system
  identifier of the cluster in pg_control. This is useful for
  individualization of the instance that is started on top of data
  directory
  produced by pg_basebackup - something that's helpful for logical
  replication
  setup where you need to easily identify each node (it's used by
  Bidirectional Replication for example).
 
 
  I can clearly understand the utility of being able to reset the system
  ID to a new, randomly-generated system ID - but giving the user the
  ability to set a particular value of their own choosing seems like a
  pretty sharp tool.  What is the use case for that?

I've previously hacked this up adhoc during data recovery when I needed
to make another cluster similar enough that I could replay WAL.

Another usecase is to mark a database as independent from its
origin. Imagine a database that gets sharded across several
servers. It's not uncommon to do that by initially basebackup'ing the
database to several nodes and then use them separately from
thereon. It's quite useful to actually mark them as being
distinct. Especially as several of them right now would end up with the
same timeline id...

 But it seems to me that we might need to have a process discussion
 here, because, while I'm all in favor of incremental feature proposals
 that build towards a larger goal, it currently appears that the larger
 goal toward which you are building is not something that's been
 publicly discussed and debated on this list.  And I really think we
 need to have that conversation.  Obviously, individual patches will
 still need to be debated, but I feel like 2ndQuadrant is trying to
 construct a castle without showing the community the floor plan.  I
 believe that there is relatively broad agreement that we would all
 like a castle, but different people may have legitimately different
 ideas about how it should be constructed.  If the work arrives as a
 series of disconnected pieces (user-specified system ID, event
 triggers for CREATE, etc.), then everyone outside of 2ndQuadrant has
 to take it on faith that those pieces are going to eventually fit
 together in a way that we'll all be happy with.  In some cases, that's
 fine, because the feature is useful on its own merits whether it ends
 up being part of the castle or not.


Uh. Right now this patch has been written because it's needed for a out
of core replication solution. That's what BDR is at this point. The
patch is unobtrusive, has other usecases than just our internal one and
doesn't make pg_resetxlog even more dangerous than it already is. I
don't see much problem with considering it on it's own cost/benefit?

So this seems to be a concern that's relatively independent of this
patch. Am I seing that right?

I think one very important point here is that BDR is *not* the proposed
in core solution. I think a reasonable community perspective - besides
also being useful on it's own - is to view it as a *prototype* for a in
core solution. And e.g. logical decoding would have looked much worse -
and likely not have been integrated - without externally already being
used for BDR.

I'm not sure how we can ease or even resolve your conerns when talking
about pretty independent and general pieces of functionality like the
DDL even trigger stuff. We needed to actually *write* those to see how
BDR will look like. And the communities feedback heavily influenced how
BDR looks like by accepting some pieces, demanding others, and outright
rejecting the remainder.

I think there's some pieces that need to consider them on their own
merit. Logical decoding is useful on it's own. The ability for out of
core systems to do DDL replication is another piece (that you referred
to above).
I think the likelihood of success if we were to try to design a in-core
system from ground up first and then follow through prety exactly along
those lines is minimal.

So, what I think we can do is to continue trying to build independent,
generally useful bits. Which imo all the stuff that's been integrated
is. Then, somewhat soon I think, we'll have to come up with a proposal
how the parts that are *not* necessarily useful outside of in-core
logical rep. might look like. Which will likely trigger some long long
discussions that turn that design around a couple of times. Which is
fine. I *don't* think that's going to be a trimmed down version of
todays BDR.

 But in other cases, like this one, if the premise that the slot name
 should match the system identifier isn't something the community wants
 to accept, then taking a patch that lets people do that is probably a
 bad idea, because at least one person will use it to set the system
 identifier of a system 

Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Robert Haas wrote:
 On Mon, Jun 16, 2014 at 2:36 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
 P.S. If you tag your reviews with [REVIEW] in the Subject, it'll be
 easier to keep track of them.

 I and, I believe, various other people hate that style, because at
 least in Gmail, it breaks the threading.  It is much easier to find
 things if they are all posted on one thread.

 Yes, please don't do that.  A simple, normal reply to the message that
 submits the patch is much better from my point of view as a subsequent
 reviewer and committer.

Worth noting also is that Magnus is working on a new version of the
commitfest app that will be able to automatically keep track of threads
about patches --- so long as they *are* threads according to our mailing
list archives.  I'm not sure if the archives recognize replies with a
changed Subject: as being the same thread or not.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Andres Freund
On 2014-06-17 12:47:19 -0400, Alvaro Herrera wrote:
 Robert Haas wrote:
  On Mon, Jun 16, 2014 at 2:36 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
  wrote:
   P.S. If you tag your reviews with [REVIEW] in the Subject, it'll be
   easier to keep track of them.
  
  I and, I believe, various other people hate that style, because at
  least in Gmail, it breaks the threading.  It is much easier to find
  things if they are all posted on one thread.
 
 Yes, please don't do that.  A simple, normal reply to the message that
 submits the patch is much better from my point of view as a subsequent
 reviewer and committer.

Very much agreed. Especially as many patches have several reviews over
the course of their integration. I think the separate thread suggestion
was made by some former CF manager?

I think it's sometimes, for larger/hotly debated patches, useful to
start anew at significant new version though... 300 message deep threads
get unwiedly.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Abhijit Menon-Sen
At 2014-06-17 12:47:19 -0400, alvhe...@2ndquadrant.com wrote:

   P.S. If you tag your reviews with [REVIEW] in the Subject, it'll
   be easier to keep track of them.
  
  I and, I believe, various other people hate that style, because at
  least in Gmail, it breaks the threading.  It is much easier to find
  things if they are all posted on one thread.
 
 Yes, please don't do that.  A simple, normal reply to the message that
 submits the patch is much better from my point of view as a subsequent
 reviewer and committer.

I find it hard to believe that gmail is incapable of threading messages
using In-Reply-To/References header fields, especially given that mail
subjects are changed all the time in the normal course of events. But
I'll take your word for it and reply to the original message (as I've
always done) without changing the Subject for any reviews I post.

Only one other person took my suggestion so far, but if anyone else is
tempted to do the same: consider the recommendation withdrawn.

-- Abhijit

P.S. I also realise now that some people may not {,be able to} change
the Subject of a message without creating an entirely new thread.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Replacement for OSSP-UUID for Linux and BSD

2014-06-17 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 Here's a fix to make the MSVC build process account for the addition of
 HAVE_UUID_OSSP.  (None of the MSVC buildfarm members enable uuid-ossp.)

Looks reasonable.  I'm unable to test this scenario, but if you have,
please commit.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-17 Thread Bruce Momjian
On Tue, Jun 17, 2014 at 12:28:46PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  Uh, I think pg_upgrade needs to check that they match too.
 
 Possibly.  What do you think it should do when examining a pg_control
 version that lacks the field?

Good question.  I have existing cases where fields were removed, but not
ones that were added.  As we have no way to query the old cluster's
value for LOBLKSIZE, I think I will just add code to compare them if
they _both_ exist.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-17 Thread Robert Haas
On Sat, Jun 14, 2014 at 9:12 PM, Andres Freund and...@2ndquadrant.com wrote:
 At this year developer's meeting we'd discussed the atomics abstraction
 which is necessary for some future improvements. We'd concluded that a
 overview over the hardware capabilities of the supported platforms would
 be helpful. I've started with that at:
 https://wiki.postgresql.org/wiki/Atomics

That looks like a great start.  It could use a key explaining what the
columns mean.  Here are some guesses, and some comments.

single-copy r/w atomicity = For what word sizes does this platform
have the ability to do atomic reads and writes of values?  I don't
understand how 1 can ever NOT be supported - does any architecture
write data to memory a nibble at a time?  I also don't really
understand the *** footnote; how can you do kernel emulation of an
atomic load or store.
TAS = Does this platform support test and set?  Items in parentheses
are the instructions that implement this.
cmpxchg = Does this platform support compare and swap?  If so, for
what word sizes?  I assume n means not supported at all and y
means supported but we don't know for which word sizes.  Maybe call
this column CAS or Compare and Swap rather than using an
Intel-ism.
gcc support from version = Does this mean the version from which GCC
supported the architecture, or from which it supported atomics (which
ones?) on that architecture?

 Does somebody want other columns in there?

I think the main question at the developer meeting was how far we want
to go with supporting primitives like atomic add, atomic and, atomic
or, etc.  So I think we should add columns for those.

 From that and previous discussions
 (e.g. 
 http://archives.postgresql.org/message-id/20131013004658.GG4056218%40alap2.anarazel.de
 ) I think we should definitely remove some platforms:
 1) VAX. Production ceased 1997. We've never supported OpenVMS, just
 netbsd (and probably openbsd)

Based on the letter you linked from the Wiki page, I am inclined to
agree.  The letter lists the final ship date for the last surviving
type of VAX as December 31, 2000, with support ending in 2010.  It
seems unlikely that anyone will wish to run a version of PostgreSQL
released in 2015 on such hardware (and if they do, we can make that
their problem rather than ours).

 2) m32r. Our implementation depends on an *unmerged* glibc header. The
 website was dead for several *years*, even after the restore patches
 can't be downloaded anymore (404).

That is also an excellent argument for deprecation.

 3) sparcv8: Last released model 1997.

I seem to recall hearing about this in a customer situation relatively
recently, so there may be a few of these still kicking around out
there.

 4) i386: Support dropped from windows 98 (yes, really), linux, openbsd
(yes, really), netbsd (yes, really). No code changes needed.

Wow, OK.  In that case, yeah, let's dump it.  But let's make sure we
adequately document that someplace in the code comments, along with
the reasons, because not everyone may realize how dead it is.

 5) pa-risc.

This is a bit less dead than the other ones; support for existing
system only stopped at the end of 2013.  But I don't personally have a
reason to keep it alive.

 6) armv-v5

I think this is also a bit less dead than the other ones; Red Hat's
shows Bugzilla shows people filing bugs for platform-specific problems
as recently as January of 2013:

https://bugzilla.redhat.com/show_bug.cgi?id=892378

 Note that this is *not* a requirement for the atomics abstraction - it
 now has a fallback to spinlocks if atomics aren't available.

That seems great.  Hopefully with a configure option to disable
atomics so that it's easy to test the fallback.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.5 CF1

2014-06-17 Thread Alvaro Herrera
Abhijit Menon-Sen wrote:

 I find it hard to believe that gmail is incapable of threading messages
 using In-Reply-To/References header fields, especially given that mail
 subjects are changed all the time in the normal course of events. But
 I'll take your word for it and reply to the original message (as I've
 always done) without changing the Subject for any reviews I post.

They do it -- I experimented with it awhile ago.  Gmail threads emails
perfectly well, but it seems they break threads on purpose when the
subject changes.  Apparently, more people is used to starting a new
thread by replying to an existing email and changing to a new subject,
than people tweak subjects by adding tags and such.


-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-17 Thread Andres Freund
On 2014-06-17 13:14:26 -0400, Robert Haas wrote:
 On Sat, Jun 14, 2014 at 9:12 PM, Andres Freund and...@2ndquadrant.com wrote:
  At this year developer's meeting we'd discussed the atomics abstraction
  which is necessary for some future improvements. We'd concluded that a
  overview over the hardware capabilities of the supported platforms would
  be helpful. I've started with that at:
  https://wiki.postgresql.org/wiki/Atomics
 
 That looks like a great start.  It could use a key explaining what the
 columns mean.  Here are some guesses, and some comments.
 
 single-copy r/w atomicity = For what word sizes does this platform
 have the ability to do atomic reads and writes of values?

Yes. It's a term found in literature, I'm happy to replace it by
something else. It's essentially defined to mean that a read after one
or more writes will only be influenced by one of the last writes, not a
mixture of them. I.e. whether there ever will be intermediate states
visible.

 I don't
 understand how 1 can ever NOT be supported - does any architecture
 write data to memory a nibble at a time?

Actually there seem to be some where that's possible for some operations
(VAX). But the concern is more whether 1 byte can actually be written
without also writing neighbouring values. I.e. there's hardware out
there that'll implement a 1byte store as reading 4 bytes, changing one
of the bytes in a register, and then write the 4 bytes out again. Which
would mean that a neighbouring write will possibly cause a wrong value
to be written...

 I also don't really
 understand the *** footnote; how can you do kernel emulation of an
 atomic load or store.

What happens is that gcc will do a syscall triggering the kernel to turn
of scheduling; perform the math and store the result; turn scheduling on
again. That way there cannot be a other operation influencing the
calculation/store. Imagine if you have hardware that, internally, only
does stores in 4 byte units. Even if it's a single CPU machine, which
most of those are, the kernel could schedule a separate process after
the first 4bytes have been written. Oops. The kernel has ways to prevent
that, userspace doesn't...

 TAS = Does this platform support test and set?  Items in parentheses
 are the instructions that implement this.

 cmpxchg = Does this platform support compare and swap?  If so, for
 what word sizes?  I assume n means not supported at all and y
 means supported but we don't know for which word sizes.  Maybe call
 this column CAS or Compare and Swap rather than using an
 Intel-ism.

Ok.

 gcc support from version = Does this mean the version from which GCC
 supported the architecture, or from which it supported atomics (which
 ones?) on that architecture?

It means from which version on gcc has support for the __sync intrinsics
on that platform. I've only added versions where support for all atomics
of the supported sizes were available.
https://gcc.gnu.org/onlinedocs/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins

Once these answers satisfy you I'll adapt the wiki.

  Does somebody want other columns in there?
 
 I think the main question at the developer meeting was how far we want
 to go with supporting primitives like atomic add, atomic and, atomic
 or, etc.  So I think we should add columns for those.

Well, once CAS is available, atomic add etc is all trivially
implementable - without further hardware support. It might be more
efficient to use the native instruction (e.g. xadd can be much better
than a cmpxchg loop because there's no retries), but that's just
optimization that won't matter unless you have a fair bit of
concurrency.

There's currently fallbacks like:
#ifndef PG_HAS_ATOMIC_FETCH_ADD_U32
#define PG_HAS_ATOMIC_FETCH_ADD_U32
STATIC_IF_INLINE uint32
pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, uint32 add_)
{
uint32 old;
while (true)
{
old = pg_atomic_read_u32_impl(ptr);
if (pg_atomic_compare_exchange_u32_impl(ptr, old, old + add_))
break;
}
return old;
}

  3) sparcv8: Last released model 1997.
 
 I seem to recall hearing about this in a customer situation relatively
 recently, so there may be a few of these still kicking around out
 there.

Really? As I'd written in a reply solaris 10 (released 2005) dropped
support for it. Dropping support for a platform that's been desupported
10 years ago by it's manufacturer doesn't sound bad imo...

  4) i386: Support dropped from windows 98 (yes, really), linux, openbsd
 (yes, really), netbsd (yes, really). No code changes needed.
 
 Wow, OK.  In that case, yeah, let's dump it.  But let's make sure we
 adequately document that someplace in the code comments, along with
 the reasons, because not everyone may realize how dead it is.

I'm generally wondering how to better document the supported os/platform
combinations. E.g. it's not apparent that we only support certain
platforms on a 

Re: [HACKERS] Minmax indexes

2014-06-17 Thread Andres Freund
On 2014-06-17 12:14:00 -0400, Robert Haas wrote:
 On Tue, Jun 17, 2014 at 12:04 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Well, I'm not the guy who does things with geometric data, but I don't
  want to ignore the significant percentage of our users who are.  As
  you must surely know, the GIST implementations for geometric data
  types store bounding boxes on internal pages, and that seems to be
  useful to people.  What is your reason for thinking that it would be
  any less useful in this context?
 
  For me minmax indexes are helpful because they allow to generate *small*
  'coarse' indexes over large volumes of data. From my pov that's possible
  possible because they don't contain item pointers for every contained
  row.
  That'ill imo work well if there are consecutive rows in the table that
  can be summarized into one min/max range. That's quite likely to happen
  for common applications of number of scalar datatypes. But the
  likelihood of placing sufficiently many rows with very similar bounding
  boxes close together seems much less relevant in practice. And I think
  that's generally likely for operations which can't be well represented
  as btree opclasses - the substructure that implies inside a Datum will
  make correlation between consecutive rows less likely.
 
 Well, I don't know: suppose you're loading geospatial data showing the
 location of every building in some country.  It might easily be the
 case that the data is or can be loaded in an order that provides
 pretty good spatial locality, leading to tight bounding boxes over
 physically consecutive data ranges.

Well, it might be doable to correlate them along one axis, but along
both?  That's more complicated... And even alongside one axis you
already get into problems if your geometries are irregularly sized.
Asingle large polygon will completely destroy indexability for anything
stored physically close by because suddently the minmax range will be
huge... So you'll need to cleverly sort for that as well.

I think hierarchical datastructures are so much better suited for this,
that there's little point trying to fit them into minmax. I can very
well imagine that there's benefit in a gist support for only storing one
pointer per block instead of one pointer per item or such. But it seems
like separate feature.

 But I'm not trying to say that we absolutely have to support that kind
 of thing; what I am trying to say is that there should be a README or
 a mailing list post or some such that says: We thought about how
 generic to make this.  We considered A, B, and C.  We rejected C as
 too narrow, and A because if we made it that general it would have
 greatly enlarged the disk footprint for the following reasons.
 Therefore we selected B.

Isn't 'simpler implementation' a valid reason that's already been
discussed onlist? Obviously simpler implementation doesn't trump
everything, but it's one valid reason...
Note that I have zap to do with the design of this feature. I work for
the same company as Alvaro, but that's pretty much it...

 Basically, I think Heikki asked a good
 question - which was could we abstract this more? - and I can't
 recall seeing a clear answer explaining why we could or couldn't and
 what the trade-offs would be.

'could we abstract more' imo is a pretty bad design guideline. It's 'is
there benefit in abstracting more'. Otherwise you end up with way to
complicated systems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Claudio Freire
On Tue, Jun 17, 2014 at 1:04 PM, Andres Freund and...@2ndquadrant.com wrote:
 For me minmax indexes are helpful because they allow to generate *small*
 'coarse' indexes over large volumes of data. From my pov that's possible
 possible because they don't contain item pointers for every contained
 row.


But minmax is just a specific form of bloom filter.

This could certainly be generalized to a bloom filter index with some
set of bloomhashing operators (minmax being just one).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-06-17 Thread Claudio Freire
On Tue, Jun 17, 2014 at 8:47 AM, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 if (compress_backup_block = BACKUP_BLOCK_COMPRESSION_SNAPPY)


You mean == right?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-17 Thread Robert Haas
On Thu, Jun 12, 2014 at 6:33 PM, Gregory Smith gregsmithpg...@gmail.com wrote:
 I'm kind of surprised to see this turn into a hot button all of the sudden
 though, because my thought on all that so far has been a giant so what?
 This is what PostgreSQL does.
[...]
 But let's not act like RLS is a scary bogeyman because it introduces a new
 way to hack the server or get surprising side-effects.  That's expected and
 possibly unavoidable behavior in a feature like this, and there are much
 worse instances of arbitrary function risk throughout the core code already.

I have some technical comments on later emails in this thread, but
first let me address this point.  In the past, people have sometimes
complained that reviewers waited until very late in the cycle to
complain about issues which they found problematic.  By the time the
issues were pointed out, insufficient time remained before feature
freeze to get those issues addressed, causing the patch to slip out of
the release and provoking developer frustration.  It has therefore
been requested numerous times by numerous people that potential issues
be raised as early as possible.

The concerns that I have raised in this thread are not new; I have
raised them before.  However, we are now at the beginning of a new
development cycle, and it seems fair to assume that the people who are
working on this patch are hoping very much that something will get
committed to 9.5.  Since it seems to me that there are unaddressed
issues with the design of this patch, I felt that it was a good idea
to make sure that those concerns were on the table right from the
beginning of the process, rather than waiting until the patch was on
the verge of commit or, indeed, already committed.  That is why, when
this thread was revived on June 10th, I decide that it was a good time
to again comment on the design points about which I was concerned.

After sending that one (1) email, I was promptly told that I'm very
disappointed to hear that the mechanical pieces around making RLS easy
for users to use ... is receiving such push-back.  The push-back, at
that point in time, consisted of one (1) email.  Several more emails
have been sent that time, including the above-quoted text, seeming to
me to imply that the people who are concerned about this feature are
being unreasonable.  I don't believe I am the only such person,
although I may be the main one right at the moment, and you may not be
entirely surprised to hear that I don't think I'm being unreasonable.

I will admit that my initial email may have contained just a touch of
hyperbole.  But I won't admit to more than a touch, and frankly, I
think it was warranted.  I perfectly well understand that people
really, really, really want this feature, and if I hadn't understood
that before, I certainly understand it now.  However, I believe that
there has been a lack of focus in the development of the patch thus
far in a couple of key areas - first in terms of articulating how it
is different from and better than a writeable security barrier view,
and second on how to manage the security and operational aspects of
having a feature like this.  I think that the discussion subsequent to
my June 10th email has let to some good discussion on both points,
which was my intent, but I still think much more time and thought
needs to be spent on those issues if we are to have a feature which is
up to our usual standards.  I do apologize to anyone who interpreted
that initial as a pure rant, because it really wasn't intended that
way.  Contrariwise, I hope that the people defending this patch will
admit that the issues I am raising are real and focus on whether and
how those concerns can be addressed.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-17 Thread Robert Haas
On Fri, Jun 13, 2014 at 3:11 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Yeah, I was thinking something like this could work, but I would go
 further. Suppose you had separate GRANTable privileges for direct
 access to individual tables, bypassing RLS, e.g.

   GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

So, is this one new privilege (DIRECT) or four separate new privileges
that are variants of the existing privileges (DIRECT SELECT, DIRECT
INSERT, DIRECT UPDATE, DIRECT DELETE)?

 Actually, given the fact that the majority of users won't be using
 RLS, I would be tempted to invert the above logic and have the new
 privilege be for LIMITED access (via RLS quals). So a user granted the
 normal SELECT privilege would be able to bypass RLS, but a user only
 granted LIMITED SELECT wouldn't.

Well, for the people who are not using RLS, there's no difference
anyway.  I think it matters more what users of RLS will expect from a
command like GRANT SELECT ... and I'm guessing they'll prefer that RLS
always apply unless they very specifically grant the right for RLS to
not apply.  I might be wrong, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-17 Thread Robert Haas
On Thu, Jun 12, 2014 at 8:13 PM, Stephen Frost sfr...@snowman.net wrote:
 I'm in full agreement we should clearly communicate the issues around
 pg_dump in particular, because they can't necessarily be eliminated
 altogether without some major work that's going to take a while to finish.
 And if the work-around is some sort of GUC for killing RLS altogether,
 that's ugly but not unacceptable to me as a short-term fix.

 A GUC which is enable / disable / error-instead may work quiet well, with
 error-instead for pg_dump default if people really want it (there would have
 to be a way to disable that though, imv).

 Note that enable is default in general, disable would be for superuser only
 (or on start-up) to disable everything, and error-instead anyone could use
 but it would error instead of implementing RLS when querying an RLS-enabled
 table.

 This approach was suggested by an existing user testing out this RLS
 approach, to be fair, but it looks pretty sane to me as a way to address
 some of these concerns. Certainly open to other ideas and thoughts though.

In general, I agree that this is a good approach.  I think it will be
difficult to have a GUC with three values, one of which is
superuser-only and the other two of which are not.  I don't think
there's any precedent for something like that in the existing
framework, and I think it's likely we'll run into unpleasant corner
cases if we try to graft it in.  Also, I think we need to separate
things: whether the system is willing to allow the user to access the
table without RLS, and whether the user is willing to accept RLS if
the system deems it necessary.

For the first one, two solutions have been proposed.  The initial
proposal was to insist on RLS except for the superuser (and maybe the
table owner?).  Having a separate grantable privilege, as Dean
suggests, may be better.  I'll reply separately to that email also, as
I have a question about what he's proposing.

For the second one, I think the two most useful behaviors are normal
mode - i.e. allow access to the table, applying RLS predicates if
required and not applying them if I am exempt - and error-instead
mode - i.e. if my access to this table would be mediated by an RLS
predicate, then throw an error instead.  There's a third mode which
might be useful as well, which is even though I have the *right* to
bypass the RLS predicate on this table, please apply the predicate
anyway.  This could be used by the table owner in testing, for
example.  Here again, the level of granularity we want to provide is
an interesting question.  Having a GUC (e.g. enable_row_level_security
= on, off, force) would be adequate for pg_dump, but allowing the
table name to be qualified in the query, as proposed downthread, would
be more granular, admittedly at some parser cost.  I'm personally of
the view that we *at least* need the GUC, because that seems like the
best way to secure pg_dump, and perhaps other applications.  We can
and should give pg_dump an--allow-row-level-security flag, I think,
but pg_dump's default behavior should be to configure the system in
such a way that the dump will fail rather than complete with a subset
of the data.  I'm less sure whether we should have something that can
be used to qualify table names in particular queries.  I think it
would be really useful, but I'm concerned that it will require
creating additional fully-reserved keywords, which are somewhat
painful for users.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-17 Thread Stephen Frost
Robert,

On Tuesday, June 17, 2014, Robert Haas robertmh...@gmail.com wrote:

 After sending that one (1) email, I was promptly told that I'm very
 disappointed to hear that the mechanical pieces around making RLS easy
 for users to use ... is receiving such push-back.  The push-back, at
 that point in time, consisted of one (1) email.  Several more emails
 have been sent that time, including the above-quoted text, seeming to
 me to imply that the people who are concerned about this feature are
 being unreasonable.  I don't believe I am the only such person,
 although I may be the main one right at the moment, and you may not be
 entirely surprised to hear that I don't think I'm being unreasonable.


I'm on my phone at the moment but that looks like a quote from me. My email
and concern there was regarding the specific suggestion that we could check
off the RLS capability which users have been asking us to provide nearly
since I started with PG by saying that they could use Updatable SB views. I
did not intend it as a comment regarding the specific technical concerns
raised and have been responding to and trying to address those
independently and openly.

I've expressed elsewhere on this thread my gratitude that the technical
concerns are being brought up now, near the beginning of the cycle, so we
can address them. I've been working with others who are interested in RLS
on a wiki page to outline and understand the options and identify
dependencies and priorities. Hopefully the link will be posted shortly
(again, not at a computer right now) and we can get comments back. There
are some very specific questions which really need to be addressed and
which I've mentioned before (in particular the question of what user the
functions in a view definition should run as, both for normal views, for
SB views, and for when an RLS qual is included and run through that
framework, and if doing so would address some of the concerns which have
been raised regarding selects running code).

Thanks,

Stephen


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 1:15 AM, Stephen Frost sfr...@snowman.net wrote:
 I'm not referring to the proposed implementation particularly; or at
 least not that aspect of it.  I don't think trying to run the view
 quals as the defining user is likely to be very appealing, because I
 think it's going to hurt performance, for example by preventing
 function inlining and requiring lots of user-ID switches.

 I understand that there are performance implications.  As mentioned to
 Tom, realistically, there's zero way to optimized at least some of these
 use-cases because they require a completely external module (eg:
 SELlinux) to be involved in the decision about who can view what
 records.  If we can optimize that, it'd be by a completely different
 approach whereby we pull up the qual higher because we know the whole
 query only involves leakproof functions or similar, allowing us to only
 apply the filter to the final set of records prior to them being
 returned to the user.  The point being that such optimizations would
 happen independently and regardless of the quals or user-defined
 functions involved.  At the end of the day, I can't think of a better
 optimization for such a case (where we have to ask an external security
 module if a row is acceptable to return to the user) than that.  Is
 there something specific you're thinking about that we'd be missing out
 on?

Yeah, if we have to ask an external security module a question for
each row, there's little hope of any real optimization.  However, I
think there will be a significant number of cases where people will
want filtering clauses that can be realized by doing an index scan
instead of a sequential scan, and if we end up forcing a sequential
scan anyway, the feature will be useless to those people.

 But I'm not
 gonna complain if someone wants to mull it over and make a proposal
 for how to make it work.  Rather, my concern is that all we've got is
 what might be called the core of the feature; the actual guts of it.
 There are a lot of ancillary details that seem to me to be not worked
 out at all yet, or only half-baked.

 Perhaps it's just my experience, but I've been focused on the main core
 feature for quite some time and it feels like we're really close to
 having it there.  I agree that a few additional bits would be nice to
 have but these strike me as relatively straight-forward to implement
 overtop of this general construct.  I do see value in documenting these
 concerns and will see about making that happen, along with what the
 general viewpoints and thoughts are about how to address the concern.

I feel like there's quite a bit of work left to do around these
issues.  The technical bits may not be too hard, but deciding what we
want will take some thought and discussion.

  The current approach allows a nearly unlimited level of flexibility,
  should the user wish it, by being able to run user-defined code.
  Perhaps that would be considered 'one policy', but it could certainly
  take under consideration the calling user, the object being queried
  (if a function is defined per table, or if we provide a way to get
  that information in the function), etc.

 In theory, that's true.  But in practice, performance will suck unless
 the security qual is easily optimizable.  If your security qual is
 WHERE somecomplexfunction() you're going to have to implement that by
 sequential-scanning the table and evaluating the function for each
 row.

 That's not actualy true today, is it?  Given our leak-proof attribute,
 if the qual is WHERE somecomplexfunction() AND leakprooffunctionx()
 then we would be able to push down the leak-proof function and not
 necessairly run a straight sequential scan, no?  Even so, though, we've
 had users who have tested exactly what this patch implements and they've
 been happy with their real-world use-cases.  I'm certainly all for
 optimization and would love to see us make this better for everyone, but
 I don't view that as a reason to delay this particular feature which is
 really just bringing us up to parity with other RDMBS's.

I'm a bit confused here, because your example seems to be totally
different from my example.  In my example, somecomplexfunction() will
get pushed down because it's the security qual; that needs to be
inside the security_barrier view, or a malicious user can subvert the
system by getting some other qual evaluated first.  In your example,
you seem to be imagining WHERE somecomplexfunction() AND
leakprooffunctionx() as queries sent by the untrusted user, in which
case, yet, the leak-proof one will get pushed down and the other one
will not.

 But that's probably not going to perform very well, because to match
 an index on sales_rep_id, or an index on partner_id, that's going to
 have to get simplified a whole lot, and that's probably not going to
 happen.  If we've only got one branch of the OR, I think we'll realize
 we can evaluate the subquery as an InitPlan and then use an 

Re: [HACKERS] pg_control is missing a field for LOBLKSIZE

2014-06-17 Thread Alvaro Herrera
Bruce Momjian wrote:
 On Tue, Jun 17, 2014 at 12:28:46PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   Uh, I think pg_upgrade needs to check that they match too.
  
  Possibly.  What do you think it should do when examining a pg_control
  version that lacks the field?
 
 Good question.  I have existing cases where fields were removed, but not
 ones that were added.  As we have no way to query the old cluster's
 value for LOBLKSIZE, I think I will just add code to compare them if
 they _both_ exist.

Can't you compare it to the historic default value?  I mean, add an
assumption that people thus far has never tweaked it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] delta relations in AFTER triggers

2014-06-17 Thread Robert Haas
On Sat, Jun 14, 2014 at 7:56 PM, Kevin Grittner kgri...@ymail.com wrote:
 I looked at the standard, and initially tried to implement the
 standard syntax for this; however, it appeared that the reasons
 given for not using standard syntax for the row variables also
 apply to the transition relations (the term used by the standard).
 There isn't an obvious way to tie that in to all the PLs we
 support.  It could be done, but it seems like it would intolerably
 ugly, and more fragile than what we have done so far.

I'm not too familiar with this area.  Can you describe what the
standard syntax for the row variables is, as opposed to our syntax?
Also, what's the standard syntax for the the transition relations?

 Some things which I *did* follow from the standard: these new
 relations are only allowed within AFTER triggers, but are available
 in both AFTER STATEMENT and AFTER ROW triggers.  That is, an AFTER
 UPDATE ... FOR EACH ROW trigger could use both the OLD and NEW row
 variables as well as the delta relations (under whatever names we
 pick).  That probably won't be used very often, but I can imagine
 some cases where it might be useful.  I expect that these will
 normally be used in FOR EACH STATEMENT triggers.

I'm concerned about the performance implications of capturing the
delta relations unconditionally.  If a particular trigger actually
needs the delta relations, then the time spent capturing that
information is well spent; but if it doesn't, then it's a waste.
There are lots of people using FOR EACH STATEMENT triggers right now
who won't be happy if those start requiring O(n) additional temporary
storage, where n is the number of rows modified by the statement.
This is likely an even bigger issue for per-row triggers, of course,
where as you say, it probably won't be used often.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btreecheck extension

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 9:47 PM, Peter Geoghegan p...@heroku.com wrote:
 As discussed at the developer meeting at pgCon, I think that there is
 a lot to be said for a tool that checks nbtree index invariants on
 live systems.

Me too.

 Attached prototype patch adds contrib extension, btreecheck.

I don't feel qualified to comment on any of the substantive issues you
raise, so instead I'd like to bikeshed the name.  I suggest that we
create one extension to be a repository for index-checking machinery
(and perhaps also heap-checking machinery) and view this as the first
of possibly several checkers to live there.  Otherwise, we may
eventually end up with separate extensions for btreecheck, hashcheck,
gistcheck, gincheck, spgistcheck, minmaxcheck, vodkacheck, heapcheck,
toastcheck, etc. which seems like excessive namespace pollution.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 10:16 PM, Noah Misch n...@leadboat.com wrote:
 On Sat, Jun 14, 2014 at 10:37:36AM -0400, Tom Lane wrote:
 After giving somebody advice, for the Nth time, to install a
 memory-consumption ulimit instead of leaving his database to the tender
 mercies of the Linux OOM killer, it occurred to me to wonder why we don't
 provide a built-in feature for that, comparable to the ulimit -c max
 option that already exists in pg_ctl.

 In principle, I would like to have such a feature.  The only limit I've found
 reliable on the occasions I wanted this was RLIMIT_AS; RLIMIT_DATA has been
 ineffective on Linux+GNU libc.  RLIMIT_AS has its own problems, of course:
 address space usage is only tenuously connected to the definition of memory
 consumption folks actually want.  Worse, one can often construct a query to
 crash an RLIMIT_AS-affected backend.  Make the query use heap space until just
 shy of the limit, then burn stack space until RLIMIT_AS halts stack growth.

Ouch.  Having a feature that causes excessive memory utilization to
produce an ERROR that halts query execution and returns us to the top
level, as Tom and Greg were proposing, sounds nice, though even there
I wonder what happens if the memory exhaustion is due to things like
relcache bloat which will not be ameliorated by error recovery.  But
having a feature that crashes the backend in similar circumstances
doesn't sound nearly as nice.

We could do better by accounting for memory usage ourselves, inside
the memory-context system, but that'd probably impose some overhead we
don't have today.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit of logout

2014-06-17 Thread Robert Haas
On Mon, Jun 16, 2014 at 4:14 PM, Stephen Frost sfr...@snowman.net wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 Fujii Masao masao.fu...@gmail.com writes:
  That's harmful for audit purpose. I think that we should make
  log_disconnections PGC_SUSET rather than PGC_BACKEND in order
  to forbid non-superusers from changing its setting. Attached
  patch does this.

 I tend to agree with this- those are things which regular users really
 shouldn't be able to modify.  Policy enforcement can be done when there
 isn't system enforcement, but you really don't want to fall back to
 policy enforcement when you don't need to.

+1.

 TBH, this complaint seems entirely artificial.  If somebody needs to audit
 disconnections, and they see a connection record with no disconnection
 record but the session's no longer there, they certainly know that a
 disconnection occurred.  And if there wasn't a server crash to explain it,
 they go slap the wrist of whoever violated corporate policy by turning off
 log_disconnections.  Why do we need system-level enforcement of this?

 Going through every log file, and writing something to address log file
 changes, to hunt for those cases is no small amount of effort which
 you're asking every administrator with an audit requirement to write
 code to do to provide something which we make it appear as if we're
 doing for them.  It's certainly a violation of POLA that users can
 decide on their own that their disconnections don't get logged.

+1.

 I wonder whether we should just get rid of log_disconnections as a
 separate variable, instead logging disconnections when log_connections
 is set.

 Another answer is to make both variables PGC_SIGHUP, on the grounds
 that it doesn't make much sense for them not to be applied system-wide;
 except that I think there was some idea that logging might be enabled
 per-user or per-database using ALTER ROLE/DATABASE.

 Both of these options look pretty reasonable to me as a way to improve
 the current situation.  I'm not really sure that I see the use-case for
 only logging connections/disconnections on a per-user or per-database
 basis; in my experience it's not a lot of traffic to log it all and I
 don't recall ever seeing anyone set those anything other than
 system-wide.

 I like the idea of flexibility in what's logged, I just don't see this
 specific case as really needing it.

I don't really like either of these ideas much, and I don't really see
the problem with Fujii Masao's original proposal.  It's true that some
installations may find it valuable to preserve the property that a
disconnect is logged whenever they connect is logged, but if that's
really what's at issue, then presumably the superuser is not going to
be flipping these settings on and off on the fly *anyway*.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-17 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-17 13:14:26 -0400, Robert Haas wrote:
 On Sat, Jun 14, 2014 at 9:12 PM, Andres Freund and...@2ndquadrant.com 
 wrote:

 3) sparcv8: Last released model 1997.

 I seem to recall hearing about this in a customer situation
 relatively recently, so there may be a few of these still
 kicking around out there.

 Really? As I'd written in a reply solaris 10 (released 2005)
 dropped support for it. Dropping support for a platform that's
 been desupported 10 years ago by it's manufacturer doesn't sound
 bad imo...

The release of version n doesn't mean that version n-1 is no longer
supported.  As of today, this page:

http://www.oracle.com/technetwork/server-storage/solaris10/overview/index-138972.html

says:

 The Oracle Solaris 9 operating system is now in the Extended
 Support period of it's life cycle. Customers with support
 contracts can still access patches and log new service requests,
 although support contracts carry a surcharge for Oracle Solaris 9
 support.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 We could do better by accounting for memory usage ourselves, inside
 the memory-context system, but that'd probably impose some overhead we
 don't have today.

Hm.  We could minimize the overhead if we just accounted for entire
malloc chunks and not individual palloc allocations.  That would make
the overhead not zero, but yet probably small enough to ignore.

On the other hand, this approach would entirely fail to account for
non-palloc'd allocations, which could be a significant issue in some
contexts.

I wonder how practical it would be to forestall Noah's scenario by
preallocating all the stack space we want before enabling the rlimit.

Another idea would be to do the enforcement ourselves by means of
measuring the change in sbrk(0)'s reported value since startup, which we
could check whenever, say, we're about to request a large malloc chunk in
aset.c.  I'm not sure about the cost of that function though, nor about
whether this measurement method is meaningful in modern systems.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Robert Haas
On Tue, Jun 17, 2014 at 12:51 AM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Jun 16, 2014 at 8:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Not having looked at the patch, but: I think the probability of
 useless-noise HINTs could be substantially reduced if the code prints a
 HINT only when there is a single available alternative that is clearly
 better than the others in Levenshtein distance.  I'm not sure how much
 better is clearly better, but I exclude zero from that.  I see that
 the original description of the patch says that it will arbitrarily
 choose one alternative when there are several with equal Levenshtein
 distance, and I'd say that's a bad idea.

 I disagree. I happen to think that making some guess is better than no
 guess at all here, given the fact that there aren't too many
 possibilities to choose from. I think that it might be particularly
 annoying to not show some suggestion in the event of a would-be
 ambiguous column reference where the column name is itself wrong,
 since both mistakes are common. For example, order_id was specified
 instead of one of either o.orderid or ol.orderid, as in my
 original examples. If some correct alias was specified, that would
 make the new code prefer the appropriate Var, but it might not be, and
 that should be okay in my view.

 I'm not trying to remove the need for human judgement here. We've all
 heard stories about people who did things like input Portland into
 their GPS only to end up in Maine rather than Oregon, but I think in
 general you can only go so far in worrying about those cases.

Emitting a suggestion with a large distance seems like it could be
rather irritating.  If the user types in SELECT prodct_id FROM orders,
and that column does not exist, suggesting product_id, if such a
column exists, will likely be well-received.  Suggesting a column
named, say, price, however, will likely make at least some users say
no I didn't mean that you stupid @%!# - because probably the issue
there is that the user selected from the completely wrong table,
rather than getting 6 of the 9 characters they typed incorrect.

One existing tool that does something along these lines is 'git',
which seems to have some kind of a heuristic to know when to give up:

[rhaas pgsql]$ git gorp
git: 'gorp' is not a git command. See 'git --help'.

Did you mean this?
grep
[rhaas pgsql]$ git goop
git: 'goop' is not a git command. See 'git --help'.

Did you mean this?
grep
[rhaas pgsql]$ git good
git: 'good' is not a git command. See 'git --help'.
[rhaas pgsql]$ git puma
git: 'puma' is not a git command. See 'git --help'.

Did you mean one of these?
pull
push

I suspect that the maximum useful distance is a function of the string
length.  Certainly, if the distance is greater than or equal to the
length of one of the strings involved, it's just a totally unrelated
string and thus not worth suggesting.  A useful heuristic might be
something like distance at most 3, or at most half the string length,
whichever is less.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Built-in support for a memory consumption ulimit?

2014-06-17 Thread Robert Haas
On Tue, Jun 17, 2014 at 4:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 We could do better by accounting for memory usage ourselves, inside
 the memory-context system, but that'd probably impose some overhead we
 don't have today.

 Hm.  We could minimize the overhead if we just accounted for entire
 malloc chunks and not individual palloc allocations.  That would make
 the overhead not zero, but yet probably small enough to ignore.

Yeah, although it might expose more details of aset.c's allocation
behavior than we want users to have to know about.

 Another idea would be to do the enforcement ourselves by means of
 measuring the change in sbrk(0)'s reported value since startup, which we
 could check whenever, say, we're about to request a large malloc chunk in
 aset.c.  I'm not sure about the cost of that function though, nor about
 whether this measurement method is meaningful in modern systems.

I wouldn't like to count on that method.  I believe some malloc()
implementations implement large allocations via mmap().

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 17, 2014 at 12:51 AM, Peter Geoghegan p...@heroku.com wrote:
 I disagree. I happen to think that making some guess is better than no
 guess at all here, given the fact that there aren't too many
 possibilities to choose from.

 Emitting a suggestion with a large distance seems like it could be
 rather irritating.  If the user types in SELECT prodct_id FROM orders,
 and that column does not exist, suggesting product_id, if such a
 column exists, will likely be well-received.  Suggesting a column
 named, say, price, however, will likely make at least some users say
 no I didn't mean that you stupid @%!# - because probably the issue
 there is that the user selected from the completely wrong table,
 rather than getting 6 of the 9 characters they typed incorrect.

Yeah, that's my point exactly.  There's no very good reason to assume that
the intended answer is in fact among the set of column names we can see;
and if it *is* there, the Levenshtein distance to it isn't going to be
all that large.  I think that suggesting foobar when the user typed
glorp is not only not helpful, but makes us look like idiots.

 One existing tool that does something along these lines is 'git',
 which seems to have some kind of a heuristic to know when to give up:

I wouldn't necessarily hold up git as a model of user interface
engineering ;-) ... but still, it might be interesting to take a look
at exactly what heuristics they used here.  I'm sure there are other
precedents we could look at, too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] btreecheck extension

2014-06-17 Thread Peter Geoghegan
On Tue, Jun 17, 2014 at 1:16 PM, Robert Haas robertmh...@gmail.com wrote:
 I don't feel qualified to comment on any of the substantive issues you
 raise, so instead I'd like to bikeshed the name.  I suggest that we
 create one extension to be a repository for index-checking machinery
 (and perhaps also heap-checking machinery) and view this as the first
 of possibly several checkers to live there.  Otherwise, we may
 eventually end up with separate extensions for btreecheck, hashcheck,
 gistcheck, gincheck, spgistcheck, minmaxcheck, vodkacheck, heapcheck,
 toastcheck, etc. which seems like excessive namespace pollution.

I agree.

I hope that we'll eventually be able to move code like this into each
AM, with something like an amverifyintegrity pg_am entry optionally
provided. There'd also be a utility statement that would perform this
kind of verification. It seems natural to do this, as the patch I've
posted arguably adds a big modularity violation. Besides, it seems
worthwhile to pepper the regular regression tests with calls like
these, at least in some places, and putting something in core is the
most convenient way to do that.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:

 I wouldn't necessarily hold up git as a model of user interface
 engineering ;-) ... but still, it might be interesting to take a
 look at exactly what heuristics they used here.  I'm sure there
 are other precedents we could look at, too.

On my Ubuntu machine, bash does something similar.  A few examples
chosen completely arbitrarily:

kgrittn@Kevin-Desktop:~$ got
No command 'got' found, did you mean:
 Command 'go' from package 'golang-go' (universe)
 Command 'gout' from package 'scotch' (universe)
 Command 'jot' from package 'athena-jot' (universe)
 Command 'go2' from package 'go2' (universe)
 Command 'git' from package 'git' (main)
 Command 'gpt' from package 'gpt' (universe)
 Command 'gom' from package 'gom' (universe)
 Command 'goo' from package 'goo' (universe)
 Command 'gst' from package 'gnu-smalltalk' (universe)
 Command 'dot' from package 'graphviz' (main)
 Command 'god' from package 'god' (universe)
 Command 'god' from package 'ruby-god' (universe)
got: command not found
kgrittn@Kevin-Desktop:~$ groupad
No command 'groupad' found, did you mean:
 Command 'groupadd' from package 'passwd' (main)
 Command 'groupd' from package 'cman' (main)
groupad: command not found
kgrittn@Kevin-Desktop:~$ asdf
No command 'asdf' found, did you mean:
 Command 'asdfg' from package 'aoeui' (universe)
 Command 'sadf' from package 'sysstat' (main)
 Command 'sdf' from package 'sdf' (universe)
asdf: command not found
kgrittn@Kevin-Desktop:~$ zxcv
zxcv: command not found

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Josh Berkus
On 06/17/2014 01:59 PM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 Emitting a suggestion with a large distance seems like it could be
 rather irritating.  If the user types in SELECT prodct_id FROM orders,
 and that column does not exist, suggesting product_id, if such a
 column exists, will likely be well-received.  Suggesting a column
 named, say, price, however, will likely make at least some users say
 no I didn't mean that you stupid @%!# - because probably the issue
 there is that the user selected from the completely wrong table,
 rather than getting 6 of the 9 characters they typed incorrect.
 
 Yeah, that's my point exactly.  There's no very good reason to assume that
 the intended answer is in fact among the set of column names we can see;
 and if it *is* there, the Levenshtein distance to it isn't going to be
 all that large.  I think that suggesting foobar when the user typed
 glorp is not only not helpful, but makes us look like idiots.

Well, there's two different issues:

(1) offering a suggestion which is too different from what the user
typed.  This is easily limited by having a max distance (most likely a
distance/length ratio, with a max of say, 0.5).  The only drawback of
this would be the extra cpu cycles to calculate it, and some arguments
about what the max distance should be.  But for the sake of the
children, let's not have a GUC for it.

(2) If there are multiple columns with the same levenschtien distance,
which one do you suggest?  The current code picks a random one, which
I'm OK with.  The other option would be to list all of the columns.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-17 Thread Josh Berkus
On 06/17/2014 09:14 AM, Robert Haas wrote:
 Well, I don't know: suppose you're loading geospatial data showing the
 location of every building in some country.  It might easily be the
 case that the data is or can be loaded in an order that provides
 pretty good spatial locality, leading to tight bounding boxes over
 physically consecutive data ranges.

I admin a production application which has exactly this.  However, that
application doesn't have big enough data to benefit from minmax indexes;
it uses the basic spatial indexes.

So, my $0.02: bounding box minmax falls under the heading of would be
nice to have, but not if it delays the feature.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Peter Geoghegan
On Tue, Jun 17, 2014 at 1:59 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Yeah, that's my point exactly.  There's no very good reason to assume that
 the intended answer is in fact among the set of column names we can see;
 and if it *is* there, the Levenshtein distance to it isn't going to be
 all that large.  I think that suggesting foobar when the user typed
 glorp is not only not helpful, but makes us look like idiots.

Maybe that's just a matter of phrasing the message appropriately. A
more guarded message, that suggests that foobar is the *best* match
is correct at least on its own terms (terms that are self evident).
This does pretty effectively communicate to the user that they should
totally rethink not just the column name, but perhaps the entire
query. On the other hand, showing nothing communicates nothing.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 (2) If there are multiple columns with the same levenschtien distance,
 which one do you suggest?  The current code picks a random one, which
 I'm OK with.  The other option would be to list all of the columns.

I objected to that upthread.  I don't think that picking a random one is
sane at all.  Listing them all might be OK (I notice that that seems to be
what both bash and git do).

Another issue is whether to print only those having exactly the minimum
observed Levenshtein distance, or to print everything less than some
cutoff.  The former approach seems to me to be placing a great deal of
faith in something that's only a heuristic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-06-17 Thread Michael Paquier
On Wed, Jun 18, 2014 at 1:40 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Jun 2, 2014 at 8:55 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 I'm not sure if this is reasonably possible, but one thing that would
 make this tool a whole lot easier to use would be if you could make
 all the magic happen in a single server.  For example, suppose you had
 a background process that somehow got access to the pre and post
 images for every buffer change, and the associated WAL record, and
 tried applying the WAL record to the pre-image to see whether it got
 the corresponding post-image.  Then you could run 'make check' or so
 and afterwards do something like psql -c 'SELECT * FROM
 wal_replay_problems()' and hopefully get no rows back.
So your point is to have a 3rd independent server in the process that
would compare images taken from a master and its standby? Seems to
complicate the machinery.

 Don't get me wrong, having this tool at all sounds great.  But I think
 to really get the full benefit out of it we need to be able to run it
 in the buildfarm, so that if people break stuff it gets noticed
 quickly.
The patch I sent has included a regression test suite making the tests
rather facilitated: that's only a matter of running actually make
check in the contrib repository containing the binary able to compare
buffer captures between a master and a standby.

Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Josh Berkus
On 06/17/2014 02:36 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 (2) If there are multiple columns with the same levenschtien distance,
 which one do you suggest?  The current code picks a random one, which
 I'm OK with.  The other option would be to list all of the columns.
 
 I objected to that upthread.  I don't think that picking a random one is
 sane at all.  Listing them all might be OK (I notice that that seems to be
 what both bash and git do).
 
 Another issue is whether to print only those having exactly the minimum
 observed Levenshtein distance, or to print everything less than some
 cutoff.  The former approach seems to me to be placing a great deal of
 faith in something that's only a heuristic.

Well, that depends on what the cutoff is.  If it's high, like 0.5, that
could be a LOT of columns.  Like, I plan to test this feature with a
3-table join that has a combined 300 columns.  I can completely imagine
coming up with a string which is within 0.5 or even 0.3 of 40 columns names.

So if we want to list everything below a cutoff, we'd need to make that
cutoff fairly narrow, like 0.2.  But that means we'd miss a lot of
potential matches on short column names.

I really think we're overthinking this: it is just a HINT, and we can
improve it in future PostgreSQL versions, and most of our users will
ignore it anyway because they'll be using a client which doesn't display
HINTs.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Tom Lane
Peter Geoghegan p...@heroku.com writes:
 Maybe that's just a matter of phrasing the message appropriately. A
 more guarded message, that suggests that foobar is the *best* match
 is correct at least on its own terms (terms that are self evident).
 This does pretty effectively communicate to the user that they should
 totally rethink not just the column name, but perhaps the entire
 query. On the other hand, showing nothing communicates nothing.

I don't especially buy that argument.  As soon as the user's gotten used
to hints of this sort, the absence of a hint communicates plenty.

In any case, people have now cited two different systems with suggestion
capability, and neither of them behaves as you're arguing for.  The lack
of precedent should give you pause, unless you can point to widely-used
systems that do what you have in mind.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 06/17/2014 02:36 PM, Tom Lane wrote:
 Another issue is whether to print only those having exactly the minimum
 observed Levenshtein distance, or to print everything less than some
 cutoff.  The former approach seems to me to be placing a great deal of
 faith in something that's only a heuristic.

 Well, that depends on what the cutoff is.  If it's high, like 0.5, that
 could be a LOT of columns.  Like, I plan to test this feature with a
 3-table join that has a combined 300 columns.  I can completely imagine
 coming up with a string which is within 0.5 or even 0.3 of 40 columns names.

I think Levenshtein distances are integers, though that's just a minor
point.

 So if we want to list everything below a cutoff, we'd need to make that
 cutoff fairly narrow, like 0.2.  But that means we'd miss a lot of
 potential matches on short column names.

I'm not proposing an immutable cutoff.  Something that scales with the
string length might be a good idea, or we could make it a multiple of
the minimum observed distance, or probably there are a dozen other things
we could do.  I'm just saying that if we have an alternative at distance
3, and another one at distance 4, it's not clear to me that we should
assume that the first one is certainly what the user had in mind.
Especially not if all the other alternatives are distance 10 or more.

 I really think we're overthinking this: it is just a HINT, and we can
 improve it in future PostgreSQL versions, and most of our users will
 ignore it anyway because they'll be using a client which doesn't display
 HINTs.

Agreed that we can make it better later.  But whether it prints exactly
one suggestion, and whether it does that no matter how silly the
suggestion is, are rather fundamental decisions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Doing better at HINTing an appropriate column within errorMissingColumn()

2014-06-17 Thread Josh Berkus
On 06/17/2014 02:53 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 On 06/17/2014 02:36 PM, Tom Lane wrote:
 Another issue is whether to print only those having exactly the minimum
 observed Levenshtein distance, or to print everything less than some
 cutoff.  The former approach seems to me to be placing a great deal of
 faith in something that's only a heuristic.
 
 Well, that depends on what the cutoff is.  If it's high, like 0.5, that
 could be a LOT of columns.  Like, I plan to test this feature with a
 3-table join that has a combined 300 columns.  I can completely imagine
 coming up with a string which is within 0.5 or even 0.3 of 40 columns names.
 
 I think Levenshtein distances are integers, though that's just a minor
 point.

I was giving distance/length ratios.  That is, 0.5 would mean that up to
50% of the characters could be replaced/changed.  0.2 would mean that
only one character could be changed at lengths of five characters.  Etc.

The problem with these ratios is that they behave differently with long
strings than short ones.  I think realistically we'd need a double
threshold, i.e. ( distance = 2 OR ratio = 0.4 ).  Otherwise the
obvious case, getting two characters wrong in a 4-character column name
(or one in a two character name), doesn't get a HINT.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >