Re: [HACKERS] pg_receivexlog and replication slots

2014-09-22 Thread Michael Paquier
On Mon, Sep 22, 2014 at 2:25 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Sat, Sep 20, 2014 at 10:06 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Sat, Sep 20, 2014 at 7:09 AM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  3.
  I find existing comments okay, is there a need to change
  in this case?  Part of the new comment mentions
  obtaining start LSN position, actually the start position is
  identified as part of next function call FindStreamingStart(),
  only incase it return InvalidXLogRecPtr, the value returned
  by IDENTIFY_SYSTEM will be used.
 Hopefully I am following you correctly here: comment has been updated
 to mention that the start LSN and TLI fetched from IDENTIFY_SYSTEM are
 always fetched but used only if no valid position is found in output
 folder of pg_receivexlog.

 Updated comment is consistent with code, however my main point
 was that in this case, I don't see much need to change existing
 comment.  I think this point is more related to each individual's
 perspective, so if you think there is a need to update the existing
 comment, then retain as it is in your patch and let Committer
 take a call about it.
Well, let's use your suggestion then and switch back to the former comment then.

  4.
  + /* Obtain a connection before doing anything */
  + conn = GetConnection();
  + if (!conn)
  + /* Error message already written in GetConnection() */
  Is there a reason for moving this function out of StreamLog(),
  there is no harm in moving it, but there doesn't seem to be any
  problem even if it is called inside StreamLog().
 For pg_recvlogical, this move is done to reduce code redundancy, and
 actually it makes sense to just grab one connection in the main() code
 path before performing any replication commands. For pg_receivexlog,
 the move is done because it makes the code more consistent with
 pg_recvlogical, and also it is a preparatory work for the second patch
 that introduces the create/drop slot. Even if 2nd patch is not
 accepted I figured out that it would not hurt to simply grab one
 connection in the main code path before doing any action.

 In pg_receivexlog, StreamLog() calls PQfinish() to close a connection
 to the backend and StreamLog() is getting called in while(true) loop,
 so if you just grab the connection once in main loop, the current
 logic won't work.  For same reasons, the current coding related to
 GetConnection() in pg_receivelogical seems to be right, so it is better
 not to change that as well.  For the second patch (that introduces the
 create/drop slot),  I think it is better to do in a way similar to what
 current pg_receivelogical does.
OK let's do so then. I changed back the GetConnection stuff back to
what is done on master. In the second patch, I added an extra call to
GetConnection before the create/drop commands.

  6.
  + /* Identify system, obtaining start LSN position at the same time */
  + if (!RunIdentifySystem(conn,
  NULL, NULL, startpos, plugin_name))
  + disconnect_and_exit(1);
  a.
  Generally IdentifySystem is called as first API, but I think you
  have changed its location after CreateReplicationSlot as you want
  to double check the value of plugin, not sure if that is necessary or
  important enough that we start calling it later.
 Funny part here is that even the current code on master and
 REL9_4_STABLE of pg_recvlogical.c is fetching a start position when
 creating a replication slot that is not used as two different actions
 cannot be used at the same time. That's a matter of removing this line
 in code though:
 startpos = ((uint64) hi)  32 | lo;

 User is not allowed to give startpos with drop or create of replication
 slot.  It is prevented by check:
 if (startpos != InvalidXLogRecPtr  (do_create_slot || do_drop_slot))
 So it seems you cannot remove the startpos assignment in code.
Ah yes true, it is actually possible to start the replication process
after creating the slot, so better not to remove it... I have updated
CreateReplicationSlot to add startpos in the fields that can be
requested from results.

 As that's only cosmetic for 9.4, and this patch makes things more
 correct I guess that's fine to do nothing and just try to get this
 patch in.

 In what sense do you think that this part of patch is better than
 current code?
I was trying to make the code a maximum consistent between the two
utilities, but your approach makes more sense.

 I think calling Identify_System as a first command makes sense
 (as it is in current code) because if that fails we should not
 proceed with execution of other command's.
Yes looking at that again you are right.

 Another point about refactoring patch is that fourth column in
 Identify_System is dbname and in patch, you are referring it as
 plugin which seems to be inconsistent.
Oops. Thanks for checking, I changed the check to have something for
the database name.

At the same time, I noticed an unnecessary limitation in the second
patch: we should be able to 

[HACKERS] libpq connection status and closed fd

2014-09-22 Thread Daniele Varrazzo
Hello,

a psycopg user is reporting [1] that the library is not marking the
connection as closed and/or bad after certain errors, such as a
connection timeout. He is emulating the error by closing the
connection fd (I don't know if the two conditions result in the same
effect, but I'll stick to this hypothesis for now).

[1] https://github.com/psycopg/psycopg2/issues/263

Testing with a short C program [2] it seems that indeed, after closing
the fd and causing an error in `PQconsumeInput`, the connection is
deemed in good state by `PQstatus`. A similar test gives the same
result after `PQexec` is attempted on a connection whose fd is closed
(tests performed with PostgreSQL 9.3.5).

[2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

Is this intentional? Is there a better way to check for a broken connection?

If we mark the connection as closed every time `PQconsumeInput`
returns 0 (or `PQexec` returns null, which are the two code paths
affecting psycopg) would it be too aggressive and cause false
positives?

Thank you very much.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Documentation fix for pg_recvlogical's --create mode

2014-09-22 Thread Michael Paquier
Hi all,

In the documentation of pg_recvlogical here
(http://www.postgresql.org/docs/devel/static/app-pgrecvlogical.html),
there is the following sentence:
Create a new logical replication slot with the name specified in
--slot, using the output plugin --plugin, then exit.
Actually that's not completely true, we can create a replication slot
and then start streaming by combining --start and --create. Process
will only exit if --start is not used. Patch is attached.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/ref/pg_recvlogical.sgml b/doc/src/sgml/ref/pg_recvlogical.sgml
index ce5ad5e..76240fe 100644
--- a/doc/src/sgml/ref/pg_recvlogical.sgml
+++ b/doc/src/sgml/ref/pg_recvlogical.sgml
@@ -58,8 +58,8 @@ PostgreSQL documentation
para
 Create a new logical replication slot with the name specified in
 option--slot/option, using the output plugin
-option--plugin/option, then exit. The slot is created for the
-database given in option--dbname/option.
+option--plugin/option. The slot is created for the database
+given in option--dbname/option.
/para
   /listitem
  /varlistentry

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-22 Thread Amit Kapila
On Mon, Sep 22, 2014 at 10:43 AM, Gregory Smith gregsmithpg...@gmail.com
wrote:

 On 9/16/14, 8:18 AM, Amit Kapila wrote:

 I think the main reason for slight difference is that
 when the size of shared buffers is almost same as data size, the number
 of buffers it needs from clock sweep are very less, as an example in first
 case (when size of shared buffers is 12286MB), it actually needs at most
 256 additional buffers (2MB) via clock sweep, where as bgreclaimer
 will put 2000 (high water mark) additional buffers (0.5% of shared buffers
 is greater than 2000 ) in free list, so bgreclaimer does some extra work
 when it is not required

 This is exactly what I was warning about, as the sort of lesson learned
 from the last round of such tuning.  There are going to be spots where
 trying to tune the code to be aggressive on the hard cases will work
 great.  But you need to make that dynamic to some degree, such that the
 code doesn't waste a lot of time sweeping buffers when the demand for them
 is actually weak.  That will make all sorts of cases that look like this
 slower.


To verify whether above can lead to any kind of regression, I have
checked the cases (workload is 0.05 or 0.1 percent larger than shared
buffers) where we need few extra buffers and bgreclaimer might put
some additional buffers and it turns out that in those cases also, there
is a win especially at high concurrency and results of the same are posted
upthread
(
http://www.postgresql.org/message-id/CAA4eK1LFGcvzMdcD5NZx7B2gCbP1G7vWK7w32EZk=voolud...@mail.gmail.com).



 We should be able to tell these apart if there's enough instrumentation
 and solid logic inside of the program itself though.  The 8.3 era BGW coped
 with a lot of these issues using a particular style of moving average with
 fast reaction time, plus instrumenting the buffer allocation rate as
 accurately as it could. So before getting into high/low water note
 questions, are you comfortable that there's a clear, accurate number that
 measures the activity level that's important here?



Very Good Question.  This was exactly the thing which was
missing in my initial versions (about 2 years back when I tried to
solve this problem) but based on Robert's and Andres's feedback
I realized that we need an accurate number to measure the activity
level (in this case it is consumption of buffers from freelist), so
I have introduced the logic to calculate the same (it is stored in new
variable numFreeListBuffers in BufferStrategyControl structure).



 And have you considered ways it might be averaging over time or have a
 history that's analyzed?



The current logic of bgreclaimer is such that even if it does
some extra activity (extra is very much controlled) in one cycle,
it will not start another cycle unless backends consume all the
buffers that were made available by bgreclaimer in one cycle.
I think the algorithm designed for bgreclaimer automatically
averages out based on activity.  Do you see any cases where it
will not do so?


 The exact fast approach / slow decay weighted moving average approach of
 the 8.3 BGW, the thing that tried to smooth the erratic data set possible
 here, was a pretty critical part of getting itself auto-tuning to workload
 size.  It ended up being much more important than the work of setting the
 arbitrary watermark levels.


Agreed, but the logic with which bgwriter works is pretty different
and thats why it needs different kind of logic to handle auto-tuning.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Index scan optimization

2014-09-22 Thread Heikki Linnakangas

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

I have observed a scope of considerable performance improvement in-case of 
index by a very minor code change.
Consider the below schema:

create table tbl2(id1 int, id2 varchar(10), id3 int);
create index idx2 on tbl2(id2, id3);

Query as:
 select count(*) from tbl2 where id2'a' and id399;

As per current design, it takes following steps to retrieve index tuples:

1.   Find the scan start position by searching first position in BTree as per 
the first key condition i.e. as per id2'a'

2.   Then it fetches each tuples from position found in step-1.

3.   For each tuple, it matches all scan key condition, in our example it 
matches both scan key condition.

4.   If condition match, it returns the tuple otherwise scan stops.

Now problem is here that already first scan key condition is matched to find 
the scan start position (Step-1), so it is obvious that any further tuple also 
will match the first scan key condition (as records are sorted).
So comparison on first scan key condition again in step-3 seems to be redundant.

So my proposal is to skip the condition check on the first scan key condition 
for every tuple.


The same happens in a single-column case. If you have a query like 
SELECT * FROM tbl2 where id2  'a', once you've found the start 
position of the scan, you know that all the rows that follow match too.



I would like to submit the patch for this improvement.
Please provide your feedback. Also let me know if I am missing something.


Yeah, sounds like a good idea. This scenario might not arise very often, 
but it should be cheap to check, so I doubt it will add any measurable 
overhead to the cases where the optimization doesn't help.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Dmitriy Igrishin
2014-09-22 10:42 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:

 Hello,

 a psycopg user is reporting [1] that the library is not marking the
 connection as closed and/or bad after certain errors, such as a
 connection timeout. He is emulating the error by closing the
 connection fd (I don't know if the two conditions result in the same
 effect, but I'll stick to this hypothesis for now).

 [1] https://github.com/psycopg/psycopg2/issues/263

 Testing with a short C program [2] it seems that indeed, after closing
 the fd and causing an error in `PQconsumeInput`, the connection is
 deemed in good state by `PQstatus`. A similar test gives the same
 result after `PQexec` is attempted on a connection whose fd is closed
 (tests performed with PostgreSQL 9.3.5).

 [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

Why are you using close() instead of PQfinish()?



 Is this intentional? Is there a better way to check for a broken
 connection?

BTW, PQsocket() returns -1 after PQfinish().


 If we mark the connection as closed every time `PQconsumeInput`
 returns 0 (or `PQexec` returns null, which are the two code paths
 affecting psycopg) would it be too aggressive and cause false
 positives?

 Thank you very much.

 -- Daniele


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
// Dmitriy.


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Daniele Varrazzo
On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com wrote:

 2014-09-22 10:42 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:

 [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f

 Why are you using close() instead of PQfinish()?

Because I'm testing for an error, please read my message and the
original bug report.

-- Daniele


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Dmitriy Igrishin
2014-09-22 11:35 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:

 On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com
 wrote:
 
  2014-09-22 10:42 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com
 :

  [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
 
  Why are you using close() instead of PQfinish()?

 Because I'm testing for an error, please read my message and the
 original bug report.

I read it. You are testing for an error, but use libpq in a wrong way in
your test case. You must use PQfinish() to close the connection
and PQstatus() will works for you.


 -- Daniele




-- 
// Dmitriy.


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Marko Tiikkaja

On 9/22/14 9:45 AM, Dmitriy Igrishin wrote:

2014-09-22 11:35 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:


On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com
wrote:

Why are you using close() instead of PQfinish()?


Because I'm testing for an error, please read my message and the
original bug report.


I read it. You are testing for an error, but use libpq in a wrong way in
your test case. You must use PQfinish() to close the connection
and PQstatus() will works for you.


Perhaps you should go back and re-read it then.  The point of the test 
case is not to test connection closure; it's to test behaviour in the 
presence of network errors.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Dmitriy Igrishin
2014-09-22 12:36 GMT+04:00 Marko Tiikkaja ma...@joh.to:

 On 9/22/14 9:45 AM, Dmitriy Igrishin wrote:

 2014-09-22 11:35 GMT+04:00 Daniele Varrazzo daniele.varra...@gmail.com:

  On Mon, Sep 22, 2014 at 8:17 AM, Dmitriy Igrishin dmit...@gmail.com
 wrote:

 Why are you using close() instead of PQfinish()?


 Because I'm testing for an error, please read my message and the
 original bug report.

  I read it. You are testing for an error, but use libpq in a wrong way in
 your test case. You must use PQfinish() to close the connection
 and PQstatus() will works for you.


 Perhaps you should go back and re-read it then.  The point of the test
 case is not to test connection closure; it's to test behaviour in the
 presence of network errors.

And where the network error emulation in the test case? By closing fd?
I'm sorry if I don't understand something, but really, I don't see any
problem
or incorrect behavior of *libpq*. It's behavior adequate to a test case.




 .marko




-- 
// Dmitriy.


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Andres Freund
On 2014-09-22 07:42:01 +0100, Daniele Varrazzo wrote:
 Hello,
 
 a psycopg user is reporting [1] that the library is not marking the
 connection as closed and/or bad after certain errors, such as a
 connection timeout. He is emulating the error by closing the
 connection fd (I don't know if the two conditions result in the same
 effect, but I'll stick to this hypothesis for now).
 
 [1] https://github.com/psycopg/psycopg2/issues/263
 
 Testing with a short C program [2] it seems that indeed, after closing
 the fd and causing an error in `PQconsumeInput`, the connection is
 deemed in good state by `PQstatus`. A similar test gives the same
 result after `PQexec` is attempted on a connection whose fd is closed
 (tests performed with PostgreSQL 9.3.5).
 
 [2] https://gist.github.com/dvarrazzo/065f343c95f8ea67cf8f
 
 Is this intentional? Is there a better way to check for a broken connection?

Note that the libpq code treats connection resets differently from
other, arbitrary, errors:

int
pqReadData(PGconn *conn)
{
...
nread = pqsecure_read(conn, conn-inBuffer + conn-inEnd,
  conn-inBufSize - 
conn-inEnd);
if (nread  0)
{
if (SOCK_ERRNO == EINTR)
goto retry3;
/* Some systems return EAGAIN/EWOULDBLOCK for no data */
#ifdef EAGAIN
if (SOCK_ERRNO == EAGAIN)
return someread;
#endif
#if defined(EWOULDBLOCK)  (!defined(EAGAIN) || (EWOULDBLOCK != EAGAIN))
if (SOCK_ERRNO == EWOULDBLOCK)
return someread;
#endif
/* We might get ECONNRESET here if using TCP and backend died */
#ifdef ECONNRESET
if (SOCK_ERRNO == ECONNRESET)
goto definitelyFailed;
#endif
/* pqsecure_read set the error message for us */
return -1;
}


I.e. if the kernel returns that the connection has been closed it'll do
different stuff.

So I'm not convinced that the testcaseq is valid. This isn't the error
you'd get after a timeout or similar. We could add a special case path
for EBADFD, but why?

 If we mark the connection as closed every time `PQconsumeInput`
 returns 0 (or `PQexec` returns null, which are the two code paths
 affecting psycopg) would it be too aggressive and cause false
 positives?

Likely, yes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Marko Tiikkaja

On 9/22/14 10:57 AM, Andres Freund wrote:

On 2014-09-22 07:42:01 +0100, Daniele Varrazzo wrote:

Is this intentional? Is there a better way to check for a broken connection?


Note that the libpq code treats connection resets differently from
other, arbitrary, errors:

I.e. if the kernel returns that the connection has been closed it'll do
different stuff.

So I'm not convinced that the testcaseq is valid. This isn't the error
you'd get after a timeout or similar. We could add a special case path
for EBADFD, but why?


Probably not for EBADFD, but how about e.g. ETIMEDOUT?  The SSL code 
path seems to be putting EPIPE into the same category as ECONNRESET, too.



.marko


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-09-22 Thread Rahila Syed
Hello All,


Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.

Yes.



but the end result is the
same: to properly submit a patch, you need to send an email to the
 mailing list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.


Thank you. I will take care of it henceforth.

Please find attached the patch to compress FPW.  Patch submitted by
Fujii-san earlier in the thread is used to merge compression GUC with
full_page_writes.



I am reposting the measurement numbers.

Server Specification:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDWD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Checkpoint segments: 1024
Checkpoint timeout: 5 mins

pgbench -c 64 -j 64 -r -T 900 -M prepared
Scale factor: 1000

WAL generated (MB)   Throughput
(tps)  Latency(ms)
On 9235.43   979.03
  65.36
Compress(pglz)   6518.681072.34
59.66
Off 501.04
1135.1756.34

The results show  around 30 percent decrease in WAL volume due to
compression of FPW.

Thank you ,

Rahila Syed

Tom Lane wrote:
 Rahila Syed rahilasyed rahilasyed...@gmail.com.90@
rahilasyed...@gmail.comgmail.com rahilasyed...@gmail.com writes:
  Please find attached patch to compress FPW using pglz compression.

 Patch not actually attached AFAICS (no, a link is not good enough).

Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.  Not her fault, really -- but the end result is the
same: to properly submit a patch, you need to send an email to the
pgsql pgsql-hackers@postgresql.org- pgsql-hackers@postgresql.orghackers
pgsql-hackers@postgresql.org@ pgsql-hackers@postgresql.orgpostgresql.org
pgsql-hackers@postgresql.orgmailing list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.

--
Álvaro Herrera   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2014-09-22 Thread Syed, Rahila
Hello,

Please find attached the patch to compress FPW.

Sorry I had forgotten to attach. Please find the patch attached.

Thank you,
Rahila Syed





From: pgsql-hackers-ow...@postgresql.org 
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Rahila Syed
Sent: Monday, September 22, 2014 3:16 PM
To: Alvaro Herrera
Cc: Rahila Syed; PostgreSQL-development; Tom Lane
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes


Hello All,


Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.

Yes.



but the end result is the
same: to properly submit a patch, you need to send an email to the
 mailing list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.


Thank you. I will take care of it henceforth.

Please find attached the patch to compress FPW.  Patch submitted by Fujii-san 
earlier in the thread is used to merge compression GUC with full_page_writes.



I am reposting the measurement numbers.

Server Specification:
Processors:Intel® Xeon ® Processor E5-2650 (2 GHz, 8C/16T, 20 MB) * 2 nos
RAM: 32GB
Disk : HDWD 450GB 10K Hot Plug 2.5-inch SAS HDD * 8 nos
1 x 450 GB SAS HDD, 2.5-inch, 6Gb/s, 10,000 rpm

Checkpoint segments: 1024
Checkpoint timeout: 5 mins

pgbench -c 64 -j 64 -r -T 900 -M prepared
Scale factor: 1000

WAL generated (MB)   Throughput (tps)   
   Latency(ms)
On 9235.43   979.03 
  65.36
Compress(pglz)   6518.681072.34 
59.66
Off 501.04  1135.17 
   56.34

The results show  around 30 percent decrease in WAL volume due to compression 
of FPW.

Thank you ,

Rahila Syed

Tom Lane wrote:
 Rahila Syed 
 rahilasyedmailto:rahilasyed...@gmail.com.90@mailto:rahilasyed...@gmail.comgmail.commailto:rahilasyed...@gmail.com
  writes:
  Please find attached patch to compress FPW using pglz compression.

 Patch not actually attached AFAICS (no, a link is not good enough).

Well, from Rahila's point of view the patch is actually attached, but
she's posting from the Nabble interface, which mangles it and turns into
a link instead.  Not her fault, really -- but the end result is the
same: to properly submit a patch, you need to send an email to the
pgsqlmailto:pgsql-hackers@postgresql.org-mailto:pgsql-hackers@postgresql.orghackersmailto:pgsql-hackers@postgresql.org@mailto:pgsql-hackers@postgresql.orgpostgresql.orgmailto:pgsql-hackers@postgresql.orgmailing
 list, not join a group/forum from
some intermediary newsgroup site that mirrors the list.

--
Álvaro Herrera   http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

__
Disclaimer:This email and any attachments are sent in strictest confidence for 
the sole use of the addressee and may contain legally privileged, confidential, 
and proprietary data.  If you are not the intended recipient, please advise the 
sender by replying promptly to this email and then delete and destroy this 
email and any attachments without any further use, copying or forwarding


compress_fpw_v1.patch
Description: compress_fpw_v1.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index scan optimization

2014-09-22 Thread Rajeev rastogi
On 22 September 2014 12:35, Heikki Linnakangas:

  I have observed a scope of considerable performance improvement in-
 case of index by a very minor code change.
  Consider the below schema:
 
  create table tbl2(id1 int, id2 varchar(10), id3 int); create index
  idx2 on tbl2(id2, id3);
 
  Query as:
   select count(*) from tbl2 where id2'a' and
  id399;
 
  As per current design, it takes following steps to retrieve index
 tuples:
 
  1.   Find the scan start position by searching first position in
 BTree as per the first key condition i.e. as per id2'a'
 
  2.   Then it fetches each tuples from position found in step-1.
 
  3.   For each tuple, it matches all scan key condition, in our
 example it matches both scan key condition.
 
  4.   If condition match, it returns the tuple otherwise scan
 stops.
 
  Now problem is here that already first scan key condition is matched
 to find the scan start position (Step-1), so it is obvious that any
 further tuple also will match the first scan key condition (as records
 are sorted).
  So comparison on first scan key condition again in step-3 seems to be
 redundant.
 
  So my proposal is to skip the condition check on the first scan key
 condition for every tuple.
 
 The same happens in a single-column case. If you have a query like
 SELECT * FROM tbl2 where id2  'a', once you've found the start
 position of the scan, you know that all the rows that follow match too.

Very much true.

  I would like to submit the patch for this improvement.
  Please provide your feedback. Also let me know if I am missing
 something.
 
 Yeah, sounds like a good idea. This scenario might not arise very often,
 but it should be cheap to check, so I doubt it will add any measurable
 overhead to the cases where the optimization doesn't help.

Thanks, I shall start to prepare a patch for this optimization and share in 1 
or 2 days.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in syncrep.c

2014-09-22 Thread Pavan Deolasee
On Thu, Sep 18, 2014 at 12:02 PM, Pavan Deolasee pavan.deola...@gmail.com
wrote:

 Hi All,

 While running some tests on REL9_2_STABLE branch, I saw an assertion
 failure in syncrep.c. The stack trace looks like this:


Any comments on this? I see it very regularly during my pgbench tests.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: [HACKERS] Assertion failure in syncrep.c

2014-09-22 Thread Stephen Frost
Pavan,

* Pavan Deolasee (pavan.deola...@gmail.com) wrote:
 On Thu, Sep 18, 2014 at 12:02 PM, Pavan Deolasee pavan.deola...@gmail.com
 wrote:
  While running some tests on REL9_2_STABLE branch, I saw an assertion
  failure in syncrep.c. The stack trace looks like this:
 
 Any comments on this? I see it very regularly during my pgbench tests.

I agree that it looks like there may be a race condition there- but
could you provide the test cases you're working with?  What kind of
behavior is it that's making it show up easily for you?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Index scan optimization

2014-09-22 Thread Stephen Frost
* Rajeev rastogi (rajeev.rast...@huawei.com) wrote:
 Thanks, I shall start to prepare a patch for this optimization and share in 1 
 or 2 days.

This sounded interesting to me also- please be sure to put it on the
open commitfest once you have posted the patch.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] documentation references invalid -A assertion checks option

2014-09-22 Thread Merlin Moncure
Both the documentation
(http://www.postgresql.org/docs/9.4/static/app-postgres.html) and the
--help switch to postgres reference a -A switch to handle assertion
checking.   Looking at the code, I don't see any entry for -A in the
getopt string and passing -A always fails with 'invalid option'
regardless of the compile time setting.  If I'm right, either the docs
or the source need to be patched.  The question is, which one?  (I
vote to remove references to the option).

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] documentation references invalid -A assertion checks option

2014-09-22 Thread Andres Freund
On 2014-09-22 08:43:02 -0500, Merlin Moncure wrote:
 Both the documentation
 (http://www.postgresql.org/docs/9.4/static/app-postgres.html) and the
 --help switch to postgres reference a -A switch to handle assertion
 checking.   Looking at the code, I don't see any entry for -A in the
 getopt string and passing -A always fails with 'invalid option'
 regardless of the compile time setting.  If I'm right, either the docs
 or the source need to be patched.  The question is, which one?  (I
 vote to remove references to the option).

You're probably building master. That option has been removed (by me)
since. The 'devel' docs don't show the option anymore unless I missed a
place referencing it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index scan optimization

2014-09-22 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 On 09/22/2014 07:47 AM, Rajeev rastogi wrote:
 So my proposal is to skip the condition check on the first scan key 
 condition for every tuple.

 The same happens in a single-column case. If you have a query like 
 SELECT * FROM tbl2 where id2  'a', once you've found the start 
 position of the scan, you know that all the rows that follow match too.

... unless you're doing a backwards scan.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index scan optimization

2014-09-22 Thread Heikki Linnakangas

On 09/22/2014 04:45 PM, Tom Lane wrote:

Heikki Linnakangas hlinnakan...@vmware.com writes:

On 09/22/2014 07:47 AM, Rajeev rastogi wrote:

So my proposal is to skip the condition check on the first scan key condition 
for every tuple.



The same happens in a single-column case. If you have a query like
SELECT * FROM tbl2 where id2  'a', once you've found the start
position of the scan, you know that all the rows that follow match too.


... unless you're doing a backwards scan.


Sure. And you have to still check for NULLs. Have to get the details right..

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq connection status and closed fd

2014-09-22 Thread Tom Lane
Daniele Varrazzo daniele.varra...@gmail.com writes:
 a psycopg user is reporting [1] that the library is not marking the
 connection as closed and/or bad after certain errors, such as a
 connection timeout. He is emulating the error by closing the
 connection fd

That seems like a completely illegitimate test procedure.  A network
failure does not result in automatic closure of the FD so there is
no reason for the libpq code to be expecting that to happen.  Instead,
it expects to see an appropriate error code next time it tries to
use the socket.

If you want to test for connection loss, consider doing a kill -9 on
the connected backend (best not to do that with a production server
of course).

A larger point is that your user may well have false expectations
about how quickly libpq will detect connection loss.  Generally
that won't happen until it next tries to transmit or receive some
data; so a simple PQstatus() call doesn't prove a lot about whether
the connection has been lost since the last query.  This is not a
bug either IMO.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] documentation references invalid -A assertion checks option

2014-09-22 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-09-22 08:43:02 -0500, Merlin Moncure wrote:
 Both the documentation
 (http://www.postgresql.org/docs/9.4/static/app-postgres.html) and the
 --help switch to postgres reference a -A switch to handle assertion
 checking.

 You're probably building master. That option has been removed (by me)
 since. The 'devel' docs don't show the option anymore unless I missed a
 place referencing it.

As Merlin says, HEAD is still doing this:

$ postgres --help
postgres is the PostgreSQL server.

Usage:
  postgres [OPTION]...

Options:
  -A 1|0 enable/disable run-time assert checking
  ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] documentation references invalid -A assertion checks option

2014-09-22 Thread Andres Freund
On 2014-09-22 09:58:49 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-09-22 08:43:02 -0500, Merlin Moncure wrote:
  Both the documentation
  (http://www.postgresql.org/docs/9.4/static/app-postgres.html) and the
  --help switch to postgres reference a -A switch to handle assertion
  checking.
 
  You're probably building master. That option has been removed (by me)
  since. The 'devel' docs don't show the option anymore unless I missed a
  place referencing it.
 
 As Merlin says, HEAD is still doing this:
 
 $ postgres --help
 postgres is the PostgreSQL server.
 
 Usage:
   postgres [OPTION]...
 
 Options:
   -A 1|0 enable/disable run-time assert checking
   ...

Gah, 3bdcf6a5a7555035810e2ba2b8a0fb04dc5c66b8 removed several doc
references about it, but not --help. Will fix.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] from_collapse_limit considerations

2014-09-22 Thread Antonin Houska
While doing experiments with rather long FROM-lists, I looked closely at the
logic related to from_collapse_limit.

I noticed that - unlike join_collapse_limit - the from_collapse_limit does not
enforce maximum length of the top-level list. Shouldn't it do? Too long
FROM-list can obviously lead to excessive planning time.

Also, the order of FROM-list items seems to affect the way RTEs are grouped
into (sub)lists. In this example, the join of tab_0, tab_1, tab_2, tab_3 gets
expanded into 4 separate RTE refs:

SET from_collapse_limit TO 5;

SELECT  *
FROM
(
(
tab_0
JOIN
tab_1
ON tab_0.id = tab_1.id
)
JOIN
(
tab_2
JOIN
tab_3
ON tab_2.id = tab_3.id
)
ON  tab_1.id = tab_2.id
),
tab_4
JOIN
tab_5
ON tab_4.id = tab_5.id
WHERE   tab_3.id = tab_4.id;

However, in the next example (the JOIN of tab_4 and tab_5 moved to the
beginning of the FROM list), the bigger join (tab_0 through tab_3) comes
too late, so it's inserted as a sub-list.

SET from_collapse_limit TO 5;

SELECT  *
FROM
tab_4
JOIN
tab_5
ON tab_4.id = tab_5.id,
(
(
tab_0
JOIN
tab_1
ON tab_0.id = tab_1.id
)
JOIN
(
tab_2
JOIN
tab_3
ON tab_2.id = tab_3.id
)
ON  tab_1.id = tab_2.id
)
WHERE   tab_3.id = tab_4.id;


Is anything wrong about the idea not to estimate the total length of the FROM
list in deconstruct_recurse and to do additional collapsing later instead? The
patch attached here tries to do so.

I wonder if change of the logic behind from_collapse_limit should be
considered acceptable for users or not: although it improves control over
planning of queries having long FROM-list, it can make some plans of existing
applications worse, unless from_collapse_limit is increased accordingly.

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index f88e493..fffc3aa 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -51,6 +51,7 @@ static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
 	bool below_outer_join,
 	Relids *qualscope, Relids *inner_join_rels,
 	List **postponed_qual_list);
+static List *collapse_fromlist(List *fromlist);
 static SpecialJoinInfo *make_outerjoininfo(PlannerInfo *root,
    Relids left_rels, Relids right_rels,
    Relids inner_join_rels,
@@ -659,6 +660,9 @@ deconstruct_jointree(PlannerInfo *root)
 	/* Shouldn't be any leftover quals */
 	Assert(postponed_qual_list == NIL);
 
+	/* Create sub-list(s) if the FROM-list appears to be too long.  */
+	result = collapse_fromlist(result);
+
 	return result;
 }
 
@@ -710,7 +714,6 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 	{
 		FromExpr   *f = (FromExpr *) jtnode;
 		List	   *child_postponed_quals = NIL;
-		int			remaining;
 		ListCell   *l;
 
 		/*
@@ -722,12 +725,10 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 		*qualscope = NULL;
 		*inner_join_rels = NULL;
 		joinlist = NIL;
-		remaining = list_length(f-fromlist);
 		foreach(l, f-fromlist)
 		{
 			Relids		sub_qualscope;
 			List	   *sub_joinlist;
-			int			sub_members;
 
 			sub_joinlist = deconstruct_recurse(root, lfirst(l),
 			   below_outer_join,
@@ -735,13 +736,14 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 			   inner_join_rels,
 			   child_postponed_quals);
 			*qualscope = bms_add_members(*qualscope, sub_qualscope);
-			sub_members = list_length(sub_joinlist);
-			remaining--;
-			if (sub_members = 1 ||
-list_length(joinlist) + sub_members + remaining = from_collapse_limit)
-joinlist = list_concat(joinlist, sub_joinlist);
-			else
-joinlist = lappend(joinlist, sub_joinlist);
+
+			/*
+			 * Sub-lists are not created at this stage, as we can't predict how
+			 * many RTEs the possibly following join nodes may contain. Instead,
+			 * length of the top-level list is checked later when
+			 * deconstruct_recurse has finished.
+			 */
+			joinlist = list_concat(joinlist, sub_joinlist);
 		}
 
 		/*
@@ -1013,6 +1015,65 @@ deconstruct_recurse(PlannerInfo *root, Node *jtnode, bool below_outer_join,
 	return joinlist;
 }
 
+
+/*
+ * Ensure that neither the top-level FROM-list nor any its sublist exceeds
+ * from_collapse_limit.
+ */
+static List *
+collapse_fromlist(List *fromlist)
+{
+	List *result = fromlist;

Re: [HACKERS] from_collapse_limit considerations

2014-09-22 Thread Tom Lane
Antonin Houska a...@cybertec.at writes:
 I noticed that - unlike join_collapse_limit - the from_collapse_limit does not
 enforce maximum length of the top-level list. Shouldn't it do?

How would it do that?  You want it to fail outright if there are more than
N tables?  That seems unhelpful.

 Also, the order of FROM-list items seems to affect the way RTEs are grouped
 into (sub)lists.

Yeah.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] END_OF_RECOVERY shutdowns and ResetUnloggedRelations()

2014-09-22 Thread Robert Haas
On Thu, Sep 18, 2014 at 4:31 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-12 14:44:48 -0400, Robert Haas wrote:
 On Fri, Sep 12, 2014 at 7:22 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  What I like even less is that pg_control is actually marked as
  DB_SHUTDOWNED due to END_OF_RECOVERY. That's just plain wrong. Obviously
  the database was *NOT* shutdown peacefully. I don't see active bugs due
  it besides this, but I think it's likely to either have or create futher
  ones.

 I agree.  The database clearly isn't shut down at end of recovery;
 it's still active and we're still doing things to it.  If we crash at
 that point, we need to go back into recovery on restart.  This seems
 open and shut, but maybe I'm missing something.  Why shouldn't we fix
 *that*?

 Well, I think we might want to do both. There doesn't seem to be a
 reason to *not* do the ResetUnloggedRelation(UNLOGGED_RELATION_INIT)
 around the ShutdownWalRcv(). That seems much closer where it, for me,
 logically belongs. And it'd fix the concrete problem.

That might be all right.  I've booted enough things in this area of
the code to not have a lot of confidence in my ability to not boot
things in this area of the code ... but I don't see a problem with it.
It does seem a little odd to do that before checking whether we
reached the minimum recovery point, or deciding whether to assign a
new TLI, but I don't see a concrete problem with putting it there.

 With regard to your second email, I agree that
 ResetUnloggedRelation(UNLOGGED_RELATION_INIT) needs to issue fsyncs.

 Good.

 The topic reminds me of the fact that I actually think at the very least
 pg_xlog/ and pg_control needs the same treatment. Consider the following
 sequence:
 1) postgres writes loads of stuff. Lots of it to *not* fsynced WAL
segments
 2) postgres crashes and crash recovery happens. Replays *not* fsynced
WAL
 3) the OS crashes
 4) Bad. We now might hava pg_control with a minRecovery that's *later*
than some potentially unsynced WAL segments

 I think the easiest would be to just fsync() the entire data directory
 at the start when ControlFile-state !=
 DB_SHUTDOWNED/DB_SHUTDOWNED_IN_RECOVERY

 Note that that's independent of the fsync for unlogged relations.

Seems reasonable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-22 Thread Peter Eisentraut
On 9/18/14 7:40 AM, Andres Freund wrote:
 I fail to see why that is so much preferrable for you to passing
 parameter to DO?
 
 1) You need to think about unique names for functions
 2) Doesn't work on HOT STANDBYs
 3) Causes noticeable amount of catalog bloat
 4) Is about a magnitude or two more expensive

Doesn't this apply to all temporary objects?  It would also be great to
have temporary tables, temporary indexes, temporary triggers, temporary
extensions, etc. that don't have the above problems.  I think inventing
a separate mechanism for working around each instance of this problem
would end up being very confusing.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-22 Thread Andres Freund
On 2014-09-22 15:46:48 -0400, Peter Eisentraut wrote:
 On 9/18/14 7:40 AM, Andres Freund wrote:
  I fail to see why that is so much preferrable for you to passing
  parameter to DO?
  
  1) You need to think about unique names for functions
  2) Doesn't work on HOT STANDBYs
  3) Causes noticeable amount of catalog bloat
  4) Is about a magnitude or two more expensive
 
 Doesn't this apply to all temporary objects?  It would also be great to
 have temporary tables, temporary indexes, temporary triggers, temporary
 extensions, etc. that don't have the above problems.  I think inventing
 a separate mechanism for working around each instance of this problem
 would end up being very confusing.

Meh. Those aren't comparable. TEMPORARY TABLES/INDEXES/... all live
beyond a single statement. What's being discussed here doesn't.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Options OUTPUT_PLUGIN_* controlling format are confusing (Was: Misleading error message in logical decoding)

2014-09-22 Thread Robert Haas
On Thu, Sep 18, 2014 at 11:21 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Sep 18, 2014 at 9:56 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-18 09:50:38 -0500, Michael Paquier wrote:
  Do you see the difference between what your doc patch states and the
  explanation I've given nearby in this thread?
 Perhaps that's the lack of documentation...

 Man. I've explained it to you about three times. The previous attempts
 at doing so didn't seem to help. If my explanations don't explain it so
 you can understand it adding them to the docs won't change a thing.
 That's why I ask whether you see the difference?
 Urg sorry for the misunderstanding. The patch stated that this
 parameter only influences the output of the SQL functions while it
 defines if the output plugin requires the output method to support
 binary data?

I'm not sure exactly what that sentence means.

But here's the point: every series of bytes is a valid bytea, except
maybe if it's over 1GB and runs afould of MaxAllocSize.  But a series
of bytes is only a valid text datum if it's a valid sequence of
characters according to the database encoding.  We like to think of
text as being an arbitrary series of bytes, but it isn't.  It can't
contain any \0 bytes, and it can't contain anything that's invalid in
the database encoding.  bytea isn't subject to either of those
restrictions.

So if we were going to have one universal output format for output
plugins, it would have to be bytea because that, really, can be
anything.  We could make users convert from that to text or whatever
they like.  But that's unappealing for several reasons: bytea output
is printed as unreadable hexademical garbage, and encoding conversions
are expensive.  So what we do instead is provide a text output method
and a binary output method.  That way, plugins that want to return
binary data are free to do so, and output methods that are happy to
return text can *declare* that what they return is legal text - and
then we just assume that to be true, and need not do an encoding
conversion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CreateEventTrigStmt copy fix

2014-09-22 Thread Robert Haas
On Fri, Sep 19, 2014 at 12:09 PM, Petr Jelinek p...@2ndquadrant.com wrote:
 I was trying to create event trigger inside DO statement inside an extension
 SQL script and noticed that the new event trigger has empty evtevent field.
 After some tinkering with gdb I found out that the memory context switches
 sometimes clear the eventname in CreateEventTrigStmt struct. The reason for
 this is that _copyCreateEventTrigStmt uses COPY_SCALAR_FIELD on eventname
 instead of COPY_STRING_FIELD.

 Attached patch fixes this and also the same issue in
 _equalCreateEventTrigStmt.

 This should be back-patched to 9.3 where event triggers were introduced.

Done, thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb format is pessimal for toast compression

2014-09-22 Thread Josh Berkus
On 09/19/2014 07:07 AM, Tom Lane wrote:
 Heikki Linnakangas hlinnakan...@vmware.com writes:
 Tom: You mentioned earlier that your patch fixes some existing bugs. 
 What were they?
 
 What I remember at the moment (sans caffeine) is that the routines for
 assembling jsonb values out of field data were lacking some necessary
 tests for overflow of the size/offset fields.  If you like I can apply
 those fixes separately, but I think they were sufficiently integrated with
 other changes in the logic that it wouldn't really help much for patch
 reviewability.

Where are we on this?  Do we have a patch ready for testing?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Help to startup

2014-09-22 Thread Alvaro Herrera
Michael Paquier wrote:
 On Sun, Sep 21, 2014 at 4:52 PM, Craig Ringer cr...@2ndquadrant.com wrote:
  On 09/17/2014 01:51 AM, Tapan Halani wrote:
  Hello everyone..i am new to PostgreSQL project. I had prior experience
  with sql+ , with oracle 11g database server. Kindly help me grasp more
  about the project.
 
  Since you're asking on pgsql-hackers, you're presumably interested in
  getting involved in developing extensions or feature work?
 
  See:
 
  http://www.postgresql.org/developer/
 There is a group picture of 2006 :)

Maybe we could turn that into a rotating display of each year's
developers meeting pictures or something like that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-22 Thread Merlin Moncure
On Mon, Sep 22, 2014 at 2:49 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-09-22 15:46:48 -0400, Peter Eisentraut wrote:
 On 9/18/14 7:40 AM, Andres Freund wrote:
  I fail to see why that is so much preferrable for you to passing
  parameter to DO?
 
  1) You need to think about unique names for functions
  2) Doesn't work on HOT STANDBYs
  3) Causes noticeable amount of catalog bloat
  4) Is about a magnitude or two more expensive

 Doesn't this apply to all temporary objects?  It would also be great to
 have temporary tables, temporary indexes, temporary triggers, temporary
 extensions, etc. that don't have the above problems.  I think inventing
 a separate mechanism for working around each instance of this problem
 would end up being very confusing.

 Meh. Those aren't comparable. TEMPORARY TABLES/INDEXES/... all live
 beyond a single statement. What's being discussed here doesn't.

Even if that wasn't true, 'DO' doesn't involve changes to system
catalogs whereas temporary functions would.  With a little imagination
I could come up a with a scenario involving a script of a whole bunch
of repeated trivial DO statements which would involve a lot less
beating on the system catalogs.

When the data-modifying-with feature was considered, an implementation
that relied on temp tables was rejected at least in part because of
system catalog thrash and poorer performance for very trivial queries.

So, to me, DO vs CREATE FUNCTION has nothing to do with passing
arguments and/or returning data.  It has to do with lifespan; single
call of the function body only, use DO, otherwise, create a function.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A mechanism securing web applications in DBMS

2014-09-22 Thread Stephen Frost
Zhaomo,

* Zhaomo Yang (zhy...@cs.ucsd.edu) wrote:
 You are right. Using unlogged table is a good idea. I'll try it out.
 Thanks for your advice!

Happy to help.  Another option would be to have a custom GUC for this
information.  The issue we have with that currently is that it can be
set by anyone..  Your extension could create one and register functions
which are called when it's set though, and only allow it to be set when
the auth/deauth functions are used.  This would get rid of the need for
any kind of table.

 Currently auth functions are security definer functions. I'm gonna try
 to create a patch using unlogged table + RLS and put it online (e.g.
 this mail list) so that people can try it.

I'd strongly suggest that you look into creating PostgreSQL extensions
and using that mechanism as a way to distribute your security definer
functions and other components of this solution as a single, complete,
package which users can install with just CREATE EXTENSION   That
might help with both getting others to test and play with your solution.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] RLS feature has been committed

2014-09-22 Thread Andres Freund
Hi,

I'm posting my reply to Stephen's mail at
http://archives.postgresql.org/message-id/20140919163839.GH16422%40tamriel.snowman.net
in a new thread because I think it's a important discussion and many
people probably stopped following the RLS thread at some point.

On 2014-09-19 12:38:39 -0400, Stephen Frost wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
  I specifically asked you to hold off on committing this until there
  was adequate opportunity for review, and explained my reasoning.  You
  committed it anyway.

 Hum- my apologies, I honestly don't recall you specifically asking for
 it to be held off indefinitely.  :(  There was discussion back and
 forth, quite a bit of it with you, and I thank you for your help with
 that and certainly welcome any additional comments.

  This patch, on the other hand, was massively revised after the start
  of the CommitFest after many months of inactivity and committed with
  no thorough review by anyone who was truly independent of the
  development effort.  It was then committed with no warning over a
  specific request, from another committer, that more time be allowed
  for review.

 I would not (nor do I feel that I did..) have committed it over a
 specific request to not do so from another committer.  I had been hoping
 that there would be another review coming from somewhere, but there is
 always a trade-off between waiting longer to get a review ahead of a
 commit and having it committed and then available more easily for others
 to work with, review, and generally moving forward.

c.f. 
http://archives.postgresql.org/message-id/CA%2BTgmoaX%2BptioOxx42rxJxsgrvxPfUVyndkpeR0JsRiTeZ36Ng%40mail.gmail.com

  I'm really disappointed by that.  I feel I'm essentially getting
  punished for trying to follow what I understand to the process, which
  has involved me doing huge amounts of review of other people's patches
  and waiting a very long time to get my own stuff committed, while you
  bull ahead with your own patches.
 
 While I wasn't public about it, I actually specifically discussed this
 question with others, a few times even, to try and make sure that I
 wasn't stepping out of line by moving forward.

 That said, I do see that Andres feels similairly.  It certainly wasn't
 my intent to surprise anyone by it but simply to continue to move
 forward- in part, to allow me to properly break from it and work on
 other things, including reviewing other patches in the commitfest.
 I fear I've simply been overly focused on it these past few weeks, for a
 variety of reasons that would likely best be discussed at the pub.

I've now slept a couple days on this. And my position hasn't changed.

This patch has been pushed in a clear violation of established policy.

Fundamental pieces of the patch have changed *after* the commitfest
started. And there wasn't a recent patch in the commitfest either - the
entry was moved over from the last round without a new patch.  It didn't
receive independent review (Robert explicitly said his wasn't a full
review).  It wasn't marked ready for committer.  The intention to commit
wasn't announced publically.  There were *clear* and unaddressed
objections to committing the patch as is, by a committer (Robert)
nonetheless.

It'd be a somewhat different thing if this weren't about a security
sensitive feature, the patch only needed minor cleanup after the start
of the CF, or there at least had been constant public progress over the
last months. Neither is the case.


The circumstances around this being committed make me deeply
uncomfortable. It's setting a very bad precedent. I don't think we want
to go down that route.

 All-in-all, I feel appropriately chastised and certainly don't wish to
 be surprising fellow committers.  Perhaps we can discuss at the dev
 meeting.

I really don't know what you understand under appropriately chastised
- but I think there's a pretty obvious way to do handle this
appropriately.

And waiting till the dev meeting obviously makes the whole discussion
moot.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Anonymous code block with parameters

2014-09-22 Thread Petr Jelinek

On 22/09/14 22:58, Merlin Moncure wrote:


Meh. Those aren't comparable. TEMPORARY TABLES/INDEXES/... all live
beyond a single statement. What's being discussed here doesn't.


Even if that wasn't true, 'DO' doesn't involve changes to system
catalogs whereas temporary functions would.  With a little imagination
I could come up a with a scenario involving a script of a whole bunch
of repeated trivial DO statements which would involve a lot less
beating on the system catalogs.

When the data-modifying-with feature was considered, an implementation
that relied on temp tables was rejected at least in part because of
system catalog thrash and poorer performance for very trivial queries.

So, to me, DO vs CREATE FUNCTION has nothing to do with passing
arguments and/or returning data.  It has to do with lifespan; single
call of the function body only, use DO, otherwise, create a function.



Actually same thing happened with the DO implementation itself - 
creating anonymous/hidden temporary functions in the background was also 
considered but was decided it's not acceptable (for similar reason temp 
tables were rejected for WITH).


So we decided at least twice already that this kind of solution is bad, 
I don't know of any change that would invalidate the reasons for 
deciding that way so I don't see why they would suddenly become 
acceptable...


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread Peter Geoghegan
On Mon, Sep 22, 2014 at 4:02 PM, Andres Freund and...@anarazel.de wrote:
 This patch has been pushed in a clear violation of established policy.

 Fundamental pieces of the patch have changed *after* the commitfest
 started. And there wasn't a recent patch in the commitfest either - the
 entry was moved over from the last round without a new patch.  It didn't
 receive independent review (Robert explicitly said his wasn't a full
 review).  It wasn't marked ready for committer.  The intention to commit
 wasn't announced publically.  There were *clear* and unaddressed
 objections to committing the patch as is, by a committer (Robert)
 nonetheless.

I have no reason to doubt your version of events here (although
Stephen may wish to address what you've said - I'm basing that on his
tone elsewhere). I must ask, though: what do you propose to do about
it in this instance? He has been chastised. Would you like to make a
point of formalizing what are (if I'm not mistaken) currently defacto
rules? Should RLS be reverted, and revisited in a future CF?


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we excise the remnants of borland cc support?

2014-09-22 Thread Josh Berkus
On 09/20/2014 06:24 AM, Andres Freund wrote:
 At the moment there's some rememnants of support for borland CC. I don't
 believe it's likely that any of it still works. I can't remember ever
 seing a buildfarm animal running it either - not surprising it's ~15
 years since the last release.
 Since there's both msvc and mingw support for windows builds - borlands
 only platform - I see little point in continuing to support it.

+1

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should we excise the remnants of borland cc support?

2014-09-22 Thread Andres Freund
On 2014-09-20 10:03:43 -0400, Andrew Dunstan wrote:
 
 On 09/20/2014 09:24 AM, Andres Freund wrote:
 Hi,
 
 At the moment there's some rememnants of support for borland CC. I don't
 believe it's likely that any of it still works. I can't remember ever
 seing a buildfarm animal running it either - not surprising it's ~15
 years since the last release.
 Since there's both msvc and mingw support for windows builds - borlands
 only platform - I see little point in continuing to support it.
 
 The reason I'm wondering is that the atomics patch cargo cults forward
 some stuff specific to borland and I'd rather not do that. And I'd
 rather be explicit about stopping to do so than slyly doing it.
 
 
 I thought the Borland stuff was there only so we could build client
 libraries for use with things like Delphi.
 
 It might be worth casting the net a little wider to find out if it still has
 any users.

FWIW I got offlist reports of two not subscribed people that they simply
use the normal libpq dll from delphi. Copying it from pgadmin or the pg
installer.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITH CHECK OPTION bug [was RLS Design]

2014-09-22 Thread Stephen Frost
* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 Yeah OK, fair point. Here are some tests that cover that code path.
 I've also thrown in a test with prepared statements, although that
 case was already working, it seemed worth checking.

Applied and backpatched, thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Should we excise the remnants of borland cc support?

2014-09-22 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-09-20 10:03:43 -0400, Andrew Dunstan wrote:
 I thought the Borland stuff was there only so we could build client
 libraries for use with things like Delphi.

 FWIW I got offlist reports of two not subscribed people that they simply
 use the normal libpq dll from delphi. Copying it from pgadmin or the pg
 installer.

Whether or not it's really needed to preserve the ability to build libpq
with borland, I'm just about certain that it's never worked to build the
backend with borland (thus explaining the lack of buildfarm members).
So it should be safe enough to strip support appearing in backend-only
header files.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread Josh Berkus
On 09/22/2014 04:22 PM, Peter Geoghegan wrote:
 I have no reason to doubt your version of events here (although
 Stephen may wish to address what you've said - I'm basing that on his
 tone elsewhere). I must ask, though: what do you propose to do about
 it in this instance? He has been chastised. Would you like to make a
 point of formalizing what are (if I'm not mistaken) currently defacto
 rules? Should RLS be reverted, and revisited in a future CF?

The CommitFests were never meant to restrict when a committer could
commit a patch.  The point of the CFs was to give committers time *off*
from committing patches.  If a committer wants to commit something
completely outside of the CF process, they are welcome to, as long as it
receives adequate review.

So if there's an argument here, it's whether or not the committed RLS
patch was adequately reviewed (and if not, if it should be reverted),
not whether it should have been in the CF or not.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal (9.5) : psql unicode border line styles

2014-09-22 Thread Stephen Frost
Pavel,

* Pavel Stehule (pavel.steh...@gmail.com) wrote:
 true, sorry, I have a different wording in first design
 
 fixed

Pushed, thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Anonymous code block with parameters

2014-09-22 Thread Craig Ringer
On 09/23/2014 07:20 AM, Petr Jelinek wrote:


 So, to me, DO vs CREATE FUNCTION has nothing to do with passing
 arguments and/or returning data.  It has to do with lifespan; single
 call of the function body only, use DO, otherwise, create a function.

 
 Actually same thing happened with the DO implementation itself -
 creating anonymous/hidden temporary functions in the background was also
 considered but was decided it's not acceptable (for similar reason temp
 tables were rejected for WITH).
 
 So we decided at least twice already that this kind of solution is bad,
 I don't know of any change that would invalidate the reasons for
 deciding that way so I don't see why they would suddenly become
 acceptable...

All good points. I was wrong to suggest just going for TEMPORARY
FUNCTION before, there's clearly a useful place for DO with parameters.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Stephen Frost
Tomonari,

* Tomonari Katsumata (t.katsumata1...@gmail.com) wrote:
 I'm thinking about a method which users get quick awareness it.
 Now, it's okay not to change current behavior except non-zero value yields
 a zero. A zero rounded down from non-zero gets an error.
 
 I attached new patch.
 This includes a document about above behavior as Heikki suggested.

Thanks for the updated patch!  I was going through it and there's a few
things-

- Documentation change no longer applies

- Why aren't we worried about a specification of '7kB' being rounded down
  to zero for GUCs which expect at least BLCKSZ?  If the reason is
  everything that works that way will error on zero anyway today, then
  I don't buy into that argument- there's no sense leaving it
  inconsistent and it would be a potential land-mine to hit later.

- The hint messages look like they need some rewording, at least.

In general, I'm a fan of this change and will move forward with it,
with changes for the above, unless folks object.  Based on the thread so
far, this sounds like the right solution and it'd be great to get this
simple change done to help move the CF along.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread Amit Kapila
On Tue, Sep 23, 2014 at 6:55 AM, Josh Berkus j...@agliodbs.com wrote:
 On 09/22/2014 04:22 PM, Peter Geoghegan wrote:
  I have no reason to doubt your version of events here (although
  Stephen may wish to address what you've said - I'm basing that on his
  tone elsewhere). I must ask, though: what do you propose to do about
  it in this instance? He has been chastised. Would you like to make a
  point of formalizing what are (if I'm not mistaken) currently defacto
  rules? Should RLS be reverted, and revisited in a future CF?

 The CommitFests were never meant to restrict when a committer could
 commit a patch.  The point of the CFs was to give committers time *off*
 from committing patches.  If a committer wants to commit something
 completely outside of the CF process, they are welcome to, as long as it
 receives adequate review.

 So if there's an argument here, it's whether or not the committed RLS
 patch was adequately reviewed (and if not, if it should be reverted),

Who decides if the patch is adequately reviewed?

Author, Committer or Reviewer?  In CF, that is comparatively clear
that once Reviewer is satisfied, he marks the patch as
Ready For Committer and then Committer picks up and if he is satisfied
with the review and quality of patch, then he commits the patch or if the
Committer himself is reviewer than in many cases once he is satisfied,
he commits it.

Now in this case, if I understand correctly the story is not
so straightforward.  It seems to me Robert as the Reviewer was not
completely satisfied where as Stephen as the Committer was pretty
much okay with the patch so he went ahead and commits it.

In short, if it is solely at the discretion of Committer that he can
decide if the patch has got adequate Review and he is satisfied
with the quality of patch, then I think what Stephen did in this case
is not wrong (though I am not the one who can decide whether it is
right or wrong, just sharing my thoughts), however if you think Reviewer
also has stake (especially when other Reviewer is also Committer)
in deciding the quality of patch, then Stephen should have waited for
more time (till the Reviewer also gets satisfied).


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread Robert Haas
On Mon, Sep 22, 2014 at 7:22 PM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Sep 22, 2014 at 4:02 PM, Andres Freund and...@anarazel.de wrote:
 This patch has been pushed in a clear violation of established policy.

 Fundamental pieces of the patch have changed *after* the commitfest
 started. And there wasn't a recent patch in the commitfest either - the
 entry was moved over from the last round without a new patch.  It didn't
 receive independent review (Robert explicitly said his wasn't a full
 review).  It wasn't marked ready for committer.  The intention to commit
 wasn't announced publically.  There were *clear* and unaddressed
 objections to committing the patch as is, by a committer (Robert)
 nonetheless.

 I have no reason to doubt your version of events here

Fortunately, you don't have to take anything on faith.  This is a
public mailing list, and the exact sequence of events is open to
inspection by anyone who cares to take a few minutes to do so.  You
can easily verify whether my statement that I asked Stephen twice to
hold off committing it is correct or not; and you can also verify the
rest of the history that Andres and I recounted.  This is all there in
black and white.

 Should RLS be reverted, and revisited in a future CF?

IMHO, that would be entirely appropriate.  I don't have any idea
whether the patch has remaining bugs, design issues, or security flaws
- and neither does anyone else since the normal review process was
bypassed - but I do feel that Stephen's feelings of being chastised
aren't worth the bits they are printed on.  We're here to develop
software together, not to talk about our feelings; and the quality of
the software we produce depends on our willingness to follow a set of
procedures that are or should be well-understood by long-time
contributors, not on our emotional states.

It's difficult to imagine a more flagrant violation of process than
committing a patch without any warning and without even *commenting*
on the fact that clear objections to commit were made on a public
mailing list.  If that is allowed to stand, what can we assume other
than that Stephen, at least, has a blank check to change anything he
wants, any time he wants, with no veto possible from anyone else?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread Robert Haas
On Mon, Sep 22, 2014 at 9:25 PM, Josh Berkus j...@agliodbs.com wrote:
 The CommitFests were never meant to restrict when a committer could
 commit a patch.  The point of the CFs was to give committers time *off*
 from committing patches.  If a committer wants to commit something
 completely outside of the CF process, they are welcome to, as long as it
 receives adequate review.

Agreed.

 So if there's an argument here, it's whether or not the committed RLS
 patch was adequately reviewed (and if not, if it should be reverted),
 not whether it should have been in the CF or not.

The point here is precisely that nobody other than the authors
reviewed it, and that I specifically asked Stephen to hold off commit
until the next CommitFest because I did not want to drop everything to
review a patch that was posted mid-CommitFest over other patches that
were timely submitted.  Stephen took the technical content that
appeared in that same email, incorporated into the patch, and
committed it shortly thereafter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Options OUTPUT_PLUGIN_* controlling format are confusing (Was: Misleading error message in logical decoding)

2014-09-22 Thread Michael Paquier
On Tue, Sep 23, 2014 at 4:59 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, Sep 18, 2014 at 11:21 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Sep 18, 2014 at 9:56 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 On 2014-09-18 09:50:38 -0500, Michael Paquier wrote:
  Do you see the difference between what your doc patch states and the
  explanation I've given nearby in this thread?
 Perhaps that's the lack of documentation...

 Man. I've explained it to you about three times. The previous attempts
 at doing so didn't seem to help. If my explanations don't explain it so
 you can understand it adding them to the docs won't change a thing.
 That's why I ask whether you see the difference?
 Urg sorry for the misunderstanding. The patch stated that this
 parameter only influences the output of the SQL functions while it
 defines if the output plugin requires the output method to support
 binary data?

 I'm not sure exactly what that sentence means.

 But here's the point: every series of bytes is a valid bytea, except
 maybe if it's over 1GB and runs afould of MaxAllocSize.  But a series
 of bytes is only a valid text datum if it's a valid sequence of
 characters according to the database encoding.  We like to think of
 text as being an arbitrary series of bytes, but it isn't.  It can't
 contain any \0 bytes, and it can't contain anything that's invalid in
 the database encoding.  bytea isn't subject to either of those
 restrictions.

 So if we were going to have one universal output format for output
 plugins, it would have to be bytea because that, really, can be
 anything.  We could make users convert from that to text or whatever
 they like.  But that's unappealing for several reasons: bytea output
 is printed as unreadable hexademical garbage, and encoding conversions
 are expensive.  So what we do instead is provide a text output method
 and a binary output method.  That way, plugins that want to return
 binary data are free to do so, and output methods that are happy to
 return text can *declare* that what they return is legal text - and
 then we just assume that to be true, and need not do an encoding
 conversion.
Aha, thanks. That's all clear then!
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add modulo (%) operator to pgbench

2014-09-22 Thread Stephen Frost
Fabien,

* Fabien COELHO (coe...@cri.ensmp.fr) wrote:
 That's not really true.  You can't really add abs(x) or hash(x) right
 now because the current code only supports this syntax:
 
 \set varname operand1 [ operator operand2 ]
 
 There's no way to add support for a unary operator with that syntax.
 
 Hmmm. If you accept a postfix syntax, there is:-)
 
 But this is not convincing. Adding a unary function with a clean
 syntax indeed requires doing something with the parser.

Based on the discussion so far, it sounds like you're coming around to
agree with Robert (as I'm leaning towards also) that we'd be better off
building a real syntax here instead.  If so, do you anticipate having a
patch to do so in the next few days, or...?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tomonari Katsumata (t.katsumata1...@gmail.com) wrote:
 I'm thinking about a method which users get quick awareness it.
 Now, it's okay not to change current behavior except non-zero value yields
 a zero. A zero rounded down from non-zero gets an error.

 In general, I'm a fan of this change and will move forward with it,
 with changes for the above, unless folks object.  Based on the thread so
 far, this sounds like the right solution and it'd be great to get this
 simple change done to help move the CF along.

Hm, I object to a patch that behaves as stated above.  I'm okay with
silently rounding *up* to the lowest possible nonzero value, eg rounding
up 7kB to 8kB if the variable's unit is 8kB.  But if we throw an error for
7kB and not 8kB, then we are exposing the unit size in a way that the user
can't ignore.  That seems to me to be antithetical to the entire design
rationale for GUC units.  More, it doesn't fix the original complaint here,
which is that the user would be surprised if he specifies 7kB and gets
some special behavior instead because it's too close to zero but not
exactly zero.  7kB should act as though it's not zero.  If the difference
between 7kB and 8kB is actually user-meaningful, then we chose too coarse
a unit for the particular GUC, which is not something that a rounding rule
can paper over.  But the difference between zero and not-zero is
meaningful regardless of unit choices.

This argument doesn't say anything much about which way to round for
values that are fractional but larger than the unit size.  I'd probably
prefer a round away from zero behavior since that seems to be the most
consistent rule if we want to avoid silently creating zero values; but
I could be talked into something else.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Stephen Frost
Hey Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  * Tomonari Katsumata (t.katsumata1...@gmail.com) wrote:
  I'm thinking about a method which users get quick awareness it.
  Now, it's okay not to change current behavior except non-zero value yields
  a zero. A zero rounded down from non-zero gets an error.
 
  In general, I'm a fan of this change and will move forward with it,
  with changes for the above, unless folks object.  Based on the thread so
  far, this sounds like the right solution and it'd be great to get this
  simple change done to help move the CF along.
 
 Hm, I object to a patch that behaves as stated above.  I'm okay with
 silently rounding *up* to the lowest possible nonzero value, eg rounding
 up 7kB to 8kB if the variable's unit is 8kB.  But if we throw an error for
 7kB and not 8kB, then we are exposing the unit size in a way that the user
 can't ignore.  That seems to me to be antithetical to the entire design
 rationale for GUC units.  More, it doesn't fix the original complaint here,
 which is that the user would be surprised if he specifies 7kB and gets
 some special behavior instead because it's too close to zero but not
 exactly zero.  7kB should act as though it's not zero.  If the difference
 between 7kB and 8kB is actually user-meaningful, then we chose too coarse
 a unit for the particular GUC, which is not something that a rounding rule
 can paper over.  But the difference between zero and not-zero is
 meaningful regardless of unit choices.

I agree that the difference between 7kB and 8kB shouldn't be meaningful,
but that it can be if it means we end up at zero.  Having different
rounding rules at near-zero than in other cases doesn't strike me as
great either though, which is why I thought the error-if-rounded-to-zero
made sense.  As for the user, I'd aruge that they haven't understood the
GUC if they're setting the value down to something which rounds to zero
and an error (yes, even one in the logs that we know few enough folks
read) would be better than where we are currently.

 This argument doesn't say anything much about which way to round for
 values that are fractional but larger than the unit size.  I'd probably
 prefer a round away from zero behavior since that seems to be the most
 consistent rule if we want to avoid silently creating zero values; but
 I could be talked into something else.

PeterE argued that rounding away from zero didn't make sense either
(53f6501b.3090...@gmx.net).  I feel like we're getting trapped by
examples.

Here's another proposal- how about we support a 'minimum-if-not-zero'
option for GUCs more generally, and then throw an error if the user sets
the value to a value below that minimum unless they explicitly use zero
(to indicate whatever the special meaning of zero for that GUC is)?
It'd be a larger change, certainly, but I feel like that combined with
the current behavior would address this and possibly other issues
(setting to a value which is low enough to be accepted, but too low to
allow the system to function properly..).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RLS feature has been committed

2014-09-22 Thread David G Johnston
Robert Haas wrote
 It's difficult to imagine a more flagrant violation of process than
 committing a patch without any warning and without even *commenting*
 on the fact that clear objections to commit were made on a public
 mailing list.  If that is allowed to stand, what can we assume other
 than that Stephen, at least, has a blank check to change anything he
 wants, any time he wants, with no veto possible from anyone else?

I'm of a mind to agree that this shouldn't have been committed...but I'm not
seeing where Stephen has done sufficient wrong to justify crucifixion. 
Especially since everything is being done publicly and you are one of the
many people in the position to flex a veto by reverting the patch. 

I see this like a black swan event[1]: needs to be addressed, is thought
provoking, but ultimately shouldn't be treated as something to overreact to
until it happens more frequently.  There are enough checks-and-balances when
it comes to committed code - which in this case is during pre-beta
development - that one should simply allow for a certain level human fallacy
to creep in and need periodic addressing/correcting.

At this point my hindsight says a strictly declaratory statement of reasons
this is not ready combined with reverting the patch would have been
sufficient; or even just a I am going to revert this for these reasons
post.  The difference between building support for a revert and gathering a
mob is a pretty thin line.  

Subsequent, possibly private, discussion between you and Stephen could then
occur before making any conclusions you form public so that others can learn
from the experience and ponder whether anything could be changed to mitigate
such situations in the future.  

Though I guess if you indeed feel that his actions were truly heinous you
should also then put forth the proposal that his ability to commit be
revoked.  If your not willing to go to that extent then, unless you know
Stephen personally, I'd not assume that public flogging is the best way to
get him to not mess up in the future; but the honest and cordial dialog
about cause/effect is likely to be more productive and less
self-destructing.  Though, to each their own style.

As a committer you have a responsibility to work not only with code but with
those who write the code; and while I myself am not a particularly strong
(or experienced) manager I have enough life experience to give opinions on
leadership.  I won't fault you for being yourself but simply put forth my
own impressions and some ideas to provoke thought.  

I'm also not the one whose efforts were marginalized so don't have the same
emotional attachment to the situation as you do - an attachment that needs
to be recognized because, as I do know from experience, even when you are
right and justified an overreaction makes you come off unfavorably and
doesn't materially improve the situation beyond what it could have been
otherwise.

David J.

1. http://en.wikipedia.org/wiki/Black_swan_theory



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/RLS-feature-has-been-committed-tp5819983p5820020.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 This argument doesn't say anything much about which way to round for
 values that are fractional but larger than the unit size.  I'd probably
 prefer a round away from zero behavior since that seems to be the most
 consistent rule if we want to avoid silently creating zero values; but
 I could be talked into something else.

 PeterE argued that rounding away from zero didn't make sense either
 (53f6501b.3090...@gmx.net).  I feel like we're getting trapped by
 examples.

I don't find anything compelling in Peter's argument.  I do agree that
if the GUC unit is hours, and the user tries to set it to 1 second or 1
microsecond, then he probably didn't understand the GUC ... but by that
argument, if the unit is hours and he tries to set it to 1.001 hours, we
should still throw an error.  The point of the GUC units system is to
not be too picky about what the variable's units are, and that that should
be possible as long as the unit is small enough that the user shouldn't
care about the difference between N and N+1 units.  Therefore, I am
entirely unimpressed by examples that hinge on the assumption that the
user *does* care about that; any such example can be rejected on the
grounds that it was our error to use too large a unit for that GUC.
However, as long as we have any cases with special behavior for zero,
we should expect that the user cares about the difference between 0 units
and 1 unit.

 Here's another proposal- how about we support a 'minimum-if-not-zero'
 option for GUCs more generally, and then throw an error if the user sets
 the value to a value below that minimum unless they explicitly use zero
 (to indicate whatever the special meaning of zero for that GUC is)?

I don't think that the extra complexity in that is worth the effort.

You could maybe talk me into round to nearest, and then complain if that
produced zero from nonzero (in effect, your proposal with the minimum
always exactly half a unit).  But that seems like mostly extra complexity
and an extra error case for not much.  Again, *the user shouldn't care*
what our rounding rule is exactly; if he does, it means the particular
GUC was misdesigned.

We could adopt a straightforward round to nearest rule if we changed
things around enough so that zero was never special, which I think is
what Peter was really arguing for in the post you cite.  But that strikes
me as a large amount of work, and a large loss of backwards compatibility,
in return for (again) not much.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread David G Johnston
Tom Lane-2 wrote
 The case where this argument falls down is for special values, such as
 where zero means something quite different from the smallest nonzero
 value.  Peter suggested upthread that we should redefine any GUC values
 for which that is true, but (a) I think that loses on backwards
 compatibility grounds, and (b) ISTM zero is probably always special to
 some extent.  A zero time delay for example is not likely to work.
 
 Maybe we should leave the rounding behavior alone (there's not much
 evidence that rounding in one direction is worse than another; although
 I'd also be okay with changing to round-to-nearest), and confine ourselves
 to throwing an error for the single case that an apparently nonzero input
 value is truncated/rounded to zero as a result of units conversion.

Tom,

Can you either change your mind back to this opinion you held last month or
commit something you find acceptable - its not like anyone would revert
something you commit... :)

Everyone agrees non-zero must not round to zero; as long as that happens I'm
not seeing anyone willing to spending any more effort on the details.

David J.

 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/proposal-rounding-up-time-value-less-than-its-unit-tp5811102p5820024.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Tom Lane
David G Johnston david.g.johns...@gmail.com writes:
 Can you either change your mind back to this opinion you held last month or
 commit something you find acceptable - its not like anyone would revert
 something you commit... :)

Hey, am I not allowed to change my mind :-) ?

Seriously though, the main point I was making before stands: if the
details of the rounding rule matter, then we messed up on choosing the
units of the particular GUC.  The question is what are we going to do
about zero being a special case.

 Everyone agrees non-zero must not round to zero; as long as that happens I'm
 not seeing anyone willing to spending any more effort on the details.

I'm not entirely sure Peter agrees; he wanted to get rid of zero being
a special case, rather than worry about making the rounding rule safe
for that case.  But assuming that that's a minority position:
it seems to me that adding a new error condition is more work for us,
and more work for users too, and not an especially sane decision when
viewed from a green-field perspective.  My proposal last month was in
response to some folk who were arguing for a very narrow-minded
definition of backwards compatibility ... but I don't think that's
really where we should go.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
  Here's another proposal- how about we support a 'minimum-if-not-zero'
  option for GUCs more generally, and then throw an error if the user sets
  the value to a value below that minimum unless they explicitly use zero
  (to indicate whatever the special meaning of zero for that GUC is)?
 
 I don't think that the extra complexity in that is worth the effort.

Alright.

 You could maybe talk me into round to nearest, and then complain if that
 produced zero from nonzero (in effect, your proposal with the minimum
 always exactly half a unit).  But that seems like mostly extra complexity
 and an extra error case for not much.  Again, *the user shouldn't care*
 what our rounding rule is exactly; if he does, it means the particular
 GUC was misdesigned.

I agree that the user shouldn't have to care, and I agree with your
earlier comment on this thread that having the rounding rules be
different when near zero vs. not-near-zero could be quite confusing.

 We could adopt a straightforward round to nearest rule if we changed
 things around enough so that zero was never special, which I think is
 what Peter was really arguing for in the post you cite.  But that strikes
 me as a large amount of work, and a large loss of backwards compatibility,
 in return for (again) not much.

Agreed- I'm a bit concerned about backwards compatibility for all of the
proposals which change the existing rounding rules, but, if the
consensus is that N vs. N+1 shouldn't actually matter for values of
N  X  N+1 (as a one-unit step should be small enough to be not
terribly noticeable), then perhaps we can still move forward with the
ceil() approach as that looks to be the only argument against it.

To clarify- we'll simply swap from (essentially) floor() to ceil() for
handling all GUC_with_unit to internal_unit conversions, document that,
and note it in the release notes as a possible behavior change and move
on.

Thoughts?  Objections?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 To clarify- we'll simply swap from (essentially) floor() to ceil() for
 handling all GUC_with_unit to internal_unit conversions, document that,
 and note it in the release notes as a possible behavior change and move
 on.

Worksforme.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread David Johnston
On Tuesday, September 23, 2014, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com javascript:; writes:
  Can you either change your mind back to this opinion you held last month
 or
  commit something you find acceptable - its not like anyone would revert
  something you commit... :)

 Hey, am I not allowed to change my mind :-) ?

 Seriously though, the main point I was making before stands: if the
 details of the rounding rule matter, then we messed up on choosing the
 units of the particular GUC.  The question is what are we going to do
 about zero being a special case.

  Everyone agrees non-zero must not round to zero; as long as that happens
 I'm
  not seeing anyone willing to spending any more effort on the details.

 I'm not entirely sure Peter agrees; he wanted to get rid of zero being
 a special case, rather than worry about making the rounding rule safe
 for that case.  But assuming that that's a minority position:
 it seems to me that adding a new error condition is more work for us,
 and more work for users too, and not an especially sane decision when
 viewed from a green-field perspective.  My proposal last month was in
 response to some folk who were arguing for a very narrow-minded
 definition of backwards compatibility ... but I don't think that's
 really where we should go.

 regards, tom lane


This patch should fix the round-to-zero issue.  If someone wants to get rid
of zero as a special case let them supply a separate patch for that
improvement.

My original concern was things that are rounded to zero now will not be in
9.5 if the non-error solution is chosen.  The risk profile is extremely
small but it is not theoretically zero.

David J.


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  To clarify- we'll simply swap from (essentially) floor() to ceil() for
  handling all GUC_with_unit to internal_unit conversions, document that,
  and note it in the release notes as a possible behavior change and move
  on.
 
 Worksforme.

Great, thanks.  I'll wait a couple days to see if anyone else wants to
voice a concern.

Tomonari, don't worry about updating the patch (unless you really want
to) as I suspect I'll be changing around the wording anyway.  No
offense, please, but I suspect English isn't your first language and
it'll be simpler for me if I just handle it.  Thanks a lot for the bug
report and discussion and I'll be sure to credit you for it in the
commit.

Thanks again!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-22 Thread Tom Lane
David Johnston david.g.johns...@gmail.com writes:
 My original concern was things that are rounded to zero now will not be in
 9.5 if the non-error solution is chosen.  The risk profile is extremely
 small but it is not theoretically zero.

This is exactly the position I was characterizing as an excessively
narrow-minded attachment to backwards compatibility.  We are trying to
make the behavior better (as in less confusing), not guarantee that it's
exactly the same.  If we are only allowed to change the behavior by
throwing errors in cases where we previously didn't, then we are
voluntarily donning a straitjacket that will pretty much ensure Postgres
doesn't improve any further.

It's important here that this behavior change is being proposed for a
new major release, not for back-patching.  We have never supposed that
postgresql.conf files had to work without any change in new
major releases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in syncrep.c

2014-09-22 Thread Pavan Deolasee
On Mon, Sep 22, 2014 at 6:50 PM, Stephen Frost sfr...@snowman.net wrote:

 Pavan,

 * Pavan Deolasee (pavan.deola...@gmail.com) wrote:
  On Thu, Sep 18, 2014 at 12:02 PM, Pavan Deolasee 
 pavan.deola...@gmail.com
  wrote:
   While running some tests on REL9_2_STABLE branch, I saw an assertion
   failure in syncrep.c. The stack trace looks like this:
  
  Any comments on this? I see it very regularly during my pgbench tests.

 I agree that it looks like there may be a race condition there- but
 could you provide the test cases you're working with?  What kind of
 behavior is it that's making it show up easily for you?


Nothing special really. Set up a 2-node sync replication on my Mac laptop
and running pgbench with 10 clients triggers it. As I said, after looking
at the code and realising that there is a race condition, I tried with with
gdb to reproduce the race I suspect.

Anyways, the attached patch should trigger the race condition for a simple
query. I'm deliberately making backend to wait to give walsender a chance
to send outstanding WALs and then making walsender to wait so that
assertion is triggered in the backend.

Hope this helps.

Thanks,
Pavan
-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


syncrep_assertion_trigger.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] tick buildfarm failure

2014-09-22 Thread Stephen Frost
All,

  I've been keeping an eye on tick as it failed a day-or-so ago and it
  looks to be related to RLS.  Using a local
  CLFAGS=-DCLOBBER_CACHE_ALWAYS -DRANDOMIZE_ALLOCATED_MEMORY build, I
  was able to see the regression tests failing in
  check_role_for_policy() due to a pretty clear reset of the memory used
  for the policies.

  Looking through relcache.c, which I have to admit to not being as
  familiar with as some, the issue becomes rather apparent (I believe)-
  RelationClearRelation() hasn't been set up to handle the RLS policies
  specially, as it does for rules and tupledesc.  Fair enough, it's
  reasonably straight-forward to add an equalPolicies() and handle the
  policies the same way- but I'm left wondering if that's actually
  *safe* to do?

  To be a bit more clear- why is it safe to change the contents if the
  equal() function returns 'false'?  I'm guessing the answer is
  locking- whereby such a change would imply a lock was acquired on
  the relation by another backend, which shouldn't be possible if we're
  currently working with it, but wanted to double-check that.
  
  I also wonder if we might be better off with a way to identify that
  nothing has actually been changed due to the locks we hold and avoid
  rebuilding the relation cache entry entirely..

Thanks!

Stephen


signature.asc
Description: Digital signature