Re: [HACKERS] ExplainModifyTarget doesn't work as expected

2015-02-06 Thread Etsuro Fujita

Hi Ashutosh,

Thank you for the review!

On 2015/02/03 15:32, Ashutosh Bapat wrote:

I agree that it's a problem, and it looks more severe when there are
multiple children
postgres=# create table parent (a int check (a  0) no inherit);
CREATE TABLE
postgres=# create table child1 (a int check (a = 0));
CREATE TABLE
postgres=# create table child2 (a int check (a = 0));
CREATE TABLE
postgres=# create table child3 (a int check (a = 0));
CREATE TABLE
postgres=# alter table child1 inherit parent;
ALTER TABLE
postgres=# alter table child2 inherit parent;
ALTER TABLE
postgres=# alter table child3 inherit parent;
ALTER TABLE
postgres=# explain update parent set a = a * 2 where a = 0;
QUERY PLAN

  Update on child1  (cost=0.00..126.00 rows=2400 width=10)
-  Seq Scan on child1  (cost=0.00..42.00 rows=800 width=10)
  Filter: (a = 0)
-  Seq Scan on child2  (cost=0.00..42.00 rows=800 width=10)
  Filter: (a = 0)
-  Seq Scan on child3  (cost=0.00..42.00 rows=800 width=10)
  Filter: (a = 0)
(7 rows)

It's certainly confusing why would an update on child1 cause scan on child*.


Yeah, I think so too.


But I also think that showing parent's name with Upate would be
misleading esp. when user expects it to get filtered because of
constraint exclusion.

Instead, can we show all the relations that are being modified e.g
Update on child1, child2, child3. That will disambiguate everything.


That's an idea, but my concern about that is the cases where there are a 
large number of child tables as the EXPLAIN would be difficult to read 
in such cases.


Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [POC] FETCH limited by bytes.

2015-02-06 Thread Kyotaro HORIGUCHI
Hello,

 Redshift has a table, stv_inflight, which serves about the same purpose as
 pg_stat_activity. Redshift seems to perform better with very high fetch
 sizes (10,000 is a good start), so the default foreign data wrapper didn't
 perform so well.

I agree with you.

 I was able to confirm that the first query showed FETCH 150 FROM c1 as
 the query, which is normal highly unhelpful, but in this case it confirms
 that tables created in redshift_server do by default use the fetch_size
 option given during server creation.
 
 I was also able to confirm that the second query showed FETCH 151 FROM c1
 as the query, which shows that per-table overrides also work.
 
 I'd be happy to stop here, but Kyotaro might feel differently.

This is enough in its own way, of course.

 With this
 limited patch, one could estimate the number of rows that would fit into
 the desired memory block based on the row width and set fetch_size
 accordingly.

The users including me will be happy with it when the users know
how to determin the fetch size. Especially the remote tables are
very few or the configuration will be enough stable.

On widely distributed systems, it would be far difficult to tune
fetch size manually for every foreign tables, so finally it would
be left at the default and safe size, it's 100 or so.

This is the similar discussion about max_wal_size on another
thread. Calculating fetch size is far tougher for users than
setting expected memory usage, I think.

 But we could go further, and have a fetch_max_memory option only at the
 table level, and the fdw could do that same memory / estimated_row_size
 calculation and derive fetch_size that way *at table creation time*, not
 query time.

We cannot know the real length of the text type data in advance,
other than that, even defined as char(n), the n is the
theoretically(or in-design) maximum size for the field but in the
most cases the mean length of the real data would be far small
than that. For that reason, calculating the ratio at the table
creation time seems to be difficult.

However, I agree to the Tom's suggestion that the changes in
FETCH statement is defenitly ugly, especially the overhead
argument is prohibitive even for me:(

 Thanks to Kyotaro and Bruce Momjian for their help.

Not at all.

regardes,


At Wed, 4 Feb 2015 18:06:02 -0500, Corey Huinker corey.huin...@gmail.com 
wrote in CADkLM=eTpKYX5VOfjLr0VvfXhEZbC2UeakN=p6mxmg7s86c...@mail.gmail.com
 I applied this patch to REL9_4_STABLE, and I was able to connect to a
 foreign database (redshift, actually).
 
 the basic outline of the test is below, names changed to protect my
 employment.
 
 create extension if not exists postgres_fdw;
 
 create server redshift_server foreign data wrapper postgres_fdw
 options ( host 'some.hostname.ext', port '5439', dbname 'redacted',
 fetch_size '150' );
 
 create user mapping for public server redshift_server options ( user
 'redacted_user', password 'comeonyouarekiddingright' );
 
 create foreign table redshift_tab150 ( colspecs )
 server redshift_server options (table_name 'redacted_table', schema_name
 'redacted_schema' );
 
 create foreign table redshift_tab151 ( colspecs )
 server redshift_server options (table_name 'redacted_table', schema_name
 'redacted_schema', fetch_size '151' );
 
 -- i don't expect the fdw to push the aggregate, this is just a test to see
 what query shows up in stv_inflight.
 select count(*) from redshift_ccp150;
 
 -- i don't expect the fdw to push the aggregate, this is just a test to see
 what query shows up in stv_inflight.
 select count(*) from redshift_ccp151;
 
 
 For those of you that aren't familiar with Redshift, it's a columnar
 database that seems to be a fork of postgres 8.cough. You can connect to it
 with modern libpq programs (psql, psycopg2, etc).
 Redshift has a table, stv_inflight, which serves about the same purpose as
 pg_stat_activity. Redshift seems to perform better with very high fetch
 sizes (10,000 is a good start), so the default foreign data wrapper didn't
 perform so well.
 
 I was able to confirm that the first query showed FETCH 150 FROM c1 as
 the query, which is normal highly unhelpful, but in this case it confirms
 that tables created in redshift_server do by default use the fetch_size
 option given during server creation.
 
 I was also able to confirm that the second query showed FETCH 151 FROM c1
 as the query, which shows that per-table overrides also work.
 
 I'd be happy to stop here, but Kyotaro might feel differently. With this
 limited patch, one could estimate the number of rows that would fit into
 the desired memory block based on the row width and set fetch_size
 accordingly.
 
 But we could go further, and have a fetch_max_memory option only at the
 table level, and the fdw could do that same memory / estimated_row_size
 calculation and derive fetch_size that way *at table creation time*, not
 query time.
 
 Thanks to Kyotaro and Bruce Momjian for their help.
 
 
 
 
 
 
 On Mon, 

Re: [HACKERS] EvalPlanQual behaves oddly for FDW queries involving system columns

2015-02-06 Thread Etsuro Fujita

Hi Ashutosh,

On 2015/02/03 16:44, Ashutosh Bapat wrote:

I am having some minor problems running this repro



[Terminal 2]
postgres=# create foreign table ft (a int) server loopback options
(table_name 'lbt');



There isn't any table lbt mentioned here. Do you mean t here?


Sorry, my explanation was not enough.  lbt means a foreign table named 
lbt defined on a foreign server named loopback.  It'd be defined eg, 
in the following manner:


$ createdb efujita

efujita=# create table lbt (a int);
CREATE TABLE

postgres=# create server loopback foreign data wrapper postgres_fdw 
options (dbname 'efujita');

CREATE SERVER
postgres=# create user mapping for current_user server loopback;
CREATE USER MAPPING
postgres=# create foreign table ft (a int) server loopback options 
(table_name 'lbt');

CREATE FOREIGN TABLE

Thanks for the review!

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Early Setup of instrumentation information in pg_stat_statements

2015-02-06 Thread Robert Haas
On Thu, Feb 5, 2015 at 10:28 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 Currently in pg_stat_statements, the setup to track
 instrumentation/totaltime information is done after
 ExecutorStart().  Can we do this before ExecutorStart()?
 In particular, I am referring to below code:

 static void
 pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
 ..
 standard_ExecutorStart(queryDesc, eflags);
 ..
 if (pgss_enabled()  queryDesc-plannedstmt-queryId != 0)
 {
 ..
 if (queryDesc-totaltime == NULL)
 {
 ..
 queryDesc-totaltime = InstrAlloc(1, INSTRUMENT_ALL);
 ..
 }
 }
 }

 The reason why I am asking is that to track instrumentation
 information (like buffer usage parameters) in case of parallel
 sequence scan, we need to pass this information at start of
 backend workers which are started in ExecutorStart() phase
 and at that time, there is no information available which can
 guarantee (we have queryId stored in planned stmt, but I think
 that is not sufficient to decide) that such an information is
 needed by plugin.  This works well for Explain statement as
 that has the information for instrumentation available before
 ExecutorStart() phase.

 Please suggest me if there is a better way to make this
 information available in ExecutorStart() phase?

This seems like a clue that maybe starting the workers during the
ExecutorStart() phase is a bad idea.  It might be a bad idea for other
reasons, too: what happens if that node never gets executed?  I think
the general pattern of executor nodes is that they're not supposed to
do any real work until somebody tries to pull a tuple from them.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-02-06 Thread Robert Haas
On Fri, Feb 6, 2015 at 12:34 PM, Robert Haas robertmh...@gmail.com wrote:
 4.

Obviously that went out a bit too soon.  Anyway, what I think we
should do here is back up a bit and talk about what the problems are
that we need to solve here and how each of them should be solved.  I
think there is some good code in this patch, but we really need to
think about what the interfaces should look like and achieve a clean
separation of concerns.

Looking at the code for the non-parallel SeqScan node, there are
basically two things going on here:

1. We call heap_getnext() to get the next tuple and store it into a
TupleTableSlot.
2. Via ExecScan(), we do projection and apply quals.

My first comment here is that I think we should actually teach
heapam.c about parallelism.  In other words, let's have an interface
like this:

extern Size heap_parallelscan_estimate(Snapshot snapshot);
extern void heap_parallelscan_initialize(ParallelHeapScanDesc target,
Relation relation, Snapshot snapshot);
extern HeapScanDesc heap_beginscan_parallel(ParallelHeapScanDesc);

So the idea is that if you want to do a parallel scan, you call
heap_parallelscan_estimate() to figure out how much space to reserve
in your dynamic shared memory segment.  Then you call
heap_parallelscan_initialize() to initialize the chunk of memory once
you have it.  Then each backend that wants to assist in the parallel
scan calls heap_beginscan_parallel() on that chunk of memory and gets
its own HeapScanDesc.  Then, they can all call heap_getnext() just as
in the non-parallel case.  The ParallelHeapScanDesc needs to contain
the relation OID, the snapshot, the ending block number, and a
current-block counter.  Instead of automatically advancing to the next
block, they use one of Andres's nifty new atomic ops to bump the
current-block counter and then scan the block just before the new
value.  All this seems pretty straightforward, and if we decide to
later change the way the relation gets scanned (e.g. in 1GB chunks
rather than block-by-block) it can be handled here pretty easily.

Now, let's suppose that we have this interface and for some reason we
don't care about quals and projection - we just want to get the tuples
back to the master.  It's easy enough to create a parallel context
that fires up a worker and lets the worker call
heap_beginscan_parallel() and then heap_getnext() in a loop, but what
does it do with the resulting tuples?  We need a tuple queue that can
be used to send the tuples back to master.  That's also pretty easy:
just set up a shm_mq and use shm_mq_send() to send each tuple.  Use
shm_mq_receive() in the master to read them back out.  The only thing
we need to be careful about is that the tuple descriptors match.  It
must be that they do, because the way the current parallel context
patch works, the master is guaranteed to hold a lock on the relation
from before the worker starts up until after it dies.  But we could
stash the tuple descriptor in shared memory and cross-check that it
matches just to be sure.  Anyway, this doesn't seem terribly complex
although we might want to wrap some abstraction around it somehow so
that every kind of parallelism that uses tuple queues can benefit from
it.  Perhaps this could even be built into the parallel context
machinery somehow, or maybe it's something executor-specific.  At any
rate it looks simpler than what you've got now.

The complicated part here seems to me to figure out what we need to
pass from the parallel leader to the parallel worker to create enough
state for quals and projection.  If we want to be able to call
ExecScan() without modification, which seems like a good goal, we're
going to need a ScanState node, which is going to need to contain
valid pointers to (at least) a ProjectionInfo, an ExprContext, and a
List of quals.  That in turn is going to require an ExecutorState.
Serializing those things directly doesn't seem very practical; what we
instead want to do is figure out what we can pass that will allow easy
reconstruction of those data structures.  Right now, you're passing
the target list, the qual list, the range table, and the params, but
the range table doesn't seem to be getting used anywhere.  I wonder if
we need it.  If we could get away with just passing the target list
and qual list, and params, we'd be doing pretty well, I think.  But
I'm not sure exactly what that looks like.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Magnus Hagander
On Mon, Jan 26, 2015 at 10:16 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Jan 26, 2015 at 4:07 PM, Magnus Hagander mag...@hagander.net
 wrote:
  I assume what was referred to was that the old cf app would show the
 last 3
  (I think it was) comments/patches/whatnot on a patch on the summary page
  (and then clickthrough for more details).

 Yep.


So doing this for whatever activity is under patch-history is trivial. Is
that the kind of information that people would like to see? (this assumes
that making annotations etc per the other mesage in this thread will create
an entry in there of course) Or is it something else?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Magnus Hagander
On Mon, Jan 26, 2015 at 9:46 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Jan 26, 2015 at 3:24 PM, Magnus Hagander mag...@hagander.net
 wrote:
  Yes, and the agreement after that feedback was to try it out and then
 figure
  out what changes were needed? As about half the feedback said it was
 better
  without and half said it was better with.

 Well, I can't speak to anyone else's opinion, but I'm quite sure I
 raised the issue that we need a way to call out which messages in the
 thread are important, and I think that's pretty much what Peter is
 saying, too.  I find the new tool essentially unusable - having one
 link to the whole thread instead of individual links to just the
 important messages in that thread is a huge regression for me, as is
 the lack of the most recent activity on the summary page.  I don't
 know how much more feedback than that you need to be convinced, but
 I'm going to shut up now before I say something I will later regret.


So in an attempt to actually move this forward in a constructive way I'm
going to ignore  a bunch of what happened after this email, and fork the
discussion at this point.

Since it's pretty clear that several people, many of who are definitely
more productive pg developers than me (and some of the others who
specifically said they did not want this feature), have spoken up about it
*after* the release of it (there's no point in arguing who said what
beforehand), we should look into doing that.


First of all - assuming we'lI fix this particular thing. Are there *other*
things that you would also say makes the tool essentially unusable?
Because if there are, can we get that out of the way early? I wouldn't want
to spend a bunch of time fixing this, just to get back into the same round
of argument again, and basically the only thing that's accepted is a
rollback. Because if that's what people want, then let's rollback - I'm not
going to waste time on it in that case. (I'm equally not going to spend
time on fixing anything on the old one anymore, so the next time there's a
security issue or such around it, it will be shut down until someone fixes
it - but that's still better than a tool people don't like to use)


Second, so let's look at how to actually do this.

What kind of annotation are we actually talking about here? Are you looking
for something that's basically star a message (like you'd do in a MUA in
a lot of cases)? Or assign tag (from a set of - editable of course -
pre-existing tags). Or a complete free-text field?

Then, what do you need to be able to annotate on?

Right now the app only shows messages that actually have attachments on
them, assuming that you'd want to read the whole thread if you want to look
at other things. For this reason, for the message thread itself it only
keeps track of the first and last message (so you can know when it has
updated). Are you saying you want to be able to attach an annotation to
*any* message in the thread? If so, we'd have to pull in and show every
single message on the thread in the UI - I'm thinking that might become
more than a little cluttered in a lot of cases. If that is what you're
talking about, any suggestions for how to actually make that into an UI
that's not that cluttered?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Simplify sleeping while reading/writing from client

2015-02-06 Thread Heikki Linnakangas

On 02/06/2015 10:38 AM, Michael Paquier wrote:

On Thu, Feb 5, 2015 at 6:45 PM, Heikki Linnakangas wrote:

Looking again at the code after Andres' interrupt-handling patch series, I
got confused by the fact that there are several wait-retry loops in
different layers, and reading and writing works slightly differently.

I propose the attached, which pulls all the wait-retry logic up to
secure_read() and secure_write(). This makes the code a lot more
understandable.


Are you sure that it is a good idea to move the check of port-noblock
out of be_tls_[read|write] to an upper level? ISTM that we should set
errno and n to respectively EWOULDBLOCK and -1 in be_tls_[write|read]
when port-noblock and do nothing otherwise. In your patch those
values are set even if the port is in block mode.


It simplifies the code to do all the sleeping and interrupt handling 
code in the upper level, in secure_[read|write]. Do you see a problem 
with it?


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplify sleeping while reading/writing from client

2015-02-06 Thread Michael Paquier
On Fri, Feb 6, 2015 at 1:46 PM, Heikki Linnakangas wrote:
 It simplifies the code to do all the sleeping and interrupt handling code in
 the upper level, in secure_[read|write]. Do you see a problem with it?
Not directly. Reading the code I got uneasy with the fact that we fact
unconditionally those parameters even if port is in block mode.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Magnus Hagander
On Mon, Jan 26, 2015 at 10:29 PM, Robert Haas robertmh...@gmail.com wrote:

 (While I'm complaining, the links only go to the flat version of the
 thread, while I happen to prefer the version that shows one message at
 a time with a message-ID selector to switch messages.)


Then you're clicking the wrong link :) As long as you click the link for
the message (first or last) you get the regular view. The thread link
always goes to the same message as the first one, except the flat view.
So you have a choice of both (and these links are both always visible, none
of them is in the collapsed area which holds the attachments)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] PATCH: Reducing lock strength of trigger and foreign key DDL

2015-02-06 Thread Andreas Karlsson

On 02/06/2015 08:16 AM, Michael Paquier wrote:

On Fri, Jan 30, 2015 at 10:26 PM, Andreas Karlsson wrote:

On 01/30/2015 07:48 AM, Michael Paquier wrote:

Looking at the latest patch, it seems that in
AlterTableGetLockLevel@tablecmds.c we ought to put AT_ReAddConstraint,
AT_AddConstraintRecurse and AT_ProcessedConstraint under the same
banner as AT_AddConstraint. Thoughts?


Good point. I think moving those would be a good thing even though it is
technically not necessary for AT_AddConstraintRecurse, since that one should
only be related to check constraints.


Andreas, are you planning to send an updated patch?


Yes, I will hopefully send it later today or tomorrow.

Andreas




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplify sleeping while reading/writing from client

2015-02-06 Thread Michael Paquier
On Thu, Feb 5, 2015 at 6:45 PM, Heikki Linnakangas wrote:
 Looking again at the code after Andres' interrupt-handling patch series, I
 got confused by the fact that there are several wait-retry loops in
 different layers, and reading and writing works slightly differently.

 I propose the attached, which pulls all the wait-retry logic up to
 secure_read() and secure_write(). This makes the code a lot more
 understandable.

Are you sure that it is a good idea to move the check of port-noblock
out of be_tls_[read|write] to an upper level? ISTM that we should set
errno and n to respectively EWOULDBLOCK and -1 in be_tls_[write|read]
when port-noblock and do nothing otherwise. In your patch those
values are set even if the port is in block mode.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Fujii Masao
On Fri, Feb 6, 2015 at 4:15 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 5, 2015 at 11:06 PM, Syed, Rahila rahila.s...@nttdata.com wrote:
/*
+* We recheck the actual size even if pglz_compress() report success,
+* because it might be satisfied with having saved as little as one byte
+* in the compressed data.
+*/
+   *len = (uint16) compressed_len;
+   if (*len = orig_len - 1)
+   return false;
+   return true;
+}

 As per latest code ,when compression is 'on' we introduce additional 2 bytes 
 in the header of each block image for storing raw_length of the compressed 
 block.
 In order to achieve compression while accounting for these two additional 
 bytes, we must ensure that compressed length is less than original length - 
 2.
 So , IIUC the above condition should rather be

 If (*len = orig_len -2 )
 return false;

2 should be replaced with the macro variable indicating the size of
extra header for compressed backup block.

Do we always need extra two bytes for compressed backup block?
ISTM that extra bytes are not necessary when the hole length is zero.
In this case the length of the original backup block (i.e., uncompressed)
must be BLCKSZ, so we don't need to save the original size in
the extra bytes.

Furthermore, when fpw compression is disabled and the hole length
is zero, we seem to be able to save one byte from the header of
backup block. Currently we use 4 bytes for the header, 2 bytes for
the length of backup block, 15 bits for the hole offset and 1 bit for
the flag indicating whether block is compressed or not. But in that case,
the length of backup block doesn't need to be stored because it must
be BLCKSZ. Shouldn't we optimize the header in this way? Thought?

+int page_len = BLCKSZ - hole_length;
+char *scratch_buf;
+if (hole_length != 0)
+{
+scratch_buf = compression_scratch;
+memcpy(scratch_buf, page, hole_offset);
+memcpy(scratch_buf + hole_offset,
+   page + (hole_offset + hole_length),
+   BLCKSZ - (hole_length + hole_offset));
+}
+else
+scratch_buf = page;
+
+/* Perform compression of block */
+if (XLogCompressBackupBlock(scratch_buf,
+page_len,
+regbuf-compressed_page,
+compress_len))
+{
+/* compression is done, add record */
+is_compressed = true;
+}

You can refactor XLogCompressBackupBlock() and move all the
above code to it for more simplicity.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup may fail to send feedbacks.

2015-02-06 Thread Fujii Masao
On Fri, Feb 6, 2015 at 3:22 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Sorry, I misunderstood that.

  At Wed, 4 Feb 2015 19:22:39 +0900, Fujii Masao masao.fu...@gmail.com 
  wrote in 
  cahgqgwgudgcmnhzinkd37i+jijdkruecrea1ncrs1mmte3r...@mail.gmail.com
  On Wed, Feb 4, 2015 at 4:58 PM, Kyotaro HORIGUCHI
  horiguchi.kyot...@lab.ntt.co.jp wrote:
   I'm very sorry for confused report. The problem found in 9.4.0
   and the diagnosis was mistakenly done on master.
  
   9.4.0 has no problem of feedback delay caused by slow xlog
   receiving on pg_basebackup mentioned in the previous mail. But
   the current master still has this problem.
 
  Seems walreceiver has the same problem. No?
 
  pg_receivexlog.c would have the same problem since it uses the
  same function with pg_basebackup.c.
 
  The correspondent of HandleCopyStream in wansender is
  WalReceiverMain, and it doesn't seem to have the same kind of
  loop shown below. It seems to surely send feedback per one
  record.
 
  |   r = stream_reader();
  |   while (r  0)
  |   {
  |  ... wal record processing stuff without sending feedback..
  |  r = stream_reader();
  |   }

 WalReceiverMain() has the similar code as follows.

 len = walrcv_receive(NAPTIME_PER_CYCLE, buf);
 if (len != 0)
 {
 for (;;)
 {
 if (len  0)
 {
 
 len = walrcv_receive(0, buf);
 }
 }

 The loop seems a bit different but eventually the same about this
 discussion.

 408 len = walrcv_receive(NAPTIME_PER_CYCLE, buf);
 409 if (len != 0)
 410 {
 415 for (;;)
 416 {
 417 if (len  0)
 418 {
 425XLogWalRcvProcessMsg(buf[0], buf[1], len - 1);
 426 }
 427-438 else {break;}
 439 len = walrcv_receive(0, buf);
 440 }
 441 }

 XLogWalRcvProcessMsg doesn't send feedback when processing
 'w'=WAL record packet. So the same thing as pg_basebackup and
 pg_receivexlog will occur on walsender.

 Exiting the for(;;) loop by TimestampDifferenceExceeds just
 before line 439 is an equivalent measure to I poposed for
 receivelog.c, but calling XLogWalRcvSendHSFeedback(false) there
 is seemingly simpler (but I feel a bit uncomfortable for the
 latter)

I'm concerned about the performance degradation by calling
getimeofday() per a record.

 Or other measures?

Introduce the maximum number of records that we can receive and
process between feedbacks? For example, we can change
XLogWalRcvSendHSFeedback() so that it's called per at least
8 records. I'm not sure what number is good, though...

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-02-06 Thread Daniel Bausch
Hi David and others!

David Fetter da...@fetter.org writes:

 On Tue, Jan 27, 2015 at 08:02:37AM +0100, Daniel Bausch wrote:
 
 Tom Lane t...@sss.pgh.pa.us writes:
 
  Wait for first IO, issue second IO request
  Compute
  Already have second IO request, issue third
  ...
 
  We'd be a lot less sensitive to IO latency.
 
  It would take about five minutes of coding to prove or disprove this:
  stick a PrefetchBuffer call into heapgetpage() to launch a request for the
  next page as soon as we've read the current one, and then see if that
  makes any obvious performance difference.  I'm not convinced that it will,
  but if it did then we could think about how to make it work for real.
 
 Sorry for dropping in so late...
 
 I have done all this two years ago.  For TPC-H Q8, Q9, Q17, Q20, and Q21
 I see a speedup of ~100% when using IndexScan prefetching + Nested-Loops
 Look-Ahead (the outer loop!).
 (On SSD with 32 Pages Prefetch/Look-Ahead + Cold Page Cache / Small RAM)

 Would you be so kind as to pass along any patches (ideally applicable
 to git master), tests, and specific measurements you made?

Attached find my patches based on the old revision
36f4c7843cf3d201279855ed9a6ebc1deb3c9463
(Adjust cube.out expected output for new test queries.)

I did not test applicability against HEAD by now.

Disclaimer: This was just a proof-of-concept and so is poor
implementation quality.  Nevertheless, performance looked promising
while it still needs a lot of extra rules for special cases, like
detecting accidential sequential scans.  General assumption is: no
concurrency - a single query owning the machine.

Here is a comparison using dbt3.  Q8, Q9, Q17, Q20, and Q21 are
significantly improved.

| |   baseline |  indexscan | indexscan+nestloop |
| || patch 1+2  | patch 3|
|-+++|
| Q1  |  76.124261 |  73.165161 |  76.323119 |
| Q2  |   9.676956 |  11.211073 |  10.480668 |
| Q3  |  36.836417 |  36.268022 |  36.837226 |
| Q4  |  48.707501 |64.2255 |  30.872218 |
| Q5  |  59.371467 |  59.205048 |  58.646096 |
| Q6  |  70.514214 |  73.021006 |   72.64643 |
| Q7  |  63.667594 |  63.258499 |  62.758288 |
| Q8  |  70.640973 |  33.144454 |  32.530732 |
| Q9  | 446.630473 | 379.063773 | 219.926094 |
| Q10 |  49.616125 |  49.244744 |  48.411664 |
| Q11 |   6.122317 |   6.158616 |   6.160189 |
| Q12 |  74.294292 |  87.780442 |  87.533936 |
| Q13 |   32.37932 |  32.771938 |  33.483444 |
| Q14 |  47.836053 |  48.093996 |   47.72221 |
| Q15 | 139.350038 | 138.880208 | 138.681336 |
| Q16 |  12.092429 |  12.120661 |  11.668971 |
| Q17 |   9.346636 |   4.106042 |   4.018951 |
| Q18 |  66.106875 | 123.754111 | 122.623193 |
| Q19 |  22.750504 |  23.191532 |   22.34084 |
| Q20 |  80.481986 |  29.906274 |   28.58106 |
| Q21 | 396.897269 |  355.45988 |  214.44184 |
| Q22 |   6.834841 |   6.600922 |   6.524032 |

Regards,
Daniel
-- 
MSc. Daniel Bausch
Research Assistant (Computer Science)
Technische Universit├Ąt Darmstadt
http://www.dvs.tu-darmstadt.de/staff/dbausch
From 569398929d899100b769abfd919bc3383626ac9f Mon Sep 17 00:00:00 2001
From: Daniel Bausch bau...@dvs.tu-darmstadt.de
Date: Tue, 22 Oct 2013 15:22:25 +0200
Subject: [PATCH 1/4] Quick proof-of-concept for indexscan prefetching

This implements a prefetching queue of tuples whose tid is read ahead.
Their block number is quickly checked for random properties (not current
block and not the block prefetched last).  Random reads are prefetched.
Up to 32 tuples are considered by default.  The tids are queued in a
fixed ring buffer.

The prefetching is implemented in the generic part of the index scan, so
it applies to all access methods.
---
 src/backend/access/index/indexam.c | 96 ++
 src/include/access/relscan.h   | 12 +
 2 files changed, 108 insertions(+)

diff --git a/src/backend/access/index/indexam.c b/src/backend/access/index/indexam.c
index b878155..1c54ef5 100644
--- a/src/backend/access/index/indexam.c
+++ b/src/backend/access/index/indexam.c
@@ -251,6 +251,12 @@ index_beginscan(Relation heapRelation,
 	scan-heapRelation = heapRelation;
 	scan-xs_snapshot = snapshot;
 
+#ifdef USE_PREFETCH
+	scan-xs_prefetch_head = scan-xs_prefetch_tail = -1;
+	scan-xs_last_prefetch = -1;
+	scan-xs_done = false;
+#endif
+
 	return scan;
 }
 
@@ -432,6 +438,55 @@ index_restrpos(IndexScanDesc scan)
 	FunctionCall1(procedure, PointerGetDatum(scan));
 }
 
+static int
+index_prefetch_queue_space(IndexScanDesc scan)
+{
+	if (scan-xs_prefetch_tail  0)
+		return INDEXSCAN_PREFETCH_COUNT;
+
+	Assert(scan-xs_prefetch_head = 0);
+
+	return (INDEXSCAN_PREFETCH_COUNT
+			- (scan-xs_prefetch_tail - scan-xs_prefetch_head + 1))
+		% INDEXSCAN_PREFETCH_COUNT;
+}
+
+/* makes copy of ItemPointerData */
+static 

Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Syed, Rahila

In any case, those things have been introduced by what I did in previous 
versions... And attached is a new patch.
Thank you for feedback.

   /* allocate scratch buffer used for compression of block images */
+  if (compression_scratch == NULL)
+  compression_scratch = MemoryContextAllocZero(xloginsert_cxt,
+  
 BLCKSZ);
 }
The compression patch can  use the latest interface MemoryContextAllocExtended 
to proceed without compression when sufficient memory is not available for 
scratch buffer. 
The attached patch introduces OutOfMem flag which is set on when 
MemoryContextAllocExtended returns NULL . 

Thank you,
Rahila Syed

-Original Message-
From: Michael Paquier [mailto:michael.paqu...@gmail.com] 
Sent: Friday, February 06, 2015 12:46 AM
To: Syed, Rahila
Cc: PostgreSQL mailing lists
Subject: Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

On Thu, Feb 5, 2015 at 11:06 PM, Syed, Rahila rahila.s...@nttdata.com wrote:
/*
+* We recheck the actual size even if pglz_compress() report success,
+* because it might be satisfied with having saved as little as one byte
+* in the compressed data.
+*/
+   *len = (uint16) compressed_len;
+   if (*len = orig_len - 1)
+   return false;
+   return true;
+}

 As per latest code ,when compression is 'on' we introduce additional 2 bytes 
 in the header of each block image for storing raw_length of the compressed 
 block.
 In order to achieve compression while accounting for these two additional 
 bytes, we must ensure that compressed length is less than original length - 2.
 So , IIUC the above condition should rather be

 If (*len = orig_len -2 )
 return false;
 return true;
 The attached patch contains this. It also has a cosmetic change-  renaming 
 compressBuf to uncompressBuf as it is used to store uncompressed page.

Agreed on both things.

Just looking at your latest patch after some time to let it cool down, I 
noticed a couple of things.

 #define MaxSizeOfXLogRecordBlockHeader \
 (SizeOfXLogRecordBlockHeader + \
- SizeOfXLogRecordBlockImageHeader + \
+ SizeOfXLogRecordBlockImageHeader, \
+ SizeOfXLogRecordBlockImageCompressionInfo + \
There is a comma here instead of a sum sign. We should really sum up all those 
sizes to evaluate the maximum size of a block header.

+ * Permanently allocate readBuf uncompressBuf.  We do it this way,
+ * rather than just making a static array, for two reasons:
This comment is just but weird, readBuf AND uncompressBuf is more appropriate.

+ * We recheck the actual size even if pglz_compress() report success,
+ * because it might be satisfied with having saved as little as one byte
+ * in the compressed data. We add two bytes to store raw_length  with the
+ * compressed image. So for compression to be effective
compressed_len should
+ * be atleast  orig_len - 2
This comment block should be reworked, and misses a dot at its end. I rewrote 
it like that, hopefully that's clearer:
+   /*
+* We recheck the actual size even if pglz_compress() reports
success and see
+* if at least 2 bytes of length have been saved, as this
corresponds to the
+* additional amount of data stored in WAL record for a compressed block
+* via raw_length.
+*/

In any case, those things have been introduced by what I did in previous 
versions... And attached is a new patch.
--
Michael

__
Disclaimer: This email and any attachments are sent in strictest confidence
for the sole use of the addressee and may contain legally privileged,
confidential, and proprietary data. If you are not the intended recipient,
please advise the sender by replying promptly to this email and then delete
and destroy this email and any attachments without any further use, copying
or forwarding.

Support-compression-for-full-page-writes-in-WAL_v17.patch
Description: Support-compression-for-full-page-writes-in-WAL_v17.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-02-06 Thread Amit Kapila
On Thu, Jan 22, 2015 at 10:30 AM, Amit Kapila amit.kapil...@gmail.com
wrote:

 On Thu, Jan 22, 2015 at 6:37 AM, Kouhei Kaigai kai...@ak.jp.nec.com
wrote:
 
  (Please point out me if my understanding is incorrect.)
 
  What happen if dynamic background worker process tries to reference
temporary
  tables? Because buffer of temporary table blocks are allocated on
private
  address space, its recent status is not visible to other process unless
it is
  not flushed to the storage every time.
 
  Do we need to prohibit create_parallelscan_paths() to generate a path
when
  target relation is temporary one?
 

 Yes, we need to prohibit parallel scans on temporary relations.  Will fix.


Here is the latest patch which fixes reported issues and supported
Prepared Statements and Explain Statement for parallel sequential
scan.

The main purpose is to get the feedback if possible on overall
structure/design of code before I goahead.

Note -
a. it is still based on parallel-mode-v1 [1] patch of Robert.
b. based on CommitId - fd496129 [on top of this commit, apply
Robert's patch and then the attached patch]
c. just build and tested on Windows, my linux box has some
problem, will fix that soon and verify this on linux as well.

[1]
http://www.postgresql.org/message-id/ca+tgmozduk4k3xhbxc9vm-82khourezdvqwtfglhwsd2r2a...@mail.gmail.com

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


parallel_seqscan_v6.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExplainModifyTarget doesn't work as expected

2015-02-06 Thread Ashutosh Bapat
Well let's see what others think. Also, we might want to separate that
information on result relations heading probably.

On Fri, Feb 6, 2015 at 1:35 PM, Etsuro Fujita fujita.ets...@lab.ntt.co.jp
wrote:

 Hi Ashutosh,

 Thank you for the review!


 On 2015/02/03 15:32, Ashutosh Bapat wrote:

 I agree that it's a problem, and it looks more severe when there are
 multiple children
 postgres=# create table parent (a int check (a  0) no inherit);
 CREATE TABLE
 postgres=# create table child1 (a int check (a = 0));
 CREATE TABLE
 postgres=# create table child2 (a int check (a = 0));
 CREATE TABLE
 postgres=# create table child3 (a int check (a = 0));
 CREATE TABLE
 postgres=# alter table child1 inherit parent;
 ALTER TABLE
 postgres=# alter table child2 inherit parent;
 ALTER TABLE
 postgres=# alter table child3 inherit parent;
 ALTER TABLE
 postgres=# explain update parent set a = a * 2 where a = 0;
 QUERY PLAN
 
   Update on child1  (cost=0.00..126.00 rows=2400 width=10)
 -  Seq Scan on child1  (cost=0.00..42.00 rows=800 width=10)
   Filter: (a = 0)
 -  Seq Scan on child2  (cost=0.00..42.00 rows=800 width=10)
   Filter: (a = 0)
 -  Seq Scan on child3  (cost=0.00..42.00 rows=800 width=10)
   Filter: (a = 0)
 (7 rows)

 It's certainly confusing why would an update on child1 cause scan on
 child*.


 Yeah, I think so too.

  But I also think that showing parent's name with Upate would be
 misleading esp. when user expects it to get filtered because of
 constraint exclusion.

 Instead, can we show all the relations that are being modified e.g
 Update on child1, child2, child3. That will disambiguate everything.


 That's an idea, but my concern about that is the cases where there are a
 large number of child tables as the EXPLAIN would be difficult to read in
 such cases.

 Best regards,
 Etsuro Fujita




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Simplify sleeping while reading/writing from client

2015-02-06 Thread Andres Freund
On 2015-02-05 16:45:50 +0200, Heikki Linnakangas wrote:
 Looking again at the code after Andres' interrupt-handling patch series, I
 got confused by the fact that there are several wait-retry loops in
 different layers, and reading and writing works slightly differently.

They don't really work all that differently?

 I propose the attached, which pulls all the wait-retry logic up to
 secure_read() and secure_write(). This makes the code a lot more
 understandable.

Generally a good idea. Especially if we get more ssl implementations.

But I think it'll make the already broken renegotiation handling even
worse. Because now we're always entering do_handshake in nonblocking
mode (essentially).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Andres Freund
On 2015-02-06 11:51:50 +0100, Magnus Hagander wrote:
 So in an attempt to actually move this forward in a constructive way I'm
 going to ignore  a bunch of what happened after this email, and fork the
 discussion at this point.

Sounds good.

 First of all - assuming we'lI fix this particular thing. Are there *other*
 things that you would also say makes the tool essentially unusable?

I personally don't have any really fundamental complaints but this
one. I'd like to be able to add some CF app only annotations to the
whole patch, but that's not that important.

 What kind of annotation are we actually talking about here?

Well, to explain where, at least I, am coming from, let's have a look at
an example:
https://commitfest.postgresql.org/4/17/

a) The linked messages aren't going to help either the author, a
   committer or prospective reviewers to find existing reviews. And
   looking through the whole thread isn't realistic at that message
   count. And all of them need to.
b) The 'latest attachement' isn't actually the patch that's supposed to
   be reviewed. The first patch is completely outdated. So one would
   need to grovel through all the linked patches.
c) By just looking at the CF entry I have no clue what the status of the
   patch actually is. Sure it Needs review, but that can mean minor
   changes or major reworks. In the old app the comments to the patches
   and reviews could say things like major issues identified,
   significant work needed, some cosmetic issues remain, new version
   of patch with major redesign or rebased patch without other changes.

I don't think any of these questions can be answered by improved
automatisms alone.

This really doesn't matter much on a 5-10 message thread about a simple
patch. But we regularly have thread with hundreds of messages.


Makes sense?


 Are you looking for something that's basically star a message (like
 you'd do in a MUA in a lot of cases)? Or assign tag (from a set of -
 editable of course - pre-existing tags). Or a complete free-text
 field?

I think it has to be free form. If you look at the above, it's hard to
do that sanely with tags.

 Then, what do you need to be able to annotate on?

I think both individual messages and the CF entry itself, without an
email, make sense.

 Right now the app only shows messages that actually have attachments on
 them, assuming that you'd want to read the whole thread if you want to look
 at other things.

I don't think anybody can realistically do that on more complex
patches. Often the discussion in the beginning has little bearance on
later state, so even if you had the time to do so, it'd better be spent
doing something else.

 Are you saying you want to be able to attach an annotation to *any*
 message in the thread?

Yes.

 If so, we'd have to pull in and show every single message on the
 thread in the UI - I'm thinking that might become more than a little
 cluttered in a lot of cases.

Well, we only would need to show commented on messages in the UI when
viewing the entry.

 If that is what you're talking about, any
 suggestions for how to actually make that into an UI that's not that
 cluttered?

The simplest seems to be to simply allow comments using the message id
:(. I've no really good idea how to allow to select the message to
comment on. Even if we find something else, it should probably be at
least an option - it certainly is quicker for me to grab the message id
of an email I just read or wrote and paste it than pretty much any other
method.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: add recovery_timeout option to control timeout of restore_command nonzero status code

2015-02-06 Thread Fujii Masao
On Fri, Feb 6, 2015 at 1:23 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 5, 2015 at 11:58 PM, Michael Paquier wrote:
 An updated patch is attached.
 I just noticed that the patch I sent was incorrect:
 - Parameter name was still wal_availability_check_interval and not
 wal_retrieve_retry_interval
 - Documentation was incorrect.
 Please use the patch attached instead for further review.

Thanks for updating the patch!

timestamp.c:1708: warning: implicit declaration of function
'HandleStartupProcInterrupts'

I got the above warning at the compilation.

+pg_usleep(wait_time);
+HandleStartupProcInterrupts();
+total_time -= wait_time;

It seems strange to call the startup-process-specific function in
the generic function like TimestampSleepDifference().

 * 5. Sleep 5 seconds, and loop back to 1.

In WaitForWALToBecomeAvailable(), the above comment should
be updated.

- * Wait for more WAL to arrive. Time out after 5 seconds,
- * like when polling the archive, to react to a trigger
- * file promptly.
+ * Wait for more WAL to arrive. Time out after
the amount of
+ * time specified by wal_retrieve_retry_interval, like
+ * when polling the archive, to react to a
trigger file promptly.
  */
 WaitLatch(XLogCtl-recoveryWakeupLatch,
   WL_LATCH_SET | WL_TIMEOUT,
-  5000L);
+  wal_retrieve_retry_interval * 1000L);

This change can prevent the startup process from reacting to
a trigger file. Imagine the case where the large interval is set
and the user want to promote the standby by using the trigger file
instead of pg_ctl promote. I think that the sleep time should be 5s
if the interval is set to more than 5s. Thought?

+# specifies an optional internal to wait for WAL to become available when

s/internal/interval?

+This parameter specifies the amount of time to wait when a failure
+occurred when reading WAL from a source (be it via streaming
+replication, local filenamepg_xlog/ or WAL archive) for a node
+in standby mode, or when WAL is expected from a source.

At least to me, it seems not easy to understand what the parameter is
from this description. What about the following, for example?

This parameter specifies the amount of time to wait when WAL is not
available from any sources (streaming replication, local
filenamepg_xlog/ or WAL archive) before retrying to retrieve WAL

Isn't it better to place this parameter in postgresql.conf rather than
recovery.conf? I'd like to change the value of this parameter without
restarting the server.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ExplainModifyTarget doesn't work as expected

2015-02-06 Thread Tom Lane
Etsuro Fujita fujita.ets...@lab.ntt.co.jp writes:
 On 2015/02/03 15:32, Ashutosh Bapat wrote:
 Instead, can we show all the relations that are being modified e.g
 Update on child1, child2, child3. That will disambiguate everything.

 That's an idea, but my concern about that is the cases where there are a 
 large number of child tables as the EXPLAIN would be difficult to read 
 in such cases.

I concur, that would *not* be an improvement in readability.  Moreover,
I don't think it really fixes the issue: what we want to show is a table
name in Modify that matches what the user wrote in the query.  Given that
context, the user should be able to understand why some tables are listed
below that and others are not.

IIRC, this code was written at a time when we didn't have NO INHERIT check
constraints and so it was impossible for the parent table to get optimized
away while leaving children.  So the comment in ExplainModifyTarget was
good at the time.  But it no longer is.

I think your basic idea of preserving the original parent table's relid
is correct; but instead of doing it like this patch does, I'd be inclined
to make ModifyTable inherit from Scan not Plan, and use the scan.scanrelid
field to carry the parent RTI.  Then you would probably end up with a net
savings of code rather than net addition; certainly ExplainModifyTarget
would go away entirely since you'd just treat ModifyTable like any other
Scan in this part of EXPLAIN.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Robert Haas
On Fri, Feb 6, 2015 at 5:51 AM, Magnus Hagander mag...@hagander.net wrote:
 So in an attempt to actually move this forward in a constructive way I'm
 going to ignore  a bunch of what happened after this email, and fork the
 discussion at this point.

Thanks, and I probably owe you an apology for some of that, so, sorry
about that.

I think the core of the problem here is that the old application saw
its goal in life as *summarizing* the thread.  The idea is that people
would go in and add comments (which could be flagged as comment,
patch, or review) pointing to particularly important messages in the
discussion.  The problem with this is that it had to be manually
updated, and some people didn't like that.[1]  The new app attaches
the entire thread, which has the advantage that everything is always
there.  The problem with that is that the unimportant stuff is there,
too, and there's no way to mark the important stuff so that you can
distinguish between that and the unimportant stuff.  I think that's
the problem we need to solve.

One thing that would probably *help* is if the list of attachments
mentioned the names of the files that were attached to each message
rather than just noting that they have some kind of attachment.  If
people name their attachments sensibly, then you'll be able to
distinguish parallel-seqscan-v23.patch from
test-case-that-breaks-parallel-seqscan.sql, and that would be nice.
But that doesn't help with, say, distinguishing useful reviews from
general discussion on the thread.  I don't think there's any way to do
that in an automated way, so it's got to be something that somebody
does manually.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] My personal opinion is that this complaint was misguided.  I have
yet to see an issue tracker that didn't require scads of manual work
to keep the information in it relevant, and ours had a higher
signal-to-noise ratio than many I've had to use professionally.  That
having been said, I understand the frustration.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup, tablespace mapping and path canonicalization

2015-02-06 Thread Robert Haas
On Thu, Feb 5, 2015 at 10:21 PM, Ian Barwick i...@2ndquadrant.com wrote:
 I stumbled on what appears to be inconsistent handling of double slashes
 in tablespace paths when using pg_basebackup with the -T/--tablespace-mapping
 option:

 ibarwick:postgresql (master)$ mkdir /tmp//foo-old
[...]
 The attached patch adds the missing canonicalization; I can't see any
 reason not to do this. Thoughts?

Seems OK to me.  Anyone think differently?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Michael Paquier
On Fri, Feb 6, 2015 at 4:30 PM, Michael Paquier wrote:
 On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e., uncompressed)
 must be BLCKSZ, so we don't need to save the original size in
 the extra bytes.

 Yes, we would need a additional bit to identify that. We could steal
 it from length in XLogRecordBlockImageHeader.

 Furthermore, when fpw compression is disabled and the hole length
 is zero, we seem to be able to save one byte from the header of
 backup block. Currently we use 4 bytes for the header, 2 bytes for
 the length of backup block, 15 bits for the hole offset and 1 bit for
 the flag indicating whether block is compressed or not. But in that case,
 the length of backup block doesn't need to be stored because it must
 be BLCKSZ. Shouldn't we optimize the header in this way? Thought?

 If we do it, that's something to tackle even before this patch on
 HEAD, because you could use the 16th bit of the first 2 bytes of
 XLogRecordBlockImageHeader to do necessary sanity checks, to actually
 not reduce record by 1 byte, but 2 bytes as hole-related data is not
 necessary. I imagine that a patch optimizing that wouldn't be that
 hard to write as well.

Actually, as Heikki pointed me out... A block image is 8k and pages
without holes are rare, so it may be not worth sacrificing code
simplicity for record reduction at the order of 0.1% or smth like
that, and the current patch is light because it keeps things simple.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New CF app deployment

2015-02-06 Thread Robert Haas
On Fri, Feb 6, 2015 at 5:55 AM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jan 26, 2015 at 10:29 PM, Robert Haas robertmh...@gmail.com wrote:
 (While I'm complaining, the links only go to the flat version of the
 thread, while I happen to prefer the version that shows one message at
 a time with a message-ID selector to switch messages.)

 Then you're clicking the wrong link :) As long as you click the link for the
 message (first or last) you get the regular view. The thread link always
 goes to the same message as the first one, except the flat view. So you
 have a choice of both (and these links are both always visible, none of them
 is in the collapsed area which holds the attachments)

OK.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Michael Paquier
On Fri, Feb 6, 2015 at 3:03 PM, Fujii Masao wrote:
 Do we always need extra two bytes for compressed backup block?
 ISTM that extra bytes are not necessary when the hole length is zero.
 In this case the length of the original backup block (i.e., uncompressed)
 must be BLCKSZ, so we don't need to save the original size in
 the extra bytes.

Yes, we would need a additional bit to identify that. We could steal
it from length in XLogRecordBlockImageHeader.

 Furthermore, when fpw compression is disabled and the hole length
 is zero, we seem to be able to save one byte from the header of
 backup block. Currently we use 4 bytes for the header, 2 bytes for
 the length of backup block, 15 bits for the hole offset and 1 bit for
 the flag indicating whether block is compressed or not. But in that case,
 the length of backup block doesn't need to be stored because it must
 be BLCKSZ. Shouldn't we optimize the header in this way? Thought?

If we do it, that's something to tackle even before this patch on
HEAD, because you could use the 16th bit of the first 2 bytes of
XLogRecordBlockImageHeader to do necessary sanity checks, to actually
not reduce record by 1 byte, but 2 bytes as hole-related data is not
necessary. I imagine that a patch optimizing that wouldn't be that
hard to write as well.

 +int page_len = BLCKSZ - hole_length;
 +char *scratch_buf;
 +if (hole_length != 0)
 +{
 +scratch_buf = compression_scratch;
 +memcpy(scratch_buf, page, hole_offset);
 +memcpy(scratch_buf + hole_offset,
 +   page + (hole_offset + hole_length),
 +   BLCKSZ - (hole_length + hole_offset));
 +}
 +else
 +scratch_buf = page;
 +
 +/* Perform compression of block */
 +if (XLogCompressBackupBlock(scratch_buf,
 +page_len,
 +regbuf-compressed_page,
 +compress_len))
 +{
 +/* compression is done, add record */
 +is_compressed = true;
 +}

 You can refactor XLogCompressBackupBlock() and move all the
 above code to it for more simplicity.

Sure.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible problem with pgcrypto

2015-02-06 Thread Jan Wieck

On 02/05/2015 02:15 PM, Jan Wieck wrote:

On 02/05/2015 01:18 PM, Marko Tiikkaja wrote:



  pgcrypto bug


That doesn't look too good, but I can't reproduce it against 9.3.6 either.


Attached is an improved script and the final output from it. I ran it 
over night and it did not reproduce the pgcrypto bug message, but 
instead the expected Corrupt data plus one, that I haven't seen before 
which is Unsupported compression algorithm.



Jan

--
Jan Wieck
Senior Software Engineer
http://slony.info


pgcrypto_test2.sh
Description: application/shellscript
select * from pgp_test2 order by 1;
 id |  
vdata   
|  vkey   |  verrmsg  
+--+-+---
  1 | 
\xc30d040703027123de71fa32937175d23a01cc0377628c3b58119e4c8e51804b74cccf961c8b500b8a283db7084ed809833f9bfd7827b70cf06aa0254707e1c08b3db8419e6e2eda697637
 | key_1   | ERROR:  Wrong key or corrupt data
  2 | 
\xc30d040703027123de71fa32937175d23a01cc0377628c3b58119e4c8e51804b74cccf961c8b500b8a283db7084ed809833f9bfd7827b70cf06aa0254707e1c08b3db8419e6e2eda697637
 | key_686 | ERROR:  Not text data
  3 | 
\xc30d04070302b9688f5b4b1db6bb77d23a01ee1cad33770344503c564496fb6463d9b11a30013424fcae8b80cfbafba415ba3c7047aec7499b8e74254069b29390990c0f2a34740a7d2085
 | key_972 | ERROR:  Unsupported compression algorithm
  4 | 
\xc30d04070302c011ec7879402f4178d23a0165d2462bb4e6a7a204f8af20fc0cfadfc72f6a55902f013479697f316659bbc028ddfe624578e462aee3279b65f3e66f0993305d378ae35593
 | key_816 | ERROR:  Corrupt data
(4 rows)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE} 2.0

2015-02-06 Thread Thom Brown
On 29 January 2015 at 23:38, Peter Geoghegan p...@heroku.com wrote:

 On Sat, Jan 17, 2015 at 6:48 PM, Peter Geoghegan p...@heroku.com wrote:
  I continued with this since posting V2.0.

 Attached version (V2.1) fixes bit-rot caused by the recent changes by
 Stephen (Fix column-privilege leak in error-message paths). More
 precisely, it is rebased on top of today's 17792b commit.


Patch 0002 no longer applies due to a conflict in
src/backend/executor/execUtils.c.

Thom


Re: [HACKERS] RangeType internal use

2015-02-06 Thread Tom Lane
Amit Langote langote_amit...@lab.ntt.co.jp writes:
 I wonder why I cannot find a way to get a range type for a given (sub-)
 type. I would like to build a RangeType from Datum's of lower and upper
 bounds. Much like how construct_array() builds an ArrayType from a Datum
 array of elements given elements' type info.

 Is there some way I do not seem to know? If not, would it be worthwhile
 to make something like construct_range() that returns a RangeType given
 Datum's of lower and upper bounds and subtype info?

There is no good reason to assume that a range type exists at all, much
less that it is unique for a subtype.  (Read the CREATE TYPE documentation
if you're unclear as to why not.)  You have not said what you're trying to
accomplish, but this seems like a dead end.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Re: Compression of full-page-writes

2015-02-06 Thread Michael Paquier
On Fri, Feb 6, 2015 at 6:35 PM, Syed, Rahila wrote:
 The compression patch can  use the latest interface 
 MemoryContextAllocExtended to proceed without compression when sufficient 
 memory is not available for
 scratch buffer.
 The attached patch introduces OutOfMem flag which is set on when 
 MemoryContextAllocExtended returns NULL .

TBH, I don't think that brings much as this allocation is done once
and process would surely fail before reaching the first code path
doing a WAL record insertion. In any case, OutOfMem is useless, you
could simply check if compression_scratch is NULL when assembling a
record.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers