Re: [HACKERS] CREATE INDEX CONCURRENTLY?

2014-11-01 Thread Simon Riggs
On 31 October 2014 17:46, Michael Banck michael.ba...@credativ.de wrote:

 I wonder whether that is pilot error (fair enough), or whether something
 could be done about this?

When originally written the constraints were tighter, but have since
been relaxed.

Even so a CIC waits until all snapshots that can see it have gone. So
what you observe is correct and known.


Can it be changed? Maybe.

CREATE INDEX gets around the wait by using indcheckxmin to see whether
the row is usable. So the command completes, even if the index is not
usable by all current sessions.

We perform the wait in a completely different way for CIC, for this
reason (in comments)

  We also need not set indcheckxmin during a concurrent index build,
  because we won't set indisvalid true until all transactions that care
  about the broken HOT chains are gone.

Reading that again, I can't see why we do it that way. If CREATE INDEX
can exit once the index is built, so could CONCURRENTLY.

ISTM that we could indcheckxmin into an Xid, not a boolean
   For CREATE INDEX, set the indcheckxmin = xid of creating transaction
   For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the
completing transaction

The apparent reason it does this is that the Xmin value used currently
is the Xmin of the index row. The index row is inserted prior to the
index being valid so that technique cannot work. So I am suggesting
for CIC that we use the xid of the transaction that completes the
index, not the xid that originally created the index row. Plus handle
the difference between valid and not.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Amit Langote
Hi,

On Sat, Nov 1, 2014 at 1:21 PM, Eric Ridge eeb...@gmail.com wrote:
 On Fri, Oct 31, 2014 at 6:07 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I don't know if/when that will happen as such, but Simon was making noises
 about writing code to treat hash indexes as unlogged automatically, which
 would more or less fix the worst risks.  That's not just a special case
 for hash indexes, but any index AM that lacks WAL support, as third-party
 AMs might well do.


 As someone writing a 3rd-party AM, literally right this moment, do you have
 a link to that thread?  While I follow this list fairly closely I don't
 remember seeing this.  I'd love to understand the thoughts around handling
 extension-based AMs.


You are looking for this:

http://www.postgresql.org/message-id/capphfdsxwzmojm6dx+tjnpyk27kt4o7ri6x_4oswcbyu1rm...@mail.gmail.com

Thanks,
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Add launchd Support

2014-11-01 Thread Peter Eisentraut
On 10/20/14 8:53 PM, Wim Lewis wrote:
 Apple has published their changes to Postgres (since they ship it in recent 
 versions of OSX) here, fwiw, including the launchd plist they use: 
 http://www.opensource.apple.com/source/PostgreSQL/
 
 One thing I noticed is that Apple also used the label “org.postgres.postgres” 
 for their launchd job. I don’t know if that will collide in some way with a 
 second job with the same label. Launchctl load/unload takes a pathname, not a 
 job label, so I don’t think it’d be a problem unless you actually do want to 
 run both copies of postgres at the same time.

I have a file /System/Library/LaunchAgents/com.apple.postgres.plist that
uses the label com.apple.postgres instead.

 MacPorts also has a launchd job for their postgresql port, which invokes 
 daemondo, which invokes a wrapper script, which invokes postgres. I’m not 
 sure why they did it that way.

Probably because it's apparently really difficult to do it in a
different way.

 2) AFAICS, this .plist file doesn't do anything about launchd's habit of not 
 waiting for the network to come up. 
 
 Have you experimented with this setting?:
 
keyKeepAlive/key
dictkeyNetworkState/keytrue//dict
 
 The launchd.plist man page claims that if you set that key in the 
 sub-dictionary:
 If true, the job will be kept alive as long as the network is up, where up 
 is defined as at least one non-loopback  interface being up and having IPv4 
 or IPv6 addresses assigned to them.  If false, the job will be kept alive in 
 the inverse condition.

My launchd.plist man page says:

   NetworkState boolean
   This key is no longer implemented as it never acted how most users
expected.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Petr Jelinek

Hi,

thanks for review.

On 01/11/14 05:45, Michael Paquier wrote:

Now here are a couple of comments at code level, this code seems not
enough baked for a commit:
1) The following renaming should be done:
- pg_get_transaction_committime to pg_get_transaction_commit_time
- pg_get_transaction_extradata to pg_get_transaction_extra_data
- pg_get_transaction_committime_data to pg_get_transaction_commit_time_data
- pg_get_latest_transaction_committime_data to
pg_get_latest_transaction_commit_time_data


Makes sense.


3) General remark: committs is not a name suited IMO (ts for
transaction??). What if the code is changed to use commit_time instead?
This remark counts as well for the file names committs.c and committs.h,
and for pg_committs.


The ts is for timestamp, tx would be shorthand for transaction. Looking 
at your remarks, it seems there is some general inconsistency with time 
vs timestamp in this patch, we should pick one and stick with it.



6) To be consistent with everything, shouldn't track_commit_timestamp be
renamed to track_commit_time


(see above)


9) Hm?! oldest commit time xid, no?
-oldest running xid %u;
%s,
+oldest CommitTs xid:
%u; oldest running xid %u; %s,


Again, timestamp vs time.


12) In committs_desc@committsdesc.c, isn't this block overthinking a bit:
+   else
+   appendStringInfo(buf, UNKNOWN);
It may be better to remove it, no?


Should be safe, indeed.


13) Isn't that 2014?
+ * Portions Copyright (c) 1996-2013, PostgreSQL Global Development Group


Hah, I forgot to update that (shows how long this patch has been waiting 
:) )



16) make installcheck (test committs_off) fails on a server that has
track_committs set to on. You should use an alternate output. I would


Well, it is supposed to fail, that's the whole point, the output should 
be different depending on the value of the GUC.



recommend removing as well the _off suffix in the test name. Let's use
commit_time. Also, it should be mentioned in parallel_schedule with a
comment that this test should always run alone and never in parallel
with other tests. Honestly, I also think that test_committs brings no
additional value and results in duplication code between
src/test/regress and contrib/test_committs. So I'd just rip it off. On


Those tests are different though, one tests that the default (off) works 
as expected the contrib one tests that the feature when turned on works 
as expected. Since we can only set config values for contrib tests I 
don't see how else to do this, but I am open to suggestions.



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Petr Jelinek

On 01/11/14 12:19, Petr Jelinek wrote:

Hi,

thanks for review.

On 01/11/14 05:45, Michael Paquier wrote:

Now here are a couple of comments at code level, this code seems not
enough baked for a commit:
1) The following renaming should be done:
- pg_get_transaction_committime to pg_get_transaction_commit_time
- pg_get_transaction_extradata to pg_get_transaction_extra_data
- pg_get_transaction_committime_data to
pg_get_transaction_commit_time_data
- pg_get_latest_transaction_committime_data to
pg_get_latest_transaction_commit_time_data


Makes sense.



On second thought, maybe those should be pg_get_transaction_committs, 
pg_get_transaction_committs_data, etc.


For me the commit time thing feels problematic in the way I perceive it 
- I see commit time as a point in time, where I see commit timestamp (or 
committs for short) as something that can recorded. So I would prefer to 
stick with commit timestamp/committs.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Michael Paquier
On Sat, Nov 1, 2014 at 9:04 PM, Petr Jelinek p...@2ndquadrant.com wrote:

 On 01/11/14 12:19, Petr Jelinek wrote:

 Hi,

 thanks for review.

 On 01/11/14 05:45, Michael Paquier wrote:

 Now here are a couple of comments at code level, this code seems not
 enough baked for a commit:
 1) The following renaming should be done:
 - pg_get_transaction_committime to pg_get_transaction_commit_time
 - pg_get_transaction_extradata to pg_get_transaction_extra_data
 - pg_get_transaction_committime_data to
 pg_get_transaction_commit_time_data
 - pg_get_latest_transaction_committime_data to
 pg_get_latest_transaction_commit_time_data


 Makes sense.


 On second thought, maybe those should be pg_get_transaction_committs,
 pg_get_transaction_committs_data, etc.
 For me the commit time thing feels problematic in the way I perceive it -
 I see commit time as a point in time, where I see commit timestamp (or
 committs for short) as something that can recorded. So I would prefer to
 stick with commit timestamp/committs.

Hehe, I got exactly the opposite impression while reading the patch, but
let's rely on your judgement for the namings. I am not the one writing this
code.
-- 
Michael


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Michael Paquier
On Sat, Nov 1, 2014 at 1:45 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 I am still planning to do more extensive tests, and study a bit more
 committs.c (with even more comments) as it is the core part of the feature.

More comments:
- Heikki already mentioned it, but after reading the code I see little
point in having the extra field implementing like that in core for many
reasons even if it is *just* 4 bytes:
1) It is untested and actually there is no direct use for it in core.
2) Pushing code that we know as dead is no good, that's a feature more or
less defined as maybe-useful-but-we-are-not-sure-yet-what-to-do-with-it.
3) If you're going to re-use this API in BDR, which is a fork of Postgres.
You'd better complete this API in BDR by yourself and not bother core with
that.
For those reasons I think that this extra field should be ripped off from
the patch.
- The API to get the commit timestamp is not that user-friendly, and I
think it could really be improved, to something like that for example:
pg_get_commit_timestamp(from_xact xid, number_of_xacts int);
pg_get_commit_timestamp(from_xact xid);
pg_get_commit_timestamp(); or pg_get_latest_commit_timestamp();
from_xact to NULL means latest. number_of_xacts to NULL means 1.
Comment in line with the fact that extra field is well, not really useful.
Regards,
-- 
Michael


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Petr Jelinek

On 01/11/14 14:00, Michael Paquier wrote:


More comments:
- Heikki already mentioned it, but after reading the code I see little
point in having the extra field implementing like that in core for many
reasons even if it is *just* 4 bytes:
1) It is untested and actually there is no direct use for it in core.
2) Pushing code that we know as dead is no good, that's a feature more
or less defined as maybe-useful-but-we-are-not-sure-yet-what-to-do-with-it.
3) If you're going to re-use this API in BDR, which is a fork of
Postgres. You'd better complete this API in BDR by yourself and not
bother core with that.
For those reasons I think that this extra field should be ripped off
from the patch.


Well this is not BDR specific thing, the idea is that with logical 
replication, commit timestamp is not enough for conflict handling, you 
also need to have additional info in order to identify some types of 
conflicts conflicts (local update vs remote update for example). So the 
extradata field was meant as something that could be used to add the 
additional info to the xid.


But I see your point, I think solving this issue can be left to the 
replication identifier patch that is discussed in separate thread.


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_ctl non-idempotent behavior change

2014-11-01 Thread Peter Eisentraut
On 10/11/14 6:54 PM, Bruce Momjian wrote:
 Are we going to unrevert this patch for 9.5?
 Seems no one is thinking of restoring this patch and working on the
 issue.

I had postponed work on this issue and set out to create a test
infrastructure so that all the subtle behavioral dependencies mentioned
in the thread could be expressed in code rather than prose.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Pipelining executions to postgresql server

2014-11-01 Thread Mikko Tiihonen
Hi,

I created a proof of concecpt patch for postgresql JDBC driver that allows the 
caller to do pipelining of requests within a transaction. The pipelining here 
means same as for HTTP: the client can send the next execution already before 
waiting for the response of the previous request to be fully processed.

The goal is to reduce the effects of latency between server and client. The 
pipelining allowed my test with localhost postgresql and jdbc test that queries 
a single value from database 200 times to get a more than 20% speed-up. The 
pipelining version processes the responses every 10 queries. With actual 
latency between the server and client larger speed-ups are of course possible.

I think pipelining + jsonb support would make postgresql even more competive 
key/value and document store.

Example use case:
1) insert to shopping cart table, and 3 inserts shopping cart rows table in one 
pipeline.
  - only one round trip penalty instead of 4
2) query shopping cart row and query shopping cart rows in one pipeline 
  - only one round trip penalty instead of 2

The only alternative way to reduce the round-trip latency is to make every 
operation in single round-trip and that can only be done with pl functions and 
by passing in more complex objects, for example the whole shopping cart with 
rows as json data.

What kind of problems could pipelining cause (assuming we limit it to rather 
simple use cases only)?

-MikkoFrom c662b8865c58cba714655148401ac86a21c10f3c Mon Sep 17 00:00:00 2001
From: Mikko Tiihonen mikko.tiiho...@nitorcreations.com
Date: Sat, 1 Nov 2014 15:43:19 +0200
Subject: [PATCH] Example pipelined single-shot query

---
 org/postgresql/core/QueryExecutor.java |  13 +++
 org/postgresql/core/v2/QueryExecutorImpl.java  |   5 +
 org/postgresql/core/v3/QueryExecutorImpl.java  |  41 +++
 org/postgresql/jdbc2/AbstractJdbc2Statement.java   |  79 +
 .../test/jdbc2/PipelineExecutionTest.java  | 129 +
 5 files changed, 267 insertions(+)
 create mode 100644 org/postgresql/test/jdbc2/PipelineExecutionTest.java

diff --git a/org/postgresql/core/QueryExecutor.java b/org/postgresql/core/QueryExecutor.java
index e80a23c..b8e46a6 100644
--- a/org/postgresql/core/QueryExecutor.java
+++ b/org/postgresql/core/QueryExecutor.java
@@ -8,6 +8,7 @@
 */
 package org.postgresql.core;
 
+import java.io.IOException;
 import java.sql.SQLException;
 
 import org.postgresql.copy.CopyOperation;
@@ -101,6 +102,16 @@ public interface QueryExecutor {
 static int QUERY_NO_BINARY_TRANSFER = 256;
 
 /**
+ * Flag for pipeline executions with responses read later.
+ */
+static int QUERY_PIPELINE = 512;
+
+/**
+ * Flag for pipeline executions with responses read later.
+ */
+static int QUERY_DEQUEUE_PIPELINE = 1024;
+
+/**
  * Execute a Query, passing results to a provided ResultHandler.
  *
  * @param query the query to execute; must be a query returned from
@@ -125,6 +136,8 @@ public interface QueryExecutor {
  int flags)
 throws SQLException;
 
+void processPipelinedResult(ResultHandler handler) throws SQLException;
+
 /**
  * Execute several Query, passing results to a provided ResultHandler.
  *
diff --git a/org/postgresql/core/v2/QueryExecutorImpl.java b/org/postgresql/core/v2/QueryExecutorImpl.java
index 33c0048..5a6f607 100644
--- a/org/postgresql/core/v2/QueryExecutorImpl.java
+++ b/org/postgresql/core/v2/QueryExecutorImpl.java
@@ -616,4 +616,9 @@ public class QueryExecutorImpl implements QueryExecutor {
 public CopyOperation startCopy(String sql, boolean suppressBegin) throws SQLException {
 throw new PSQLException(GT.tr(Copy not implemented for protocol version 2), PSQLState.NOT_IMPLEMENTED);
 }
+
+@Override
+public void processPipelinedResult(ResultHandler handler) throws SQLException {
+throw new PSQLException(GT.tr(Copy not implemented for protocol version 2), PSQLState.NOT_IMPLEMENTED);
+}
 }
diff --git a/org/postgresql/core/v3/QueryExecutorImpl.java b/org/postgresql/core/v3/QueryExecutorImpl.java
index 966a6f6..7297764 100644
--- a/org/postgresql/core/v3/QueryExecutorImpl.java
+++ b/org/postgresql/core/v3/QueryExecutorImpl.java
@@ -11,6 +11,7 @@ package org.postgresql.core.v3;
 import org.postgresql.core.*;
 
 import java.util.ArrayList;
+import java.util.LinkedList;
 import java.util.List;
 import java.util.HashMap;
 import java.util.Properties;
@@ -1713,7 +1714,33 @@ public class QueryExecutorImpl implements QueryExecutor {
 }
 }
 
+public void processPipelinedResult(ResultHandler handler) throws SQLException {
+ResultHandlerHolder holder;
+while ((holder = pipelineResultHandlers.remove(0)) != null) {
+try {
+processResults(holder.handler, holder.flags  (~QUERY_PIPELINE) | QUERY_DEQUEUE_PIPELINE);
+} catch (IOException e) {
+

Re: [HACKERS] Switch pg_basebackup to use -X stream instead of -X fetch by default?

2014-11-01 Thread Peter Eisentraut
On 10/13/14 3:16 PM, Bruce Momjian wrote:
 Where are we on this?

I think we discovered that the pros and cons of the different options
are not as clear-cut, and it's better to leave the default as is until
additional features make one option a clear winner.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pipelining executions to postgresql server

2014-11-01 Thread Tom Lane
Mikko Tiihonen mikko.tiiho...@nitorcreations.com writes:
 I created a proof of concecpt patch for postgresql JDBC driver that allows 
 the caller to do pipelining of requests within a transaction. The pipelining 
 here means same as for HTTP: the client can send the next execution already 
 before waiting for the response of the previous request to be fully processed.

In principle this can work if you think through the error handling
carefully.  The way the FE/BE protocol is intended to handle the case
is that all the pipelined statements are part of one transaction so
that they all succeed or fail together.  Otherwise the user has to work
through and predict the outcome of all the combinations of Q1 succeeded,
Q2 failed, Q3 succeeded which is mighty error-prone, especially if the
commands are interdependent in any way at all.  Having legislated that
(and forbidden any transaction-boundary commands in a pipelined group),
what you do is not issue a Sync until after the last command of the set.
Then, if any command fails, the backend just automatically discards
remaining messages until it gets the Sync, and so you can safely issue
new commands before knowing the results of the previous ones.

I mention this because, although I don't know the innards of the JDBC
driver at all, it sure doesn't look like that's the way you've done it.
If I'm right that you've left the error recovery strategy to the user,
I think this is gonna be very hard to use safely.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] _mdfd_getseg can be expensive

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-10-31 18:48:45 -0400, Tom Lane wrote:
 While the basic idea is sound, this particular implementation seems
 pretty bizarre.  What's with the md_seg_no stuff, and why is that
 array typed size_t?

 It stores the length of the array of _MdfdVec entries.

Oh.  seg_no seems like not a very good choice of name then.
Perhaps md_seg_count or something like that would be more intelligible.

And personally I'd have made it an int, because we are certainly not doing
segment-number arithmetic in anything wider than int anywhere else.
Introducing size_t into the mix won't do anything except create a risk of
signed-vs-unsigned logic bugs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Josh Berkus
On 10/31/2014 03:07 PM, Tom Lane wrote:
 I don't care one way or the other about the money type, but I will defend
 hash indexes, especially seeing that we've already added a pretty
 in-your-face warning as of 9.5:
 
 regression=# create table foo(f1 int);
 CREATE TABLE
 regression=# create index on foo using hash (f1);
 WARNING:  hash indexes are not WAL-logged and their use is discouraged
 CREATE INDEX

Yes, and I'm arguing that is the wrong decision.  If hash indexes are
discouraged, then they shouldn't be in core in the first place.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] ltree::text not immutable?

2014-11-01 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 Tom Lane wrote:
 Not entirely sure what to do about this.  It seems like for the purposes
 of contrib/chkpass, it's a good thing that chkpass_in() won't reuse the
 same salt.  Weak as a 12-bit salt might be nowadays, it's still better
 than no salt.  Nonetheless, this behavior is breaking assumptions made
 in places like array_in and record_in.
 
 For the moment I'm tempted to mark chkpass_in as stable (with a comment
 explaining that it isn't really) just so we can put in the error check
 in CREATE TYPE.  But I wonder if anyone has a better idea.

 Can we have a separate function that accepts unencrypted passwords, and
 change chkpass_in to only accept encrypted ones?  Similar to
 to_tsquery() et al.

Well, that would undoubtedly have been a better way to design the module
to start with, but we're not working in a green field.  I doubt we can
get away with changing the type's behavior that much.  That is, assuming
there are any users of it at all, which there might not be ;-)

An alternative that just occurred to me is to put the no-volatile-
I/O-functions check into CREATE TYPE, but make it just WARNING not
ERROR.  That would be nearly as good as an ERROR in terms of nudging
people who'd accidentally omitted a volatility marking from their
custom types.  But we could leave chkpass as-is and wait to see if
anyone complains hey, why am I getting this warning?.  If we don't
hear anyone complaining, maybe that means we can get away with changing
the type's behavior in 9.6 or later.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] _mdfd_getseg can be expensive

2014-11-01 Thread Andres Freund
On 2014-11-01 12:57:40 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-10-31 18:48:45 -0400, Tom Lane wrote:
  While the basic idea is sound, this particular implementation seems
  pretty bizarre.  What's with the md_seg_no stuff, and why is that
  array typed size_t?
 
  It stores the length of the array of _MdfdVec entries.
 
 Oh.  seg_no seems like not a very good choice of name then.
 Perhaps md_seg_count or something like that would be more intelligible.

That's fine with me.

 And personally I'd have made it an int, because we are certainly not doing
 segment-number arithmetic in anything wider than int anywhere else.

Fine with me too. I picked size_t by habit, because there's projects
that don't allow anything else to be used for lengths of memory...

I've, during testing, also noticed it has accidentally introduced a
vfd/memory leak...

So I'll repost a version with those fixes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:
 On 10/31/2014 03:07 PM, Tom Lane wrote:
  I don't care one way or the other about the money type, but I will defend
  hash indexes, especially seeing that we've already added a pretty
  in-your-face warning as of 9.5:
  
  regression=# create table foo(f1 int);
  CREATE TABLE
  regression=# create index on foo using hash (f1);
  WARNING:  hash indexes are not WAL-logged and their use is discouraged
  CREATE INDEX
 
 Yes, and I'm arguing that is the wrong decision.  If hash indexes are
 discouraged, then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 On 10/31/2014 03:07 PM, Tom Lane wrote:
 I don't care one way or the other about the money type, but I will defend
 hash indexes, especially seeing that we've already added a pretty
 in-your-face warning as of 9.5:
 
 regression=# create table foo(f1 int);
 CREATE TABLE
 regression=# create index on foo using hash (f1);
 WARNING:  hash indexes are not WAL-logged and their use is discouraged
 CREATE INDEX

 Yes, and I'm arguing that is the wrong decision.  If hash indexes are
 discouraged, then they shouldn't be in core in the first place.

There's an awful lot of stuff in core that probably shouldn't be there,
but we've not made a move to rip it out, especially not if there wasn't
a way to make it an external module.

In the case of hash indexes, because we still have to have the hash
opclasses in core, there's no way that it could be pushed out as an
extension module even if we otherwise had full support for AMs as
extensions.  So what I hear you proposing is let's break this so
thoroughly that it *can't* be fixed.  I'm not on board with that.
I think the WARNING will do just fine to discourage novices who are
not familiar with the state of the hash AM.  In the meantime, we
could push forward with the idea of making hash indexes automatically
unlogged, so that recovering from a crash wouldn't be quite so messy/
dangerous.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pipelining executions to postgresql server

2014-11-01 Thread Andres Freund
On 2014-11-01 14:04:05 +, Mikko Tiihonen wrote:
 I created a proof of concecpt patch for postgresql JDBC driver that
 allows the caller to do pipelining of requests within a
 transaction. The pipelining here means same as for HTTP: the client
 can send the next execution already before waiting for the response of
 the previous request to be fully processed.

Slightly confused here. To my knowledge the jdbc driver already employs
some pipelining? There's some conditions where it's disabled (IIRC
RETURNING for DML is one of them), but otherwise it's available.

I'm very far from a pgjdbc expert, but that's what I gathered from the
code when investigating issues a while back and from my colleague Craig.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Andres Freund
On 2014-11-01 13:45:44 +0900, Michael Paquier wrote:
 14) I'd put the two checks in the reverse order:
 +   Assert(xid != InvalidTransactionId);
 +
 +   if (!commit_ts_enabled)
 +   return;

Please don't. The order is correct right now. Why you ask? This way the
correctness of the callsites is checked even when committs is
disabled. Which it'll likely be on the majority of developer setups. And
what's the upsite of changing the order? There's no difference in the
generated code in production builds and the overhead in assert enabled
ones is neglegible.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tracking commit timestamps

2014-11-01 Thread Andres Freund
On 2014-11-01 22:00:40 +0900, Michael Paquier wrote:
 On Sat, Nov 1, 2014 at 1:45 PM, Michael Paquier michael.paqu...@gmail.com
 wrote:
 
  I am still planning to do more extensive tests, and study a bit more
  committs.c (with even more comments) as it is the core part of the feature.
 
 More comments:
 - Heikki already mentioned it, but after reading the code I see little
 point in having the extra field implementing like that in core for many
 reasons even if it is *just* 4 bytes:
 1) It is untested and actually there is no direct use for it in core.

Meh. The whole feature is only there for extensions, not core.

 2) Pushing code that we know as dead is no good, that's a feature more or
 less defined as maybe-useful-but-we-are-not-sure-yet-what-to-do-with-it.

Uh. It's not more/less dead than the whole of committs.

 3) If you're going to re-use this API in BDR, which is a fork of Postgres.
 You'd better complete this API in BDR by yourself and not bother core with
 that.

I think that's a fundamentally wrong position. The only reason BDR isn't
purely stock postgres is that some features couldn't sanely be made work
without patches. I *hate* the fact that we had to do so. And I really
hope that we don't need any of the patches we have when building against
9.5.

So, now you might argue that the additional data is useless. But I think
that's just not thought far enough. If you think about it, in which
scenarios do you want to map xids to the commit timestamp? Primarily
that's going to be replication, right? One of the most obvious usecases
is allowing to detect/analyze/resolve conflicts in a multimaster setup,
right? To make sensible decisisons you'll often want to have more
information about the involved transactions. Makes sense so far?

Now, you might argue that could just be done with some table
transaction_metadata(xid DEFAULT txid_current(), meta, data). But that
has *significant* disadvantages: For one, it'll not work correctly once
subtransactions are used. Not good.  For another it has about a
magnitude higher overhead than the committs way.

And it's not like the the extra field is in any way bdr specific - even
if you actually want to store much more information about the
transaction than just the origin (which is what bdr does), you can use
it to correctly solve the subtransaction problem and refer to some
transaction metadata table.

 - The API to get the commit timestamp is not that user-friendly, and I
 think it could really be improved, to something like that for example:
 pg_get_commit_timestamp(from_xact xid, number_of_xacts int);

What'd be the point of this?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 13:58:02 -0400, Tom Lane wrote:
 Yeah.  When we last discussed this, the difficulty was around how to make
 that combination work.  An unlogged index on an unlogged table is no
 problem: the init-fork mechanism is able to make them both go to empty
 after a crash.  But for an unlogged index on a logged table, just making
 the index go to empty is not the correct recovery action.

Just removing the relfilenode would probably be better than what we have
today. It sure is pretty unsatisfactory to get a ERROR after a
crash/immediate restart, but it's far better than getting wrong results
back (or even crash).

 We'd been wondering how to make crash recovery change the index's pg_index
 entry into not indisvalid/indisready status.  That's hard to do.

It really is hard.

What I've been wondering about whether we should be boring and do it on
backend startup. Roughly similar to the way the relcache init file is
handled. The first backend that starts up for a certain database* scans
pg_index and rechecks all unlogged indexes of logged tables and marks
them as invalid.

* There's a couple ways we could detect that. The most trivial is to
just have a 'initialized' marker file in the database whose presence we
test on startup. That's probably acceptable costwise in comparison to
all the other stuff happening at backend init. We could get more fancy,
but it's probably not needed.

 But
 maybe we don't have to.  What about having the init-fork mechanism restore
 a hash index into a state with the metapage marked as invalid?  hashinsert
 etc could simply do nothing when they see this metapage state.
 hashgettuple could throw an error saying the index isn't usable until it's
 been REINDEXed.

 This is not quite as nice as an indisvalid-based solution, because it
 requires some extra cooperation from the index AM's code.  But setting up
 an init fork requires work from the index AM anyway, so that objection
 doesn't seem terribly strong to me.

The most annoying thing I see with that kind of approach is that we'd
need to read the metapage pretty early during planning. The second most
annoying is that it's harder to for the user to detect which indexes are
in that state - we'd probably need to provide a SQL level function to
detect it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andrew Dunstan


On 11/01/2014 01:26 PM, Andres Freund wrote:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

On 10/31/2014 03:07 PM, Tom Lane wrote:

I don't care one way or the other about the money type, but I will defend
hash indexes, especially seeing that we've already added a pretty
in-your-face warning as of 9.5:

regression=# create table foo(f1 int);
CREATE TABLE
regression=# create index on foo using hash (f1);
WARNING:  hash indexes are not WAL-logged and their use is discouraged
CREATE INDEX

Yes, and I'm arguing that is the wrong decision.  If hash indexes are
discouraged, then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.

So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...





Yes, although there might not be much reason to use them either. I think 
Tom's first step of making hash indexes automatically unlogged makes 
sense. Longer term I'd like to see unlogged as an option for all AMs - 
especially btree. It makes plenty of sense to me to be able to make the 
data resilient even if the indexes, which can after all be recreated in 
the unlikely event of a crash, are not.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:13:02 -0400, Andrew Dunstan wrote:
 Yes, although there might not be much reason to use them either. I think
 Tom's first step of making hash indexes automatically unlogged makes sense.
 Longer term I'd like to see unlogged as an option for all AMs - especially
 btree.

If we implement unlogged indexes, it should definitely work for all
builtin indexes.

 It makes plenty of sense to me to be able to make the data resilient
 even if the indexes, which can after all be recreated in the unlikely event
 of a crash, are not.

While I agree that it can be useful, I think the fact that replicas/HS
can't use such index makes them far less useful.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andrew Dunstan


On 11/01/2014 01:58 PM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-11-01 10:18:03 -0700, Josh Berkus wrote:

Yes, and I'm arguing that is the wrong decision.  If hash indexes are
discouraged, then they shouldn't be in core in the first place.

Last time we discussed it there were people (IIRC Andrew was one of
them) commenting that they use hash indexes *precisely* because they're
not WAL logged and that they can live with the dangers that creates. I
don't think that's sufficient justification for introducing the feature
at all. But it's nothing new that removing a feature has to fit quite
different criteria than adding one.
So, by that argument we could remove hash indexes once we have unlogged
indexes on logged tables. But then there's no need to remove them
anymore...

Yeah.  When we last discussed this, the difficulty was around how to make
that combination work.  An unlogged index on an unlogged table is no
problem: the init-fork mechanism is able to make them both go to empty
after a crash.  But for an unlogged index on a logged table, just making
the index go to empty is not the correct recovery action.

We'd been wondering how to make crash recovery change the index's pg_index
entry into not indisvalid/indisready status.  That's hard to do.  But
maybe we don't have to.  What about having the init-fork mechanism restore
a hash index into a state with the metapage marked as invalid?  hashinsert
etc could simply do nothing when they see this metapage state.
hashgettuple could throw an error saying the index isn't usable until it's
been REINDEXed.

This is not quite as nice as an indisvalid-based solution, because it
requires some extra cooperation from the index AM's code.  But setting up
an init fork requires work from the index AM anyway, so that objection
doesn't seem terribly strong to me.

Thoughts?




Isn't the planner still going to try to use the index in that case? If 
it's not then I'd be OK with it, but if it's going to make the table 
largely unusable until it's reindexed that would be rather sad.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 13:58:02 -0400, Tom Lane wrote:
 maybe we don't have to.  What about having the init-fork mechanism restore
 a hash index into a state with the metapage marked as invalid?  hashinsert
 etc could simply do nothing when they see this metapage state.
 hashgettuple could throw an error saying the index isn't usable until it's
 been REINDEXed.

 The most annoying thing I see with that kind of approach is that we'd
 need to read the metapage pretty early during planning.

No, I was specifically *not* proposing that.  What I proposed was if the
planner picks the index for use in a query, you get an error.

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect.  An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

However, if the consensus is that silently ignoring the index is the best
behavior, I would not be too concerned about the cost of checking the
metapage to see if the index is valid.  A hash index's metapage would tend
to stay pinned in shared buffers anyway, because it's necessarily touched
on each use of the index.  If that opinion proves wrong, the AM could take
steps to cache the state in the index's relcache entry (btree already
maintains similar cached state).

 The second most
 annoying is that it's harder to for the user to detect which indexes are
 in that state - we'd probably need to provide a SQL level function to
 detect it.

Sure, we could do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:19:22 -0400, Andrew Dunstan wrote:
 Isn't the planner still going to try to use the index in that case? If it's
 not then I'd be OK with it, but if it's going to make the table largely
 unusable until it's reindexed that would be rather sad.

Both the planner (for querying) and the executor (to avoid inserting
tuples into the index) would have to query the state of such indexes. I
don't think it can reasonably work otherwise.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 14:19:22 -0400, Andrew Dunstan wrote:
 Isn't the planner still going to try to use the index in that case? If it's
 not then I'd be OK with it, but if it's going to make the table largely
 unusable until it's reindexed that would be rather sad.

 Both the planner (for querying) and the executor (to avoid inserting
 tuples into the index) would have to query the state of such indexes. I
 don't think it can reasonably work otherwise.

The executor doesn't need to know anything, since the AM can trivially
make aminsert be a no-op if the index is internally invalid.  The planner
only needs to know something if we think that silently being slow for a
query meant to search the index is better than throwing an error reminding
the user that the index needs to be reindexed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:23:07 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-11-01 13:58:02 -0400, Tom Lane wrote:
  maybe we don't have to.  What about having the init-fork mechanism restore
  a hash index into a state with the metapage marked as invalid?  hashinsert
  etc could simply do nothing when they see this metapage state.
  hashgettuple could throw an error saying the index isn't usable until it's
  been REINDEXed.
 
  The most annoying thing I see with that kind of approach is that we'd
  need to read the metapage pretty early during planning.
 
 No, I was specifically *not* proposing that.  What I proposed was if the
 planner picks the index for use in a query, you get an error.

Ugh. I think that'll be pretty ugly. At the very least we'd need to
provide a way to mark such indexes as 'actually invalid' in the sense of
indisready. It'll not only block queries that sensibly would end up
using that index, but also all modifications - making the relation
essentially read only.

If we just want this, we can just remove the main fork of such relations
on startup, and not bother with index specific stuff.

 Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
 need to be a way to detect the invalid index at plan time and not use it.
 But I'm not sure that that's actually an improvement from the user's
 standpoint: what they'd see is queries suddenly, and silently, performing
 a lot worse than they expect.  An explicit complaint about the necessary
 REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

 However, if the consensus is that silently ignoring the index is the best
 behavior, I would not be too concerned about the cost of checking the
 metapage to see if the index is valid.  A hash index's metapage would tend
 to stay pinned in shared buffers anyway, because it's necessarily touched
 on each use of the index.  If that opinion proves wrong, the AM could take
 steps to cache the state in the index's relcache entry (btree already
 maintains similar cached state).

We also could just put it in the generic relcache code...

Yea.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 A REINDEX is imo unlikely to be acceptable. It takes long (why would you
 bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX?  Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:39:21 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  A REINDEX is imo unlikely to be acceptable. It takes long (why would you
  bother on a small table?) and locks the relation/indexes.
 
 I think the goalposts just took a vacation to Acapulco.

I think that might be caused by a misunderstanding.

 What exactly do you think is going to make a crashed unlogged index valid
 again without a REINDEX?  Certainly the people who are currently using
 hash indexes in the way Andrew describes are expecting to have to REINDEX
 them after a crash.

Obviously that individual index needs to be recreated. What I mean is
that I don't think it'll be acceptable that the table essentially can't
be queried before that's done. The situations in which I'd found
unlogged indexes useful is where there's some indexes are critical for
the OLTP business (those would continue to be logged), but some other
large ones are for things that aren't absolutely essential. Reports and
such.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andrew Dunstan


On 11/01/2014 02:34 PM, Andres Freund wrote:

Yeah, if we were trying to duplicate the behavior of indisvalid, there'd
need to be a way to detect the invalid index at plan time and not use it.
But I'm not sure that that's actually an improvement from the user's
standpoint: what they'd see is queries suddenly, and silently, performing
a lot worse than they expect.  An explicit complaint about the necessary
REINDEX seems more user-friendly from where I sit.

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.



It's a bit of a pity we don't have REINDEX CONCURRENTLY.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 14:39:21 -0400, Tom Lane wrote:
 What exactly do you think is going to make a crashed unlogged index valid
 again without a REINDEX?  Certainly the people who are currently using
 hash indexes in the way Andrew describes are expecting to have to REINDEX
 them after a crash.

 Obviously that individual index needs to be recreated. What I mean is
 that I don't think it'll be acceptable that the table essentially can't
 be queried before that's done. The situations in which I'd found
 unlogged indexes useful is where there's some indexes are critical for
 the OLTP business (those would continue to be logged), but some other
 large ones are for things that aren't absolutely essential. Reports and
 such.

Sure.  And as long as you aren't issuing queries that would want to scan
the crashed index, it won't matter either way.  The question is whether
you'd rather that your inessential reporting queries fail without the
broken index, or that they take extreme amounts of time/resources.
I don't think it's obvious that the first alternative is bad.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:48:20 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-11-01 14:39:21 -0400, Tom Lane wrote:
  What exactly do you think is going to make a crashed unlogged index valid
  again without a REINDEX?  Certainly the people who are currently using
  hash indexes in the way Andrew describes are expecting to have to REINDEX
  them after a crash.
 
  Obviously that individual index needs to be recreated. What I mean is
  that I don't think it'll be acceptable that the table essentially can't
  be queried before that's done. The situations in which I'd found
  unlogged indexes useful is where there's some indexes are critical for
  the OLTP business (those would continue to be logged), but some other
  large ones are for things that aren't absolutely essential. Reports and
  such.
 
 Sure.  And as long as you aren't issuing queries that would want to scan
 the crashed index, it won't matter either way.  The question is whether
 you'd rather that your inessential reporting queries fail without the
 broken index, or that they take extreme amounts of time/resources.
 I don't think it's obvious that the first alternative is bad.

In some of these cases the unlogged index would still be used for a
subset of the OLTP workload, e.g. because they're smaller. We e.g. have
a client that has smaller (as in 50GB instead of 600GB) indexes for rows
of a certain type in the table, but also one that spans the whole thing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andrew Dunstan


On 11/01/2014 02:39 PM, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

A REINDEX is imo unlikely to be acceptable. It takes long (why would you
bother on a small table?) and locks the relation/indexes.

I think the goalposts just took a vacation to Acapulco.

What exactly do you think is going to make a crashed unlogged index valid
again without a REINDEX?  Certainly the people who are currently using
hash indexes in the way Andrew describes are expecting to have to REINDEX
them after a crash.





That's certainly true. They were warned of the risks and found them 
acceptable.


The real question here is whether the table should continue to be usable 
in a degraded state until it's reindexed.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 14:56:35 -0400, Andrew Dunstan wrote:
 
 On 11/01/2014 02:39 PM, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
 A REINDEX is imo unlikely to be acceptable. It takes long (why would you
 bother on a small table?) and locks the relation/indexes.
 I think the goalposts just took a vacation to Acapulco.
 
 What exactly do you think is going to make a crashed unlogged index valid
 again without a REINDEX?  Certainly the people who are currently using
 hash indexes in the way Andrew describes are expecting to have to REINDEX
 them after a crash.

 That's certainly true. They were warned of the risks and found them
 acceptable.
 
 The real question here is whether the table should continue to be usable in
 a degraded state until it's reindexed.

One argument in that direction imo is HS. We certainly would just
generally ignore unlogged indexes for querying while InRecovery, right?
Because otherwise HS would become pretty useless. And I think it'd be
pretty wierd if things worked on HS and not on the primary (or the HS
after promotion).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 14:48:20 -0400, Tom Lane wrote:
 Sure.  And as long as you aren't issuing queries that would want to scan
 the crashed index, it won't matter either way.  The question is whether
 you'd rather that your inessential reporting queries fail without the
 broken index, or that they take extreme amounts of time/resources.
 I don't think it's obvious that the first alternative is bad.

 In some of these cases the unlogged index would still be used for a
 subset of the OLTP workload, e.g. because they're smaller. We e.g. have
 a client that has smaller (as in 50GB instead of 600GB) indexes for rows
 of a certain type in the table, but also one that spans the whole thing.

Meh.  There are no existing use-cases where anyone would dare use a hash
index in that way, because they couldn't be sure it would be valid.

However, it seems like we're arguing over not much.  We'd certainly need a
new AM entry point to test whether the index is internally valid, so as to
support a SQL-level function that can report that.  This argument boils
down to whether or not the planner should expend a few cycles to call
that, which in the end is unlikely to cost enough to notice (especially if
the AM arranges to cache that state in indexes' relcache entries).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 The real question here is whether the table should continue to be usable 
 in a degraded state until it's reindexed.

It certainly will be, as long as your notion of usable in a degraded
state doesn't include issuing queries that would prefer to use the broken
index.  The discussion is about what exactly should happen if you do that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 One argument in that direction imo is HS. We certainly would just
 generally ignore unlogged indexes for querying while InRecovery, right?
 Because otherwise HS would become pretty useless. And I think it'd be
 pretty wierd if things worked on HS and not on the primary (or the HS
 after promotion).

I don't see how HS has anything to do with this discussion.  We would
certainly have the index marked as unlogged in the catalogs, and we
would therefore not use it while InRecovery.  Once you promote a
slave to live, it would be in the same state as a post-crash-recovery
master.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 15:11:40 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  One argument in that direction imo is HS. We certainly would just
  generally ignore unlogged indexes for querying while InRecovery, right?
  Because otherwise HS would become pretty useless. And I think it'd be
  pretty wierd if things worked on HS and not on the primary (or the HS
  after promotion).
 
 I don't see how HS has anything to do with this discussion.  We would
 certainly have the index marked as unlogged in the catalogs, and we
 would therefore not use it while InRecovery.

Consider:
SELECT * FROM tbl WHERE active AND value = $1;
that can be satisfied by two indexes. One on (value), and an unlogged
index on (value) WHERE active. While in HS only the logged one will be
used.  But if we don't silently ignore invalid unlogged indexes, hell
will break loose after promition because suddenly the predicated index
will be used in plans.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Temp tables, pg_class_temp and AccessExclusiveLocks

2014-11-01 Thread Pavel Stehule
Hi

good idea. Mix of these both strategies can be a base for global temp
tables implementation.

Pavel

2014-10-31 15:53 GMT+01:00 Simon Riggs si...@2ndquadrant.com:

 While investigating how to reduce logging of AccessExclusiveLocks for
 temp tables, I came up with the attached patch.

 My feeling was that's an ugly patch, but it set me thinking that a
 more specialised code path around temp tables could be useful in other
 ways, and therefore worth pursuing further.

 The patch creates a relation_open_temp(), which could then allow a
 RelationIdGetTempRelation() which could have a path that calls
 ScanPgRelation on a new catalog table called pg_class_temp, and other
 _temp catalog table accesses.

 So we could isolate all temp table access to specific tables. That would
 * cause less bloat in the main catalog tables
 * avoid logging AccessExclusiveLocks on temp tables
 * less WAL traffic

 Why less WAL traffic? Because we can choose then to make *_temp catalog
 tables
 1. use unlogged table mechanism
 2. use local tables inside the local buffer manager

 My preference would be (1) because all we need to do is change the
 catalog table oid when searching, we don't need to do anything else.

 --
  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 15:11:40 -0400, Tom Lane wrote:
 I don't see how HS has anything to do with this discussion.

 Consider:
 SELECT * FROM tbl WHERE active AND value = $1;
 that can be satisfied by two indexes. One on (value), and an unlogged
 index on (value) WHERE active. While in HS only the logged one will be
 used.  But if we don't silently ignore invalid unlogged indexes, hell
 will break loose after promition because suddenly the predicated index
 will be used in plans.

This example is utterly unrelated to HS: the same hazard would exist
on the master after a crash, with or without HS.

With current usage of hash indexes, nobody would ever construct such an
arrangement, or at least they would not expect to be able to start up
querying without having reindexed first.  My proposal to throw errors
was based on the assumption that unlogged indexes would only be used to
support a readily identified subset of inessential queries.  If we think
examples like the above are things people would actually want to do, then
yeah we would want to silently skip broken indexes.  I personally find
this example less than compelling and think it's far more likely that
we'll get complaints about the system *not* complaining when a broken
index would be needed.  However, it's only one line of code either way,
so I'm not going to argue about it further.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Andres Freund
On 2014-11-01 15:33:27 -0400, Tom Lane wrote:
 With current usage of hash indexes, nobody would ever construct such an
 arrangement

Do you think this should only be implemented for hash indexes? I'd think
we'd do it for all existing index AMs?

I'm not all that excited about unlogged hash indexes. Yes, that'll
remove a annoying hazard, but I probably won't use them anyway. I am
somewhat excited about the more general unlogged indexes feature.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-11-01 15:33:27 -0400, Tom Lane wrote:
 With current usage of hash indexes, nobody would ever construct such an
 arrangement

 Do you think this should only be implemented for hash indexes? I'd think
 we'd do it for all existing index AMs?

I think we should eventually get to that point, perhaps.  I'd want to do
it first for hash, since that's what direly needs it.  I can see the value
of doing it for btree as well, but I'm less sure that the other index
types would have enough usage to justify the work.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Peter Geoghegan
On Sat, Nov 1, 2014 at 12:40 PM, Andres Freund and...@2ndquadrant.com wrote:
 I'm not all that excited about unlogged hash indexes. Yes, that'll
 remove a annoying hazard, but I probably won't use them anyway. I am
 somewhat excited about the more general unlogged indexes feature.

I strongly agree.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Robert Haas
On Sat, Nov 1, 2014 at 1:55 PM, Andres Freund and...@2ndquadrant.com wrote:
 Where will those preexisting cache entries come from, exactly?  The
 postmaster is forking the parallel worker, not the user backend.

 Several things:
 1) The relcache init files load a fair bit
 2) There's cache entries made just during startup while we look up user
information and such
 3) There's lots of places you can hook into where it's perfectly legal
and expected that you access system caches. I don't think you can
reasonably define those away.
 4) I'm pretty sure that one of the early next steps will be to reuse a
bgworker for multiple tasks. So there'll definitely be preexisting
cache entries.

 I'm pretty sure that it's impossible to define the problem away. We
 *might* be able to say that we'll just do a InvalidateSystemCaches() at
 start of the parallelized task.

I see.  I haven't thought deeply about those things, and there could
indeed be problems there.

  What I have serious doubts about is 'coowning' locks. Especially if two
  backends normally wouldn't be able to both get that lock.

 Perhaps we should discuss that more.  To me it seems pretty obvious
 that's both safe and important.

 I completely fail to see why you'd generally think it's safe for two
 backends to hold the same self conflicting lock. Yes, within carefully
 restricted sections of code that might be doable. But I don't think you
 can fully enforce that. People *will* mark their own functions at
 parallel safe. And do stupid stuff. That shouldn't cause
 segfaults/exploits/low level data corruption/.. as long as it's done
 from !C functions.

I agree that people will mark their own functions as parallel-safe,
and I also agree that shouldn't cause segfaults, exploits, or
low-level data corruption.  Clearly, we'll need to impose a bunch of
restrictions to achieve that.  But if you put a LOCK TABLE statement
in an SQL function, it won't conflict with other locking request made
by the same or another function elsewhere in the same transaction;
surely you don't want that behavior to be *different* in parallel
mode.  That would be a blatant, user-visible behavior difference
justified by ... nothing at all.  There's no hazard there.  Where you
start getting into crash/exploit/data corruption territory is when you
are talking about DDL operations that change the physical structure of
the table.  That's why we have stuff like CheckTableNotInUse() to
verify that, for example, there are no old cursors around that are
still expecting the old relfilenode and tuple descriptor to be valid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Robert Haas
On Sat, Nov 1, 2014 at 1:55 PM, Andres Freund and...@2ndquadrant.com wrote:
 I doubt it. There's a whole bunch of problems that just exist by virtue
 of having a group leader. What if the leader dies but the worker backend
 isn't in a section of code that does a CHECK_FOR_INTERRUPTS()?

In between all of the heat about whether I'm writing the correct
patch, or whether I'm overall making progress fast enough, somebody
could, I don't know, read the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Andres Freund
On 2014-11-01 17:00:59 -0400, Robert Haas wrote:
 On Sat, Nov 1, 2014 at 1:55 PM, Andres Freund and...@2ndquadrant.com wrote:
  I doubt it. There's a whole bunch of problems that just exist by virtue
  of having a group leader. What if the leader dies but the worker backend
  isn't in a section of code that does a CHECK_FOR_INTERRUPTS()?
 
 In between all of the heat about whether I'm writing the correct
 patch, or whether I'm overall making progress fast enough, somebody
 could, I don't know, read the patch.

I plan to, but that takes a fair amount of time. And there's at least
one patch of yours ahead in the queue... And I do think that most of the
concerns are more general than a specific implementation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Robert Haas
On Sat, Nov 1, 2014 at 5:06 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-11-01 17:00:59 -0400, Robert Haas wrote:
 On Sat, Nov 1, 2014 at 1:55 PM, Andres Freund and...@2ndquadrant.com wrote:
  I doubt it. There's a whole bunch of problems that just exist by virtue
  of having a group leader. What if the leader dies but the worker backend
  isn't in a section of code that does a CHECK_FOR_INTERRUPTS()?

 In between all of the heat about whether I'm writing the correct
 patch, or whether I'm overall making progress fast enough, somebody
 could, I don't know, read the patch.

 I plan to, but that takes a fair amount of time. And there's at least
 one patch of yours ahead in the queue... And I do think that most of the
 concerns are more general than a specific implementation.

Fair enough.  Well, the way I solved that specific problem is to allow
the group leader to terminate but not to return its PGPROC to the free
list until the last group member exits.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Robert Haas
On Sat, Nov 1, 2014 at 1:09 AM, Simon Riggs si...@2ndquadrant.com wrote:
 What are the specific restrictions you are suggesting we place on
 parallel workers? We need that definition clear so we can decide how
 to mark the functions. If we don't know, which is easily
 understandable, then the best way to find that out is to build a
 restricted solution and to expand slowly outwards to find problems.

 An obvious milestone there is whether a function contains SQL, which
 is the point chosen by some other databases. I personally hope to
 expand upon that, but it would be useful to reach it first and then
 continue from there.

 I was already thinking of implementing CONTAINS NO SQL as a SQL
 Standard function marking since it can be used to optimize COPY
 further.

I've written some fairly extensive thoughts on this topic at
http://wiki.postgresql.org/wiki/Parallel_Sort

Personally, I tend to focus more on what needs to be allowed in
parallel workers than what needs to be prohibited.  The motivation for
this patch is simple:

1. Any non-trivial piece of PostgreSQL code is likely to contain
syscache lookups.
2. Syscache lookups had better work in parallel workers, or they'll be
all but useless.
3. Some of those syscache lookups will be satisfied from cache, but
sometimes we'll need to actually read the system catalogs.
4. That requires locks.
5. Gosh, we'd better not deadlock.

There's a lot of infrastructure that I think will work fine in
parallel mode without any changes.  Buffer locking, buffer pins,
buffer lookups, and lwlocks, for example, I expect to all just work
without any real.  But heavyweight locks and invalidation messages
seem like they'll need a bit of surgery to work correctly in this
environment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Andres Freund
On 2014-11-01 16:59:34 -0400, Robert Haas wrote:
  I completely fail to see why you'd generally think it's safe for two
  backends to hold the same self conflicting lock. Yes, within carefully
  restricted sections of code that might be doable. But I don't think you
  can fully enforce that. People *will* mark their own functions at
  parallel safe. And do stupid stuff. That shouldn't cause
  segfaults/exploits/low level data corruption/.. as long as it's done
  from !C functions.
 
 I agree that people will mark their own functions as parallel-safe,
 and I also agree that shouldn't cause segfaults, exploits, or
 low-level data corruption.  Clearly, we'll need to impose a bunch of
 restrictions to achieve that.

 But if you put a LOCK TABLE statement
 in an SQL function, it won't conflict with other locking request made
 by the same or another function elsewhere in the same transaction;
 surely you don't want that behavior to be *different* in parallel
 mode.  That would be a blatant, user-visible behavior difference
 justified by ... nothing at all.

That's just a mislabeled function. It's obviously not parallel safe at
all. I see absolutely no problem with erroring out.

 There's no hazard there.  Where you
 start getting into crash/exploit/data corruption territory is when you
 are talking about DDL operations that change the physical structure of
 the table.  That's why we have stuff like CheckTableNotInUse() to
 verify that, for example, there are no old cursors around that are
 still expecting the old relfilenode and tuple descriptor to be valid.

It's not just fully structural changes although they are a concern.
It's also that we've amassed a number of hacks to deal with local state
that just won't be nicely transported. What's with stuff like
RelationSetIndexList() (which is infrequently enough used to not be a
big problem in practice...)? If we only allow parallel access while
independent backends could acquire the relevant we can rely on us
already having taken care about the concurrency hazards. Otherwise not.

And before you say that'd prohibit a number of useful things - yes, it
might. But e.g. CREATE INDEX can still be parallelized. The primary
backend takes a Exclusive/ShareUpdateExclusive (depending on
CONCURRENTLY), workers just a AccessShare.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread Josh Berkus
All,

While there's argument about hash indexes, it looks like nobody minds if
the MONEY type goes bye-bye.  So, time for a patch ...


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Locked stdio has noticeable performance cost for COPY

2014-11-01 Thread Andres Freund
Hi,

Every now and then I've noticed that glibc's stdio functions show up
prominently in profiles. Which seems somewhat odd, given they pretty
much should delegate all work to the kernel and memcpy(). By accident I
looked at a the disassembly and oops: A large part is due to
locking. Especially in ferror()s case that's quite noticeable...

Locked it's:

int
_IO_ferror (fp)
 _IO_FILE* fp;
{
  int result;
  CHECK_FILE (fp, EOF);
  _IO_flockfile (fp);
  result = _IO_ferror_unlocked (fp);
  _IO_funlockfile (fp);
  return result;
}
unlocked it's just:
#define _IO_ferror_unlocked(__fp) (((__fp)-_flags  _IO_ERR_SEEN) != 0)

Fun...

glibc also provide fread/fwrite/ferror_unlocked. Some quick performance
results (quickest of three) show:

before:
postgres[1]=# COPY notsolargedata (data) TO '/dev/null' WITH BINARY;
COPY 1000
Time: 2811.623 ms
after replacing ferror/fread/fwrite in copy.c:
postgres[1]=# COPY notsolargedata (data) TO '/dev/null' WITH BINARY;
COPY 1000
Time: 2593.969 ms

That's not nothing.

As we really, especially not in COPY, don't need the locking I guess we
should do something about. I'm unsure what exactly. Getting rid of stdio
seems more work than it's worth. So maybe just add a configure check for
fread/fwrite/ferror_unlocked and use it in copy.c?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] group locking: incomplete patch, just for discussion

2014-11-01 Thread Robert Haas
On Sat, Nov 1, 2014 at 5:38 PM, Andres Freund and...@2ndquadrant.com wrote:
 That's just a mislabeled function. It's obviously not parallel safe at
 all. I see absolutely no problem with erroring out.

I disagree.  It's entirely parallel-safe, as long as you don't
arbitrarily decide to have the lock manager break it.

 There's no hazard there.  Where you
 start getting into crash/exploit/data corruption territory is when you
 are talking about DDL operations that change the physical structure of
 the table.  That's why we have stuff like CheckTableNotInUse() to
 verify that, for example, there are no old cursors around that are
 still expecting the old relfilenode and tuple descriptor to be valid.

 It's not just fully structural changes although they are a concern.
 It's also that we've amassed a number of hacks to deal with local state
 that just won't be nicely transported. What's with stuff like
 RelationSetIndexList() (which is infrequently enough used to not be a
 big problem in practice...)? If we only allow parallel access while
 independent backends could acquire the relevant we can rely on us
 already having taken care about the concurrency hazards. Otherwise not.

RelationSetIndexList() is only used inside REINDEX, which I think
illustrates my point that it's mostly DDL we need to be worried about.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's drop two obsolete features which are bear-traps for novices

2014-11-01 Thread David Rowley
On Sun, Nov 2, 2014 at 12:59 PM, Josh Berkus j...@agliodbs.com wrote:

 All,

 While there's argument about hash indexes, it looks like nobody minds if
 the MONEY type goes bye-bye.  So, time for a patch ...


Will the patch move the feature to a contrib module?

Or will our release notes state that all MONEY columns must be changed to
an appropriate NUMERIC type pre-upgrade?

Regards

David Rowley


[HACKERS] Silly coding in pgcrypto

2014-11-01 Thread Marko Tiikkaja

Hi,

Patch attached to fix some sillyness in pgp_expect_packet_end() and 
pgp_skip_packet().  Will add to the next commitfest unless someone picks 
this one up before that.



.marko
*** a/contrib/pgcrypto/pgp-decrypt.c
--- b/contrib/pgcrypto/pgp-decrypt.c
***
*** 1069,1075  pgp_skip_packet(PullFilter *pkt)
  
while (res  0)
res = pullf_read(pkt, 32 * 1024, tmp);
!   return res  0 ? res : 0;
  }
  
  /*
--- 1069,1075 
  
while (res  0)
res = pullf_read(pkt, 32 * 1024, tmp);
!   return res;
  }
  
  /*
***
*** 1078,1096  pgp_skip_packet(PullFilter *pkt)
  int
  pgp_expect_packet_end(PullFilter *pkt)
  {
!   int res = 1;
uint8  *tmp;
  
!   while (res  0)
{
!   res = pullf_read(pkt, 32 * 1024, tmp);
!   if (res  0)
!   {
!   px_debug(pgp_expect_packet_end: got data);
!   return PXE_PGP_CORRUPT_DATA;
!   }
}
!   return res  0 ? res : 0;
  }
  
  int
--- 1078,1093 
  int
  pgp_expect_packet_end(PullFilter *pkt)
  {
!   int res;
uint8  *tmp;
  
!   res = pullf_read(pkt, 32 * 1024, tmp);
!   if (res  0)
{
!   px_debug(pgp_expect_packet_end: got data);
!   return PXE_PGP_CORRUPT_DATA;
}
!   return res;
  }
  
  int

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers