date:20151230

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Alvaro Herrera

Bruce Momjian wrote:
> On Tue, Dec 29, 2015 at 02:08:19PM -0500, Bruce Momjian wrote:
> > In the Windows MSVC build, we handle pg_config_os.h in the Perl scripts,
> > but not dynloader.h.  The attached patch copies the process used for
> > pg_config_os.h to handle dynloader.h.  I believe this is the only *.h
> > file that has this problem.
> > 
> > This should fix the PL/Java Windows build problem.  I don't think I will
> > get this patch into 9.5.0 but will put it in after 9.5.0 is released and
> > it will appear in all the next minor version releases, including 9.5.1,
> > which should happen in the next 60 days.
> 
> Oops.  Once this patch is applied, it is only going to take effect when
> someone _installs_ Postgres.  Even an initdb will not add the file. 
> This means that upgrading to the next minor release will _not_ fix this.

They will, unless they build from source -- which most people don't.

> This suggests that we perhaps should consider this for 9.5.0, and that
> your hack will have to be active until 9.4 gets to end-of-life, or
> perhaps 9.5 if we can't get this into 9.5.0.  People who are using 9.5
> Beta or RC will still not have the file, meaning 9.5 end-of-life might
> still be a requirement.

I think this should be fixed in 9.5.0 since you already have the patch --
what's the point of waiting for 9.5.1?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread Tomas Vondra




On 12/30/2015 10:30 PM, David Rowley wrote:

On 31 December 2015 at 01:24, Tomas Vondra > wrote:

Personally I'd like to see automatic plan caching occur in
PostgreSQL. There is a bit of a problem with it in regards to a query
such as: select * from t where mySkewedColumn = 1; where the value 1
appears 99% of the time. Initially we may plan to seqscan, where with
other values we'd likely prefer to index scan. I imagine with my
unique joins patch, that it could be expanded to test
baserestrictinfos to see if they contain quals with a unique index.
This knowledge could later permit plan caching to occur on queries
which are safe from having any skew in the results. It might sound
rather restrictive, but likely this would cover 99% of UPDATE and
DELETE operations in an OLTP workload, and likely a good deal of the
SELECTs too. The safety of caching could be analyzed during planning,
and a flag could be set somewhere, perhaps in PlannedStmt to mark if
the plan is safe to cache. The planner() function could initially
hash the query string and check if any cached plan exists with that
hash, if not, plan the statement, and then check if the "safeToCache"
flag is set, and if so, stick that plan into a hash table. Also plans
with no baserestrictinfos could be "safeToCache" too.


Yeah, that's what I meant by "non-trivial". I don't know a good approach 
to this problem, or if such a "good" approach even exists, but I'd say 
being able to decide whether to rebuild a plan in such cases is a 
"must-have" feature. Otherwise we could easily loose any gains from the 
more thorough optimization because of poor plans.


In other words, we'd have to come up with a way to decide whether to use 
the same plan as before, or try building another plan (for the same 
query with different parameter values). I can think of two approaches:


(1) Try to measure how "different" are the parameter values used in the
new query, compared to the existing plan(s). This probably means
difference in terms of probability / frequencies etc.

(2) Compute the cost of the existing plan for the new parameters. I.e.
don't perform the whole optimization, just the costing for the
single plan. If the costs are "close" then use the existing plan.

Of course, none of this is trivial and still may fail for some cases.

I wonder what the other databases do, or if there are papers about this 
topic (I'd guess there are, but I haven't looked too much).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Bruce Momjian

On Wed, Dec 30, 2015 at 11:57:45PM -0500, Tom Lane wrote:
> Bruce Momjian  writes:
> > Oops.  Once this patch is applied, it is only going to take effect when
> > someone _installs_ Postgres.  Even an initdb will not add the file. 
> > This means that upgrading to the next minor release will _not_ fix this.
> 
> Uh, what?  Surely an upgrade to a new package *would* fix it, because
> that is a reinstall at the filesystem level.  This patch has nothing
> to do with system catalog contents, which is what initdb would be
> concerned with.

Uh, do we install hew lib files and stuff in a minor upgrade?  I guess
we do, yeah.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Tom Lane

Bruce Momjian  writes:
> Oops.  Once this patch is applied, it is only going to take effect when
> someone _installs_ Postgres.  Even an initdb will not add the file. 
> This means that upgrading to the next minor release will _not_ fix this.

Uh, what?  Surely an upgrade to a new package *would* fix it, because
that is a reinstall at the filesystem level.  This patch has nothing
to do with system catalog contents, which is what initdb would be
concerned with.

I would not be terribly worried about you pushing it into 9.5.0, but
I also think that it could wait for 9.5.1.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [POC] FETCH limited by bytes.

2015-12-30 Thread Corey Huinker

>
>
>> I believe it won't be needed as a regression test but DEBUGn
>> messages could be usable to see "what was sent to the remote
>> side". It can be shown to the console so easily done in
>> regression test. See the deocumentation for client_min_messages
>> for details. (It could receive far many messages then expected,
>> though, and maybe fragile for changing in other part.)
>>
>> regards,
>>
>> --
>> Kyotaro Horiguchi
>> NTT Open Source Software Center
>>
>>
>>
> Ok, I'll see what debug-level messages reveal.
>
>
>
Here's a re-based patch. Notable changes since the last patch are:

- some changes had to move to postgres_fdw.h
- the regression tests are in their own script fetch_size.sql /
fetch_size.out. that should make things easier for the reviewer(s), even if
we later merge it into postgres_fdw.sql/.out.
- I put braces around a multi-line error ereport(). That might be a style
violation, but the statement seemed confusing without it.
- The fetch sql is reported as a DEBUG1 level ereport(), and confirms that
server level options are respected, and table level options are used in
favor of server-level options.

I left the DEBUG1-level message in this patch so that others can see the
output, but I assume we would omit that for production code, with
corresponding deletions from fetch_size.sql and fetch_size.out.
diff --git a/contrib/postgres_fdw/Makefile b/contrib/postgres_fdw/Makefile
index 3543312..5d50b30 100644
--- a/contrib/postgres_fdw/Makefile
+++ b/contrib/postgres_fdw/Makefile
@@ -10,7 +10,7 @@ SHLIB_LINK = $(libpq)
 EXTENSION = postgres_fdw
 DATA = postgres_fdw--1.0.sql
 
-REGRESS = postgres_fdw
+REGRESS = postgres_fdw fetch_size
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/postgres_fdw/option.c b/contrib/postgres_fdw/option.c
index 380ac80..f482dcd 100644
--- a/contrib/postgres_fdw/option.c
+++ b/contrib/postgres_fdw/option.c
@@ -162,6 +162,12 @@ InitPgFdwOptions(void)
 		/* updatable is available on both server and table */
 		{"updatable", ForeignServerRelationId, false},
 		{"updatable", ForeignTableRelationId, false},
+		/*
+		 * fetch_size is available on both server and table, the table setting
+		 * overrides the server setting.
+		 */
+		{"fetch_size", ForeignServerRelationId, false},
+		{"fetch_size", ForeignTableRelationId, false},
 		{NULL, InvalidOid, false}
 	};
 
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index f501c6c..d2595f6 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -68,7 +68,9 @@ enum FdwScanPrivateIndex
 	/* SQL statement to execute remotely (as a String node) */
 	FdwScanPrivateSelectSql,
 	/* Integer list of attribute numbers retrieved by the SELECT */
-	FdwScanPrivateRetrievedAttrs
+	FdwScanPrivateRetrievedAttrs,
+	/* Integer representing the desired fetch_size */
+	FdwScanPrivateFetchSize
 };
 
 /*
@@ -126,6 +128,8 @@ typedef struct PgFdwScanState
 	/* working memory contexts */
 	MemoryContext batch_cxt;	/* context holding current batch of tuples */
 	MemoryContext temp_cxt;		/* context for per-tuple temporary data */
+
+	int			fetch_size;		/* number of tuples per fetch */
 } PgFdwScanState;
 
 /*
@@ -303,6 +307,8 @@ static HeapTuple make_tuple_from_result_row(PGresult *res,
 		   List *retrieved_attrs,
 		   MemoryContext temp_context);
 static void conversion_error_callback(void *arg);
+static int get_fetch_size(ForeignTable	*table,
+	ForeignServer *server);
 
 
 /*
@@ -371,6 +377,8 @@ postgresGetForeignRelSize(PlannerInfo *root,
 	/* Look up foreign-table catalog info. */
 	fpinfo->table = GetForeignTable(foreigntableid);
 	fpinfo->server = GetForeignServer(fpinfo->table->serverid);
+	/* Look up any table-specific fetch size */
+	fpinfo->fetch_size = get_fetch_size(fpinfo->table,fpinfo->server);
 
 	/*
 	 * Extract user-settable option values.  Note that per-table setting of
@@ -1069,6 +1077,9 @@ postgresGetForeignPlan(PlannerInfo *root,
 	 */
 	fdw_private = list_make2(makeString(sql.data),
 			 retrieved_attrs);
+	fdw_private = list_make3(makeString(sql.data),
+			 retrieved_attrs,
+			 makeInteger(fpinfo->fetch_size));
 
 	/*
 	 * Create the ForeignScan node from target list, filtering expressions,
@@ -1147,6 +1158,8 @@ postgresBeginForeignScan(ForeignScanState *node, int eflags)
 	 FdwScanPrivateSelectSql));
 	fsstate->retrieved_attrs = (List *) list_nth(fsplan->fdw_private,
 			   FdwScanPrivateRetrievedAttrs);
+	fsstate->fetch_size = intVal(list_nth(fsplan->fdw_private,
+ FdwScanPrivateFetchSize));
 
 	/* Create contexts for batches of tuples and per-tuple temp workspace. */
 	fsstate->batch_cxt = AllocSetContextCreate(estate->es_query_cxt,
@@ -2275,15 +2288,13 @@ fetch_more_data(ForeignScanState *node)
 	{
 		PGconn	   *conn = fsstate->conn;
 		char		sql[64];
-		int			fetch_size;
 		int			numrows;
 		int			i;
 
-		/* The fetch size is arbitrary, but shouldn't be enormous. */
-		fetch_size = 100;
-

Re: [HACKERS] Additional role attributes && superuser review

2015-12-30 Thread Noah Misch

On Tue, Dec 29, 2015 at 08:35:50AM -0500, Stephen Frost wrote:
> * Noah Misch (n...@leadboat.com) wrote:

> The one argument which you've put forth for adding the complexity of
> dumping catalog ACLs is that we might reduce the number of default
> roles provided to the user.

Right.  If "GRANT EXECUTE ON FUNCTION pg_rotate_logfile() TO mydba" worked as
well as it works on user-defined functions, the community would not choose to
add a pg_rotate_logfile role holding just that one permission.

> I disagree that we would.  Having a single
> set of default roles which provide a sensible breakdown of permissions
> is a better approach than asking every administrator and application
> developer who is building tools on top of PG to try and work through
> what makes sense themselves, even if that means we have a default role
> with a small, or even only an individual, capability.

The proposed pg_replication role introduces abstraction that could, as you
hope, spare a DBA from studying sets of functions to grant together.  The
pg_rotate_logfile role, however, does not shield the DBA from complexity.
Being narrowly tied to a specific function, it's just a suboptimal spelling of
GRANT.  The gap in GRANT has distorted the design for these predefined roles.
I do not anticipate a sound design discussion about specific predefined roles
so long as the state of GRANT clouds the matter.

> > To summarize, I think the right next step is to resume designing pg_dump
> > support for system object ACLs.  I looked over your other two patches and 
> > will
> > unshelve those reviews when their time comes.
> 
> To be clear, I don't believe the two patches are particularly involved
> with each other and don't feel that one needs to wait for the other.

Patch 2/3 could stand without patch 3/3, but not vice-versa.  It's patch 2/3
that makes pg_dumpall skip ^pg_ roles, and that must be in place no later than
the first patch that adds a predefined ^pg_ role.

> Further, I'm not convinced that adding support for dumping ACLs or, in
> general, encouraging users to define their own ACLs on catalog objects
> is a good idea.  We certainly have no mechanism in place today for those
> ACLs to be respected by SysCache and encouraging their use when we won't
> actually respect them is likely to be confusing.

What's this problem with syscache?  It sounds important.

nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Bruce Momjian

On Wed, Dec 30, 2015 at 11:18:45PM -0300, Alvaro Herrera wrote:
> > This suggests that we perhaps should consider this for 9.5.0, and that
> > your hack will have to be active until 9.4 gets to end-of-life, or
> > perhaps 9.5 if we can't get this into 9.5.0.  People who are using 9.5
> > Beta or RC will still not have the file, meaning 9.5 end-of-life might
> > still be a requirement.
> 
> I think this should be fixed in 9.5.0 since you already have the patch --
> what's the point of waiting for 9.5.1?

True.  We are in RC mode now though, and this is a Windows build issue
which I cannot test.  I would need someone else to apply it after
testing.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Better detail logging for password auth failures

2015-12-30 Thread Noah Misch

On Wed, Dec 30, 2015 at 10:18:35AM -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-12-29 11:07:26 -0500, Tom Lane wrote:
> >> In passing, the patch gets rid of a vestigial CHECK_FOR_INTERRUPTS()
> >> call; it was added by e710b65c and IMO should have been removed again
> >> by 6647248e.  There's certainly no very good reason to have one right
> >> at that spot anymore.
> 
> > Why? Doesn't seem like the worst place for an explicit interrupt check?
> > I think we don't really have a problem with too many such checks... We
> > surely could move it, but I don't really see how it's related to the
> > topic at hand nor do I think it's really worth pondering about
> > extensively.

Agreed.

> The only reason there was one there at all was that e710b65c added
> code like this:
> 
> +   /*
> +* Disable immediate interrupts while doing database access.  (Note
> +* we don't bother to turn this back on if we hit one of the failure
> +* conditions, since we can expect we'll just exit right away anyway.)
> +*/
> +   ImmediateInterruptOK = false;
> 
> ... some catalog access here ...
> 
> +   /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
> +   ImmediateInterruptOK = true;
> +   /* And don't forget to detect one that already arrived */
> +   CHECK_FOR_INTERRUPTS();
> 
> In 6647248e you got rid of nine of these ten lines, leaving something
> that's both pointless and undocumented.  There are more than enough
> CHECK_FOR_INTERRUPTS calls already in the auth code; there's not a
> reason to expend code space on one here.  (If MD5 ran long enough to
> be worth interrupting, there would be an argument for a check inside
> its hashing loop, but that still wouldn't be this check.)

I see no general benefit from being parsimonious with CHECK_FOR_INTERRUPTS
calls or documenting them.  As you explain, it's probably fine to remove the
two calls that commit e710b65 had added.  However, the sole connection to
$SUBJECT is one of those two calls sharing a screenful with lines $SUBJECT
changed.  The removal, if worthwhile, is worth a freestanding patch.
Squashing the changes makes both topics harder to review.

nm


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> FWIW, the
>   if (sock == PGINVALID_SOCKET)
>   wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE);
> block in both latch implementations looks like a problem waiting to happen.

You think it should throw an error instead?  Seems reasonable to me.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 19:38:23 +0200, Shay Rojansky wrote:
> > Hm. So that seems to indicate that, on windows, we're not properly
> > recognizing dead sockets in the latch code. Could you check, IIRC with
> > netstat or something like it, in what state the connections are?

> netstat shows the socket is in FIN_WAIT_2.

> > Any chance you could single-step through WaitLatchOrSocket() with a
> > debugger? Without additional information this is rather hard to
> > diagnose.
> >
> 
> Uh I sure can, but I have no idea what to look for :) Anything
> specific?

Things that'd be interesting:
1) what are the arguments passed to WaitLatchOrSocket(), most
   importantly wakeEvents and sock
2) are we busy looping, or is WaitForMultipleObjects() blocking
   endlessly
3) If you kill -9 (well, terminate in the task manager) a client, while
   stepping serverside in WaitLatchOrSocket, does
   WaitForMultipleObjects() return? If so, what paths are we taking?

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Oleksii Kliukin

> On 30 Dec 2015, at 18:38, Emre Hasegeli  wrote:
> 
>> which is much closer to the actual number of rows removed by the index
>> recheck + the one left.
> 
> Is it better to be closer?  We are saying those are the "actual"
> values not the estimates.  If we cannot provide the actual rows, I
> think it is better to provide nothing.  Something closer to the
> reality would create more confusion.  Maybe, we just just return the
> number of blocks, and put somewhere a note about it.  The actual row
> count is already available on the upper part of the plan.

I don’t see how to solve this problem without changing explain analyze output 
to accommodate for “unknown” value. I don’t think “0” is a non-confusing 
representation of “unknown” for most people, and from the practical standpoint, 
a “best effort” estimate is better than 0 (i.e. I will be able to estimate how 
efficient BRIN index is for my tables in terms of the number of tuples 
retrieved/thrown away)

We might still reflect in the documentation that the BRIN index cannot produce 
the exact number of rows during the bitmap scan and point people asking similar 
questions there.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: --enable-depend by default (was Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex)

2015-12-30 Thread Andres Freund

On 2015-12-30 10:49:27 -0500, Tom Lane wrote:
> > On Wed, Dec 30, 2015 at 5:44 PM, Andres Freund  wrote:
> >> I still maintain that --enable-depend should be on by default. We're
> >> absurdly optimizing towards saving a handful of cycles in scenarios
> >> which are usually bottlenecked by other things (build boxes spend more
> >> times on tests and such)
>
> Nonsense.  The buildfarm will simply get slower, and at least for my
> animals it's too damn slow already.

I think you're overstating the performance impact of
--enable-depend. It's just an additional output file for the compiler,
given the way we're doing it (-MMD -MP -MF $outfile). Let's make a quick test.

I've built both trees once before, to prime ccache. The test is:
for i in $(seq 1 5); do git clean -dfxq && ./configure --(dis|en)able-depend 
--quiet && time make -j4 -s;done

(coffee break)

no ccache, --disable-depend:0m55.810s, 0m58.361s, 0m58.517s, 0m58.674s, 
0m58.466s
ccache, --disable-depend:   0m5.248s,  0m5.279s,  0m5.244s,  0m5.771s,  
0m5.296s
no ccache, --enable-depend: 0m56.443s, 0m58.507s, 0m58.587s, 0m58.866s, 
0m58.429s
ccache, --enable-depend:0m5.538s,  0m5.518s,  0m5.572s,  0m5.555s,  
0m5.528s


Yes, this is on a much faster machine (my laptop) than what some
buildfarm animal are running. But given it's really just some
additional, *small*, files that are written (compiler) and read (make),
I don't see why the impact would be significantly different on slower
machines.


> We could fix that by having the
> buildfarm script automatically add --disable-depend; but if we're going
> to change this, we should roll out that script update BEFORE we change
> it, not after.

Sounds reasonable, despite the above. This isn't an urgent change, just
something to make newer developers waste less time.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] --enable-depend by default (was Re: Patch: fix lock contention for HASHHDR.mutex)

2015-12-30 Thread Tom Lane

Greg Stark  writes:
> Alternately the buildfarm could be changed to retain earlier builds and use
> dependencies to reduce build times. This would have the benefit of
> detecting dependency failures. At the expense of a lot more noise, at least
> until we Orrin out whatever dependency failures are there now.

The buildfarm already made its choice in this area, and that's ccache.
Layering --enable-depend on top of builds that are using ccache is
simply a waste.

(Admittedly, ccache is gcc-specific, but TTBOTMK so is --enable-depend.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] --enable-depend by default (was Re: Patch: fix lock contention for HASHHDR.mutex)

2015-12-30 Thread Andres Freund

On 2015-12-30 11:13:31 -0500, Tom Lane wrote:
> The buildfarm already made its choice in this area, and that's ccache.
> Layering --enable-depend on top of builds that are using ccache is
> simply a waste.
> 
> (Admittedly, ccache is gcc-specific, but TTBOTMK so is --enable-depend.)

For posterities sake: clang supports the dependency stuff by default,
it's harder but doable to get clang running with ccache.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

Hi,

On 2015-12-30 19:01:10 +0200, Shay Rojansky wrote:
> OK, I finally found some time to dive into this.
> 
> The backends seem to hang when the client closes a socket without first
> sending a Terminate message - some of the tests make this happen. I've
> confirmed this happens with 9.5rc1 running on Windows (versions 10 and 7),
> but this does not occur on Ubuntu 15.10. The client runs on Windows as well
> (although I doubt that's important).

Hm. So that seems to indicate that, on windows, we're not properly
recognizing dead sockets in the latch code. Could you check, IIRC with
netstat or something like it, in what state the connections are?

Any chance you could single-step through WaitLatchOrSocket() with a
debugger? Without additional information this is rather hard to
diagnose.

> On Wed, Dec 30, 2015 at 5:32 AM, Amit Kapila 
> wrote:
> > What procedure do you use to kill backends?  Normally, if we kill
> > via task manager using "End Process", it is considered as backend
> > crash and the server gets restarted and all other backends got
> > disconnected.

Unless I miss something major here the problem is clients disconnecting
and leaving backends hanging. The killing of backends only comes into
play after that's already the case.

Regards,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Shay Rojansky  writes:
> The backends seem to hang when the client closes a socket without first
> sending a Terminate message - some of the tests make this happen. I've
> confirmed this happens with 9.5rc1 running on Windows (versions 10 and 7),
> but this does not occur on Ubuntu 15.10.

Nor OS X.  Ugh.  My first thought was that ac1d7945f broke this, but
that's only in HEAD not 9.5, so some earlier change must be responsible.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Tom Lane

Emre Hasegeli  writes:
>> which is much closer to the actual number of rows removed by the index
>> recheck + the one left.

> Is it better to be closer?  We are saying those are the "actual"
> values not the estimates.  If we cannot provide the actual rows, I
> think it is better to provide nothing.

Return zero you mean?  Good point; there's precedent for that elsewhere
in EXPLAIN ANALYZE, IIRC.  If we do what I propose in my patch, it would
possibly just lead to more questions like Oleksii's, because people would
be even more likely to believe that the number is accurate.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 12:41:56 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > FWIW, the
> > if (sock == PGINVALID_SOCKET)
> > wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE);
> > block in both latch implementations looks like a problem waiting to happen.
> 
> You think it should throw an error instead?  Seems reasonable to me.

Yea. Error or maybe just an assert. That path seems to always indicate
something having gone wrong.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> On 2015-12-30 12:30:43 -0500, Tom Lane wrote:
>> Nor OS X.  Ugh.  My first thought was that ac1d7945f broke this, but
>> that's only in HEAD not 9.5, so some earlier change must be responsible.

> The backtrace in
> http://archives.postgresql.org/message-id/CADT4RqBo79_0Vx%3D-%2By%3DnFv3zdnm_-CgGzbtSv9LhxrFEoYMVFg%40mail.gmail.com
> seems to indicate that it's really WaitLatchOrSocket() not noticing the
> socket is closed.

Right, and what I was wondering was whether adding the additional wait-for
condition had exposed some pre-existing flaw in the Windows latch code.
But that's not it, so we're left with the conclusion that we broke
something that used to work.

Are we sure this is a 9.5-only bug?  Shay, can you try 9.4 branch tip
and see if it misbehaves?  Can anyone else reproduce the problem?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 12:50:58 -0500, Tom Lane wrote:
> Right, and what I was wondering was whether adding the additional wait-for
> condition had exposed some pre-existing flaw in the Windows latch code.
> But that's not it, so we're left with the conclusion that we broke
> something that used to work.

4bad60e is another suspect. Besides wondering why I moved the FD_CLOSE
case out of the existing if cases, I don't see anything suspicious
though.  If we were hitting the write-path here, it'd be plausible that
we're hitting an issue with FD_CLOSE and waiting for writability; but
we're not.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 20:12:07 +0200, Shay Rojansky wrote:
> >
> > Is this in a backend with ssl?
> >
> 
> No.

There goes that theory. Amongst others. The aforementioned problem with
waitfor doesn't seem to be actually armed because waitfor is only used
if errno == EWOULDBLOCK || errno == EAGAIN.

> If you go up one frame, what value does port->sock have?

> For some reason VS is telling me "Unable to read memory" on port->sock... I
> have no idea why that is...

Hm. Is this with a self compiled postgres? If so, is it with assertions
enabled?

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 20:18:52 +0200, Shay Rojansky wrote:
> Tom's probably right about the optimized code. I could try compiling a
> debug version..

Seems to be the next unfortunately. Sorry.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Oleksii Kliukin


> On 30 Dec 2015, at 17:02, Tom Lane  wrote:
> 
> Oleksii Kliukin  writes:
>> Bitmap Heap Scan on example  (cost=744.44..757.64 rows=6 width=0) (actual 
>> time=73.895..73.895 rows=0 loops=1)
>>   Output: 1
>>   Recheck Cond: (example.event_time = (now() - '5 mons'::interval))
>>   Rows Removed by Index Recheck: 4030
>>   Heap Blocks: lossy=128
>>   Buffers: shared hit=629
>>   ->  Bitmap Index Scan on example_event_time_idx1  (cost=0.00..744.41 
>> rows=6 width=0) (actual time=70.335..70.335 rows=1280 loops=1)
>> Index Cond: (example.event_time = (now() - '5 mons'::interval))
>> Buffers: shared hit=501
> 
>> - how does it get 1280 rows from the BRIN index scan, given that BRIN only 
>> stores pointers to the heap blocks, not individual rows. Does it calculate 
>> the number of rows in the blocks returned?
> 
> It evidently returned 128 block IDs to the heapscan logic.  I have not
> looked at the code, but a reasonable bet is that it's just guessing that
> there are 10 rows per block.

You are right, this is at the end of bringetbitmap in brin.c
   /*
 * XXX We have an approximation of the number of *pages* that our scan
 * returns, but we don't have a precise idea of the number of heap 
tuples
 * involved.
 */
PG_RETURN_INT64(totalpages * 10);


> 
> That seems like an awfully low number, though; it equates to assuming
> that rows are 800 bytes wide on average.  If we're going to use a fixed
> number, 100 rows per block would probably be more nearly the correct
> order of magnitude.
> 
> Another idea would be to use the heap's row density as calculated
> by the last ANALYZE (ie, reltuples/relpages), with a fallback to 100
> if relpages=0.  This'd only be convenient if the bitmap scan node has
> the parent heap rel open, which it might not.

+1

Kind regards,
--
Oleksii

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Robert Haas

On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
 wrote:
> I didn't check out earlier versions of this patch, but the latest one still
> changes pg_size_pretty() to emit PB suffix.
>
> I don't think it is worth it to throw a number of changes together like
> that.  We should focus on adding pg_size_bytes() first and make it
> compatible with both pg_size_pretty() and existing GUC units: that is
> support suffixes up to TB and make sure they have the meaning of powers of
> 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>
> Next, we could think about adding handling of PB suffix on input and output,
> but I don't see a big problem if that is emitted as 1024TB or the user has
> to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an minor
> inconvenience only.

+1 to everything in this email.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Tom Lane

Oleksii Kliukin  writes:
>> On 30 Dec 2015, at 17:02, Tom Lane  wrote:
>> Another idea would be to use the heap's row density as calculated
>> by the last ANALYZE (ie, reltuples/relpages), with a fallback to 100
>> if relpages=0.  This'd only be convenient if the bitmap scan node has
>> the parent heap rel open, which it might not.

> +1

Any objections to the attached?

regards, tom lane

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index 2622a7e..64bf788 100644
*** a/src/backend/access/brin/brin.c
--- b/src/backend/access/brin/brin.c
*** bringetbitmap(PG_FUNCTION_ARGS)
*** 279,286 
  	Relation	heapRel;
  	BrinOpaque *opaque;
  	BlockNumber nblocks;
  	BlockNumber heapBlk;
! 	int			totalpages = 0;
  	FmgrInfo   *consistentFn;
  	MemoryContext oldcxt;
  	MemoryContext perRangeCxt;
--- 279,287 
  	Relation	heapRel;
  	BrinOpaque *opaque;
  	BlockNumber nblocks;
+ 	int			heap_tuples_per_page;
  	BlockNumber heapBlk;
! 	int64		totalpages = 0;
  	FmgrInfo   *consistentFn;
  	MemoryContext oldcxt;
  	MemoryContext perRangeCxt;
*** bringetbitmap(PG_FUNCTION_ARGS)
*** 291,301 
  
  	/*
  	 * We need to know the size of the table so that we know how long to
! 	 * iterate on the revmap.
  	 */
  	heapOid = IndexGetRelation(RelationGetRelid(idxRel), false);
  	heapRel = heap_open(heapOid, AccessShareLock);
  	nblocks = RelationGetNumberOfBlocks(heapRel);
  	heap_close(heapRel, AccessShareLock);
  
  	/*
--- 292,309 
  
  	/*
  	 * We need to know the size of the table so that we know how long to
! 	 * iterate on the revmap.  While we have it open, estimate the number of
! 	 * tuples per heap page for use later.
  	 */
  	heapOid = IndexGetRelation(RelationGetRelid(idxRel), false);
  	heapRel = heap_open(heapOid, AccessShareLock);
  	nblocks = RelationGetNumberOfBlocks(heapRel);
+ 	if (heapRel->rd_rel->relpages != 0 && heapRel->rd_rel->reltuples > 0)
+ 		heap_tuples_per_page = (int)
+ 			((double) heapRel->rd_rel->reltuples /
+ 			 (BlockNumber) heapRel->rd_rel->relpages);
+ 	else	/* if no info, assume 100-byte tuples */
+ 		heap_tuples_per_page = BLCKSZ / 100;
  	heap_close(heapRel, AccessShareLock);
  
  	/*
*** bringetbitmap(PG_FUNCTION_ARGS)
*** 447,457 
  		ReleaseBuffer(buf);
  
  	/*
! 	 * XXX We have an approximation of the number of *pages* that our scan
! 	 * returns, but we don't have a precise idea of the number of heap tuples
! 	 * involved.
  	 */
! 	PG_RETURN_INT64(totalpages * 10);
  }
  
  /*
--- 455,465 
  		ReleaseBuffer(buf);
  
  	/*
! 	 * We have an approximation of the number of pages that our scan returns,
! 	 * but we don't have a precise idea of the number of heap tuples involved.
! 	 * We have to estimate based on average tuple density.
  	 */
! 	PG_RETURN_INT64(totalpages * heap_tuples_per_page);
  }
  
  /*

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Emre Hasegeli

> which is much closer to the actual number of rows removed by the index
> recheck + the one left.

Is it better to be closer?  We are saying those are the "actual"
values not the estimates.  If we cannot provide the actual rows, I
think it is better to provide nothing.  Something closer to the
reality would create more confusion.  Maybe, we just just return the
number of blocks, and put somewhere a note about it.  The actual row
count is already available on the upper part of the plan.

By the way, the estimation is a bigger problem than that.  Please see
my patch [1] about it.

[1] 
http://www.postgresql.org/message-id/cae2gyzzjvzpy-1csgzjjyh69izsa13segfc4i4r2z0qbq2p...@mail.gmail.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> > The backends seem to hang when the client closes a socket without first
> > sending a Terminate message - some of the tests make this happen. I've
> > confirmed this happens with 9.5rc1 running on Windows (versions 10 and
> 7),
> > but this does not occur on Ubuntu 15.10. The client runs on Windows as
> well
> > (although I doubt that's important).
>
> Hm. So that seems to indicate that, on windows, we're not properly
> recognizing dead sockets in the latch code. Could you check, IIRC with
> netstat or something like it, in what state the connections are?
>

netstat shows the socket is in FIN_WAIT_2.


> Any chance you could single-step through WaitLatchOrSocket() with a
> debugger? Without additional information this is rather hard to
> diagnose.
>

Uh I sure can, but I have no idea what to look for :) Anything specific?

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> Hm. Is this with a self compiled postgres? If so, is it with assertions
> enabled?
>

No, it's just the EnterpriseDB 9.5rc1 installer...

Tom's probably right about the optimized code. I could try compiling a
debug version..

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> On 2015-12-30 19:54:19 +0200, Shay Rojansky wrote:
>> wakeEvents is 8387808 and so is sock.

> Hm. That seems like an extremely weird value.

Probably just means the debugger is confused by optimized code.

> I think it's indicative of
> a bug in 80788a431e. Specifically secure_read/write's waitfor isn't
> necessarily set in the ssl case. Importantly not in the
> SSL_ERROR_SYSCALL case, which we're likely hitting here.

Hmm, that does look bogus ... and the Assert wouldn't fire in a non-assert
build.  But waitfor would be zero if that was what was happening.

> Is this in a backend with ssl?

I thought we'd eliminated SSL already, or have I lost track of the
bidding?  But I suspect you've identified a different bug.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

Hi,

On 2015-12-30 13:17:47 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > On 2015-12-30 19:54:19 +0200, Shay Rojansky wrote:
> >> wakeEvents is 8387808 and so is sock.
> 
> > Hm. That seems like an extremely weird value.
> 
> Probably just means the debugger is confused by optimized code.

Yea.


> > I think it's indicative of
> > a bug in 80788a431e. Specifically secure_read/write's waitfor isn't
> > necessarily set in the ssl case. Importantly not in the
> > SSL_ERROR_SYSCALL case, which we're likely hitting here.
> 
> Hmm, that does look bogus ... and the Assert wouldn't fire in a non-assert
> build.

After a bit of thinking it seems this can't actually happen because
waitfor is only used if errno is EWOULDBLOCK/EAGAIN. And all the other
branches allowing for n < 0 set errno to a different value.  Not
particularly obvious.


> > Is this in a backend with ssl?
> 
> I thought we'd eliminated SSL already, or have I lost track of the
> bidding?

We know a very similar problem happens without ssl. But we could have
ended up two different bugs...


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] custom parameters cannot be removed once set

2015-12-30 Thread Joe Conway

Today I was reminded of an issue I have run across before, namely that
in a given postgres session, once a custom parameter has been set, there
is no way to remove it entirely. For example:

8<---
# psql regression
psql (9.5rc1)
Type "help" for help.

regression=# SELECT current_setting('app_name.app_user');
ERROR:  unrecognized configuration parameter "app_name.app_user"
regression=# BEGIN;
BEGIN
regression=# SET LOCAL app_name.app_user = 'bob';
SET
regression=# SELECT current_setting('app_name.app_user');
 current_setting
-
 bob
(1 row)

regression=# ROLLBACK;
ROLLBACK
regression=# SELECT current_setting('app_name.app_user');
 current_setting
-

(1 row)

regression=# RESET app_name.app_user;
RESET
regression=# SELECT current_setting('app_name.app_user');
 current_setting
-

(1 row)

regression=# SELECT current_setting('app_name.app_user') = '';
 ?column?
--
 t
(1 row)
8<---

Note that before app_name.app_user has been set the first time, an error
is thrown if we try to access it. However once it has been set, even
when done as SET LOCAL and inside a rolled back transaction, the
parameter continues to exist and no error is thrown when reading it. And
it is not even NULL, it is actually an empty string.

This strikes me as, at least, surprising, and possibly should be
considered a bug. Thoughts?

A side issue is that it would be nice if there were a way to check for a
custom parameter value without getting an error if it does not exist.
There is a missing_ok option to GetConfigOptionByName(), but we
currently don't expose it from SQL. I'd like to add a variant of
current_setting() with a second argument for missing_ok. Objections?

Thanks,

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread Evgeniy Shishkin


> On 30 Dec 2015, at 10:16, David Rowley  wrote:
> 
> Hi,
> 
> On [1] I suggested an idea to make improvements to the planner around the 
> Equivalence Class code. Later in [2] Tom raised concerns with this adding too 
> many planning cycles for a perhaps not common enough situation.  I don't want 
> to discuss that particular patch here, I want to discuss more generally about 
> the dilemma about adding more smarts to the planner to allow it to generate a 
> more optimal plan in order to save on execution time.
> 
> In the case of the Equivalence Class Filters code, I quoted an example where 
> pushing these filters down into the joined relation caused a significant 
> performance improvement to a query. Now, I understand Tom's concerns with 
> slowing down the planner, as in cases where the query is short running, or 
> the optimisations don't apply, then we could cause the query to overall 
> (including planning time) perform worse. Nobody wants that, but on the other 
> hand, if spending 5-10 extra microseconds during planning equates to 6 hours 
> shaved off execution time, then nobody would think to grudge that extra 5-10 
> microseconds during planning.
> 
> What I'd like to discuss here is what was touched on on that other thread on 
> ways to get around this problem:
> 
> A number of ideas were suggested on the other thread about how we might go 
> about solving this problem. In [3] Simon talked about perhaps enabling extra 
> optimisations when the planner sees that the plan will cost more than some 
> given threshold. That's perhaps an option, but may not work well for 
> optimisations which must take place very early in planning, for example [4].
> Another idea which came up was from Evgeniy [5], which was more of a request 
> not to do it this way, but never-the-less, the idea was basically to add lots 
> of GUCs to enable/disable each extra planner feature.
> 

Well, my idea was to track planning/execution cost in something like 
pg_stat_statements.
That way we can track actual time, not estimated cost like Simon proposed.

This table can be combined with Tomas proposal of plan caching.


> Another option which I've thought about previously was a planner_strength 
> GUC, at which various additional optimisations are enabled at various 
> predefined strength levels, so that databases which tend to spend a great 
> deal more execution time compared to planning time can turn this up a bit to 
> see if that helps change that ratio a bit.  This idea is far from perfect 
> though, as who's to say that planner feature X should "kick in" before 
> planner feature Y? I've also often thought that it might be nice to have it 
> so the planner does not modify the Parse object, so that the planner has the 
> option to throw away what it's done so far and start planning all over again 
> with the "planner_strength" knob turned up to the maximum, if the cost 
> happened to indicate that the query was going to take a long time to execute.
> 
> In reality we already have some planner features which are possible 
> candidates for non essential optimisations. For example join removals likely 
> don't apply in all that many cases, but when they do, this feature is a great 
> win. So by having some sort of ability to enable/disable planner features we 
> also stand to actually speed the planner up for fast simple queries. 
> 
> I do strongly believe that we need to come up with something to solve this 
> problem. I already summarised my thoughts on the other thread.
> 
> I wrote:
> > I believe that with parallel query on the horizon for 9.6 that we're now
> > aiming to support bigger OLAP type database than ever before. So if we
> > ignore patches like this one then it appears that we have some conflicting
> > goals in the community as it seems that we're willing to add the brawn, but
> > we're not willing to add the brain. If this is the case then it's a shame,
> > as I think we can have both. So I very much agree on the fact that we must
> > find a way to maintain support and high performance of small OLTP databases
> > too.
> 
> So here I'd very much like to kick off discussion on an acceptable way to 
> solve this problem, in a realistic way which we're all happy with.
> 
> Comments are of course welcome.
> 
> [1] 
> http://www.postgresql.org/message-id/cakjs1f9fk_x_5hkcpcseimy16owe3empmmgsgwlckkj_rw9...@mail.gmail.com
> [2] http://www.postgresql.org/message-id/30810.1449335...@sss.pgh.pa.us
> [3] 
> http://www.postgresql.org/message-id/canp8+jlrprn4ynmsrkoqhyi-dw5jrodmot05qejhrayrsex...@mail.gmail.com
> [4] 
> http://www.postgresql.org/message-id/cakjs1f_uz_mxtpot6epxsghsujoucrkuxyhlh06h072rdxs...@mail.gmail.com
> [5] 
> http://www.postgresql.org/message-id/2f30ba8b-dab9-4907-9e4e-102d24256...@gmail.com
> 
> -- 
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

OK, I finally found some time to dive into this.

The backends seem to hang when the client closes a socket without first
sending a Terminate message - some of the tests make this happen. I've
confirmed this happens with 9.5rc1 running on Windows (versions 10 and 7),
but this does not occur on Ubuntu 15.10. The client runs on Windows as well
(although I doubt that's important).

In case it helps, here's a gist
 with some .NET code
that uses Npgsql 3.0.4 to reproduce this.

If there's anything else I can do please let me know.

Shay

On Wed, Dec 30, 2015 at 5:32 AM, Amit Kapila 
wrote:

>
>
> On Tue, Dec 29, 2015 at 7:04 PM, Shay Rojansky  wrote:
>
>> Could you describe the worklad a bit more? Is this rather concurrent? Do
>>> you use optimized or debug builds? How long did you wait for the
>>> backends to die? Is this all over localhost, external ip but local,
>>> remotely?
>>>
>>
>> The workload is a a rather diverse set of integration tests executed with
>> Npgsql. There's no concurrency whatsoever - tests are executed serially.
>> The backends stay alive indefinitely, until they are killed. All this is
>> over localhost with TCP. I can try other scenarios if that'll help.
>>
>>
>
> What procedure do you use to kill backends?  Normally, if we kill
> via task manager using "End Process", it is considered as backend
> crash and the server gets restarted and all other backends got
> disconnected.
>
>
>> > Note that the number of backends that stay stuck after the tests is
>>> > constant (always 12).
>>>
>>> Can you increase the number of backends used in the test? And check
>>> whether it's still 12?
>>>
>>
>> Well, I ran the testsuite twice in parallel, and got... 23 backends stuck
>> at the end.
>>
>>
>>> How are your clients disconnecting? Possibly without properly
>>> disconnecting?
>>>
>>
>> That's possible, definitely in some of the test cases.
>>
>> What I can do is try to isolate things further by playing around with the
>> tests and trying to see if a more minimal repro can be done - I'll try
>> doing this later today or tomorrow. If anyone has any other specific tests
>> or checks I should do let me know.
>>
>
> I think first we should try to isolate whether the hanged backends
> are due to the reason that they are not disconnected properly or
> there is some other factor involved as well, so you can try to kill/
> disconnect the sessions connected via psql in the same way as
> you are doing for connections with Npgsql and see if you can
> reproduce the same behaviour.
>
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>

Re: [HACKERS] Tsvector editing functions

2015-12-30 Thread Stas Kelvich

Hi, Tomáš! Thanks for comprehensive review.

> On 15 Dec 2015, at 06:07, Tomas Vondra  wrote:
> 
> 1) It's a bit difficult to judge the usefulness of the API, as I've
>   always been a mere user of full-text search, and I never had a need
>   (or courage) to mess with the tsvectors. OTOH I don't see a good
>   reason no to have such API, when there's a need for it.
> 
>   The API seems to be reasonably complete, with one exception - when
>   looking at editing function we have for 'hstore', we do have these
>   variants for delete()
> 
>  delete(hstore,text)
>  delete(hstore,text[])
>  delete(hstore,hstore)
> 
>   while this patch only adds delete(tsvector,text). Would it make
>   sense to add variants similar to hstore? It probably does not make
>   much sense to add delete(tsvector,tsvector), right? But being able
>   to delete a bunch of lexemes in one go seems like a good thing.
> 
>   What do you think?

That’s a good idea and actually deleting tsvector from tsvector makes perfect 
sense. In delete function I used exact string match between string and lexemes 
in tsvector, but if somebody wants to delete for example “Cats” from tsvector, 
then he should downcase and singularize this word. Easiest way to do it is to 
just use to_tsvector() function. Also we can use this function to delete 
specific positions: like delete('cat:3 fat:2,4'::tsvector, 'fat:2'::tsvector) 
-> 'cat:3 fat:4'::tsvector.

So in attached patch I’ve implemented following:

delete(tsin tsvector, lexarrtext[]) — remove any occurence of lexemes inlexarr 
from tsin

delete(tsin tsvector, tsv_filter tsvector) — Delete lexemes and/or positions of 
tsv_filter from tsin. When lexeme in tsv_filter has no positions function will 
delete any occurrence of same lexeme in tsin. When tsv_filter lexeme have 
positions function will delete them from positions of matching lexeme in tsin. 
If after such removal resulting positions set is empty then function will 
delete that lexeme from resulting tsvector.

Also if we want some level of completeness of API and taking into account that 
concat() function shift positions on second argument I thought that it can be 
useful to also add function that can shift all positions of specific value. 
This helps to undo concatenation: delete one of concatenating tsvectors and 
then shift positions in resulting tsvector. So I also wrote one another small 
function:

shift(tsin tsvector,offset int16) — Shift all positions in tsin by given offset

> 
> 
> 2) tsvector_op.c needs a bit of love, to eliminate the two warnings it
>   currently triggers:
> 
>tsvector_op.c:211:2: warning: ISO C90 forbids mixed ...
>tsvector_op.c:635:9: warning: variable ‘lexeme2copy’ set but …
> 

fixed

> 3) the patch also touches tsvector_setweight(), only to do change:
> 
>  elog(ERROR, "unrecognized weight: %d", cw);
> 
>   to
> 
>  elog(ERROR, "unrecognized weight: %c", cw);
> 
>   That should probably go get committed separately, as a bugfix.
> 

Okay, i’ll submit that as a separate patch.

> 
> 4) I find it rather annoying that there are pretty much no comments in
>   the code. Granted, there are pretty much no comments in the
>   surrounding code, but I doubt that's a good reason for not having
>   any comments in new code. It makes reviews unnecessarily difficult.
> 

Fixed, I think.

> 
> 5) tsvector_concat() is not mentioned in docs at all
> 

Concat mentioned in docs as an operator ||.

> 
> 6) Docs don't mention names of the new parameters in function
>   signatures, just data types. The functions with a single parameter
>   probably don't need to do that, but multi-parameter ones should.
> 

Fixed.

> 7) Some of the functions use intexterm that does not match the function
>   name. I see two such cases - to_tsvector and setweight. Is there a
>   reason for that?
> 

Because sgml compiler wants unique indexterm. Both functions that you mentioned 
use overloading of arguments and have non-unique name.




tsvector_ops.diff
Description: Binary data


---
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 19:54:19 +0200, Shay Rojansky wrote:
> >
> > Things that'd be interesting:
> > 1) what are the arguments passed to WaitLatchOrSocket(), most
> >importantly wakeEvents and sock
> >
> 
> wakeEvents is 8387808 and so is sock.

Hm. That seems like an extremely weird value. I think it's indicative of
a bug in 80788a431e. Specifically secure_read/write's waitfor isn't
necessarily set in the ssl case. Importantly not in the
SSL_ERROR_SYSCALL case, which we're likely hitting here.

Is this in a backend with ssl?

If you go up one frame, what value does port->sock have?

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Tom Lane

Oleksii Kliukin  writes:
>  Bitmap Heap Scan on example  (cost=744.44..757.64 rows=6 width=0) (actual 
> time=73.895..73.895 rows=0 loops=1)
>Output: 1
>Recheck Cond: (example.event_time = (now() - '5 mons'::interval))
>Rows Removed by Index Recheck: 4030
>Heap Blocks: lossy=128
>Buffers: shared hit=629
>->  Bitmap Index Scan on example_event_time_idx1  (cost=0.00..744.41 
> rows=6 width=0) (actual time=70.335..70.335 rows=1280 loops=1)
>  Index Cond: (example.event_time = (now() - '5 mons'::interval))
>  Buffers: shared hit=501

> - how does it get 1280 rows from the BRIN index scan, given that BRIN only 
> stores pointers to the heap blocks, not individual rows. Does it calculate 
> the number of rows in the blocks returned?

It evidently returned 128 block IDs to the heapscan logic.  I have not
looked at the code, but a reasonable bet is that it's just guessing that
there are 10 rows per block.

That seems like an awfully low number, though; it equates to assuming
that rows are 800 bytes wide on average.  If we're going to use a fixed
number, 100 rows per block would probably be more nearly the correct
order of magnitude.

Another idea would be to use the heap's row density as calculated
by the last ANALYZE (ie, reltuples/relpages), with a fallback to 100
if relpages=0.  This'd only be convenient if the bitmap scan node has
the parent heap rel open, which it might not.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: --enable-depend by default (was Re: Patch: fix lock contention for HASHHDR.mutex)

2015-12-30 Thread Greg Stark

Alternately the buildfarm could be changed to retain earlier builds and use
dependencies to reduce build times. This would have the benefit of
detecting dependency failures. At the expense of a lot more noise, at least
until we Orrin out whatever dependency failures are there now.

Or we could add extra rules so that if you don't configure with
dependencies we check for unclean builds and error out. This would make at
least accidental unclean builds much led likely.
On 30 Dec 2015 16:49, "Tom Lane"  wrote:

> [ A change as significant as this should not be debated in a thread with
> a title that suggests it's of interest to only one or two people ]
>
> > On Wed, Dec 30, 2015 at 5:44 PM, Andres Freund 
> wrote:
> >> I still maintain that --enable-depend should be on by default. We're
> >> absurdly optimizing towards saving a handful of cycles in scenarios
> >> which are usually bottlenecked by other things (build boxes spend more
> >> times on tests and such)
>
> Nonsense.  The buildfarm will simply get slower, and at least for my
> animals it's too damn slow already.  We could fix that by having the
> buildfarm script automatically add --disable-depend; but if we're going
> to change this, we should roll out that script update BEFORE we change
> it, not after.
>
> regards, tom lane
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Additional role attributes && superuser review

2015-12-30 Thread Robert Haas

On Tue, Dec 29, 2015 at 5:35 AM, Stephen Frost  wrote:
> * Noah Misch (n...@leadboat.com) wrote:
>> > Updated patch attached.  I'll give it another good look and then commit
>> > it, barring objections.
>>
>> This thread and its satellite[1] have worked their way through a few designs.
>> At first, it was adding role attributes, alongside existing attributes like
>> REPLICATION and BYPASSRLS.  It switched[2] to making pg_dump preserve ACLs on
>> system objects.  Built-in roles joined[3] the pg_dump work to offer 
>> predefined
>> collections of ACL grants.  Finally, it dropped[4] the pg_dump side and
>> hard-coded the roles into the features they govern.
>
> Correct, after quite a bit of discussion and the conclusion that, while
> pg_dump support for dumping ACLs might be interesting, it was quite a
> bit more complex an approach than this use-case justified.

Hmm.  I don't think I agree with that conclusion.  Who were the
participants in that discussion, and how many people spoke in favor
from moving on from that proposal - which I rather liked - to what
you've got now?  Do you have links to the relevant portion of the
relevant thread?

I think it's not a very good thing if we add roles that allow, say,
execution of exactly one SQL function.  The
dump-grants-on-system-objects proposal would accomplish the same thing
in a much more flexible way, and with less catalog clutter.  More
broadly, I'm not very convinced that even the roles which allow for
rights on multiple objects are going to meet with general approval.
There has been enough discussion of which roles should be created and
which things should be included in each one, and the overall tenor of
that discussion seems to be that different people have different
ideas.  Under those circumstances, it seems very dubious to proceed
with this.  Michael seems to think that we can go ahead and start
changing things and sort out whatever is broken later, but that
doesn't sound like a very good plan to me.  We can change the
internals of a bad implementation later and replace it with a good
implementation, but changing user-visible behavior once people have
started relying on it is a lot harder.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Rationalizing Query.withCheckOptions

2015-12-30 Thread Tom Lane

The RLS patch added this to struct Query:

List  *withCheckOptions;/* a list of WithCheckOption's, which
 * are only added during rewrite and
 * therefore are not written out as
 * part of Query. */

As per the comment, this field is ignored by outfuncs.c and readfuncs.c.
That's pretty horrid, because:

1. Omitting the field from the output of pprint() seriously impedes
debugging.  I could probably have wasted significantly fewer hours
yesterday before figuring out fd195257 if pprint() hadn't been lying
to me about what was in the query data structures.

2. At some point, this will break parallel queries, or perhaps just
cause them to silently fail to enforce RLS, because we depend on
outfuncs/readfuncs to transfer node trees to parallel workers.
(I think there's no live bug today because we only transfer Plans
not Queries, but surely this is a loaded foot-gun.)

3. A node type as fundamental as Query ought not have weird gotchas
like this.

Furthermore, as far as I can see there's actually no point at all to the
special exception.  As the comment says, the list does not get populated
until rewrite time, which means that if outfuncs/readfuncs just process it
normally, it would always be NIL in stored views anyway.

It's too late to change this in 9.5, but I think we should do so
in HEAD.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> > > Any chance you could single-step through WaitLatchOrSocket() with a
> > > debugger? Without additional information this is rather hard to
> > > diagnose.
> > >
> >
> > Uh I sure can, but I have no idea what to look for :) Anything
> > specific?
>
> Things that'd be interesting:
> 1) what are the arguments passed to WaitLatchOrSocket(), most
>importantly wakeEvents and sock
> 2) are we busy looping, or is WaitForMultipleObjects() blocking
>endlessly
> 3) If you kill -9 (well, terminate in the task manager) a client, while
>stepping serverside in WaitLatchOrSocket, does
>WaitForMultipleObjects() return? If so, what paths are we taking?
>

The process definitely isn't busy looping - zero CPU usage.

I'll try to set up debugging, it may take some time though (unfamiliar with
PostgreSQL internals and Windows debugging techniques).

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> Things that'd be interesting:
> 1) what are the arguments passed to WaitLatchOrSocket(), most
>importantly wakeEvents and sock
>

wakeEvents is 8387808 and so is sock.

Tom, this bug doesn't occur with 9.4.4 (will try to download 9.4.5 and
test).

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> Are we sure this is a 9.5-only bug?  Shay, can you try 9.4 branch tip
> and see if it misbehaves?  Can anyone else reproduce the problem?
>
>
Doesn't occur with 9.4.5 either. The first version I tested which exhibited
this was 9.5beta2.

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Shay Rojansky

>
> Is this in a backend with ssl?
>

No.

If you go up one frame, what value does port->sock have?
>

For some reason VS is telling me "Unable to read memory" on port->sock... I
have no idea why that is...

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> There goes that theory. Amongst others. The aforementioned problem with
> waitfor doesn't seem to be actually armed because waitfor is only used
> if errno == EWOULDBLOCK || errno == EAGAIN.

Mumble.  It is clearly possible that we'd reach the Assert failure if
SSL_read were to return -1 and set errno to EWOULDBLOCK or EAGAIN.
I doubt that is what is happening here, because those errnos don't
seem sensible for an EOF condition, but I'd still feel more comfortable
if be_tls_read/be_tls_write handled SSL_ERROR_SYSCALL like this:

if (n != -1)
{
errno = ECONNRESET;
n = -1;
}
+   else
+   {
+   /* just in case errno is EWOULDBLOCK/EAGAIN */
+   *waitfor = WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE;
+   }

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Alvaro Herrera

Shay Rojansky wrote:
> >
> > Are we sure this is a 9.5-only bug?  Shay, can you try 9.4 branch tip
> > and see if it misbehaves?  Can anyone else reproduce the problem?
> >
> >
> Doesn't occur with 9.4.5 either. The first version I tested which exhibited
> this was 9.5beta2.

Maybe it's time for some git bisect'ing?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] custom parameters cannot be removed once set

2015-12-30 Thread Joe Conway

On 12/30/2015 10:36 AM, Joe Conway wrote:
> A side issue is that it would be nice if there were a way to check for a
> custom parameter value without getting an error if it does not exist.
> There is a missing_ok option to GetConfigOptionByName(), but we
> currently don't expose it from SQL. I'd like to add a variant of
> current_setting() with a second argument for missing_ok. Objections?

Nevermind this part -- I guess someone already beat me to it :-)

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 13:26:34 -0500, Tom Lane wrote:
> I doubt that is what is happening here, because those errnos don't
> seem sensible for an EOF condition, but I'd still feel more comfortable
> if be_tls_read/be_tls_write handled SSL_ERROR_SYSCALL like this:
> 
> if (n != -1)
> {
> errno = ECONNRESET;
> n = -1;
> }
> +   else
> +   {
> +   /* just in case errno is EWOULDBLOCK/EAGAIN */
> +   *waitfor = WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE;
> +   }

Being a bit more defensive here sounds sensible. But I think it might be
better to only set WL_SOCKET_READABLE/WL_SOCKET_WRITEABLE in
be_tls_read/be_tls_write correspondingly. It's possible that we'd
otherwise just end up in an endless loop.

Shay, if you trace the backend with process explorer or something like
that, what system calls do you seen when you kill the client?

Hm, so pgwin32_recv() has the following block:
if (pgwin32_waitforsinglesocket(s, FD_READ | FD_CLOSE | 
FD_ACCEPT,

INFINITE) == 0)
return -1;  /* errno already set */
and there's several paths in pgwin32_waitforsinglesocket() that return
0. But I can't see any of those reasonably return EAGAIN/EWOULDBLOCK

Shay earlier said that he sees the connection in state FIN_WAIT_2. The
WSAEventSelect() docs say:
"The FD_CLOSE network event is recorded when a close indication is
received for the virtual circuit corresponding to the socket. In TCP
terms, this means that the FD_CLOSE is recorded when the connection goes
into the TIME WAIT or CLOSE WAIT states.".
To me that seems to imply that the socket Shay observed is the client's
and that the server should see at least see CLOSE WAIT.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread David Rowley

On 30 December 2015 at 21:12, Benedikt Grundmann 
wrote:

> On Wed, Dec 30, 2015 at 7:16 AM, David Rowley <
> david.row...@2ndquadrant.com> wrote:
>
>>
>> A number of ideas were suggested on the other thread about how we might
>> go about solving this problem. In [3] Simon talked about perhaps enabling
>> extra optimisations when the planner sees that the plan will cost more than
>> some given threshold. That's perhaps an option, but may not work well for
>> optimisations which must take place very early in planning, for example [4].
>>
>
> A small tweak on 3 to deal with 4.  If the returned plan cost is quite
> high (say you estimate minutes+) you could just restart planning from
> scratch with all costly planning enabled, because even in the worst case
> (that is the additional options don't find a better plan), the total
> planning cost won't matter much in the grand scheme of things.
>

I do personally quite like this idea. Quite likely the extra logic could be
added to the planner() function so that it calls standard_planner() again
in the event that the cost exceeds some specified threshold. I think the
planner might need a little bit of work before replanning on the same parse
is ok, as there's places where the planner makes changes to this object
which cause things not to work well during the replan. So I think if we
went down this route, then the first steps should be to find alternative
ways to do things so that the parse is never edited, and set new standards
that the parse cannot be changed within the planner anymore.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> On 2015-12-30 13:26:34 -0500, Tom Lane wrote:
>> I doubt that is what is happening here, because those errnos don't
>> seem sensible for an EOF condition, but I'd still feel more comfortable
>> if be_tls_read/be_tls_write handled SSL_ERROR_SYSCALL like this:
>> ...

> Being a bit more defensive here sounds sensible. But I think it might be
> better to only set WL_SOCKET_READABLE/WL_SOCKET_WRITEABLE in
> be_tls_read/be_tls_write correspondingly. It's possible that we'd
> otherwise just end up in an endless loop.

Mumble.  We don't have enough visibility into SSL's internal protocol
to know whether it's currently trying to read the socket or write it,
unless it tells us.  I think if we wait for only readable or only
writable based on what the high-level request is, we risk getting stuck.

(In practice, if the syscall error is persistent, we're likely to fail
pretty soon anyway.  But if it were some transient error, we'd be
more likely to successfully recover if we wait for either condition.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Emre Hasegeli

> I don’t see how to solve this problem without changing explain analyze output 
> to accommodate for “unknown” value. I don’t think “0” is a non-confusing 
> representation of “unknown” for most people, and from the practical 
> standpoint, a “best effort” estimate is better than 0 (i.e. I will be able to 
> estimate how efficient BRIN index is for my tables in terms of the number of 
> tuples retrieved/thrown away)

The number of retrieved and thrown away rows are already available on
the upper part of the plan.  Bitmap Index Scan should provide the rows
that matched the index.  Another alternative would be just returning
the number of matching pages (by not multiplying with 10).  It might
be better understood.  The users who can understand the EXPLAIN
ANALYZE output shouldn't be expecting BRIN to return rows.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Oleksii Kliukin


> On 30 Dec 2015, at 21:12, Tom Lane  wrote:
> 
> Emre Hasegeli  writes:
>>> I don’t see how to solve this problem without changing explain analyze 
>>> output to accommodate for “unknown” value. I don’t think “0” is a 
>>> non-confusing representation of “unknown” for most people, and from the 
>>> practical standpoint, a “best effort” estimate is better than 0 (i.e. I 
>>> will be able to estimate how efficient BRIN index is for my tables in terms 
>>> of the number of tuples retrieved/thrown away)
> 
> We do already have a nearby precedent for returning zero when we don't
> have an accurate answer: that's what BitmapAnd and BitmapOr plan nodes
> do.  (This is documented btw, at the bottom of section 14.1.)

+1 for following a precedent.

> 
>> The number of retrieved and thrown away rows are already available on
>> the upper part of the plan.  Bitmap Index Scan should provide the rows
>> that matched the index.
> 
> It doesn't have that information.
> 
>> Another alternative would be just returning
>> the number of matching pages (by not multiplying with 10).  It might
>> be better understood.
> 
> No, it would not, at least not unless we found a way to explicitly mark
> the output as being blocks not rows (which would doubtless break a lot of
> existing client-side code).  Zero is fairly clearly an impossible value,
> whereas anything that's not zero is going to be taken at face value by
> many users.

But is it? Is it impossible for the BRIN bitmap index scan to return 0 rows 
(say, if the value being matched is outside the min/max boundary for every 
block range?) Granted, if we document that it always returns 0 and should be 
ignored, then confusing the actual 0 with the 0 as a representation of 
“unknown” would be less a problem. 

--
Oleksii

Fwd: [HACKERS] Avoid endless futile table locks in vacuuming.

2015-12-30 Thread Jeff Janes

Forgetting to CC the list is starting to become a bad habit, sorry.
forwarding to list:

-- Forwarded message --
From: Jeff Janes 
Date: Wed, Dec 30, 2015 at 10:03 AM
Subject: Re: [HACKERS] Avoid endless futile table locks in vacuuming.
To: Tom Lane 


On Dec 29, 2015 4:47 PM, "Tom Lane"  wrote:
>
> Jeff Janes  writes:
> > If we are not doing a scan_all and we fail to acquire a clean-up lock on
> > the last block, and the last block reports that it needs freezing, then we
> > continue on to wait for the clean-up lock. But there is no need, we don't
> > really need to freeze the block, and we already know whether it has tuples
> > or not without the clean up lock.  Couldn't we just set the flag based on
> > hastup, then 'continue'?
>
> Uh, isn't that what my patch is doing?

My reading was it does that only if there are no tuples that could be
frozen.  If there are tuples that could be frozen, it actually does
the freezing, even though that is not necessary unless scan_all is
true.

So like the attached, although it is a bit weird to call
lazy_check_needs_freeze if , under !scan_all, we don't actually care
about whether it needs freezing but only the hastup.

Cheers,

Jeff
diff --git a/src/backend/commands/vacuumlazy.c b/src/backend/commands/vacuumlazy.c
new file mode 100644
index 2429889..e3cfa59
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
*** static BufferAccessStrategy vac_strategy
*** 138,144 
  static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  			   Relation *Irel, int nindexes, bool scan_all);
  static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
! static bool lazy_check_needs_freeze(Buffer buf);
  static void lazy_vacuum_index(Relation indrel,
    IndexBulkDeleteResult **stats,
    LVRelStats *vacrelstats);
--- 138,144 
  static void lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  			   Relation *Irel, int nindexes, bool scan_all);
  static void lazy_vacuum_heap(Relation onerel, LVRelStats *vacrelstats);
! static bool lazy_check_needs_freeze(Buffer buf, bool *hastup);
  static void lazy_vacuum_index(Relation indrel,
    IndexBulkDeleteResult **stats,
    LVRelStats *vacrelstats);
*** static void lazy_cleanup_index(Relation
*** 147,152 
--- 147,153 
     LVRelStats *vacrelstats);
  static int lazy_vacuum_page(Relation onerel, BlockNumber blkno, Buffer buffer,
   int tupindex, LVRelStats *vacrelstats, Buffer *vmbuffer);
+ static bool should_attempt_truncation(LVRelStats *vacrelstats);
  static void lazy_truncate_heap(Relation onerel, LVRelStats *vacrelstats);
  static BlockNumber count_nondeletable_pages(Relation onerel,
  		 LVRelStats *vacrelstats);
*** lazy_vacuum_rel(Relation onerel, int opt
*** 175,181 
  	LVRelStats *vacrelstats;
  	Relation   *Irel;
  	int			nindexes;
- 	BlockNumber possibly_freeable;
  	PGRUsage	ru0;
  	TimestampTz starttime = 0;
  	long		secs;
--- 176,181 
*** lazy_vacuum_rel(Relation onerel, int opt
*** 263,276 
  
  	/*
  	 * Optionally truncate the relation.
- 	 *
- 	 * Don't even think about it unless we have a shot at releasing a goodly
- 	 * number of pages.  Otherwise, the time taken isn't worth it.
  	 */
! 	possibly_freeable = vacrelstats->rel_pages - vacrelstats->nonempty_pages;
! 	if (possibly_freeable > 0 &&
! 		(possibly_freeable >= REL_TRUNCATE_MINIMUM ||
! 		 possibly_freeable >= vacrelstats->rel_pages / REL_TRUNCATE_FRACTION))
  		lazy_truncate_heap(onerel, vacrelstats);
  
  	/* Vacuum the Free Space Map */
--- 263,270 
  
  	/*
  	 * Optionally truncate the relation.
  	 */
! 	if (should_attempt_truncation(vacrelstats))
  		lazy_truncate_heap(onerel, vacrelstats);
  
  	/* Vacuum the Free Space Map */
*** lazy_scan_heap(Relation onerel, LVRelSta
*** 498,503 
--- 492,504 
  	 * started skipping blocks, we may as well skip everything up to the next
  	 * not-all-visible block.
  	 *
+ 	 * We don't skip the last page, unless we've already determined that it's
+ 	 * not worth trying to truncate the table.  This avoids having every
+ 	 * vacuum take access exclusive lock on the table to attempt a truncation
+ 	 * that just fails immediately (if there are tuples in the last page).
+ 	 * This is worth avoiding mainly because such a lock must be replayed on
+ 	 * any hot standby, where it can be disruptive.
+ 	 *
  	 * Note: if scan_all is true, we won't actually skip any pages; but we
  	 * maintain next_not_all_visible_block anyway, so as to set up the
  	 * all_visible_according_to_vm flag correctly for each page.
*** lazy_scan_heap(Relation onerel, LVRelSta
*** 530,536 
  		Page		page;
  		OffsetNumber offnum,
  	maxoff;
! 		bool		tupgone,
  	hastup;
  		int			prev_dead_count;
  		int			nfrozen;
--- 531,538 
  		Page		page;

Re: [HACKERS] custom parameters cannot be removed once set

2015-12-30 Thread Joe Conway

On 12/30/2015 10:44 AM, Tom Lane wrote:
> Meh.  The real problem here is that people are abusing the custom-GUC
> mechanism to implement session-lifespan variables.  I do not think we
> should encourage that; GUC offers neither adequate features for that
> (eg, no way to declare a variable's type) nor adequate performance
> (it's not going to scale to very many variables).

All true, but it works well enough today that it serves an often needed
role. And it is plenty fast, at least for use cases for which I've
needed it.

> I'd rather see us invent a real session-variable mechanism instead
> of putting yet more demands on GUC that have nothing to do with its
> primary mission, and indeed are antithetical to it.

Also true, but since no such effort exists today that I'm aware of,
there is little chance we will have that real mechanism any time in the
next 24 months at the least, because I doubt even the bikeshedding could
be finished before we lock down 9.6 :-(

Joe

-- 
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development



signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Pavel Stehule

2015-12-30 17:33 GMT+01:00 Robert Haas :

> On Mon, Dec 28, 2015 at 8:45 AM, Shulgin, Oleksandr
>  wrote:
> > I didn't check out earlier versions of this patch, but the latest one
> still
> > changes pg_size_pretty() to emit PB suffix.
> >
> > I don't think it is worth it to throw a number of changes together like
> > that.  We should focus on adding pg_size_bytes() first and make it
> > compatible with both pg_size_pretty() and existing GUC units: that is
> > support suffixes up to TB and make sure they have the meaning of powers
> of
> > 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
> >
> > Next, we could think about adding handling of PB suffix on input and
> output,
> > but I don't see a big problem if that is emitted as 1024TB or the user
> has
> > to specify it as 1024TB in a GUC or argument to pg_size_bytes(): an minor
> > inconvenience only.
>
> +1 to everything in this email.
>

so I removed support for PB and SI units. Now the
memory_unit_conversion_table is shared.

Regards

Pavel


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index 8ef9fce..6b921ae
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17607,17612 
--- 17607,17615 
  pg_relation_size
 
 
+ pg_size_bytes
+
+
  pg_size_pretty
 
 
*** postgres=# SELECT * FROM pg_xlogfile_nam
*** 17677,17682 
--- 17680,17696 
 


+
+ pg_size_bytes(text)
+ 
+bigint
+
+  Converts a size in human-readable format with size units
+  into bytes.  The parameter is case insensitive string. Following
+  units are supported: kB, MB, GB, TB.
+
+   
+   
 
  pg_size_pretty(bigint)
  
diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
new file mode 100644
index 5ee59d0..3ec6d0c
*** a/src/backend/utils/adt/dbsize.c
--- b/src/backend/utils/adt/dbsize.c
***
*** 25,30 
--- 25,31 
  #include "storage/fd.h"
  #include "utils/acl.h"
  #include "utils/builtins.h"
+ #include "utils/guc.h"
  #include "utils/numeric.h"
  #include "utils/rel.h"
  #include "utils/relfilenodemap.h"
*** pg_size_pretty_numeric(PG_FUNCTION_ARGS)
*** 699,704 
--- 700,807 
  	PG_RETURN_TEXT_P(cstring_to_text(result));
  }
  
+ #define MAX_DIGITS		20
+ #define MAX_UNIT_LEN		3
+ 
+ /*
+  * Convert human readable size to long int.
+  *
+  * Due suppor decimal value and case insensitivity of units
+  * a function parse_intcannot be used.
+  */
+ Datum
+ pg_size_bytes(PG_FUNCTION_ARGS)
+ {
+ 	text	   *arg = PG_GETARG_TEXT_PP(0);
+ 	const char	   *cp = text_to_cstring(arg);
+ 	char		digits[MAX_DIGITS + 1];
+ 	int		ndigits = 0;
+ 	Numeric			num;
+ 	int64 result;
+ 
+ 	/* Skip leading spaces */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	switch (*cp)
+ 	{
+ 		/* ignore plus symbol */
+ 		case '+':
+ 			cp++;
+ 			break;
+ 		case '-':
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("size cannot be negative")));
+ 	}
+ 
+ 	while (*cp)
+ 	{
+ 		if ((isdigit(*cp) || *cp == '.') && ndigits < MAX_DIGITS)
+ 			digits[ndigits++] = *cp++;
+ 		else
+ 			break;
+ 	}
+ 
+ 	if (ndigits == 0)
+ 		ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 	digits[ndigits] = '\0';
+ 	num = DatumGetNumeric(DirectFunctionCall3(numeric_in,
+ 			CStringGetDatum(digits), 0, -1));
+ 
+ 	/* allow whitespace between integer and unit */
+ 	while (isspace((unsigned char) *cp))
+ 		cp++;
+ 
+ 	/* Handle possible unit */
+ 	if (*cp != '\0')
+ 	{
+ 		char		unit[MAX_UNIT_LEN + 1];
+ 		int		unitlen = 0;
+ 		int		multiplier;
+ 		Numeric			mul_num;
+ 		const char	*hintmsg;
+ 
+ 		while (*cp && !isspace(*cp) && unitlen < MAX_UNIT_LEN)
+ 			unit[unitlen++] = *cp++;
+ 
+ 		unit[unitlen] = '\0';
+ 
+ 		/* allow whitespace after unit */
+ 		while (isspace((unsigned char) *cp))
+ 			cp++;
+ 
+ 		if (*cp != '\0')
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid format")));
+ 
+ 		if (!parse_memory_unit(unit, , ))
+ 			ereport(ERROR,
+ 	(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ 	 errmsg("invalid unit \"%s\"", unit),
+ 	 errhint("%s", _(hintmsg;
+ 
+ 		/*
+ 		 * Now, the multiplier is in KB unit. It should be multiplied by 1024
+ 		 * before usage
+ 		 */
+ 		mul_num = DatumGetNumeric(DirectFunctionCall1(int8_numeric,
+ 		Int64GetDatum(multiplier * 1024L)));
+ 
+ 		num = DatumGetNumeric(DirectFunctionCall2(numeric_mul,
+ 			NumericGetDatum(mul_num),
+ 			NumericGetDatum(num)));
+ 	}
+ 
+ 	result =

Re: [HACKERS] custom parameters cannot be removed once set

2015-12-30 Thread Tom Lane

Joe Conway  writes:
> Today I was reminded of an issue I have run across before, namely that
> in a given postgres session, once a custom parameter has been set, there
> is no way to remove it entirely.

True.

> This strikes me as, at least, surprising, and possibly should be
> considered a bug. Thoughts?

Meh.  The real problem here is that people are abusing the custom-GUC
mechanism to implement session-lifespan variables.  I do not think we
should encourage that; GUC offers neither adequate features for that
(eg, no way to declare a variable's type) nor adequate performance
(it's not going to scale to very many variables).

I'd rather see us invent a real session-variable mechanism instead
of putting yet more demands on GUC that have nothing to do with its
primary mission, and indeed are antithetical to it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Fwd: [HACKERS] Avoid endless futile table locks in vacuuming.

2015-12-30 Thread Tom Lane

Jeff Janes  writes:
> On Dec 29, 2015 4:47 PM, "Tom Lane"  wrote:
>> Uh, isn't that what my patch is doing?

> My reading was it does that only if there are no tuples that could be
> frozen.  If there are tuples that could be frozen, it actually does
> the freezing, even though that is not necessary unless scan_all is
> true.

Ah, now I see.

> So like the attached, although it is a bit weird to call
> lazy_check_needs_freeze if , under !scan_all, we don't actually care
> about whether it needs freezing but only the hastup.

True, but this is such a corner case that it doesn't seem worth expending
additional code to have a special-purpose page scan for it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Tom Lane

Emre Hasegeli  writes:
>> I donât see how to solve this problem without changing explain analyze 
>> output to accommodate for âunknownâ value. I donât think â0â is a 
>> non-confusing representation of âunknownâ for most people, and from the 
>> practical standpoint, a âbest effortâ estimate is better than 0 (i.e. I 
>> will be able to estimate how efficient BRIN index is for my tables in terms 
>> of the number of tuples retrieved/thrown away)

We do already have a nearby precedent for returning zero when we don't
have an accurate answer: that's what BitmapAnd and BitmapOr plan nodes
do.  (This is documented btw, at the bottom of section 14.1.)

> The number of retrieved and thrown away rows are already available on
> the upper part of the plan.  Bitmap Index Scan should provide the rows
> that matched the index.

It doesn't have that information.

> Another alternative would be just returning
> the number of matching pages (by not multiplying with 10).  It might
> be better understood.

No, it would not, at least not unless we found a way to explicitly mark
the output as being blocks not rows (which would doubtless break a lot of
existing client-side code).  Zero is fairly clearly an impossible value,
whereas anything that's not zero is going to be taken at face value by
many users.

On balance I think likely the best thing to do is return zero, and
document that behavior.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Additional role attributes && superuser review

2015-12-30 Thread Michael Paquier

On Thu, Dec 31, 2015 at 1:50 AM, Robert Haas  wrote:
> Under those circumstances, it seems very dubious to proceed
> with this.  Michael seems to think that we can go ahead and start
> changing things and sort out whatever is broken later, but that
> doesn't sound like a very good plan to me.

I meant [snip]tuning those roles during this development cycle.[/snip]
But I'll just withdraw as there are enough concerns, from two
committers on top of it. My point was just to move on with this patch
in the direction where the overall consensus seemed to be at the point
I begun participating in this thread.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-30 Thread Michael Paquier

On Wed, Dec 30, 2015 at 11:21 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>
>> OK, here are new patches.
>> - 0001 switches a bunch of TailMatches to Matches. Do we want to care
>> about the case where a schema is created following by a bunch of
>> objects? I mean stuff like "CREATE SCHEMA hoge CREATE TABLE ..." where
>> the current completion would work fine. The performance gains seem
>> worth it compared to the number of people actually using it, the point
>> has just not been raised yet.
>
> I'd rather have the completion work for that case than get a few
> microseconds speedup.  As far as I recall, it's only four commands that
> must retain the old coding.

Fine for me this way.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Alvaro Herrera

Tom Lane wrote:
> Emre Hasegeli  writes:
> >> I don’t see how to solve this problem without changing explain analyze 
> >> output to accommodate for “unknown” value. I don’t think “0” is a 
> >> non-confusing representation of “unknown” for most people, and from the 
> >> practical standpoint, a “best effort” estimate is better than 0 (i.e. I 
> >> will be able to estimate how efficient BRIN index is for my tables in 
> >> terms of the number of tuples retrieved/thrown away)
> 
> We do already have a nearby precedent for returning zero when we don't
> have an accurate answer: that's what BitmapAnd and BitmapOr plan nodes
> do.  (This is documented btw, at the bottom of section 14.1.)

Hmm, but amgetbitmap is documented thusly:

The number of tuples fetched is returned (this might be just an
approximate count, for instance some AMs do not detect
duplicates).
http://www.postgresql.org/docs/9.5/static/index-functions.html

so I'm not sure we're actually violating an expectation that the number
of rows is exact.  I mean, if we zero out this one, shouldn't we set it
to zero for these other documented cases too?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> We do already have a nearby precedent for returning zero when we don't
>> have an accurate answer: that's what BitmapAnd and BitmapOr plan nodes
>> do.  (This is documented btw, at the bottom of section 14.1.)

> Hmm, but amgetbitmap is documented thusly:

>   The number of tuples fetched is returned (this might be just an
>   approximate count, for instance some AMs do not detect
>   duplicates).
>   http://www.postgresql.org/docs/9.5/static/index-functions.html

> so I'm not sure we're actually violating an expectation that the number
> of rows is exact.  I mean, if we zero out this one, shouldn't we set it
> to zero for these other documented cases too?

Well, the code comments may say that, but the user-facing docs don't ...

More generally, people are accustomed to the idea that the planner's
estimated rowcounts are just estimates, but they're not accustomed to
the idea that the runtime "actual" rowcounts might be just estimates.
That seems like it's moving EXPLAIN ANALYZE's contract quite a bit,
even if there are some documents for internal APIs that say something
else.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-30 Thread Haribabu Kommi

On Wed, Dec 30, 2015 at 9:48 PM, Shulgin, Oleksandr
 wrote:
> On Wed, Dec 30, 2015 at 4:31 AM, Haribabu Kommi 
> wrote:
>>
>>
>> Adding quotes to pg_hba_lookup function makes it different from others.
>> The issues regarding the same is already discussed in [1].
>>
>> select a.database[1], b.datname from
>> pg_hba_lookup('postgres','kommih','::1')
>> as a, pg_database as b where a.database[1]
>> = b.datname;
>>
>> The queries like above are not possible with quoted output. It is very
>> rare that the
>> pg_hba_lookup function used in join operations, but still it is better
>> to provide
>> data without quotes. so I reverted these changes in the attached latest
>> patch.
>
>
> That's a good point.  I wonder that maybe instead of re-introducing quotes
> we could somehow make the unquoted keywords that have special meaning stand
> out, e.g:
>
> database  | {$sameuser}
> user_name | {$all}
>
> That should make it obvious which of the values are placeholders and doesn't
> interfere with joining database or user catalogs (while I would call
> "sameuser" a very unlikely name for a database, "all" might be not that
> unlikely name for a user, e.g. someone called like "Albert L. Lucky" could
> use that as a login name).

It is not only the problem with joins, the following two cases works
without quotes only.
With quotes the query doesn't match with the database name.

select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
database = '{"Test"}';
select * from pg_hba_lookup('Test', 'kommih','127.0.0.1') where
database = '{Test}';


Regards,
Hari Babu
Fujitsu Australia


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

--enable-depend by default (was Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex)

2015-12-30 Thread Tom Lane

[ A change as significant as this should not be debated in a thread with
a title that suggests it's of interest to only one or two people ]

> On Wed, Dec 30, 2015 at 5:44 PM, Andres Freund  wrote:
>> I still maintain that --enable-depend should be on by default. We're
>> absurdly optimizing towards saving a handful of cycles in scenarios
>> which are usually bottlenecked by other things (build boxes spend more
>> times on tests and such)

Nonsense.  The buildfarm will simply get slower, and at least for my
animals it's too damn slow already.  We could fix that by having the
buildfarm script automatically add --disable-depend; but if we're going
to change this, we should roll out that script update BEFORE we change
it, not after.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Oleksii Kliukin


> On 30 Dec 2015, at 17:44, Tom Lane  wrote:
> 
> Oleksii Kliukin  writes:
>>> On 30 Dec 2015, at 17:02, Tom Lane  wrote:
>>> Another idea would be to use the heap's row density as calculated
>>> by the last ANALYZE (ie, reltuples/relpages), with a fallback to 100
>>> if relpages=0.  This'd only be convenient if the bitmap scan node has
>>> the parent heap rel open, which it might not.
> 
>> +1
> 
> Any objections to the attached?

Looks good to me. On my sample system with 100K rows, the new version gives me:

— CREATE TABLE test AS SELECT id FROM generate_series(1,10) id;
— CREATE INDEX ON test USING brin(id);

postgres=# explain analyze select 1 from test where id = 500;
   QUERY PLAN
-
 Bitmap Heap Scan on test  (cost=12.01..16.02 rows=1 width=0) (actual 
time=0.199..4.220 rows=1 loops=1)
   Recheck Cond: (id = 500)
   Rows Removed by Index Recheck: 28927
   Heap Blocks: lossy=128
   ->  Bitmap Index Scan on test_id_idx  (cost=0.00..12.01 rows=1 width=0) 
(actual time=0.072..0.072 rows=28800 loops=1)
 Index Cond: (id = 500)
 Planning time: 0.433 ms
 Execution time: 4.323 ms
(8 rows)

which is much closer to the actual number of rows removed by the index recheck 
+ the one left.

--
Oleksii

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Andres Freund

On 2015-12-30 12:30:43 -0500, Tom Lane wrote:
> Nor OS X.  Ugh.  My first thought was that ac1d7945f broke this, but
> that's only in HEAD not 9.5, so some earlier change must be responsible.

The backtrace in
http://archives.postgresql.org/message-id/CADT4RqBo79_0Vx%3D-%2By%3DnFv3zdnm_-CgGzbtSv9LhxrFEoYMVFg%40mail.gmail.com
seems to indicate that it's really WaitLatchOrSocket() not noticing the
socket is closed.

For a moment I had the theory that Port->sock might be invalid because
it somehow got closed. That'd then remove the socket from the waited-on
events, which would explain the behaviour. But afaics that's really only
possible via pq_init()'s on_proc_exit(socket_close, 0); And I can't see
how that could be reached.

FWIW, the
if (sock == PGINVALID_SOCKET)
wakeEvents &= ~(WL_SOCKET_READABLE | WL_SOCKET_WRITEABLE);
block in both latch implementations looks like a problem waiting to happen.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Some 9.5beta2 backend processes not terminating properly?

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> On 2015-12-30 19:01:10 +0200, Shay Rojansky wrote:
>> The backends seem to hang when the client closes a socket without first
>> sending a Terminate message - some of the tests make this happen. I've
>> confirmed this happens with 9.5rc1 running on Windows (versions 10 and 7),
>> but this does not occur on Ubuntu 15.10. The client runs on Windows as well
>> (although I doubt that's important).

> Hm. So that seems to indicate that, on windows, we're not properly
> recognizing dead sockets in the latch code.

Or we just broke EOF detection on Windows sockets in general.  It might be
worth checking if the problem appears on the client side; that is, given a
psql running on Windows, do local-equivalent-of-kill-9 on the connected
backend, and see if psql notices.  (Hm, although if it's idle psql wouldn't
notice until you next try a command, so it might be hard to tell.  Maybe
kill -9 while the backend is in process of a long query?)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread David Rowley

On 31 December 2015 at 01:24, Tomas Vondra 
wrote:

> On 12/30/2015 08:16 AM, David Rowley wrote:
>
>>
>> I do strongly believe that we need to come up with something to
>> solve this problem. I already summarised my thoughts on the other
>> thread.
>>
>
> One approach that I don't see mentioned on any of the threads is plan
> caching, which allows reusing the plan across many query executions,
> hopefully amortizing the planning costs.
>
> I'm sure implementing such caching is non-trivial and there are cases
> where it may not help, but perhaps it's not entirely futile (AFAIK it's
> used by some databases exactly to address the higher planning costs).
>
> I imagine a single GUC enabling or disabling this (possibly not just
> globally but per session, user or database).
>
> We already have some form of plan caching, although only for prepared
> statements within a single session - maybe that could be a good starting
> point? For example what if we only enabled those "expensive" optimizations
> for prepared statements, which are assumed to be executed multiple times?
> Of course, this may not be entirely true (say, PL/pgSQL uses prepared
> statements all the time).
>
> Of course, the annoying consequence of this would be that the planning may
> get somewhat unpredictable - the plan will depend on whether the query was
> planned directly or as a prepared statement, or whether plan caching is
> enabled. However, the same mostly applies to solutions proposed in the
> other threads so far.
>

Personally I'd like to see automatic plan caching occur in PostgreSQL.
There is a bit of a problem with it in regards to a query such as: select *
from t where mySkewedColumn = 1;  where the value 1 appears 99% of the
time. Initially we may plan to seqscan, where with other values we'd likely
prefer to index scan. I imagine with my unique joins patch, that it could
be expanded to test baserestrictinfos to see if they contain quals with a
unique index.   This knowledge could later permit plan caching to occur on
queries which are safe from having any skew in the results.  It might sound
rather restrictive, but likely this would cover 99% of UPDATE and DELETE
operations in an OLTP workload, and likely a good deal of the SELECTs too.
The safety of caching could be analyzed during planning, and a flag could
be set somewhere, perhaps in PlannedStmt to mark if the plan is safe to
cache. The planner() function could initially hash the query string and
check if any cached plan exists with that hash, if not, plan the statement,
and then check if the "safeToCache" flag is set, and if so, stick that plan
into a hash table. Also plans with no baserestrictinfos could be
"safeToCache" too.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Fwd: [HACKERS] Avoid endless futile table locks in vacuuming.

2015-12-30 Thread Tom Lane

Jeff Janes  writes:
> So like the attached, although it is a bit weird to call
> lazy_check_needs_freeze if , under !scan_all, we don't actually care
> about whether it needs freezing but only the hastup.

I think this misses unpinning the buffer in the added code path.
I rearranged to avoid that, did some other cosmetic work, and committed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread David Rowley

On 31 December 2015 at 04:39, Evgeniy Shishkin  wrote:

>
> > On 30 Dec 2015, at 10:16, David Rowley 
> wrote:
> >
> > A number of ideas were suggested on the other thread about how we might
> go about solving this problem. In [3] Simon talked about perhaps enabling
> extra optimisations when the planner sees that the plan will cost more than
> some given threshold. That's perhaps an option, but may not work well for
> optimisations which must take place very early in planning, for example [4].
> > Another idea which came up was from Evgeniy [5], which was more of a
> request not to do it this way, but never-the-less, the idea was basically
> to add lots of GUCs to enable/disable each extra planner feature.
> >
>
> Well, my idea was to track planning/execution cost in something like
> pg_stat_statements.
> That way we can track actual time, not estimated cost like Simon proposed.
>
>
I like this idea also, but I do see the use for this as more of a guide
which could be used by the DBA to tweak some sort of planner_effort GUCs. I
don't think that the data of pg_stat_statements could be used by the
planner as this is an extension rather than core code.


-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Revisiting pg_stat_statements and IN() (Was: Re: pg_stat_statements fingerprinting logic and ArrayExpr)

2015-12-30 Thread Bruce Momjian

On Mon, Nov 30, 2015 at 02:41:02PM +0100, Shulgin, Oleksandr wrote:
> On Wed, Nov 25, 2015 at 9:13 AM, Lukas Fittl  wrote:
> 
> On Mon, Nov 23, 2015 at 11:53 PM, Peter Geoghegan  wrote:
> 
> One specific justification he gave for not using pg_stat_statements
> was:
> 
> "Doesn’t merge bind vars in IN()" (See slide #11)
> 
> 
> I wonder:
> 
> * How do other people feel about this? Personally, I've seen enough
> problems of this kind in the field that "slippery slope" arguments
> against this don't seem very compelling.
> 
> 
> As someone who runs a little monitoring service thats solely based on
> pg_stat_statements data, ignoring IN list length would certainly be a good
> change.
> 
> We currently do this in post-processing, together with other data cleanup
> (e.g. ignoring the length of a VALUES list in an INSERT statement).
> 
> Given the fact that pgss data is normalized & you don't know which plan 
> was
> chosen, I don't think distinguishing based on the length of the list helps
> anyone really.
> 
> I see pg_stat_statements as a high-level overview of which queries have
> run, and which ones you might want to look into closer using e.g.
> auto_explain.
> 
> 
> I still have the plans to try to marry pg_stat_statements and auto_explain for
> the next iteration of "online query plans" extension I was proposing a few
> months ago, and the first thing I was going to look into is rectifying this
> problem with IN() jumbling.  So, have a +1 from me.

Is this a TODO?

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Still more sanity checking in gincostestimate()

2015-12-30 Thread Tom Lane

I looked into the problem reported in
http://www.postgresql.org/message-id/flat/56830ea5.7080...@squeakycode.net
which briefly is that gincostestimate can produce ridicuously large index
scan cost estimates for partial-match queries.

It appears that there are two basic problems contributing to this:

1. gincostestimate will scale up the statistics found in the index
metapage, no matter what, as long as they're not zero (but it doesn't
insist on nEntries being > 0).  If the stats date back to the index being
empty or nearly so, we can come out with some page counts that are pretty
silly, and we can also come out with numEntries = 0, which we then clamp
to 1, but this is still orders of magnitude off of reality.

2. The partial-match logic in gincost_pattern is pretty bogus and can
produce partialEntries values that are not very realistic.

The direct cause of the bad cost estimate is that some of the cost factors
get multiplied by partialEntries / numEntries, which can be quite a lot
larger than one, a scaling that was certainly never intended.

The attached proposed patch deals with this first by not applying the
scaling logic if it would be scaling up the index size more than 4X;
if it would be, we fall back to the rule-of-thumb estimates I
introduced recently in commit 7fb008c5ee59b040.  Secondly, we clamp
the partialEntries / numEntries ratio to not more than 1.

This is obviously all a bit ad-hoc, but it seems less likely to produce
insane estimates than what's there now.  On the other hand, those
rule-of-thumb estimates are too new to have any field experience behind
them, so maybe doubling down on our dependence on them isn't bright.

Comments?

regards, tom lane

diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 15121bc..8d109aa 100644
*** a/src/backend/utils/adt/selfuncs.c
--- b/src/backend/utils/adt/selfuncs.c
*** gincostestimate(PG_FUNCTION_ARGS)
*** 7246,7251 
--- 7246,7252 
  numEntries;
  	GinQualCounts counts;
  	bool		matchPossible;
+ 	double		partialScale;
  	double		entryPagesFetched,
  dataPagesFetched,
  dataPagesFetchedBySel;
*** gincostestimate(PG_FUNCTION_ARGS)
*** 7274,7315 
  		memset(, 0, sizeof(ginStats));
  	}
  
! 	if (ginStats.nTotalPages > 0 && ginStats.nEntryPages > 0 && numPages > 0)
  	{
  		/*
! 		 * We got valid stats.  nPendingPages can be trusted, but the other
! 		 * fields are data as of the last VACUUM.  Scale them by the ratio
! 		 * numPages / nTotalPages to account for growth since then.
  		 */
  		double		scale = numPages / ginStats.nTotalPages;
  
! 		numEntryPages = ginStats.nEntryPages;
! 		numDataPages = ginStats.nDataPages;
! 		numPendingPages = ginStats.nPendingPages;
! 		numEntries = ginStats.nEntries;
! 
! 		numEntryPages = ceil(numEntryPages * scale);
! 		numDataPages = ceil(numDataPages * scale);
! 		numEntries = ceil(numEntries * scale);
  		/* ensure we didn't round up too much */
! 		numEntryPages = Min(numEntryPages, numPages);
! 		numDataPages = Min(numDataPages, numPages - numEntryPages);
  	}
  	else
  	{
  		/*
! 		 * It's a hypothetical index, or perhaps an index created pre-9.1 and
! 		 * never vacuumed since upgrading.  Invent some plausible internal
! 		 * statistics based on the index page count.  We estimate that 90% of
! 		 * the index is entry pages, and the rest is data pages.  Estimate 100
! 		 * entries per entry page; this is rather bogus since it'll depend on
! 		 * the size of the keys, but it's more robust than trying to predict
! 		 * the number of entries per heap tuple.
  		 */
  		numPages = Max(numPages, 10);
! 		numEntryPages = floor(numPages * 0.90);
! 		numDataPages = numPages - numEntryPages;
! 		numPendingPages = 0;
  		numEntries = floor(numEntryPages * 100);
  	}
  
--- 7275,7333 
  		memset(, 0, sizeof(ginStats));
  	}
  
! 	/*
! 	 * Assuming we got valid (nonzero) stats at all, nPendingPages can be
! 	 * trusted, but the other fields are data as of the last VACUUM.  We can
! 	 * scale them up to account for growth since then, but that method only
! 	 * goes so far; in the worst case, the stats might be for a completely
! 	 * empty index, and scaling them will produce pretty bogus numbers.
! 	 * Somewhat arbitrarily, set the cutoff for doing scaling at 4X growth; if
! 	 * it's grown more than that, fall back to estimating things only from the
! 	 * assumed-accurate index size.  But we'll trust nPendingPages in any case
! 	 * so long as it's not clearly insane, ie, more than the index size.
! 	 */
! 	if (ginStats.nPendingPages < numPages)
! 		numPendingPages = ginStats.nPendingPages;
! 	else
! 		numPendingPages = 0;
! 
! 	if (numPages > 0 && ginStats.nTotalPages <= numPages &&
! 		ginStats.nTotalPages > numPages / 4 &&
! 		ginStats.nEntryPages > 0 && ginStats.nEntries > 0)
  	{
  		/*
! 		 * OK, the stats seem close enough to sane to be trusted.  But we
! 		 * still need to scale them by the ratio

Re: [HACKERS] Revisiting pg_stat_statements and IN() (Was: Re: pg_stat_statements fingerprinting logic and ArrayExpr)

2015-12-30 Thread Peter Geoghegan

On Wed, Dec 30, 2015 at 5:20 PM, Bruce Momjian  wrote:
> Is this a TODO?

It's on my (very long) personal TODO list. It would be great if
someone else took it, though. So, yes.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Revisiting pg_stat_statements and IN() (Was: Re: pg_stat_statements fingerprinting logic and ArrayExpr)

2015-12-30 Thread Bruce Momjian

On Wed, Dec 30, 2015 at 05:21:05PM -0800, Peter Geoghegan wrote:
> On Wed, Dec 30, 2015 at 5:20 PM, Bruce Momjian  wrote:
> > Is this a TODO?
> 
> It's on my (very long) personal TODO list. It would be great if
> someone else took it, though. So, yes.

TODO added.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Bruce Momjian

On Tue, Dec 29, 2015 at 02:08:19PM -0500, Bruce Momjian wrote:
> In the Windows MSVC build, we handle pg_config_os.h in the Perl scripts,
> but not dynloader.h.  The attached patch copies the process used for
> pg_config_os.h to handle dynloader.h.  I believe this is the only *.h
> file that has this problem.
> 
> This should fix the PL/Java Windows build problem.  I don't think I will
> get this patch into 9.5.0 but will put it in after 9.5.0 is released and
> it will appear in all the next minor version releases, including 9.5.1,
> which should happen in the next 60 days.

Oops.  Once this patch is applied, it is only going to take effect when
someone _installs_ Postgres.  Even an initdb will not add the file. 
This means that upgrading to the next minor release will _not_ fix this.

This suggests that we perhaps should consider this for 9.5.0, and that
your hack will have to be active until 9.4 gets to end-of-life, or
perhaps 9.5 if we can't get this into 9.5.0.  People who are using 9.5
Beta or RC will still not have the file, meaning 9.5 end-of-life might
still be a requirement.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Roman grave inscription +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] dynloader.h missing in prebuilt package for Windows?

2015-12-30 Thread Chapman Flack

On 12/30/15 20:40, Bruce Momjian wrote:

> your hack will have to be active until 9.4 gets to end-of-life, or
> perhaps 9.5 if we can't get this into 9.5.0.  People who are using 9.5
> Beta or RC will still not have the file, meaning 9.5 end-of-life might
> still be a requirement.

... by which time PL/Java will be sporting the javax.ai.selfaware
features of Java SE 11 or 12, and no one will even have to tell it to
drop the fallback directory.

Well, so it's one directory and one file and a few extra lines in pom.xml.
There are already more onerous backward compatibility hacks in PL/Java
than that. :)

-Chap

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE INDEX CONCURRENTLY?

2015-12-30 Thread Tim Kane

This just hit us today... Admittedly on an old cluster still running 9.2,
though I can't see any mention of it being addressed since.

Any chance of getting this on to to-do list?
On Sat, 1 Nov 2014 at 07:45, Simon Riggs  wrote:

> On 31 October 2014 17:46, Michael Banck  wrote:
>
> > I wonder whether that is pilot error (fair enough), or whether something
> > could be done about this?
>
> When originally written the constraints were tighter, but have since
> been relaxed.
>
> Even so a CIC waits until all snapshots that can see it have gone. So
> what you observe is correct and known.
>
>
> Can it be changed? Maybe.
>
> CREATE INDEX gets around the wait by using indcheckxmin to see whether
> the row is usable. So the command completes, even if the index is not
> usable by all current sessions.
>
> We perform the wait in a completely different way for CIC, for this
> reason (in comments)
>
>   We also need not set indcheckxmin during a concurrent index build,
>   because we won't set indisvalid true until all transactions that care
>   about the broken HOT chains are gone.
>
> Reading that again, I can't see why we do it that way. If CREATE INDEX
> can exit once the index is built, so could CONCURRENTLY.
>
> ISTM that we could indcheckxmin into an Xid, not a boolean
>For CREATE INDEX, set the indcheckxmin = xid of creating transaction
>For CREATE INDEX CONCURRENTLY set the indcheckxmin = xid of the
> completing transaction
>
> The apparent reason it does this is that the Xmin value used currently
> is the Xmin of the index row. The index row is inserted prior to the
> index being valid so that technique cannot work. So I am suggesting
> for CIC that we use the xid of the transaction that completes the
> index, not the xid that originally created the index row. Plus handle
> the difference between valid and not.
>
> --
>  Simon Riggs   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] pg_hba_lookup function to get all matching pg_hba.conf entries

2015-12-30 Thread Shulgin, Oleksandr

On Wed, Dec 30, 2015 at 4:31 AM, Haribabu Kommi 
wrote:

>
> Adding quotes to pg_hba_lookup function makes it different from others.
> The issues regarding the same is already discussed in [1].
>
> select a.database[1], b.datname from
> pg_hba_lookup('postgres','kommih','::1')
> as a, pg_database as b where a.database[1]
> = b.datname;
>
> The queries like above are not possible with quoted output. It is very
> rare that the
> pg_hba_lookup function used in join operations, but still it is better
> to provide
> data without quotes. so I reverted these changes in the attached latest
> patch.
>

That's a good point.  I wonder that maybe instead of re-introducing quotes
we could somehow make the unquoted keywords that have special meaning stand
out, e.g:

database  | {$sameuser}
user_name | {$all}

That should make it obvious which of the values are placeholders and
doesn't interfere with joining database or user catalogs (while I would
call "sameuser" a very unlikely name for a database, "all" might be not
that unlikely name for a user, e.g. someone called like "Albert L. Lucky"
could use that as a login name).

--
Alex

Re: [HACKERS] custom function for converting human readable sizes to bytes

2015-12-30 Thread Shulgin, Oleksandr

On Tue, Dec 29, 2015 at 7:15 PM, Pavel Stehule 
wrote:

>
>> I didn't check out earlier versions of this patch, but the latest one
>> still changes pg_size_pretty() to emit PB suffix.
>>
>> I don't think it is worth it to throw a number of changes together like
>> that.  We should focus on adding pg_size_bytes() first and make it
>> compatible with both pg_size_pretty() and existing GUC units: that is
>> support suffixes up to TB and make sure they have the meaning of powers of
>> 2^10, not 10^3.  Re-using the table present in guc.c would be a plus.
>>
>> Next, we could think about adding handling of PB suffix on input and
>> output, but I don't see a big problem if that is emitted as 1024TB or the
>> user has to specify it as 1024TB in a GUC or argument to pg_size_bytes():
>> an minor inconvenience only.
>>
>
> Last version still support BP in pg_size_pretty. It isn't big change. PB
> isn't issue.
>
> We have to do significant decision - should to support SI units in
> pg_size_bytes? We cannot to change it later. There is disagreement for SI
> units in pg_size_pretty, so SI units in pg_size_bytes can be inconsistent.
>

There is no way at this point to add support for SI units in a consistent
and backwards-compatible manner: both GUC and pg_size_pretty() use SI
suffixes (kB, MB, GB, TB) with the meaning of 2^(10*n) (KiB, MiB, GiB,
TiB).  But given that the size relates to memory or disk space, it is quite
customary *not* to use SI units, so I don't see a point in adding those.

--
Alex

Re: [HACKERS] More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread Benedikt Grundmann

On Wed, Dec 30, 2015 at 7:16 AM, David Rowley 
wrote:

>
> A number of ideas were suggested on the other thread about how we might go
> about solving this problem. In [3] Simon talked about perhaps enabling
> extra optimisations when the planner sees that the plan will cost more than
> some given threshold. That's perhaps an option, but may not work well for
> optimisations which must take place very early in planning, for example [4].
>

A small tweak on 3 to deal with 4.  If the returned plan cost is quite high
(say you estimate minutes+) you could just restart planning from scratch
with all costly planning enabled, because even in the worst case (that is
the additional options don't find a better plan), the total planning cost
won't matter much in the grand scheme of things.

Re: [HACKERS] Better detail logging for password auth failures

2015-12-30 Thread Andres Freund

On 2015-12-29 11:07:26 -0500, Tom Lane wrote:
> In passing, the patch gets rid of a vestigial CHECK_FOR_INTERRUPTS()
> call; it was added by e710b65c and IMO should have been removed again
> by 6647248e.  There's certainly no very good reason to have one right
> at that spot anymore.

Why? Doesn't seem like the worst place for an explicit interrupt check?
I think we don't really have a problem with too many such checks... We
surely could move it, but I don't really see how it's related to the
topic at hand nor do I think it's really worth pondering about
extensively.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-30 Thread Alvaro Herrera

Michael Paquier wrote:

> OK, here are new patches.
> - 0001 switches a bunch of TailMatches to Matches. Do we want to care
> about the case where a schema is created following by a bunch of
> objects? I mean stuff like "CREATE SCHEMA hoge CREATE TABLE ..." where
> the current completion would work fine. The performance gains seem
> worth it compared to the number of people actually using it, the point
> has just not been raised yet.

I'd rather have the completion work for that case than get a few
microseconds speedup.  As far as I recall, it's only four commands that
must retain the old coding.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Making tab-complete.c easier to maintain

2015-12-30 Thread Michael Paquier

On Wed, Dec 30, 2015 at 9:14 AM, Michael Paquier wrote:
> On Wed, Dec 30, 2015 at 6:26 AM, Thomas Munro wrote:
>> I see that you changed INSERT and DELETE (but not UPDATE) to use
>> MatchesN rather than TailMatchesN.  Shouldn't these stay with
>> TailMatchesN for the reason Tom gave above?
>
> Er, yeah. They had better be TailMatches, or even COPY DML stuff will be 
> broken.

OK, here are new patches.
- 0001 switches a bunch of TailMatches to Matches. Do we want to care
about the case where a schema is created following by a bunch of
objects? I mean stuff like "CREATE SCHEMA hoge CREATE TABLE ..." where
the current completion would work fine. The performance gains seem
worth it compared to the number of people actually using it, the point
has just not been raised yet.
- 0002 that implements the new tab completion for backslash commands,
with the wildcard "*" as suggested by Tom.

I fixed in 0001 the stuff with DML queries, and also found one bug for
another query while re-reading the code.
Regards,
-- 
Michael
From 3a2e63b984548d2f4826dabd61e0efcae3aabe22 Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 30 Dec 2015 20:32:18 +0900
Subject: [PATCH 1/2] Improve performance of psql tab completion

TailMatches are based on a lower-bound check and Matches uses a direct
match for the number of words. It happens that the former is used in many
places where the latter could be used. Doing the switch improve the
performance of tab completion by having to match only a number of words
for many commands.
---
 src/bin/psql/tab-complete.c | 547 +++-
 1 file changed, 281 insertions(+), 266 deletions(-)

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4c93ae9..b36bd73 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1137,6 +1137,21 @@ psql_completion(const char *text, int start, int end)
 #define Matches4(p1, p2, p3, p4) \
 	(previous_words_count == 4 && \
 	 TailMatches4(p1, p2, p3, p4))
+#define Matches5(p1, p2, p3, p4, p5) \
+	(previous_words_count == 5 && \
+	 TailMatches5(p1, p2, p3, p4, p5))
+#define Matches6(p1, p2, p3, p4, p5, p6) \
+	(previous_words_count == 6 && \
+	 TailMatches6(p1, p2, p3, p4, p5, p6))
+#define Matches7(p1, p2, p3, p4, p5, p6, p7) \
+	(previous_words_count == 7 && \
+	 TailMatches7(p1, p2, p3, p4, p5, p6, p7))
+#define Matches8(p1, p2, p3, p4, p5, p6, p7, p8) \
+	(previous_words_count == 8 && \
+	 TailMatches8(p1, p2, p3, p4, p5, p6, p7, p8))
+#define Matches9(p1, p2, p3, p4, p5, p6, p7, p8, p9) \
+	(previous_words_count == 9 && \
+	 TailMatches9(p1, p2, p3, p4, p5, p6, p7, p8, p9))
 
 	/*
 	 * Macros for matching N words at the start of the line, regardless of
@@ -1266,10 +1281,10 @@ psql_completion(const char *text, int start, int end)
 	else if (TailMatches7("ALL", "IN", "TABLESPACE", MatchAny, "OWNED", "BY", MatchAny))
 		COMPLETE_WITH_CONST("SET TABLESPACE");
 	/* ALTER AGGREGATE,FUNCTION  */
-	else if (TailMatches3("ALTER", "AGGREGATE|FUNCTION", MatchAny))
+	else if (Matches3("ALTER", "AGGREGATE|FUNCTION", MatchAny))
 		COMPLETE_WITH_CONST("(");
 	/* ALTER AGGREGATE,FUNCTION  (...) */
-	else if (TailMatches4("ALTER", "AGGREGATE|FUNCTION", MatchAny, MatchAny))
+	else if (Matches4("ALTER", "AGGREGATE|FUNCTION", MatchAny, MatchAny))
 	{
 		if (ends_with(prev_wd, ')'))
 			COMPLETE_WITH_LIST3("OWNER TO", "RENAME TO", "SET SCHEMA");
@@ -1278,49 +1293,49 @@ psql_completion(const char *text, int start, int end)
 	}
 
 	/* ALTER SCHEMA  */
-	else if (TailMatches3("ALTER", "SCHEMA", MatchAny))
+	else if (Matches3("ALTER", "SCHEMA", MatchAny))
 		COMPLETE_WITH_LIST2("OWNER TO", "RENAME TO");
 
 	/* ALTER COLLATION  */
-	else if (TailMatches3("ALTER", "COLLATION", MatchAny))
+	else if (Matches3("ALTER", "COLLATION", MatchAny))
 		COMPLETE_WITH_LIST3("OWNER TO", "RENAME TO", "SET SCHEMA");
 
 	/* ALTER CONVERSION  */
-	else if (TailMatches3("ALTER", "CONVERSION", MatchAny))
+	else if (Matches3("ALTER", "CONVERSION", MatchAny))
 		COMPLETE_WITH_LIST3("OWNER TO", "RENAME TO", "SET SCHEMA");
 
 	/* ALTER DATABASE  */
-	else if (TailMatches3("ALTER", "DATABASE", MatchAny))
+	else if (Matches3("ALTER", "DATABASE", MatchAny))
 		COMPLETE_WITH_LIST7("RESET", "SET", "OWNER TO", "RENAME TO",
 			"IS_TEMPLATE", "ALLOW_CONNECTIONS",
 			"CONNECTION LIMIT");
 
 	/* ALTER EVENT TRIGGER */
-	else if (TailMatches3("ALTER", "EVENT", "TRIGGER"))
+	else if (Matches3("ALTER", "EVENT", "TRIGGER"))
 		COMPLETE_WITH_QUERY(Query_for_list_of_event_triggers);
 
 	/* ALTER EVENT TRIGGER  */
-	else if (TailMatches4("ALTER", "EVENT", "TRIGGER", MatchAny))
+	else if (Matches4("ALTER", "EVENT", "TRIGGER", MatchAny))
 		COMPLETE_WITH_LIST4("DISABLE", "ENABLE", "OWNER TO", "RENAME TO");
 
 	/* ALTER EVENT TRIGGER  ENABLE */
-	else if (TailMatches5("ALTER", "EVENT", "TRIGGER", MatchAny, "ENABLE"))
+	else if (Matches5("ALTER", "EVENT", "TRIGGER", MatchAny, "ENABLE"))

[HACKERS] Re: More thorough planning for OLAP queries (was: [PATCH] Equivalence Class Filters)

2015-12-30 Thread Tomas Vondra


On 12/30/2015 08:16 AM, David Rowley wrote:


I do strongly believe that we need to come up with something to
solve this problem. I already summarised my thoughts on the other
thread.


One approach that I don't see mentioned on any of the threads is plan 
caching, which allows reusing the plan across many query executions, 
hopefully amortizing the planning costs.


I'm sure implementing such caching is non-trivial and there are cases 
where it may not help, but perhaps it's not entirely futile (AFAIK it's 
used by some databases exactly to address the higher planning costs).


I imagine a single GUC enabling or disabling this (possibly not just 
globally but per session, user or database).


We already have some form of plan caching, although only for prepared 
statements within a single session - maybe that could be a good starting 
point? For example what if we only enabled those "expensive" 
optimizations for prepared statements, which are assumed to be executed 
multiple times? Of course, this may not be entirely true (say, PL/pgSQL 
uses prepared statements all the time).


Of course, the annoying consequence of this would be that the planning 
may get somewhat unpredictable - the plan will depend on whether the 
query was planned directly or as a prepared statement, or whether plan 
caching is enabled. However, the same mostly applies to solutions 
proposed in the other threads so far.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_controldata/pg_resetxlog "Latest checkpoint's NextXID" format

2015-12-30 Thread José Luis Tallón

On 12/30/2015 06:46 AM, Simon Riggs wrote:
On 30 December 2015 at 00:17, Joe Conway > wrote:

On 12/29/2015 07:15 AM, Tom Lane wrote:
> Yeah.  Use of the same x/y notation with two different bases
seems like
> a recipe for confusion.  It's probably too late to do anything about
> this for 9.5, but I'd be +1 for adopting Jose's suggestion or some
> other formatting tweak in HEAD.

I made the "%u/%u" -> "%u:%u" change in the controldata patch I just
posted, but I suppose I should commit that separately. Any complaints
about that?

There is already long precedent about how to represent an XID with an 
epoch... and it is neither of those two formats.

http://www.postgresql.org/docs/devel/static/functions-info.html
"Table 9-63. Transaction IDs and Snapshots"
"The internal transaction ID type (xid) is 32 bits wide and wraps 
around every 4 billion transactions. However, these functions export a 
64-bit format that is extended with an "epoch" counter so it will not 
wrap around during the life of an installation."

So? If I am guessing correctly, you propose to use %llu (more precisely, 
UINT64_FORMAT) ?

(please correct me if I got it wrong)

IMHO, we have been telling users that XIDs are 32bits forever, so 
showing a 64bits int where an XID is expected can easily induce confusion.
Moreover, the "epoch : whatever" format is widely used (Debian & 
derivatives, just to name some) and understood...

... but I might be wrong.

Any format different from %X/%X seems better than what we have 
know, in any case

Thanks,

/ J.L.

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Aleksander Alekseev

Here is a clean version of the patch. Step 1 is an optimization. Step 2
refactors this:

 HTAB *
 ShmemInitHash(const char *name, /* table string name for shmem index */
- long init_size,   /* initial table size */
+ long init_size,   /* initial table size XXX ignored,
 refactor! */


As I understand there is no performance degradation:

Before
==

Core i7, pgbench -j 8 -c 8 -T 50 pgbench:

```
number of transactions actually processed: 14315
latency average: 27.945 ms
latency stddev: 44.920 ms
tps = 286.097043 (including connections establishing)
tps = 286.127642 (excluding connections establishing)
```

60-core server, pgbench -j 64 -c 64 -T 50 pgbench:

```
number of transactions actually processed: 176127
latency average: 18.162 ms
latency stddev: 23.861 ms
tps = 3521.069672 (including connections establishing)
tps = 3522.166706 (excluding connections establishing)
```


After
=

Core i7, pgbench -j 8 -c 8 -T 50 pgbench:

```
number of transactions actually processed: 14212
latency average: 27.984 ms
latency stddev: 43.250 ms
tps = 284.038588 (including connections establishing)
tps = 285.112772 (excluding connections establishing)
```

60-core server, pgbench -j 64 -c 64 -T 50 pgbench:

```
number of transactions actually processed: 172135
latency average: 17.767 ms
latency stddev: 22.692 ms
tps = 3590.410302 (including connections establishing)
tps = 3594.607165 (excluding connections establishing)
```diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index 78f15f0..d361dbb 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -265,7 +265,7 @@ InitShmemIndex(void)
  */
 HTAB *
 ShmemInitHash(const char *name, /* table string name for shmem index */
-			  long init_size,	/* initial table size */
+			  long init_size,	/* initial table size XXX ignored, refactor! */
 			  long max_size,	/* max size of the table */
 			  HASHCTL *infoP,	/* info about key and bucket size */
 			  int hash_flags)	/* info about infoP */
@@ -299,7 +299,7 @@ ShmemInitHash(const char *name, /* table string name for shmem index */
 	/* Pass location of hashtable header to hash_create */
 	infoP->hctl = (HASHHDR *) location;
 
-	return hash_create(name, init_size, infoP, hash_flags);
+	return hash_create(name, max_size, infoP, hash_flags);
 }
 
 /*
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index eacffc4..a50a2d3 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -15,7 +15,7 @@
  * to hash_create.  This prevents any attempt to split buckets on-the-fly.
  * Therefore, each hash bucket chain operates independently, and no fields
  * of the hash header change after init except nentries and freeList.
- * A partitioned table uses a spinlock to guard changes of those two fields.
+ * A partitioned table uses spinlocks to guard changes of those fields.
  * This lets any subset of the hash buckets be treated as a separately
  * lockable partition.  We expect callers to use the low-order bits of a
  * lookup key's hash value as a partition number --- this will work because
@@ -87,6 +87,7 @@
 #include "access/xact.h"
 #include "storage/shmem.h"
 #include "storage/spin.h"
+#include "storage/lock.h"
 #include "utils/dynahash.h"
 #include "utils/memutils.h"
 
@@ -128,12 +129,21 @@ typedef HASHBUCKET *HASHSEGMENT;
  */
 struct HASHHDR
 {
-	/* In a partitioned table, take this lock to touch nentries or freeList */
-	slock_t		mutex;			/* unused if not partitioned table */
+	/*
+	 * In a partitioned table take these locks to touch nentries or freeList.
+	 * If hash table is not partitioned mutexes are not used.
+	 */
+	slock_t		mutex[NUM_LOCK_PARTITIONS];
+
+	/*
+	 * These fields change during entry addition/deletion. If hash table is
+	 * not partitioned only nentries[0] and freeList[0] are used.
+	 */
 
-	/* These fields change during entry addition/deletion */
-	long		nentries;		/* number of entries in hash table */
-	HASHELEMENT *freeList;		/* linked list of free elements */
+	/* number of entries in hash table */
+	long		nentries[NUM_LOCK_PARTITIONS];
+	/* linked list of free elements */
+	HASHELEMENT *freeList[NUM_LOCK_PARTITIONS];
 
 	/* These fields can change, but not in a partitioned table */
 	/* Also, dsize can't change in a shared table, even if unpartitioned */
@@ -166,6 +176,8 @@ struct HASHHDR
 
 #define IS_PARTITIONED(hctl)  ((hctl)->num_partitions != 0)
 
+#define PARTITION_IDX(hctl, hashcode) (IS_PARTITIONED(hctl) ? LockHashPartition(hashcode) : 0)
+
 /*
  * Top control structure for a hashtable --- in a shared table, each backend
  * has its own copy (OK since no fields change at runtime)
@@ -219,10 +231,10 @@ static long hash_accesses,
  */
 static void *DynaHashAlloc(Size size);
 static HASHSEGMENT seg_alloc(HTAB *hashp);
-static bool element_alloc(HTAB *hashp, int nelem);
+static bool element_alloc(HTAB *hashp, int nelem, int partition_idx);
 static bool

Re: [HACKERS] Better detail logging for password auth failures

2015-12-30 Thread Michael Paquier

On Wed, Dec 30, 2015 at 1:07 AM, Tom Lane  wrote:
> We often tell people to look in the postmaster log for more information
> about authentication problems; but an off-list question prompted me to
> notice that many of the common authentication failure cases don't actually
> get any useful commentary in the log.  The attached proposed patch
> remedies this by adding specific log detail messages for all the
> non-out-of-memory cases processed by md5_crypt_verify().  Essentially,
> this is just covering cases that I omitted to cover in commit 64e43c59,
> for reasons that no longer seem very good to me.
>
> Any objections?

Not really. +1. And the patch looks good.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Improved tab completion for FDW DDL

2015-12-30 Thread Andreas Karlsson


Hi,

Here is a patch which adds the below missing tab completions for the FDW 
DDL commands. I noticed these were missing while playing around with 
FDWs a while ago.


"ALTER SERVER " with "RENAME TO"
"CREATE SERVER  FOREIGN DATA WRAPPER" with fdw name
"CREATE|ALTER USER MAPPING FOR  SERVER " with "OPTIONS ("

Another completion which is currently missing but I am not sure if we 
should add or not is completing "ALTER|CREATE|DROP USER" with "MAPPING 
FOR", but since it might interfere with completing to username for 
"ALTER|DROP USER" I am not sure we want it. What do you think?


Andreas
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index 4c93ae9..46b555c 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1444,7 +1444,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_LIST3("MINVALUE", "MAXVALUE", "CYCLE");
 	/* ALTER SERVER  */
 	else if (TailMatches3("ALTER", "SERVER", MatchAny))
-		COMPLETE_WITH_LIST3("VERSION", "OPTIONS", "OWNER TO");
+		COMPLETE_WITH_LIST4("VERSION", "OPTIONS", "OWNER TO", "RENAME TO");
 	/* ALTER SYSTEM SET, RESET, RESET ALL */
 	else if (TailMatches2("ALTER", "SYSTEM"))
 		COMPLETE_WITH_LIST2("SET", "RESET");
@@ -2015,8 +2015,12 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_LIST3("MINVALUE", "MAXVALUE", "CYCLE");
 
 /* CREATE SERVER  */
+	/* Complete "CREATE SERVER " with "FOREIGN DATA WRAPPER" */
 	else if (TailMatches3("CREATE", "SERVER", MatchAny))
 		COMPLETE_WITH_LIST3("TYPE", "VERSION", "FOREIGN DATA WRAPPER");
+	/* Complete "CREATE SERVER  FOREIGN DATA WRAPPER" with fdw name */
+	else if (TailMatches6("CREATE", "SERVER", MatchAny, "FOREIGN", "DATA", "WRAPPER"))
+		COMPLETE_WITH_QUERY(Query_for_list_of_fdws);
 
 /* CREATE TABLE */
 	/* Complete "CREATE TEMP/TEMPORARY" with the possible temp objects */
@@ -2740,6 +2744,8 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_QUERY(Query_for_list_of_user_mappings);
 	else if (TailMatches5("CREATE|ALTER|DROP", "USER", "MAPPING", "FOR", MatchAny))
 		COMPLETE_WITH_CONST("SERVER");
+	else if (TailMatches7("CREATE|ALTER", "USER", "MAPPING", "FOR", MatchAny, "SERVER", MatchAny))
+		COMPLETE_WITH_CONST("OPTIONS (");
 
 /*
  * VACUUM [ FULL | FREEZE ] [ VERBOSE ] [ table ]

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Alvaro Herrera

Aleksander Alekseev wrote:

> Here is a funny thing - benchmark results I shared 22.12.2015 are wrong
> because I forgot to run `make clean` after changing lwlock.h (autotools
> doesn't rebuild project properly after changing .h files).

Running configure with --enable-depend should avoid this problem.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Andres Freund

On 2015-12-30 11:37:19 -0300, Alvaro Herrera wrote:
> Aleksander Alekseev wrote:
> 
> > Here is a funny thing - benchmark results I shared 22.12.2015 are wrong
> > because I forgot to run `make clean` after changing lwlock.h (autotools
> > doesn't rebuild project properly after changing .h files).
> 
> Running configure with --enable-depend should avoid this problem.

I still maintain that --enable-depend should be on by default. We're
absurdly optimizing towards saving a handful of cycles in scenarios
which are usually bottlenecked by other things (build boxes spend more
times on tests and such), rather than optimizing for developer time. I
don't know how many people failed setting --enable-depend by now, but it
definitely goes into several hundres of wasted ours territory.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] rows estimate in explain analyze for the BRIN index

2015-12-30 Thread Oleksii Kliukin

Hi,

While experimenting with BRIN on PostgreSQL 9.5RC1 I came across the following 
plan (which is, btw a very good example of how BRIN rocks for the clustered 
data, the size of this table is around 90GB, the size of the index is around 
3MB):

explain (analyze, buffers, verbose) select 1 from example where event_time = 
now() - interval '5 months';
  QUERY PLAN
---
 Bitmap Heap Scan on example  (cost=744.44..757.64 rows=6 width=0) (actual 
time=73.895..73.895 rows=0 loops=1)
   Output: 1
   Recheck Cond: (example.event_time = (now() - '5 mons'::interval))
   Rows Removed by Index Recheck: 4030
   Heap Blocks: lossy=128
   Buffers: shared hit=629
   ->  Bitmap Index Scan on example_event_time_idx1  (cost=0.00..744.41 rows=6 
width=0) (actual time=70.335..70.335 rows=1280 loops=1)
 Index Cond: (example.event_time = (now() - '5 mons'::interval))
 Buffers: shared hit=501
 Planning time: 0.125 ms
 Execution time: 73.943 ms
(11 rows)

Time: 74.642 ms


Here EXPLAIN ANALYZE reports 1280 rows from the Bitmap Index Scan, but on the 
higher level, 4030 rows were removed by Index Recheck. 

The questions are:

- how does it get 1280 rows from the BRIN index scan, given that BRIN only 
stores pointers to the heap blocks, not individual rows. Does it calculate the 
number of rows in the blocks returned?

- how comes that the subsequent recheck filters out 4030 rows, out of 
supposedly 1280 returned?

Kind regards,
--
Oleksii

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Alvaro Herrera

Andres Freund wrote:
> On 2015-12-30 11:37:19 -0300, Alvaro Herrera wrote:
> > Aleksander Alekseev wrote:
> > 
> > > Here is a funny thing - benchmark results I shared 22.12.2015 are wrong
> > > because I forgot to run `make clean` after changing lwlock.h (autotools
> > > doesn't rebuild project properly after changing .h files).
> > 
> > Running configure with --enable-depend should avoid this problem.
> 
> I still maintain that --enable-depend should be on by default.

+1  Let's do it.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Aleksander Alekseev

Agree. --enable-depend turned on by default could save me a lot of
time. Now I'm aware regarding --enable-depend and `make clean`. Still
other people will definitely do the same mistake over and over again.

On Wed, 30 Dec 2015 11:49:13 -0300
Alvaro Herrera  wrote:

> Andres Freund wrote:
> > On 2015-12-30 11:37:19 -0300, Alvaro Herrera wrote:
> > > Aleksander Alekseev wrote:
> > > 
> > > > Here is a funny thing - benchmark results I shared 22.12.2015
> > > > are wrong because I forgot to run `make clean` after changing
> > > > lwlock.h (autotools doesn't rebuild project properly after
> > > > changing .h files).
> > > 
> > > Running configure with --enable-depend should avoid this problem.
> > 
> > I still maintain that --enable-depend should be on by default.
> 
> +1  Let's do it.
> 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: fix lock contention for HASHHDR.mutex

2015-12-30 Thread Oleg Bartunov

On Wed, Dec 30, 2015 at 5:44 PM, Andres Freund  wrote:

> On 2015-12-30 11:37:19 -0300, Alvaro Herrera wrote:
> > Aleksander Alekseev wrote:
> >
> > > Here is a funny thing - benchmark results I shared 22.12.2015 are wrong
> > > because I forgot to run `make clean` after changing lwlock.h (autotools
> > > doesn't rebuild project properly after changing .h files).
> >
> > Running configure with --enable-depend should avoid this problem.
>
> I still maintain that --enable-depend should be on by default. We're
>

+1


> absurdly optimizing towards saving a handful of cycles in scenarios
> which are usually bottlenecked by other things (build boxes spend more
> times on tests and such), rather than optimizing for developer time. I
> don't know how many people failed setting --enable-depend by now, but it
> definitely goes into several hundres of wasted ours territory.
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: [HACKERS] Better detail logging for password auth failures

2015-12-30 Thread Tom Lane

Andres Freund  writes:
> On 2015-12-29 11:07:26 -0500, Tom Lane wrote:
>> In passing, the patch gets rid of a vestigial CHECK_FOR_INTERRUPTS()
>> call; it was added by e710b65c and IMO should have been removed again
>> by 6647248e.  There's certainly no very good reason to have one right
>> at that spot anymore.

> Why? Doesn't seem like the worst place for an explicit interrupt check?

The only reason there was one there at all was that e710b65c added
code like this:

+   /*
+* Disable immediate interrupts while doing database access.  (Note
+* we don't bother to turn this back on if we hit one of the failure
+* conditions, since we can expect we'll just exit right away anyway.)
+*/
+   ImmediateInterruptOK = false;

... some catalog access here ...

+   /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
+   ImmediateInterruptOK = true;
+   /* And don't forget to detect one that already arrived */
+   CHECK_FOR_INTERRUPTS();

In 6647248e you got rid of nine of these ten lines, leaving something
that's both pointless and undocumented.  There are more than enough
CHECK_FOR_INTERRUPTS calls already in the auth code; there's not a
reason to expend code space on one here.  (If MD5 ran long enough to
be worth interrupting, there would be an argument for a check inside
its hashing loop, but that still wouldn't be this check.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_controldata/pg_resetxlog "Latest checkpoint's NextXID" format

2015-12-30 Thread Tom Lane

=?UTF-8?B?Sm9zw6kgTHVpcyBUYWxsw7Nu?=  writes:
> On 12/30/2015 06:46 AM, Simon Riggs wrote:
>> There is already long precedent about how to represent an XID with an 
>> epoch... and it is neither of those two formats.

> IMHO, we have been telling users that XIDs are 32bits forever, so 
> showing a 64bits int where an XID is expected can easily induce confusion.

Yeah.  Also, if you look at xmin or xmax in any stored row, it's only
going to be 32 bits.  If we make pg_controldata merge the epoch into a
decimal representation, it's going to be a serious PITA to manually
compare stored XIDs to the printout.

We've talked about making on-disk XIDs be effectively 64 bits (with
some trick or other to avoid taking a big space hit).  If that ever
happens, and we start printing xmin/xmax as true 64-bit values, then
it'd be appropriate for pg_controldata to do the same.  But right now
I think a split format is still the way to go.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

95 matches

Mail list logo