Re: [HACKERS] Extension Templates S03E11

2013-12-07 Thread Jeff Davis
On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote:
 I think Stephen has already argued why it could be a good idea here.
 But in a nutshell: it seems like there are two use-cases to be
 supported, one where you want CREATE EXTENSION hstore to give you
 some appropriate version of hstore, and one where you want to restore
 exactly what you had on the previous installation.  It seems to me that
 exploding the extension by dumping, rather than suppressing, its
 component objects is by far the most reliable way of accomplishing the
 latter.

The behavior of an extension should not depend on how it was installed.

The kind of extension being described by Stephen will:

* Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0'
* Dump out objects that wouldn't be dumped if they had installed the
extension using the filesystem

So if we do it this way, then we should pick a new name, like package.

And then we'll need to decide whether it still makes sense to use an
external tool to transform a PGXN extension into a form that could be
loaded as a package.

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread MauMau

From: David Johnston pol...@yahoo.com

5. FATAL:  terminating walreceiver process due to administrator command
6. FATAL:  terminating background worker \%s\ due to administrator
command

5 and 6: I don't fully understand when they would happen but likely fall
into the same the DBA should know what is going on with their server and
confirm any startup/shutdown activity it is involved with.

They might be better categorized NOTICE level if they were in response 
to

a administrator action, versus in response to a crashed process, but even
for the user-initiated situation making sure they hit the log but using
FATAL is totally understandable and IMO desirable.


#5 is output when the DBA shuts down the replication standby server.
#6 is output when the DBA shuts down the server if he is using any custom 
background worker.
These messages are always output.  What I'm seeing as a problem is that 
FATAL messages are output in a normal situation, which worries the DBA, and 
those messages don't help the DBA with anything.  They merely worry the DBA.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread MauMau

From: David Johnston pol...@yahoo.com
PITR/Failover is not generally that frequent an occurrence but I will 
agree

that these events are likely common during such.

Maybe PITR/Failover mode can output something in the logs to alleviate 
user

angst about these frequent events?  I'm doubting that a totally separate
mechanism can be used for this mode but instead of looking for things to
remove how about adding some additional coddling to the logs and the
beginning and end of the mode change?


Yes, those messages are not output frequently, because they are only output 
during planned or unplanned downtime.  But frequency does not matter here. 
Imagine the DBA's heart.  They need to perform maintenance or recovery 
operation in a hurry and wish to not encounter any trouble.  They would 
panic if something goes trouble when he expects to see no trouble.  The 
FATAL messages merely make him worried.


The extra FATAL messages can be a problem for support staff.  What do you 
feel if the DBA asks you for help when some emergency trouble occurs during 
recovery, with a few important messages buried in lots of unnecessary FATAL 
ones?


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory

2013-12-07 Thread MauMau

From: Amit Kapila amit.kapil...@gmail.com

Today, I had again gone through all the discussion that happened at
that time related to this problem
and I found that later in discussion it was discussed something on
lines as your Approach-2,
please see the link
http://www.postgresql.org/message-id/503a879c.6070...@dunslane.net


Thank you again.  I'm a bit relieved to see similar discussion.



Wouldn't the other way to resolve this problem be reinvoke pg_ctl in
non-restricted mode for the case in question?


It's effectively the same as not checking admin privileges.  And direct 
invocation of postgres -C/--describe-config still cannot be helped.

I think it is important to resolve this problem, so please godhead and
upload this patch to next CF.


Thanks.

Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recovery to backup point

2013-12-07 Thread MauMau

From: Michael Paquier michael.paqu...@gmail.com

On Sat, Dec 7, 2013 at 9:06 AM, MauMau maumau...@gmail.com wrote:

Recovery target 'immediate'
http://www.postgresql.org/message-id/51703751.2020...@vmware.com

May I implement this feature and submit a patch for the next commitfest 
if I have time?

Please feel free. I might as well participate in the review.


Thanks.  I'm feeling incliend to make the configuration recovery_target = 
'backup_point' instead of recovery_target = 'immediate', because:


* The meaning of this feature for usrs is to recover the database to the 
backup point.
* it doesn't seem to need a new parameter.  recovery_target_time sounds 
appropriate because users want to restore the database at the time of 
backup.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 1/2] SSL: GUC option to prefer server cipher order

2013-12-07 Thread Peter Eisentraut
Committed your v2 patch (with default to on).  I added a small snippet
of documentation explaining that this setting is mainly for backward
compatibility.




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread Greg Stark
On Sat, Dec 7, 2013 at 12:27 AM, David Johnston pol...@yahoo.com wrote:

 1. FATAL:  the database system is starting up

 How about altering the message to tone down the severity by a half-step...

 FATAL: (probably) not! - the database system is starting up

Well it is fatal, the backend for that client doesn't continue.

FATAL means a backend died. It is kind of vague how FATAL and PANIC
differ. They both sound like the end of the world. If you read FATAL
thinking it means the whole service is quitting -- ie what PANIC means
-- then these sound like they're noise.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread David Johnston
MauMau wrote
 From: David Johnston lt;

 polobo@

 gt;
 5. FATAL:  terminating walreceiver process due to administrator command
 6. FATAL:  terminating background worker \%s\ due to administrator
 command
 5 and 6: I don't fully understand when they would happen but likely fall
 into the same the DBA should know what is going on with their server and
 confirm any startup/shutdown activity it is involved with.

 They might be better categorized NOTICE level if they were in response 
 to
 a administrator action, versus in response to a crashed process, but even
 for the user-initiated situation making sure they hit the log but using
 FATAL is totally understandable and IMO desirable.
 
 #5 is output when the DBA shuts down the replication standby server.
 #6 is output when the DBA shuts down the server if he is using any custom 
 background worker.
 These messages are always output.  What I'm seeing as a problem is that 
 FATAL messages are output in a normal situation, which worries the DBA,
 and 
 those messages don't help the DBA with anything.  They merely worry the
 DBA.

While worry is something to be avoided not logging messages is not the
correct solution.  On the project side looking for ways and places to better
communicate to the user is worthwhile in the absence of adding additional
complexity to the logging system.  The user/DBA, though, is expected to
learn how the proces works, ideally in a testing environment where worry is
much less a problem, and understand that some of what they are seeing are
client-oriented messages that they should be aware of but that do not
indicate server-level issues by themselves.

They should probably run the log through a filter (a template of which the
project should provide) that takes the raw log and filters out those things
that, relative to the role and context, are really better classified as
NOTICE even if the log-level is shown as FATAL.  

Raw logs should be verbose so that tools can have more data with which to
make decisions.  You can always hide what is provided but you can never show
was has been omitted.  I get that people wil scan raw logs but targeting
that lowest-denominator isn't necessarily the best thing to do.  Instead,
getting those people on board with better practices and showing (and
providing) them helpful tools should be the goal.

David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/RFC-Shouldn-t-we-remove-annoying-FATAL-messages-from-server-log-tp5781899p5782267.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread Andres Freund
On 2013-12-07 13:58:11 +, Greg Stark wrote:
 FATAL means a backend died. It is kind of vague how FATAL and PANIC
 differ.

I don't really see much vagueness there. FATAL is an unexpected but
orderly shutdown. PANIC is for the situations where we can't handle the
problem that occurred in any orderly way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [RFC] Shouldn't we remove annoying FATAL messages from server log?

2013-12-07 Thread Greg Stark
On Sat, Dec 7, 2013 at 3:56 PM, Andres Freund and...@2ndquadrant.com wrote:

 I don't really see much vagueness there. FATAL is an unexpected but
 orderly shutdown. PANIC is for the situations where we can't handle the
 problem that occurred in any orderly way.

Sorry, I was unclear. I meant that without context if someone asked
you which was more severe, fatal or panic you would have no
particular way to know. After all, for a person it's easier to cure a
panic than a fatality :)

On the client end the FATAL is pretty logical but in the logs it makes
it sound like the entire server died. Especially in this day of
multithreaded servers. I was suggesting that that was the origin of
the confusion here. Anyone who has seen these messages on the client
end many times might interpret them correctly in the server logs but
someone who has only been a DBA, not a database user might never have
seen them except in the server logs and without the context might not
realize that FATAL is a term of art peculiar to Postgres.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-12-07 Thread Fujii Masao
On Sat, Dec 7, 2013 at 6:31 AM, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Dec 6, 2013 at 12:24 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 There seems to be no problem even if we use bigint as the type of
 unsigned 32-bit integer like queryid. For example, txid_current()
 returns the transaction ID, i.e., unsigned 32-bit integer, as bigint.
 Could you tell me what the problem is when using bigint for queryid?

 We're talking about the output of some view, right, not internal storage?
 +1 for using bigint for that.  Using OID is definitely an abuse, because
 the value *isn't* an OID.  And besides, what if we someday decide we need
 64-bit keys not 32-bit?

 Fair enough. I was concerned about the cost of external storage of
 64-bit integers (unlike query text, they might have to be stored many
 times for many distinct intervals or something like that), but in
 hindsight that was fairly miserly of me.

 Attached revision displays signed 64-bit integers instead.

Thanks! Looks good to me. Committed!

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-07 Thread Stephen Frost
* Jeff Davis (pg...@j-davis.com) wrote:
 On Wed, 2013-12-04 at 14:54 -0500, Tom Lane wrote:
  I think Stephen has already argued why it could be a good idea here.
  But in a nutshell: it seems like there are two use-cases to be
  supported, one where you want CREATE EXTENSION hstore to give you
  some appropriate version of hstore, and one where you want to restore
  exactly what you had on the previous installation.  It seems to me that
  exploding the extension by dumping, rather than suppressing, its
  component objects is by far the most reliable way of accomplishing the
  latter.
 
 The behavior of an extension should not depend on how it was installed.
 
 The kind of extension being described by Stephen will:
 
 * Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0'

That's correct, but consider when the above command actually works
today: when the new version is magically made available to the backend
without any action happening in PG.  That works when the filesystem can
be updated independently or the backend can reach out to some other
place and pull things down but that's, really, a pretty special
situation.  It works today specifically because we expect the OS
packaging system to make changes to the filesystem for us but this whole
'extension template' proposal is about getting away from the filesystem.

Instead, I'm suggesting an external tool which can pull down the new
version from an external repo and then apply it to the backend.
Clearly, my approach supports the general action of updating an
extension, it just doesn't expect the filesystem on the server to be
changed underneath PG nor that PG will reach out to some external
repository on the basis of a non-superuser request to get the update
script.

 * Dump out objects that wouldn't be dumped if they had installed the
 extension using the filesystem

Correct, but the general presumption here is that many of these
extensions wouldn't even be available for installation on the
filesystem anyway.

 So if we do it this way, then we should pick a new name, like package.

I've been on the fence about this for a while.  There's definitely pros
and cons to consider but it would go very much against one of the goals
here, which is to avoid asking extension authors (or their users, to
some extent..) to change and I expect it'd also be a lot more work to
invent something which is 90% the same as extensions.  Perhaps there's
no help for it and we'll need extensions which are essentially for
OS managed extensions (which can include .so files and friends) and then
packages for entirely-in-catalog sets of objects with a certain amount
of metadata included (version and the like).

 And then we'll need to decide whether it still makes sense to use an
 external tool to transform a PGXN extension into a form that could be
 loaded as a package.

I'd certainly think it would be but if we're moving away from calling
them extensions then I'm afraid extension authors and users would end up
having to change something anyway, no matter the tool.  Perhaps that's
reasonable and we can at least minimize the impact but much of what
extensions offer are, in fact, what's also needed for packages and I'm
not thrilled with the apparent duplication.

It just occured to me that perhaps we can call these something
different towards the end user but use the existing catalogs and code
for extensions to handle our representation, with a few minor tweaks..
Not sure if I like that idea, but it's a thought.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Storing pg_stat_statements query texts externally, pg_stat_statements in core

2013-12-07 Thread Fujii Masao
On Sun, Nov 17, 2013 at 1:15 PM, Peter Geoghegan p...@heroku.com wrote:
 On Sat, Nov 16, 2013 at 4:36 PM, Peter Geoghegan p...@heroku.com wrote:
 I'll post a revision with fixes for those. Another concern is that I
 see some race conditions that only occur when pg_stat_statements cache
 is under a lot of pressure with a lot of concurrency, that wasn't
 revealed by my original testing. I'm working on a fix for that.

 Revision attached.

The patch doesn't apply cleanly against the master due to recent change of
pg_stat_statements. Could you update the patch?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-07 Thread Fujii Masao
On Wed, Nov 20, 2013 at 3:18 AM, Atri Sharma atri.j...@gmail.com wrote:
 On Tue, Nov 19, 2013 at 11:43 PM, Mike Blackwell mike.blackw...@rrd.com 
 wrote:
 This patch looks good to me.  It applies, builds, and runs the regression
 tests.  Documentation is included and it seems to do what it says.  I don't
 consider myself a code expert, but as far as I can see it looks fine.  This
 is a pretty straightforward enhancement to the existing pg_stat_* code.

 If no one has any objections, I'll mark it ready for committer.

 Mike

 I agree.

 I had a discussion with Mike yesterday, and took the performance areas
 in the patch. I think the impact would be pretty low and since the
 global counter being incremented is incremented with keeping race
 conditions in mind, I think that the statistics collected will be
 valid.

 So, I have no objections to the patch being marked as ready for committer.

Could you share the performance numbers? I'm really concerned about
the performance overhead caused by this patch.

Here are the comments from me:

All the restrictions of this feature should be documented. For example,
this feature doesn't track the bytes of the data transferred by FDW.
It's worth documenting that kind of information.

ISTM that this feature doesn't support SSL case. Why not?

The amount of data transferred by walreceiver also should be tracked,
I think.

I just wonder how conn_received, conn_backend and conn_walsender
are useful.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-07 Thread Atri Sharma


Sent from my iPad

 On 07-Dec-2013, at 23:47, Fujii Masao masao.fu...@gmail.com wrote:
 
 On Wed, Nov 20, 2013 at 3:18 AM, Atri Sharma atri.j...@gmail.com wrote:
 On Tue, Nov 19, 2013 at 11:43 PM, Mike Blackwell mike.blackw...@rrd.com 
 wrote:
 This patch looks good to me.  It applies, builds, and runs the regression
 tests.  Documentation is included and it seems to do what it says.  I don't
 consider myself a code expert, but as far as I can see it looks fine.  This
 is a pretty straightforward enhancement to the existing pg_stat_* code.
 
 If no one has any objections, I'll mark it ready for committer.
 
 Mike
 
 I agree.
 
 I had a discussion with Mike yesterday, and took the performance areas
 in the patch. I think the impact would be pretty low and since the
 global counter being incremented is incremented with keeping race
 conditions in mind, I think that the statistics collected will be
 valid.
 
 So, I have no objections to the patch being marked as ready for committer.
 
 Could you share the performance numbers? I'm really concerned about
 the performance overhead caused by this patch.
I did some pgbench tests specifically with increasing number of clients, as 
that are the kind of workloads that can lead to display in slowness due to 
increase in work in the commonly used functions. Let me see if I can get the 
numbers and see where I kept them.

 
 Here are the comments from me:
 
 All the restrictions of this feature should be documented. For example,
 this feature doesn't track the bytes of the data transferred by FDW.
 It's worth documenting that kind of information.

+1

 
 ISTM that this feature doesn't support SSL case. Why not?
 
 The amount of data transferred by walreceiver also should be tracked,
 I think.
 
Yes, I agree. WAL receiver data transfer can be problematic some times as well, 
so should be tracked.

Regards,

Atri

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index overhead cost reporting

2013-12-07 Thread Thom Brown
Hi,

I was wondering whether anyone has any insight with regards to
measuring and reporting the overhead of maintaining indexes on
relations.  If an UPDATE is issued to a table with, say, 6 indexes, it
would be useful to determine how much time is spent updating each of
those indexes.  And perhaps such timings would not be confined to
indexes, but also other dependants that add overhead, such as
triggers, rules, and in future, eager incremental materialised view
updates.

One use case I had in mind is observing whether any index is
particularly burdensome to the overall plan.  Then an analysis of
those numbers could show that, for example, 25% of the time spent on
DML on table my_table were spent on maintaining index
idx_my_table_a... and index that's not often used.

Is that something that could be provided in an EXPLAIN ANALYSE node?

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index overhead cost reporting

2013-12-07 Thread Tom Lane
Thom Brown t...@linux.com writes:
 I was wondering whether anyone has any insight with regards to
 measuring and reporting the overhead of maintaining indexes on
 relations.  If an UPDATE is issued to a table with, say, 6 indexes, it
 would be useful to determine how much time is spent updating each of
 those indexes.  And perhaps such timings would not be confined to
 indexes, but also other dependants that add overhead, such as
 triggers, rules, and in future, eager incremental materialised view
 updates.

We already do track the time spent in triggers.  Although Kevin might
have a different idea, I'd think that matview updates should also be
driven by triggers, so that the last item there would be covered.

 Is that something that could be provided in an EXPLAIN ANALYSE node?

Well, it'd not be particularly difficult to add measurements of the time
spent in ExecInsertIndexTuples, but I'd have some concerns about that:

* Instrumentation overhead.  That's already painful on machines with
slow gettimeofday, and this has the potential to add a lot more,
especially with the more expansive readings of your proposal.

* Is it really measuring the right thing?  To a much greater degree
than for some other things you might try to measure, just counting
time spent in ExecInsertIndexTuples is going to understate the true
cost of updating an index, because so much of the true cost is paid
asynchronously; viz, writing WAL as well as the actual index pages.
We already have that issue with measuring the runtime of a
ModifyTable node as a whole, but slicing and dicing that time
ten ways would make the results even more untrustworthy, IMO.

* There are also other costs to think of, such as the time spent by
VACUUM on maintaining the index, and the time spent by the planner
considering (perhaps vainly) whether it can use the index for each
query that reads the table.  In principle you could instrument
VACUUM to track the time it spends updating each index, and log
that in the pgstats infrastructure.  (I'd even think that might be
a good idea, except for the bloat effect on the pgstats files.)
I'm not at all sure there's any practical way to measure the distributed
planner overhead; it's not paid in discrete chunks large enough to be
timed easily.  Perhaps it's small enough to ignore, but I'm not sure.

Bottom line, I think it's a harder problem than it might seem at
first glance.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Robert Haas
On Tue, Dec 3, 2013 at 6:30 PM, Greg Stark st...@mit.edu wrote:
 I always gave the party line that ANALYZE only takes a small
 constant-sized sample so even very large tables should be very quick.
 But after hearing the same story again in Heroku I looked into it a
 bit further. I was kind of shocked but the numbers.

 ANALYZE takes a sample of 300 * statistics_target rows. That sounds
 pretty reasonable but with default_statistics_target set to 100 that's
 30,000 rows. If I'm reading the code right It takes this sample by
 sampling 30,000 blocks and then (if the table is large enough) taking
 an average of one row per block. Each block is 8192 bytes so that
 means it's reading 240MB of each table.That's a lot more than I
 realized.

That is a lot.  On the other hand, I question the subject line:
sometimes, our ANALYZE sampling is not good enough.  Before we raised
the default statistics target from 10 to 100, complaints about bad
plan choices due to insufficiently-precise statistics were legion --
and we still have people periodically proposing to sample a fixed
percentage of the table instead of a fixed amount of the table, even
on large tables, which is going the opposite direction.  I think this
is because they're getting really bad n_distinct estimates, and no
fixed-size sample can reliably give a good one.

More generally, I think the basic problem that people are trying to
solve by raising the statistics target is avoid index scans on
gigantic tables.  Obviously, there are a lot of other situations where
inadequate statistics can cause problems, but that's a pretty
easy-to-understand one that we do not always get right.  We know that
an equality search looking for some_column = 'some_constant', where
some_constant is an MCV, must be more selective than a search for the
least-frequent MCV.  If you store more and more MCVs for a table,
eventually you'll have enough that the least-frequent one is pretty
infrequent, and then things work a lot better.

This is more of a problem for big tables than for small tables.  MCV
#100 can't have a frequency of greater than 1/100 = 0.01, but that's a
lot more rows on a big table than small one.  On a table with 10
million rows we might estimate something close to 100,000 rows when
the real number is just a handful; when the table has only 10,000
rows, we just can't be off by as many orders of magnitude.  Things
don't always work out that badly, but in the worst case they do.

Maybe there's some highly-principled statistical approach which could
be taken here, and if so that's fine, but I suspect not.  So what I
think we should do is auto-tune the statistics target based on the
table size.  If, say, we think that the generally useful range for the
statistics target is something like 10 to 400, then let's come up with
a formula based on table size that outputs 10 for small tables, 400
for really big tables, and intermediate values for tables in the
middle.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Show lossy heap block info in EXPLAIN ANALYZE for bitmap heap scan

2013-12-07 Thread Robert Haas
On Fri, Dec 6, 2013 at 5:02 AM, Etsuro Fujita
fujita.ets...@lab.ntt.co.jp wrote:
 Though at first I agreed on this, while working on this I start to think 
 information about (2) is enough for tuning work_mem.  Here are examples using 
 a version under development, where Bitmap Memory Usage means (peak) memory 
 space used by a TIDBitmap, and Desired means the memory required to 
 guarantee non-lossy storage of a TID set, which is shown only when the 
 TIDBitmap has been lossified.  (work_mem = 1MB.)

I'd be wary of showing a desired value unless it's highly likely to be accurate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-07 Thread Robert Haas
On Fri, Dec 6, 2013 at 8:12 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-05 14:07:39 -0500, Robert Haas wrote:
 On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund and...@2ndquadrant.com wrote:
  Hm. The API change of on_shmem_exit() is going to cause a fair bit of
  pain. There are a fair number of extensions out there using it and all
  would need to be littered by version dependent #ifdefs. What about
  providing a version of on_shmem_exit() that allows to specify the exit
  phase, and make on_shmem_exit() a backward compatible wrapper?

 Or, we could have on_user_exit() or so and leave on_shmem_exit() alone
 altogether, which would probably be transparent for nearly everyone.
 I kind of like that better, except that I'm not sure whether
 on_user_exit() is any good as a name.

 Leaving it as separate calls sounds good to me as well - but like you I
 don't like on_user_exit() that much. Not sure if I can up with something
 really good...
 on_shmem_exit_phase() or at_shmem_exit_phase() if we go for a function
 allowing to specify phases, and just before_shmem_exit() for the user
 level things?

Hmm, before_shmem_exit() and on_shmem_exit() doesn't sound bad.

  Hm. A couple of questions/comments:
  * How do you imagine keys to be used/built?
  * Won't the sequential search over toc entries become problematic?
  * Some high level overview of how it's supposed to be used wouldn't
hurt.
  * the estimator stuff doesn't seem to belong in the public header?

 The test-shm-mq patch is intended to illustrate these points.  In that
 case, for an N-worker configuration, there are N+1 keys; that is, N
 message queues and one fixed-size control area.  The estimator stuff
 is critically needed in the public header so that you can size your
 DSM to hold the stuff you intend to store in it; again, see
 test-shm-mq.

 I still think shm_mq.c needs to explain more of that. That's where I
 look for a brief high-level explanation, no?

That stuff mostly pertains to shm_toc.c, not shm_mq.c.  I can
certainly look at expanding the explanation in the former.

 I had a very quick look at shm_mq_receive()/send() and
 shm_mq_receive_bytes()/shm_mq_send_bytes() - while the low level steps
 seem to be explained fairly well, the high level design of the queue
 doesn't seem to be explained much. I was first thinking that you were
 trying to implement a lockless queue (which is quite efficiently
 possible for 1:1 queues) even because I didn't see any locking in them
 till I noticed it's just delegated to helper functions.

I did initially implement it as lockless, but I figured I needed a
fallback with spinlocks for machines that didn't have atomic 8-bit
writes and/or memory barriers, both of which are required for this to
work without locks.  When I implemented the fallback, I discovered
that it wasn't any slower than the lock-free implementation, so I
decided to simplify my life (and the code) by not bothering with it.

So the basic idea is that you have to take the spinlock to modify the
count of bytes read - if you're the reader - or written - if you're
the writer.  You also need to take the spinlock to read the counter
the other process might be modifying, but you can read the counter
that only you can modify without the lock.  You also don't need the
lock to read or write the data in the buffer, because you can only
write to the portion of the buffer that the reader isn't currently
allowed to read, and you can only read from the portion of the buffer
that the writer isn't currently allowed to write.

 I wonder if we couldn't just generally store a generation number for
 each PGPROC which is increased everytime the slot is getting
 reused. Then one could simply check whether mq-mq_sender-generation ==
 mq-mq_sender_generation.

I think you'd want to bump the generation every time the slot was
released rather than when it was reacquired, but it does sound
possibly sensible.  However, it wouldn't obviate the separate need to
watch a BackgroundWorkerHandle, because what this lets you do is (say)
write to a queue after you've registered the reader but before it's
actually been started by the postmaster, let alone acquired a PGPROC.
I won't object if someone implements such a system for their needs,
but it's not on my critical path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index overhead cost reporting

2013-12-07 Thread Thom Brown
On 7 December 2013 19:41, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 I was wondering whether anyone has any insight with regards to
 measuring and reporting the overhead of maintaining indexes on
 relations.  If an UPDATE is issued to a table with, say, 6 indexes, it
 would be useful to determine how much time is spent updating each of
 those indexes.  And perhaps such timings would not be confined to
 indexes, but also other dependants that add overhead, such as
 triggers, rules, and in future, eager incremental materialised view
 updates.

 We already do track the time spent in triggers.  Although Kevin might
 have a different idea, I'd think that matview updates should also be
 driven by triggers, so that the last item there would be covered.

Oh, of course. :)

 Is that something that could be provided in an EXPLAIN ANALYSE node?

 Well, it'd not be particularly difficult to add measurements of the time
 spent in ExecInsertIndexTuples, but I'd have some concerns about that:

 * Instrumentation overhead.  That's already painful on machines with
 slow gettimeofday, and this has the potential to add a lot more,
 especially with the more expansive readings of your proposal.

 * Is it really measuring the right thing?  To a much greater degree
 than for some other things you might try to measure, just counting
 time spent in ExecInsertIndexTuples is going to understate the true
 cost of updating an index, because so much of the true cost is paid
 asynchronously; viz, writing WAL as well as the actual index pages.
 We already have that issue with measuring the runtime of a
 ModifyTable node as a whole, but slicing and dicing that time
 ten ways would make the results even more untrustworthy, IMO.

 * There are also other costs to think of, such as the time spent by
 VACUUM on maintaining the index, and the time spent by the planner
 considering (perhaps vainly) whether it can use the index for each
 query that reads the table.  In principle you could instrument
 VACUUM to track the time it spends updating each index, and log
 that in the pgstats infrastructure.  (I'd even think that might be
 a good idea, except for the bloat effect on the pgstats files.)
 I'm not at all sure there's any practical way to measure the distributed
 planner overhead; it's not paid in discrete chunks large enough to be
 timed easily.  Perhaps it's small enough to ignore, but I'm not sure.

 Bottom line, I think it's a harder problem than it might seem at
 first glance.

Thanks for taking the time to explain the above.

Perhaps I may have misunderstood, or not explained my question with
enough detail, but you appear to be including activity that would, in
all likelihood, occur after the DML has returned confirmation to the
user that it has completed; in particular, VACUUM.  What I was
thinking of was an execution plan node to communicate the index
modifications that are carried out prior to confirmation of the query
completing.  The bgwriter, WAL writer et al. that spring into action
as a result of the index being updated wouldn't, as I see it, be
included.

So in essence, I'd only be looking for a breakdown of anything that
adds to the duration of the DML statement.  However, it sounds like
even that isn't straightforward from what you've written.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH 2/2] SSL: Support ECDH key excange.

2013-12-07 Thread Peter Eisentraut
On Thu, 2013-11-07 at 01:59 +0200, Marko Kreen wrote:
 This sets up ECDH key exchange, when compiling against OpenSSL
 that supports EC.  Then ECDHE-RSA and ECDHE-ECDSA ciphersuites
 can be used for SSL connections.  Latter one means that EC keys
 are now usable.

Committed v2.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql_check_function - rebase for 9.3

2013-12-07 Thread Steve Singer

On 08/22/2013 02:02 AM, Pavel Stehule wrote:

rebased

Regards

Pavel

This patch again needs a rebase, it no longer applies cleanly. 
plpgsql_estate_setup now takes a shared estate parameter and the call in 
pl_check needs that.   I passed NULL in and things seem to work.




I think this is another example where are processes aren't working as 
we'd like.


The last time this patch got  a review was in March when Tom said 
http://www.postgresql.org/message-id/27661.1364267...@sss.pgh.pa.us


Shortly after Pavel responded with a new patch that changes the output 
format. 
http://www.postgresql.org/message-id/CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_jjop21fj...@mail.gmail.com


I don't much has progress in the following  9 months on this patch.

In Tom's review the issue of code duplication and asks:
do we want to accept that this is the implementation pathway we're 
going to settle for, or are we going to hold out for a better one? I'd 
be the first in line to hold out if I had a clue how to move towards a 
better implementation, but I have none.


This question is still outstanding.

Here are my comments on this version of the patch:



This version of the patch gives output as a table with the structure:

functionid | lineno | statement | sqlstate | message | detail | hint | level | 
position | query | context
++---+--+-++--+---+--+---+-

I like this much better than the previous format.

I think the patch needs more documentation.

The extent of the documentation is

 titleChecking of embedded SQL/title
+para
+ The SQL statements inside applicationPL/pgSQL/ functions are
+ checked by validator for semantic errors. These errors
+ can be found by functionplpgsql_check_function/function:



I a don't think that this adequately describes the function.  It isn't clear to 
me what we mean by 'embedded' SQL.
and 'semantic errors'.


I think this would read better as

  sect2 id=plpgsql-check-function
   titleChecking SQL Inside Functions/title
   para
 The SQL Statements inside of applicationPL/pgsql/ functions can be
 checked by a validation function for semantic errors. You can check
 applicationPL/pgsl/application functions by passing the name of the
 function to functionplpgsql_check_function/function.
   /para
   para
   The functionplpgsql_check_function/function will check the SQL
   inside of a applicationPL/pgsql/application function for certain
   types of errors and warnings.
   /para

but that needs to be expanded on.

I was expecting something like:

create function x() returns void as $$ declare a int4; begin a='f'; end $$ 
language plpgsql;

to return an error but it doesn't

but

create temp table x(a int4);
create or replace function x() returns void as $$ declare a int4; begin insert 
into x values('h'); end $$ language plpgsql;

does give an error when I pass it to the validator.   Is this the intended 
behavior of the patch? If so we need to explain why the first function doesn't 
generate an error.


I feel we need to document what each of the columns in the output means.  What 
is the difference between statement, query and context?

Some random comments on the messages you return:

Line 816:

if (is_expression  tupdesc-natts != 1)
ereport(ERROR,
(errcode(ERRCODE_SYNTAX_ERROR),
 errmsg(qw,
query-query,
tupdesc-natts)));

Should this be query \%s\ returned %d columns but only 1 is allowed  or 
something similar?

 Line 837

else
/* XXX: should be a warning? */
ereport(ERROR,
(errmsg(cannot to identify real type for 
record type variable)));

In what situation does this come  up?  Should it be a warning or error?  
Do we mean the real type for record variable ?


Line 1000
ereport(ERROR,
+ (errcode(ERRCODE_DATATYPE_MISMATCH),
+  errmsg(function does not return 
composite type, is not possible to identify composite type)));


Should this be function does not return a composite type, it is not 
possible to identify the composite type ?


Line 1109:

appendStringInfo(str, parameter \%s\ is overlapped,
+  refname);

gives warnings like:

select message,detail FROM  plpgsql_check_function('b(int)');
   message   | detail
-+
 parameter a is overlapped | Local variable overlap function parameter.


How about instead
parameter a is duplicated | Local variable is named the same as the 
function parameter



Line 1278:

+ checker_error(cstate,
+   

Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Josh Berkus
On 12/07/2013 11:46 AM, Robert Haas wrote:
 Maybe there's some highly-principled statistical approach which could
 be taken here, and if so that's fine, but I suspect not.  So what I
 think we should do is auto-tune the statistics target based on the
 table size.  If, say, we think that the generally useful range for the
 statistics target is something like 10 to 400, then let's come up with
 a formula based on table size that outputs 10 for small tables, 400
 for really big tables, and intermediate values for tables in the
 middle.

The only approach which makes sense is to base it on a % of the table.
In fact, pretty much every paper which has examined statistics
estimation for database tables has determined that any estimate based on
a less-than-5% sample is going to be wildly inaccurate.  Not that 5%
samples are 100% accurate, but at least they fit the 80/20 rule.

This is the reason why implementing block-based sampling is critical;
using our current take one row out of every page method, sampling 5%
of the table means scanning the whole thing in most tables.  We also
need to decouple the number of MCVs we keep from the sample size.
Certainly our existing sampling algo seems designed to maximize IO for
the sample size.

There's other qualitative improvements we could make, which Nathan Boley
has spoken on.   For example, our stats code has no way to recognize a
normal or exponential distrbution -- it assumes that all columns are
randomly distributed.  If we could recoginze common distribution
patterns, then not only could we have better query estimates, those
would require keeping *fewer* stats, since all you need for a normal
distribution are the end points and the variance.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-07 Thread Tom Lane
Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom I believe that the spec requires that the direct arguments of
  Tom an inverse or hypothetical-set aggregate must not contain any
  Tom Vars of the current query level.

 False.

After examining this more closely, ISTM that the direct arguments are
supposed to be processed as if they weren't inside an aggregate call at all.
That being the case, isn't it flat out wrong for check_agg_arguments()
to be examining the agg_ordset list?  It should ignore those expressions
whilst determining the aggregate's semantic level.  As an example, an
upper-level Var in those expressions isn't grounds for deciding that the
aggregate isn't of the current query level.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index overhead cost reporting

2013-12-07 Thread Tom Lane
Thom Brown t...@linux.com writes:
 Perhaps I may have misunderstood, or not explained my question with
 enough detail, but you appear to be including activity that would, in
 all likelihood, occur after the DML has returned confirmation to the
 user that it has completed; in particular, VACUUM.  What I was
 thinking of was an execution plan node to communicate the index
 modifications that are carried out prior to confirmation of the query
 completing.  The bgwriter, WAL writer et al. that spring into action
 as a result of the index being updated wouldn't, as I see it, be
 included.

 So in essence, I'd only be looking for a breakdown of anything that
 adds to the duration of the DML statement.  However, it sounds like
 even that isn't straightforward from what you've written.

I think that would be reasonably straightforward, though perhaps too
expensive depending on the speed of clock reading.  My larger point was
that I don't think that that alone is a fair measure of the cost of
maintaining an index, which is what you had claimed to be interested in.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index overhead cost reporting

2013-12-07 Thread Thom Brown
On 7 December 2013 20:44, Tom Lane t...@sss.pgh.pa.us wrote:
 Thom Brown t...@linux.com writes:
 So in essence, I'd only be looking for a breakdown of anything that
 adds to the duration of the DML statement.  However, it sounds like
 even that isn't straightforward from what you've written.

 I think that would be reasonably straightforward, though perhaps too
 expensive depending on the speed of clock reading.  My larger point was
 that I don't think that that alone is a fair measure of the cost of
 maintaining an index, which is what you had claimed to be interested in.

Point taken.  Yes, I don't believe I was that clear.  This was a how
much of my DML statement time was spent updating indexes rather than
a give me stats on index maintenance times for these indexes.

It's a shame that it comes at a non-trivial price.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Greg Stark
On Sat, Dec 7, 2013 at 8:25 PM, Josh Berkus j...@agliodbs.com wrote:
 The only approach which makes sense is to base it on a % of the table.
 In fact, pretty much every paper which has examined statistics
 estimation for database tables has determined that any estimate based on
 a less-than-5% sample is going to be wildly inaccurate.  Not that 5%
 samples are 100% accurate, but at least they fit the 80/20 rule.

This is nonsense. The math behind the deductions the planner makes is
well understood and we know how to estimate the precision based on the
sample size. There are some questions that really need a proportional
sample such as n_distinct (and as Robert points out the MCV list) but
the main motivation of the sample size is the histograms and those are
used to answer questions which very clearly do not need a proportional
sample. The statistics is very clear there. Using a proportional
sample for the histograms would just be silly. It would be
substituting a gut feeling for real statistics.

The problems with using a proportional sample for things like
n_distinct or the MCV list is that you're talking about sorting or
hashing an unboundedly large set of values and storing an unboundedly
large array in the statistics table. I don't think that would be
tenable without dramatically changing the way process and store that
data to be more scalable. Using a lossy counting algorithm and
something more scalable than toasted arrays would be prerequisites I
think.

And as Robert mentions even if we solved those problems it wouldn't
help n_distinct. You really need to read all the values to deal with
n_distinct.

 This is the reason why implementing block-based sampling is critical;
 using our current take one row out of every page method, sampling 5%
 of the table means scanning the whole thing in most tables.  We also
 need to decouple the number of MCVs we keep from the sample size.
 Certainly our existing sampling algo seems designed to maximize IO for
 the sample size.

This would be helpful if you could specify what you mean by
black-based sampling. The reason these previous pleas didn't go
anywhere is not because we can't do math. It's because of the lack of
specifics here. What we do *today* is called block-based sampling by
the literature.

What I'm saying is we need to change the block-based sampling that
we do to extract more rows per block. We currently extract the
correct number of rows to get a strong guarantee of uniformity. If
we could get a weaker guarantee of being close enough to uniform
samples for the deductions we want to make and get many more rows per
block then we could read a lot fewer blocks.

Or to put it another way people could increase the statistics target
dramatically and still be reading the same number of blocks as today.
In an ideal world perhaps we could have people reading 1% of the
blocks they read today and get statistics targets 10x better than
today.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-07 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

 Tom After examining this more closely, ISTM that the direct
 Tom arguments are supposed to be processed as if they weren't inside
 Tom an aggregate call at all.  That being the case, isn't it flat
 Tom out wrong for check_agg_arguments() to be examining the
 Tom agg_ordset list?  It should ignore those expressions whilst
 Tom determining the aggregate's semantic level.  As an example, an
 Tom upper-level Var in those expressions isn't grounds for deciding
 Tom that the aggregate isn't of the current query level.

Hmm... yes, you're probably right; but we'd still have to check somewhere
for improper nesting, no? since not even the direct args are allowed to
contain nested aggregate calls.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql_check_function - rebase for 9.3

2013-12-07 Thread Pavel Stehule
Hello

thank you for review


2013/12/7 Steve Singer st...@ssinger.info

 On 08/22/2013 02:02 AM, Pavel Stehule wrote:

 rebased

 Regards

 Pavel

  This patch again needs a rebase, it no longer applies cleanly.
 plpgsql_estate_setup now takes a shared estate parameter and the call in
 pl_check needs that.   I passed NULL in and things seem to work.



 I think this is another example where are processes aren't working as we'd
 like.

 The last time this patch got  a review was in March when Tom said
 http://www.postgresql.org/message-id/27661.1364267...@sss.pgh.pa.us

 Shortly after Pavel responded with a new patch that changes the output
 format. http://www.postgresql.org/message-id/
 CAFj8pRAqpAEWqL465xkTmwe-DaZ8+7N-oEQ=uv_jjop21fj...@mail.gmail.com

 I don't much has progress in the following  9 months on this patch.

 In Tom's review the issue of code duplication and asks:

 do we want to accept that this is the implementation pathway we're going
 to settle for, or are we going to hold out for a better one? I'd be the
 first in line to hold out if I had a clue how to move towards a better
 implementation, but I have none.


yes, I am waiting still to a) have some better idea, or b) have
significantly more free time for larger plpgsql refactoring. Nothing of it
happens :(


 This question is still outstanding.

 Here are my comments on this version of the patch:



 This version of the patch gives output as a table with the structure:

 functionid | lineno | statement | sqlstate | message | detail | hint |
 level | position | query | context
 ++---+--+-+-
 ---+--+---+--+---+-

 I like this much better than the previous format.

 I think the patch needs more documentation.

 The extent of the documentation is

  titleChecking of embedded SQL/title
 +para
 + The SQL statements inside applicationPL/pgSQL/ functions are
 + checked by validator for semantic errors. These errors
 + can be found by functionplpgsql_check_function/function:



 I a don't think that this adequately describes the function.  It isn't
 clear to me what we mean by 'embedded' SQL.
 and 'semantic errors'.


 I think this would read better as

   sect2 id=plpgsql-check-function
titleChecking SQL Inside Functions/title
para
  The SQL Statements inside of applicationPL/pgsql/ functions can be
  checked by a validation function for semantic errors. You can check
  applicationPL/pgsl/application functions by passing the name of
 the
  function to functionplpgsql_check_function/function.
/para
para
The functionplpgsql_check_function/function will check the SQL
inside of a applicationPL/pgsql/application function for certain
types of errors and warnings.
/para

 but that needs to be expanded on.

 I was expecting something like:

 create function x() returns void as $$ declare a int4; begin a='f'; end $$
 language plpgsql;

 to return an error but it doesn't


This is expected. PL/pgSQL use a late casting - so a := '10'; is
acceptable. And in this moment plpgsql check doesn't evaluate constant and
doesn't try to cast to target type. But this check can be implemented
sometime. In this moment, we have no simple way how to identify constant
expression in plpgsql. So any constants are expressions from plpgsql
perspective.



 but

 create temp table x(a int4);
 create or replace function x() returns void as $$ declare a int4; begin
 insert into x values('h'); end $$ language plpgsql;


it is expected too. SQL doesn't use late casting - casting is controlled by
available casts and constants are evaluated early - due possible
optimization.



 does give an error when I pass it to the validator.   Is this the intended
 behavior of the patch? If so we need to explain why the first function
 doesn't generate an error.


 I feel we need to document what each of the columns in the output means.
  What is the difference between statement, query and context?

 Some random comments on the messages you return:

 Line 816:

 if (is_expression  tupdesc-natts != 1)
 ereport(ERROR,
 (errcode(ERRCODE_SYNTAX_ERROR),
  errmsg(qw,
 query-query,
 tupdesc-natts)));

 Should this be query \%s\ returned %d columns but only 1 is allowed
  or something similar?

  Line 837

 else
 /* XXX: should be a warning? */
 ereport(ERROR,
 (errmsg(cannot to identify real
 type for record type variable)));

 In what situation does this come  up?  Should it be a warning or error?
  Do we mean the real type for record variable ?


a most difficult part of plpgsql check function is about transformation
dynamic types to know row 

Re: [HACKERS] WITHIN GROUP patch

2013-12-07 Thread Tom Lane
Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom After examining this more closely, ISTM that the direct
  Tom arguments are supposed to be processed as if they weren't inside
  Tom an aggregate call at all.  That being the case, isn't it flat
  Tom out wrong for check_agg_arguments() to be examining the
  Tom agg_ordset list?  It should ignore those expressions whilst
  Tom determining the aggregate's semantic level.  As an example, an
  Tom upper-level Var in those expressions isn't grounds for deciding
  Tom that the aggregate isn't of the current query level.

 Hmm... yes, you're probably right; but we'd still have to check somewhere
 for improper nesting, no? since not even the direct args are allowed to
 contain nested aggregate calls.

Yeah, I had come to that same conclusion while making the changes;
actually, check_agg_arguments needs to look for aggs but not vars there.

In principle we could probably support aggs there if we really wanted to;
but it would result in evaluation-order dependencies among the aggs of a
single query level, which doesn't seem like something we want to take on
for a feature that's not required by spec.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-07 Thread Jeff Davis
On Sat, 2013-12-07 at 12:27 -0500, Stephen Frost wrote:
 * Jeff Davis (pg...@j-davis.com) wrote:
  The behavior of an extension should not depend on how it was installed.
  
  The kind of extension being described by Stephen will:
  
  * Not be updatable by doing ALTER EXTENSION foo UPDATE TO '2.0'
 
  ... [ reason ] ...

  * Dump out objects that wouldn't be dumped if they had installed the
  extension using the filesystem
  ... [ reason ] ...

I understand there are reasons, but I'm having a hard time getting past
the idea that I have extension foo v1.2 now needs to be qualified with
installed using SQL or installed using the filesystem to know what
you actually have and how it will behave.


Stepping back, maybe we need to do some more research on existing
SQL-only extensions. We might be over-thinking this. How many extensions
are really just a collection of functions on existing types? If you
define a new data type, almost all of the functions seem to revolve
around C code -- not just to define the basic data type, but also the
GiST support routines, which then mean you're implementing operators in
C too, etc.

Perhaps we should first focus on making SQL-only extensions more useful?

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Mark Kirkwood

On 08/12/13 09:25, Josh Berkus wrote:

On 12/07/2013 11:46 AM, Robert Haas wrote:

Maybe there's some highly-principled statistical approach which could
be taken here, and if so that's fine, but I suspect not.  So what I
think we should do is auto-tune the statistics target based on the
table size.  If, say, we think that the generally useful range for the
statistics target is something like 10 to 400, then let's come up with
a formula based on table size that outputs 10 for small tables, 400
for really big tables, and intermediate values for tables in the
middle.


The only approach which makes sense is to base it on a % of the table.
In fact, pretty much every paper which has examined statistics
estimation for database tables has determined that any estimate based on
a less-than-5% sample is going to be wildly inaccurate.  Not that 5%
samples are 100% accurate, but at least they fit the 80/20 rule.

This is the reason why implementing block-based sampling is critical;
using our current take one row out of every page method, sampling 5%
of the table means scanning the whole thing in most tables.  We also
need to decouple the number of MCVs we keep from the sample size.
Certainly our existing sampling algo seems designed to maximize IO for
the sample size.

There's other qualitative improvements we could make, which Nathan Boley
has spoken on.   For example, our stats code has no way to recognize a
normal or exponential distrbution -- it assumes that all columns are
randomly distributed.  If we could recoginze common distribution
patterns, then not only could we have better query estimates, those
would require keeping *fewer* stats, since all you need for a normal
distribution are the end points and the variance.



From src/backend/commands/analyze.c

* As of May 2004 we use a new two-stage method:  Stage one selects up
 * to targrows random blocks (or all blocks, if there aren't so many).
 * Stage two scans these blocks and uses the Vitter algorithm to create
 * a random sample of targrows rows (or less, if there are less in the
 * sample of blocks).  The two stages are executed simultaneously: each
 * block is processed as soon as stage one returns its number and while
 * the rows are read stage two controls which ones are to be inserted
 * into the sample.

I don't think we always read every block (been a while since I looked at 
this code, so I'll recheck). From what I understand this stuff is based 
on reasonable research (Vitter algorithm). Not to say it 
couldn't/shouldn't be looked at again to improve it - but it is not just 
dreamed up with no valid research!


Cheers

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WITHIN GROUP patch

2013-12-07 Thread Andrew Gierth
 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  Hmm... yes, you're probably right; but we'd still have to check
  somewhere for improper nesting, no? since not even the direct args
  are allowed to contain nested aggregate calls.

 Tom Yeah, I had come to that same conclusion while making the
 Tom changes; actually, check_agg_arguments needs to look for aggs
 Tom but not vars there.

There's also the question of ungrouped vars, maybe. Consider these two
queries:

select array(select a+sum(x) from (values (0.3),(0.7)) v(a) group by a)
  from generate_series(1,5) g(x);

select array(select percentile_disc(a) within group (order by x)
   from (values (0.3),(0.7)) v(a) group by a)
  from generate_series(1,5) g(x);

In both cases the aggregation query is the outer one; but while the first
can return a value, I think the second one has to fail (at least I can't
see any reasonable way of executing it).

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-12-07 Thread Peter Eisentraut
On Sun, 2013-12-08 at 02:08 +0900, Fujii Masao wrote:
  Attached revision displays signed 64-bit integers instead.
 
 Thanks! Looks good to me. Committed!

32-bit buildfarm members are having problems with this patch.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat_statements: calls under-estimation propagation

2013-12-07 Thread Peter Geoghegan
On Sat, Dec 7, 2013 at 3:50 PM, Peter Eisentraut pete...@gmx.net wrote:
 32-bit buildfarm members are having problems with this patch.

This should fix that problem. Thanks.


-- 
Peter Geoghegan
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
new file mode 100644
index 4e262b4..9f3e376
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*** pg_stat_statements(PG_FUNCTION_ARGS)
*** 1160,1165 
--- 1160,1166 
  		bool		nulls[PG_STAT_STATEMENTS_COLS];
  		int			i = 0;
  		Counters	tmp;
+ 		int64		queryid = entry-key.queryid;
  
  		memset(values, 0, sizeof(values));
  		memset(nulls, 0, sizeof(nulls));
*** pg_stat_statements(PG_FUNCTION_ARGS)
*** 1172,1178 
  			char	   *qstr;
  
  			if (detected_version = PGSS_V1_2)
! values[i++] = Int64GetDatumFast((int64) entry-key.queryid);
  
  			qstr = (char *)
  pg_do_encoding_conversion((unsigned char *) entry-query,
--- 1173,1179 
  			char	   *qstr;
  
  			if (detected_version = PGSS_V1_2)
! values[i++] = Int64GetDatumFast(queryid);
  
  			qstr = (char *)
  pg_do_encoding_conversion((unsigned char *) entry-query,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-07 Thread Stephen Frost
* Jeff Davis (pg...@j-davis.com) wrote:
 I understand there are reasons, but I'm having a hard time getting past
 the idea that I have extension foo v1.2 now needs to be qualified with
 installed using SQL or installed using the filesystem to know what
 you actually have and how it will behave.

I can certainly understand that.

 Stepping back, maybe we need to do some more research on existing
 SQL-only extensions. We might be over-thinking this. How many extensions
 are really just a collection of functions on existing types? If you
 define a new data type, almost all of the functions seem to revolve
 around C code -- not just to define the basic data type, but also the
 GiST support routines, which then mean you're implementing operators in
 C too, etc.

Fair point- I'll try and find time to review the 100+ extensions on PGXN
and classify them (unless someone else would like to or happens to
already know..?).  That said, there's certainly cases where people find
the existing extension system stinks for their SQL-only code and
therefore don't use it- which is exactly the case I'm in @work.  We have
a ton of pl/pgsql code (along with schema definitions and the like), our
own build system which builds both 'full/new' databases based on a
certain version *and* will verify that newly built == current version
plus a hand-written update script (of course, we have to write that
update script) with versioned releases, etc.

Point simply being that, were extensions more generally useful, we might
see more of them and we need to consider more than just what's in PGXN,
and therefore already built as an extension, today.

 Perhaps we should first focus on making SQL-only extensions more useful?

I'm not following what you're suggesting here..?  In general, I agree;
do you have specific ideas about how to do that?  Ones which don't
involve a superuser or modifying the filesystem under PG, and which
works with replication and backups..?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Mark Kirkwood

On 08/12/13 12:27, Mark Kirkwood wrote:

On 08/12/13 09:25, Josh Berkus wrote:

On 12/07/2013 11:46 AM, Robert Haas wrote:

Maybe there's some highly-principled statistical approach which could
be taken here, and if so that's fine, but I suspect not.  So what I
think we should do is auto-tune the statistics target based on the
table size.  If, say, we think that the generally useful range for the
statistics target is something like 10 to 400, then let's come up with
a formula based on table size that outputs 10 for small tables, 400
for really big tables, and intermediate values for tables in the
middle.


The only approach which makes sense is to base it on a % of the table.
In fact, pretty much every paper which has examined statistics
estimation for database tables has determined that any estimate based on
a less-than-5% sample is going to be wildly inaccurate.  Not that 5%
samples are 100% accurate, but at least they fit the 80/20 rule.

This is the reason why implementing block-based sampling is critical;
using our current take one row out of every page method, sampling 5%
of the table means scanning the whole thing in most tables.  We also
need to decouple the number of MCVs we keep from the sample size.
Certainly our existing sampling algo seems designed to maximize IO for
the sample size.

There's other qualitative improvements we could make, which Nathan Boley
has spoken on.   For example, our stats code has no way to recognize a
normal or exponential distrbution -- it assumes that all columns are
randomly distributed.  If we could recoginze common distribution
patterns, then not only could we have better query estimates, those
would require keeping *fewer* stats, since all you need for a normal
distribution are the end points and the variance.



 From src/backend/commands/analyze.c

* As of May 2004 we use a new two-stage method:  Stage one selects up
  * to targrows random blocks (or all blocks, if there aren't so many).
  * Stage two scans these blocks and uses the Vitter algorithm to create
  * a random sample of targrows rows (or less, if there are less in the
  * sample of blocks).  The two stages are executed simultaneously: each
  * block is processed as soon as stage one returns its number and while
  * the rows are read stage two controls which ones are to be inserted
  * into the sample.

I don't think we always read every block (been a while since I looked at
this code, so I'll recheck). From what I understand this stuff is based
on reasonable research (Vitter algorithm). Not to say it
couldn't/shouldn't be looked at again to improve it - but it is not just
dreamed up with no valid research!



Since I volunteered to recheck :-)

from analyze.c again:


/*
 * The following choice of minrows is based on the paper
 * Random sampling for histogram construction: how much is enough?
 * by Surajit Chaudhuri, Rajeev Motwani and Vivek Narasayya, in
 * Proceedings of ACM SIGMOD International Conference on Management
 * of Data, 1998, Pages 436-447.  Their Corollary 1 to Theorem 5
 * says that for table size n, histogram size k, maximum relative
 * error in bin size f, and error probability gamma, the minimum
 * random sample size is
 *  r = 4 * k * ln(2*n/gamma) / f^2
 * Taking f = 0.5, gamma = 0.01, n = 10^6 rows, we obtain
 *  r = 305.82 * k
 * Note that because of the log function, the dependence on n is
 * quite weak; even at n = 10^12, a 300*k sample gives = 0.66
 * bin size error with probability 0.99.  So there's no real need to
 * scale for n, which is a good thing because we don't necessarily
 * know it at this point.
 *
 */
stats-minrows = 300 * attr-attstattarget;


So for typical settings (default_statictics_target = 100), we try to get 
3 rows. This means we will sample about 3 blocks.


Indeed quickly checking with a scale 100 pgbench db and a simple patch 
to make the block sampler say how many blocks it reads (note 
pgbench_accounts has 163935 blocks):


bench=# ANALYZE pgbench_branches;
NOTICE:  acquire sample will need 3 blocks
NOTICE:  sampled 1 blocks
ANALYZE
Time: 16.935 ms

bench=# ANALYZE pgbench_accounts;
NOTICE:  acquire sample will need 3 blocks
NOTICE:  sampled 3 blocks
ANALYZE
Time: 10059.446 ms
bench=# \q


Regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add %z support to elog/ereport?

2013-12-07 Thread Andres Freund
On 2013-12-06 09:54:59 -0500, Peter Eisentraut wrote:
 On 11/11/13, 12:01 PM, Tom Lane wrote:
  I do recall Peter saying that the infrastructure knows how to
  verify conversion specs in translated strings, so it would have to be
  aware of 'z' flags for this to work.
 
 It just checks that the same conversion placeholders appear in the
 translation.  It doesn't interpret the actual placeholders.  I don't
 think this will be a problem.

Ok, so here's a patch.

Patch 01 introduces infrastructure, and elog.c implementation.
Patch 02 converts some elog/ereport() callers to using the z modifier,
  some were wrong at least on 64 bit windows.

In patch 01, I've modified configure to not define [U]INT64_FORMAT
directly, but rather just define INT64_LENGTH_MODIFIER as
appropriate. The formats are now defined in c.h.
INT64_LENGTH_MODIFIER is defined without quotes - I am not sure whether
that was the right choice, it requires using CppAsString2() in some
places. Maybe I should just have defined it with quotes instead. Happy
to rejigger it if somebody thinks the other way would be better.

Note that I have decided to only support the z modifier properly if no
further modifiers (precision, width) are present. That seems sufficient
for now, given the usecases, and more complete support seems to be
potentially expensive without much use.

I wonder if we should also introduce SIZE_T_FORMAT and SSIZE_T_FORMAT,
there's some places that have #ifdef _WIN64 guards and similar that look
like they could be simplified.

Btw, walsender.c/walreceiver.c upcast an int into an unsigned long for
printing? That's a bit strange.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From b31f7c5fc2b2d4451c650a54bb02e656b8fd3bbf Mon Sep 17 00:00:00 2001
From: Andres Freund and...@anarazel.de
Date: Sat, 7 Dec 2013 19:23:15 +0100
Subject: [PATCH 1/2] Add support for printing Size arguments to
 elog()/ereport() using %zu.

Similar to the way %m is special-case supported in elog.c routines, add
support for the z length modifier by replacing it by the platform's modifier
for the pointer size.

To do that, refactor the configure checks that define [U]INT64_FORMAT, to
instead define the used length modifier as INT64_LENGTH_MODIFIER. Currently we
don't support flag, width, precision modifiers in addition to z, but that's
fine for the current callsites.
---
 config/c-library.m4| 36 ++---
 configure  | 51 +-
 configure.in   | 28 +--
 src/backend/utils/error/elog.c | 16 +
 src/include/c.h| 10 ++---
 src/include/pg_config.h.in |  7 ++
 src/include/pg_config.h.win32  |  8 ++-
 7 files changed, 77 insertions(+), 79 deletions(-)

diff --git a/config/c-library.m4 b/config/c-library.m4
index 1e3997b..b836c2a 100644
--- a/config/c-library.m4
+++ b/config/c-library.m4
@@ -221,22 +221,22 @@ HAVE_POSIX_SIGNALS=$pgac_cv_func_posix_signals
 AC_SUBST(HAVE_POSIX_SIGNALS)])# PGAC_FUNC_POSIX_SIGNALS
 
 
-# PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT
+# PGAC_FUNC_SNPRINTF_LONG_LONG_LENGTH_MODIFIER
 # ---
-# Determine which format snprintf uses for long long int.  We handle
-# %lld, %qd, %I64d.  The result is in shell variable
-# LONG_LONG_INT_FORMAT.
+# Determine which format snprintf uses for long long integers.  We
+# handle %ll, %q, %I64.  The detected length modifer is stored in
+# the shell variable LONG_LONG_LENGTH_MODIFIER.
 #
-# MinGW uses '%I64d', though gcc throws an warning with -Wall,
-# while '%lld' doesn't generate a warning, but doesn't work.
+# MinGW uses '%I64', though gcc throws an warning with -Wall,
+# while '%ll' doesn't generate a warning, but doesn't work.
 #
-AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_INT_FORMAT],
-[AC_MSG_CHECKING([snprintf format for long long int])
-AC_CACHE_VAL(pgac_cv_snprintf_long_long_int_format,
-[for pgac_format in '%lld' '%qd' '%I64d'; do
+AC_DEFUN([PGAC_FUNC_SNPRINTF_LONG_LONG_LENGTH_MODIFIER],
+[AC_MSG_CHECKING([snprintf length modifier for long long])
+AC_CACHE_VAL(pgac_cv_snprintf_long_long_length_modifier,
+[for pgac_format in 'll' 'q' 'I64'; do
 AC_TRY_RUN([#include stdio.h
 typedef long long int ac_int64;
-#define INT64_FORMAT $pgac_format
+#define INT64_FORMAT %${pgac_format}d
 
 ac_int64 a = 2001;
 ac_int64 b = 4005;
@@ -258,19 +258,19 @@ int does_int64_snprintf_work()
 main() {
   exit(! does_int64_snprintf_work());
 }],
-[pgac_cv_snprintf_long_long_int_format=$pgac_format; break],
+[pgac_cv_snprintf_long_long_length_modifier=$pgac_format; break],
 [],
-[pgac_cv_snprintf_long_long_int_format=cross; break])
+[pgac_cv_snprintf_long_long_length_modifier=cross; break])
 done])dnl AC_CACHE_VAL
 
-LONG_LONG_INT_FORMAT=''
+LONG_LONG_LENGTH_MODIFIER=''
 
-case $pgac_cv_snprintf_long_long_int_format in

Re: [HACKERS] dblink performance regression

2013-12-07 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/05/2013 07:05 PM, Joe Conway wrote:
 On 12/05/2013 06:53 PM, Tom Lane wrote:
 I seem to remember that at some point we realized that the
 encoding ID assignments are part of libpq's ABI and so can't
 practically be changed ever, so the above may be moot.  Even so,
 I think it's a bad idea to be depending on pg_encoding_to_char()
 here, given the ambiguity in what it references.  It would be
 unsurprising to get build-time or run-time failures on pickier
 platforms, as a consequence of that ambiguity.  So I'd still
 recommend comparing integer IDs as above, rather than this.
 
 Great feedback as always -- thanks! Will make that change.

Committed to all active branches.

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSo8ZZAAoJEDfy90M199hlCz4P/R9ngR28e3nPNOxpkKO8R2oE
t32gPhVP2fLhTstumolJHvkAQh8d+7em8aDT8lwGQ6a1DNGtwlJ2cXR64yjI9SXg
Vp4tJyuliZau7kitiDDtGXqbEyM+cCWJ8wFnckToERJtW2uXcC81kqGoac7lz1e9
o34UKizzD8PLA+N6TCG6GsOx8/W2kKEgru8up9jbUzuJDEmzUPkSn8rA2jHVl7oX
LKF0hPcFuNJept9A/iNZmBfPgXUrHL/4s9qJ0UQdzzcJDHZ3kalTgRKs9y035WXl
XQ/YbYry8/Qw5QgWz9g50/FDhK7jAP/ZBGU99lBGO9e4hxV2lTeeiz177hYzgckF
vqWNS/dKN3wtdtXmEwhwmyLodpJKJhLFWOsgfpRPPZZu5Ji+gwPYl3MQHY4THqMp
8kjzK/BL94C15NSCO5E5FBM3CGruYWPVzNZqS4PT+VqCupZ2+qxySgL/0riMbx+J
GmBF/C9EGQzrRq7idz8O7/7Frb9TwjBgj+I0wns5NKF1vJpJ2fbzafmvpLkgJ/oI
bGhJt43D645BgnAKrInIq6mMbsYAr10pK6v03gJhiJUZREeu9wX5pVIzFV1R1PgC
ZS7evjftSP5MwdGs8geEVcQKbx42jW+kQos/HKH6saLKKf5s+cdXNiXaCOj/8cI7
1TZWt31Z9PqS7FuSDkXR
=AWVh
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Fabrízio de Royes Mello
On Sat, Dec 7, 2013 at 11:07 PM, Joe Conway m...@joeconway.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 12/05/2013 07:05 PM, Joe Conway wrote:
  On 12/05/2013 06:53 PM, Tom Lane wrote:
  I seem to remember that at some point we realized that the
  encoding ID assignments are part of libpq's ABI and so can't
  practically be changed ever, so the above may be moot.  Even so,
  I think it's a bad idea to be depending on pg_encoding_to_char()
  here, given the ambiguity in what it references.  It would be
  unsurprising to get build-time or run-time failures on pickier
  platforms, as a consequence of that ambiguity.  So I'd still
  recommend comparing integer IDs as above, rather than this.
 
  Great feedback as always -- thanks! Will make that change.

 Committed to all active branches.


IMHO is more elegant create a procedure to encapsulate the code to avoid
redundancy.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Michael Paquier
On Sun, Dec 8, 2013 at 10:16 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:



 On Sat, Dec 7, 2013 at 11:07 PM, Joe Conway m...@joeconway.com wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 12/05/2013 07:05 PM, Joe Conway wrote:
  On 12/05/2013 06:53 PM, Tom Lane wrote:
  I seem to remember that at some point we realized that the
  encoding ID assignments are part of libpq's ABI and so can't
  practically be changed ever, so the above may be moot.  Even so,
  I think it's a bad idea to be depending on pg_encoding_to_char()
  here, given the ambiguity in what it references.  It would be
  unsurprising to get build-time or run-time failures on pickier
  platforms, as a consequence of that ambiguity.  So I'd still
  recommend comparing integer IDs as above, rather than this.
 
  Great feedback as always -- thanks! Will make that change.

 Committed to all active branches.


 IMHO is more elegant create a procedure to encapsulate the code to avoid
 redundancy.
Yep, perhaps something like PQsetClientEncodingIfDifferent or similar
would make sense.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Josh Berkus
All,

I tested out Joe's original patch, and it does eliminate the 8%
performance regression.

Will try the new one.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Fabrízio de Royes Mello
On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier michael.paqu...@gmail.com
wrote:
 
  IMHO is more elegant create a procedure to encapsulate the code to avoid
  redundancy.
 Yep, perhaps something like PQsetClientEncodingIfDifferent or similar
 would make sense.


Well I think at this first moment we can just create a procedure inside the
dblink contrib and not touch in libpq.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Joe Conway
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote:
 
 On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier 
 michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com
 wrote:
 
 IMHO is more elegant create a procedure to encapsulate the code
 to avoid redundancy.
 Yep, perhaps something like PQsetClientEncodingIfDifferent or
 similar would make sense.
 
 Well I think at this first moment we can just create a procedure
 inside the dblink contrib and not touch in libpq.

Maybe a libpq function could be done for 9.4, but not for back branches.

I don't think it makes sense to create a new function in dblink either
- -- we're only talking about two lines of added redundancy which is
less lines of code than a new function would add. But if we create
PQsetClientEncodingIfDifferent() (or whatever) we can remove those
extra lines in 9.4 ;-)

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.12 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSo9BnAAoJEDfy90M199hluqYP/RyoKh9nvC49H3xDl4wBLh4n
0OQ5nKJk8RkQ5d0MLbgn9t1xiQ+RltMcHQQEDoPlrn388DNTqOP31TqCHtI11S5l
ZOjZw6eKvcp0KEzk723kZCq9d2N1uRb95K2z5jFUXbeZ2pO6yj8ohdnh8J9i1VQE
iI5F76yeUJCO8YRmHMJs39Fy3Qtq0dsXesPYBRuEJqHy7cGh9hpYHPuqHFyW19Kg
0q1nPMa7TYpKRECa1wi+Gt2BJqd70AdnOZZipqqCR2bMJqmcpBy3jo94vedaarAz
Wtunn4mk0/2LCNsAjgAdA33FYNfRpgL2c99IQ1DK5hwW9mdH2WH6G+8Eaf5zGhps
ZWXJRQSYjfUCmMOoGudEKNX3H3prrNpqltQuC978i4ddKK/78yX1wJD10I8qVNHy
MRlSoTFfomVYPW0054Jk6LR1f/RKGD4yuiIPDv8dI/gWINT1HveBGkvSf9wnY578
wjh2iLJ790o4CNK/334ooggPnlbBS4yQ+e+xsDcaJ2pc1cWJr/gCUf3cliUtv+rI
MnFvsbq4vEjhBE3Tr6LYPwivCzKpiEz2L0/oO8sShHhm/dfr9UGPUyeDO43phP+m
NFQXoh6oCuleBXk/N874yAp9EDXtu3g9E1PUMMsbplXjiH6mLp8OWmRbvQ9Qw3zu
BFtOonWViuPz5ILObc3i
=o0XG
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Fabrízio de Royes Mello
On Sat, Dec 7, 2013 at 11:41 PM, Fabrízio de Royes Mello 
fabriziome...@gmail.com wrote:


 On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier 
michael.paqu...@gmail.com wrote:
  
   IMHO is more elegant create a procedure to encapsulate the code to
avoid
   redundancy.
  Yep, perhaps something like PQsetClientEncodingIfDifferent or similar
  would make sense.
 

 Well I think at this first moment we can just create a procedure inside
the dblink contrib and not touch in libpq.


Something like the attached.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/contrib/dblink/dblink.c b/contrib/dblink/dblink.c
index a91a547..1feee22 100644
--- a/contrib/dblink/dblink.c
+++ b/contrib/dblink/dblink.c
@@ -112,6 +112,7 @@ static Relation get_rel_from_relname(text *relname_text, LOCKMODE lockmode, AclM
 static char *generate_relation_name(Relation rel);
 static void dblink_connstr_check(const char *connstr);
 static void dblink_security_check(PGconn *conn, remoteConn *rconn);
+static void dblink_set_client_encoding(PGconn *conn);
 static void dblink_res_error(const char *conname, PGresult *res, const char *dblink_context_msg, bool fail);
 static char *get_connect_string(const char *servername);
 static char *escape_param_str(const char *from);
@@ -209,8 +210,7 @@ typedef struct remoteConnHashEnt
 			 errdetail_internal(%s, msg))); \
 } \
 dblink_security_check(conn, rconn); \
-if (PQclientEncoding(conn) != GetDatabaseEncoding()) \
-	PQsetClientEncoding(conn, GetDatabaseEncodingName()); \
+dblink_set_client_encoding(conn); \
 freeconn = true; \
 			} \
 	} while (0)
@@ -290,8 +290,7 @@ dblink_connect(PG_FUNCTION_ARGS)
 	dblink_security_check(conn, rconn);
 
 	/* attempt to set client encoding to match server encoding, if needed */
-	if (PQclientEncoding(conn) != GetDatabaseEncoding())
-		PQsetClientEncoding(conn, GetDatabaseEncodingName());
+	dblink_set_client_encoding(conn);
 
 	if (connname)
 	{
@@ -2622,6 +2621,13 @@ dblink_security_check(PGconn *conn, remoteConn *rconn)
 	}
 }
 
+static void
+dblink_set_client_encoding(PGconn *conn)
+{
+	if (PQclientEncoding(conn) != GetDatabaseEncoding())
+		PQsetClientEncoding(conn, GetDatabaseEncodingName());
+}
+
 /*
  * For non-superusers, insist that the connstr specify a password.	This
  * prevents a password from being picked up from .pgpass, a service file,

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink performance regression

2013-12-07 Thread Michael Paquier

 On 2013/12/08, at 10:50, Joe Conway m...@joeconway.com wrote:
 
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 12/07/2013 05:41 PM, Fabrízio de Royes Mello wrote:
 
 On Sat, Dec 7, 2013 at 11:20 PM, Michael Paquier 
 michael.paqu...@gmail.com mailto:michael.paqu...@gmail.com
 wrote:
 
 IMHO is more elegant create a procedure to encapsulate the code
 to avoid redundancy.
 Yep, perhaps something like PQsetClientEncodingIfDifferent or
 similar would make sense.
 
 Well I think at this first moment we can just create a procedure
 inside the dblink contrib and not touch in libpq.
 
 Maybe a libpq function could be done for 9.4, but not for back branches.
Agreed as this would change the spec of libpq between minor releases. Sorry for 
not being clear myself.

  -- we're only talking about two lines of added redundancy which is
 less lines of code than a new function would add. But if we create
 PQsetClientEncodingIfDifferent() (or whatever) we can remove those
 extra lines in 9.4 ;-)



--
Michael
(Sent from my mobile phone)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ANALYZE sampling is too good

2013-12-07 Thread Gavin Flower

On 08/12/13 10:27, Greg Stark wrote:

On Sat, Dec 7, 2013 at 8:25 PM, Josh Berkus j...@agliodbs.com wrote:

The only approach which makes sense is to base it on a % of the table.
In fact, pretty much every paper which has examined statistics
estimation for database tables has determined that any estimate based on
a less-than-5% sample is going to be wildly inaccurate.  Not that 5%
samples are 100% accurate, but at least they fit the 80/20 rule.

This is nonsense. The math behind the deductions the planner makes is
well understood and we know how to estimate the precision based on the
sample size. There are some questions that really need a proportional
sample such as n_distinct (and as Robert points out the MCV list) but
the main motivation of the sample size is the histograms and those are
used to answer questions which very clearly do not need a proportional
sample. The statistics is very clear there. Using a proportional
sample for the histograms would just be silly. It would be
substituting a gut feeling for real statistics.

The problems with using a proportional sample for things like
n_distinct or the MCV list is that you're talking about sorting or
hashing an unboundedly large set of values and storing an unboundedly
large array in the statistics table. I don't think that would be
tenable without dramatically changing the way process and store that
data to be more scalable. Using a lossy counting algorithm and
something more scalable than toasted arrays would be prerequisites I
think.

And as Robert mentions even if we solved those problems it wouldn't
help n_distinct. You really need to read all the values to deal with
n_distinct.


This is the reason why implementing block-based sampling is critical;
using our current take one row out of every page method, sampling 5%
of the table means scanning the whole thing in most tables.  We also
need to decouple the number of MCVs we keep from the sample size.
Certainly our existing sampling algo seems designed to maximize IO for
the sample size.

This would be helpful if you could specify what you mean by
black-based sampling. The reason these previous pleas didn't go
anywhere is not because we can't do math. It's because of the lack of
specifics here. What we do *today* is called block-based sampling by
the literature.

What I'm saying is we need to change the block-based sampling that
we do to extract more rows per block. We currently extract the
correct number of rows to get a strong guarantee of uniformity. If
we could get a weaker guarantee of being close enough to uniform
samples for the deductions we want to make and get many more rows per
block then we could read a lot fewer blocks.

Or to put it another way people could increase the statistics target
dramatically and still be reading the same number of blocks as today.
In an ideal world perhaps we could have people reading 1% of the
blocks they read today and get statistics targets 10x better than
today.


I suspect that the number of rows to sample should be something of the form:


  { N  where N = a * 100
n =   { (a * N) / 100  where a * 100  N = 1
  { (a * SQRT(N)) / 100 where N  1

n: the number of rows to sample
N: total number of rows
a: configurable coefficient  (1 = a = 100)


For very large numbers of rows, like 10^8, taking a constant fraction 
will over sample for most purposes, hence the square root.


  a  Nn   sampled%
  1  1  100 1%
100  11   100%
  1   10^81  0.01%
100   10^8 10^6 1%
  1  10^10 10^5 0.001%
100  10^10 10^7  0.01%


For very large values of N, you can get can still get a representative 
sample with a very small fraction of rows sampled.


Hmm... Yes, I can see some issues, once I tried out with test values!  
However. it might inspire a more useful and practical approach!



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl always uses the same event source

2013-12-07 Thread MauMau

From: MauMau maumau...@gmail.com

I've removed a limitation regarding event log on Windows with the attached
patch.  I hesitate to admit this is a bug fix and want to regard this an
improvement, but maybe it's a bug fix from users' perspective.  Actually, 
I

received problem reports from some users.


I've done a small correction.  I removed redundant default value from the 
pg_ctl.c.



Regards
MauMau



pg_ctl_eventsrc_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl fails with config-only directory

2013-12-07 Thread Amit Kapila
On Sat, Dec 7, 2013 at 6:02 PM, MauMau maumau...@gmail.com wrote:
 From: Amit Kapila amit.kapil...@gmail.com

 Today, I had again gone through all the discussion that happened at
 that time related to this problem
 and I found that later in discussion it was discussed something on
 lines as your Approach-2,
 please see the link
 http://www.postgresql.org/message-id/503a879c.6070...@dunslane.net


 Thank you again.  I'm a bit relieved to see similar discussion.



 Wouldn't the other way to resolve this problem be reinvoke pg_ctl in
 non-restricted mode for the case in question?


 It's effectively the same as not checking admin privileges.  And direct
 invocation of postgres -C/--describe-config still cannot be helped.

   Yeah, that is the only point of reinvoke pg_ctl in non-restricted
mode, that it should be only allowed through
   pg_ctl, but not directly with postgres -C.
   In anycase, I think now we have link for the discussion and patch
as well for other Approach, so if the consensus
   is to modify postgres code such that we don't need to check for
admin privs for -C/--describe-config option, then the
   patch proposed by you can be taken else the other patch can be used.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extra functionality to createuser

2013-12-07 Thread Amit Kapila
On Fri, Dec 6, 2013 at 10:31 AM, Peter Eisentraut pete...@gmx.net wrote:
 On Wed, 2013-11-20 at 11:23 -0500, Christopher Browne wrote:
 I note that similar (with not quite identical behaviour) issues apply
 to the user name.  Perhaps the
 resolution to this is to leave quoting issues to the administrator.
 That simplifies the problem away.

 How about only one role name per -g option, but allowing the -g option
 to be repeated?

   I think that might simplify the problem and patch, but do you think
it is okay to have inconsistency
   for usage of options between Create User statement and this utility?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] About shared cache invalidation mechanism

2013-12-07 Thread huaicheng Li
​I know that all invalid cache messages are stored in the
shmInvalidationBuffer ring buffer and that they should be consumed by all
other backends to keep their own cache fresh. Since there may be some
stragglers which process the SI message quite slow, we use *catchup*
interrupt(signal) to accelerate their cosuming shared invalid messages.
Here comes my questions :
(1). When the current number of messages in the shmInvalidationBuffer
exceeds a threshold, it needs to be cleaned up by using SICleanupQueue.
After that, if the number still exceeds MAXNUMMESSAGES/2, threshold will be
calculated by the following formula:
Threshold = (numMsgs/CLEANUP_QUANTUM + 1) * CLEANUP_QUANTUM
(2). For those slow backends, if their *nextMsgNum* value is less than
*lowbound*, they will be reset, and the *lowbound* is calculated by
lowbound = maxMsgNum - MAXNUMMESSAGES + minFree,
and if their *nextMsgNum* value is less than *minsig*, they will get
catchup signals to speed up, *minsig* is calculated by
minsig = maxMsgNum - MAXNUMMESSAGES/2

Here, I want to ask why threshold, lowbound and minsig are calculated like
that ? Do the three formulas have any performance considerations when
designed ? I have searched through the old mail list archives, but found
nothing about these(these changes emerged in pg8.4 firstly), any help would
be appreciated.