Re: [HACKERS] pgbench throttling latency limit

2014-10-13 Thread Gregory Smith

On 10/13/14, 1:54 PM, Heikki Linnakangas wrote:
Greg Smith, I'd still appreciate it if you could take a look at this, 
to check how this will work for pgbench-tools.


I'll do a QA pass on the committed version looking for issues, and 
update the toolchain I publish to be compatible with it along the way too.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal for 9.5: monitoring lock time for slow queries

2014-10-05 Thread Gregory Smith

On 8/13/14, 5:14 AM, MauMau wrote:
I'm interested in b, too.  I was thinking of proposing a performance 
diagnostics feature like Oracle's wait events (V$SYSTEM_EVENT and 
V$SESSION_EVENT).  So, if you do this, I'd like to contribute to the 
functional design, code and doc review, and testing.


I already wrote up a design like this once: 
http://www.postgresql.org/message-id/509300f7.5000...@2ndquadrant.com


The main problem when I tried to code it was figuring out how to store 
the data.  When I wrote that, I thought I could just stuff it into a 
shared memory block the way pg_stat_statements did.  That didn't really 
work out.  I think it's manageable now because the Dynamic Shared Memory 
management written since then has the right sort of shape to do the job.


This is one of those jobs where I think the coding itself is the hard 
part, not the design nor the review.  What I really want is something 
that dumps this data into memory, then a second process that persists to 
disk in batches.  I think that's the only way we'll get high performance 
on reads while still saving enough data to be useful after a crash.


We're working on getting a few things in this area fully funded to dig 
into them harder.  The idea of designing for high-speed audit logs into 
memory and then persisting to disk has a lot of overlap with this one 
too, and that may get picked up too.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixed xloginsert_locks for 9.4

2014-10-03 Thread Gregory Smith

On 10/3/14, 8:26 AM, Andres Freund wrote:

#define NUM_XLOGINSERT_LOCKS  1
tps = 52.711939 (including connections establishing)
#define NUM_XLOGINSERT_LOCKS  8
tps = 286.496054 (including connections establishing)
#define NUM_XLOGINSERT_LOCKS  16
tps = 346.113313 (including connections establishing)
#define NUM_XLOGINSERT_LOCKS  24
tps = 363.242111 (including connections establishing)


Just to clarify:  that 10% number I threw out was meant as a rough 
estimate for a system with the default configuration, which is all that 
I tested.  It seemed like people would likely need to tune all the usual 
things like checkpoint_segments, shared_buffers, etc. as well before 
seeing much better.  You did all that, and sure enough the gain went up; 
thanks for confirming my guess.


I still don't think that means this needs a GUC for 9.4.  Look at that 
jump from 1 to 8.  The low-hanging fruit here hasn't just been knocked 
off.  It's been blended into a frozen drink, poured into a glass, and 
had a little paper umbrella put on top.  I think that's enough for 9.4.  
But, yes, let's see if we can add delivery to the side of the pool in 
the next version too.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixed xloginsert_locks for 9.4

2014-10-03 Thread Gregory Smith

On 10/3/14, 10:11 AM, Andres Freund wrote:
So 25% performance on a relatively small machine improvements aren't 
worth a GUC? That are likely to be larger on a bigger machine? I 
utterly fail to see why that's a service to our users.


I didn't say that.  I said I don't think they're worth a GUC today if it 
can be quietly and automatically slipped into the next release--and that 
seems quite feasible.  I have introduced GUCs that almost no one can 
tune properly into the system before.  Can't say I was pleased with how 
that played out.


Another thing I don't know yet, and this is going to take me a while to 
characterize, is how that 25% gain on a benchmark that is specifically 
designed to highlight the problem use case impacts the various mixes of 
more average cases I try as well.  Is it 0.1% for a typical pgbench 
workload?  Now that GUC isn't so exciting anymore either.


And there's one more big issue I'd prefer not to discover from 
real-world complaints:  is there any downside to making this number very 
large on a system where it shouldn't be?


The history of settings like this says that providing an exposed knob 
will result in some people tinkering it with and making the system 
slower.  The gain of going from 1 to 8 is so clear and simple that I'm 
not worried too hard about performance regressions like that.  But if we 
make it a GUC and it can be set to 100 to extract maximum performance on 
big machines...I'd take a bet that we'll find people setting it to 100 
saying more must be better where it isn't.  Every time someone wanders 
into pgsql-performance with a complaint, each one of these obscure GUCs 
they tweaked magnifies the troubleshooting mess a little.


I do not disagree with you fundamentally here: this *is* worth refining 
further, for sure, and the gains might be even greater if that keeps 
going.  My guess is just that the Postgres community would not get a net 
benefit chasing that as a GUC in 9.4, not by the time you try to account 
for all the future overhead and risk that adds to the release.  That was 
Heikki's gut feel on this when he yanked it out already; I was mainly 
trying to do sanity checking on that.  You've made a good case that 
wasn't the ideal answer.  Even with that new data, I still don't think 
it was a outright bad decision though--especially not in an October 
where there's no new version out yet.  This thread spun out of Open 
Items, and cutting complexity should be the preferred direction for 
everything left on there now.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixed xloginsert_locks for 9.4

2014-10-03 Thread Gregory Smith

On 10/3/14, 5:23 PM, Andres Freund wrote:
How are we ever going to be able to tune it further without feedback 
from actual production usage?


With improved targeted synthetic test cases that isolate the bottleneck 
to where it's really obvious, way more obvious than it will ever be in 
production?  I just watched you do a very successful round of chewing 
right through this problem that way.  And haven't the synthetic tests 
already been more useful to guide development than feedback from 
production for some time?


An alternate way to state one of your questions along this angle is how 
can we tell if the fixed limit of 8 is still a bottleneck on a 9.4 
server unless there's a GUC to increase past that?  In my first message 
here I was trying to highlight that we have little idea what that world 
looks like yet.  This change is already so beneficial and large, the 
hotspots on systems impacted by it are probably going to move to 
somewhere *very* different than earlier versions.


And when that happens, it's typically not revisiting the thing you just 
made way, way faster than is still the problem anymore.  If it turns out 
a large bottleneck in real-world 9.4 deployments is *still* 
xloginsert_locks, and there's no GUC for tuning it beyond 8, then the 
people in the support trenches are going to see removing that GUC as a 
terrible error.  That might happen.  I'm not trying to criticize your 
work here, but you haven't actually made the case yet this is 
likely--you cheated a little with the I/O part to get around where the 
bottleneck shifts once you have a decent number of slots (8).  That 
tweak was part of why you got a larger gain than I did.


So that's a bad path everyone might see in the future, and if we end up 
there all of us in the support trenches will suffer for having done the 
wrong thing.  I get what you're saying there, and I'll owe you an 
apology if it plays out that way.  In every other potential future I can 
imagine, and in every future I consider probable, eliminating the GUC 
for now makes life easier.  Everyone has already talked a bunch about 
why extra GUCs have a support cost too.  It's one less thing to 
maintain, document, mess with, break in the field, talk about, and 
suffer from unintended regressions.


Do you want to put the GUC right back again in the active branch to keep 
progress on this easier to make, since it's really clear already there's 
still gain to be chased here?  I think you've already justified doing 
that.  I'm still running the commit with the GUC here.  I will join you 
among the annoyed that it's gone group the minute I do my next git pull.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time measurement format - more human readable

2014-10-01 Thread Gregory Smith

On 9/29/14, 1:08 AM, Andres Freund wrote:

On 2014-09-28 20:32:30 -0400, Gregory Smith wrote:

There are already a wide range of human readable time interval output
formats available in the database; see the list at 
http://www.postgresql.org/docs/current/static/datatype-datetime.html#INTERVAL-STYLE-OUTPUT-TABLE

He's talking about psql's \timing...


I got that.  My point was that even though psql's timing report is kind 
of a quick thing hacked into there, if it were revised I'd expect two 
things will happen eventually:


-Asking if any of the the interval conversion code can be re-used for 
this purpose, rather than adding yet another custom to one code path 
standard.


-Asking if this should really just be treated like a full interval 
instead, and then overlapping with a significant amount of that baggage 
so that you have all the existing format choices.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] open items for 9.4

2014-09-30 Thread Gregory Smith

On 9/29/14, 2:30 PM, Andres Freund wrote:

 Can we explain those reasons in the form of documentation?
Yes. Try and benchmark it. It'll be hardware and workload dependant.


I missed this whole thing, and eventually I have to circle back to it.  
I could do it this week.  Could you (or someone else familiar with the 
useful benchmarks) give me more detail on how to replicate one case 
where things should improve?  I can do that, and I have a lab full of 
hardware if it's easier to spot on one type of server. That exercise 
will probably lead to a useful opinion on the feature in its final form, 
any associated GUC, tunables, and necessary level of associated 
documentation in a day or two.


This is a pretty low bar, right?  Examples and documentation sufficient 
for just me to catch up is not asking for very much.  If it isn't 
possible to do that in a very short period of time, it would not bode 
well for broader consumption.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Time measurement format - more human readable

2014-09-28 Thread Gregory Smith

On 9/28/14, 7:49 AM, Bogdan Pilch wrote:

I have created a small patch to postgres source (in particular the
psql part of it) that modifies the way time spent executing the SQL
commands is printed out.

The idea is to have a human readable time printed


There are already a wide range of human readable time interval output 
formats available in the database; see the list at 
http://www.postgresql.org/docs/current/static/datatype-datetime.html#INTERVAL-STYLE-OUTPUT-TABLE


If none of those are acceptable to you, it would be difficult but not 
impossible to justify something new.  I could see tweaking one of those 
into a slightly updated new style aimed at this specific job, especially 
since it doesn't have to consider things like negative intervals.


There's value in printing time measurements using one of these interval 
styles sometimes, instead of the relatively raw values given right now.  
It would need to be an option though, and one that let the user allow 
choosing any of the supported interval formats. I personally would 
prefer to never see the existing format the number is reported in go 
away--too much software already expects it to be there, in that format.  
But adding this human readable version after that, when the user asks 
specifically for it, could be an acceptable addition.


So there's a rough spec for the job you'd have to take on here.  I'd 
expect it to expand in scope almost immediately to also consider the 
output of similar time intervals from mechanisms like 
log_min_duration_statement, too though, rather than just focusing on 
psql timing data.  There's a whole second round of almost inevitable 
scope creep to working on this.


If you were hoping what you submitted might be considered directly, 
sorry; that's not going to happen.  Handling input and output of times 
and dates is a very deep topic, and small patches trying to adjust such 
behavior without grappling with the full complexity are normally 
rejected outright.  There are cases where the existing code just has 
simple hacks in there now that could easily be whacked around.  But once 
the topic of cleaning those up appears, swapping to an alternate simple 
hack is rarely how that goes.  It normally heads toward considering the 
full right thing to do to handle all cases usefully.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-27 Thread Gregory Smith

On 9/26/14, 3:22 PM, Tom Lane wrote:
We could alternatively try to split up these cases into multiple GUCs, 
which I guess is what you're imagining as a multi-year battle.


No, I was just pointing out that even the cleanest major refactoring 
possible here is going to result in broken config files complaints for 
years.  And no matter how well that goes, a second rewrite in the next 
major version, addressing whatever things nobody saw coming in the first 
redesign, is a very real possibility.  The minute the GUC box is opened 
that far, it will not close up again in a release, or likely even two, 
and the griping from the field is going to take 3+ years to settle.


I have no desire to waste time on partial measures either though.  I 
think I've been clear that's my theme on this--if we're gonna mess with 
this significantly, let's do it right and in a big way.  If there's a 
really trivial fix to apply, that's fine too.  Throw an error when the 
value is between the special one and the useful minimum, no other 
changes; that I could see slipping into a single release without much 
pain.  Might even be possible to write as a useful sub-step toward a 
bigger plan too.  Wouldn't bother breaking that out as a goal unless it 
just happened to fall out of the larger design though.


The rest of the rounding and error handling ideas wandering around seem 
in this uncomfortable middle ground to me.  They are neither large 
enough to feel like a major improvement happened, nor trivial enough to 
apply with minimal work/risk.  And this is not a bug that must be fixed 
ASAP--it's an unfriendly UI surprise.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Last Commitfest patches waiting review

2014-09-27 Thread Gregory Smith

On 9/27/14, 4:18 AM, Heikki Linnakangas wrote:

add latency limit to pgbench throttling (--rate)
---
Rukh: Are you planning to review the latest patch version?


Rukh is free to jump onto this ASAP, but if it's still there next week I 
already had my eye on picking that up and taking it out for a spin.  I 
already updated my pgbench-tools set to incorporate the rate limit for 
9.4, and I use that part all the time now.  I think I understand 
Fabien's entire game plan for this now, and I want the remainder of the 
set to land in 9.5.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread Gregory Smith

On 9/26/14, 2:17 PM, Tom Lane wrote:
Well, ok, let's allow zero as a special case, but it has to be written 
as 0 not something else. If you try to set a positive value then we 
should act as though the min_val is 1 unit. So I'm coming around to 
the idea that throw an error if a nonzero input would round (or 
truncate) to zero is a reasonable solution. 


I expressed some distaste for throwing errors before, but I find this 
specific rule reasonable.  I'll even write the patch.  I owe the GUC 
system a re-match after my wal_buffers auto-scaling thing for 9.1 
rippled to cause extra work for you (maybe Robert too).


I just changed this submission from Ready for Committer to Returned 
with Feedback, with the feedback being that we'd like to see special 
values rounded to 0 treated differently altogether first, before 
touching any rounding.  And that's a whole new submission for the next CF.


I think it'd be even more reasonable if we also fixed the rounding 
rule to be round to nearest, but the two changes can be considered 
independently. regards, tom lane 


Personally I'm still -1 on any rounding change, instead preferring to 
work toward eliminating these 0  -1 special values altogether.  Outside 
of these special cases, I feel you are fundamentally right that if the 
rounding really matters, the unit is simply too large.  And I believe 
that is the case for the log rotation one right now.


I can see my idea for rescaling the rotation age parameter is an 
unpopular one, but I still like it.  That cleanly splits out into a 
third thing though, where it can live or die without being tied to the 
rest of these issues.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread Gregory Smith

On 9/26/14, 2:34 PM, David Johnston wrote:


​ I don't get where we need to do anything else besides that...the 
whole actual min values comment is unclear to me.


If you look at pg_settings, there is a minimum value exposed there as 
min_val.  For some of these parameters, that number would normally be 
1.  But since we have decided that 0 is a special flag value, min_val is 
0 instead.


There are others where min_val is -1 for the same reason, where 
functionally the minimum is really 0.  Some of us would like to see 
min_val reflect the useful minimum, period, and move all these special 
case ones out of there.  That is a multi-year battle to engage in 
though, and there's little real value to the user community coming out 
of it relative to that work scope.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-26 Thread Gregory Smith

On 9/26/14, 2:50 PM, David G Johnston wrote:


Like I said I just want to fix the bug and call it a day :)


I'm afraid you've come to the wrong project and mailing list for *that*.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-09-26 Thread Gregory Smith

On 9/26/14, 2:38 PM, Gavin Flower wrote:
Curious: would it be both feasible and useful to have multiple workers 
process a 'large' table, without complicating things too much?  The 
could each start at a different position in the file.


Not really feasible without a major overhaul.  It might be mildly useful 
in one rare case.  Occasionally I'll find very hot single tables that 
vacuum is constantly processing, despite mostly living in RAM because 
the server has a lot of memory.  You can set vacuum_cost_page_hit=0 in 
order to get vacuum to chug through such a table as fast as possible.


However, the speed at which that happens will often then be limited by 
how fast a single core can read from memory, for things in 
shared_buffers.  That is limited by the speed of memory transfers from a 
single NUMA memory bank.  Which bank you get will vary depending on the 
core that owns that part of shared_buffers' memory, but it's only one at 
a time.


On large servers, that can be only a small fraction of the total memory 
bandwidth the server is able to reach.  I've attached a graph showing 
how this works on a system with many NUMA banks of RAM, and this is only 
a medium sized system.  This server can hit 40GB/s of memory transfers 
in total; no one process will ever see more than 8GB/s.


If we had more vacuum processes running against the same table, there 
would then be more situations where they were doing work against 
different NUMA memory banks at the same time, therefore making faster 
progress through the hits in shared_buffers possible. In the real world, 
this situation is rare enough compared to disk-bound vacuum work that I 
doubt it's worth getting excited over.  Systems with lots of RAM where 
performance is regularly dominated by one big ugly table are common 
though, so I wouldn't just rule the idea out as not useful either.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-25 Thread Gregory Smith

On 9/25/14, 1:41 AM, David Johnston wrote:
If the error message is written correctly most people upon seeing the 
error will simply fix their configuration and move on - regardless of 
whether they were proactive in doing so having read the release notes.


The important part to realize here is that most people will never see 
such an error message.  There is a person/process who breaks the 
postgresql.conf, a process that asks for a configuration restart/reload 
(probably via pg_ctl, and then the postmaster program process creating a 
server log entry that shows the error (maybe in pgstartup.log, maybe in 
pg_log, maybe in syslog, maybe in the Windows Event Log)


In practice, the top of that food chain never knows what's happening at 
the bottom unless something goes so seriously wrong the system is down, 
and they are forced to drill down into all of these log destinations.  
That's why a subset of us consider any error message based approaches to 
GUC guidance a complete waste of time.  I won't even address the rest of 
your comments; you're focusing on trivia around something that just 
fundamentally isn't useful at all.


My challenge to anyone who things error checking has value for this 
issue is to demonstrate how that would play out usefully on a mainstream 
Postgres system like RHEL/CentOS, Debian, or even Windows  Put your 
bogus setting in the config file, activate that config file in a 
Postgres that looks for the round errors people dislike, and show me how 
that mistake is made apparent to the user who made it.  I've done 
similar exercises myself, and my guess is that if the system is up at 
all, those error messages went by completely unnoticed.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add modulo (%) operator to pgbench

2014-09-25 Thread Gregory Smith

On 9/25/14, 8:38 AM, Robert Haas wrote:
That's my whole reason for not wanting to adopt Fabien's approach in 
the first place: I was cool with exposing C's modulo operator, but any 
other modulo semantics seem like they should be built up from 
general-purpose primitives. 


Maybe; I don't quite understand his requirements well enough yet to know 
if that's possible, or if it's easier to give him a full special 
operator of his own.  But since what you did makes that easier, too, 
forward progress regardless.


Anyway, I think the first thing is that somebody needs to spend some 
time testing, polishing, and documenting this patch, before we start 
adding to it. I'm hoping someone else will volunteer - other tasks 
beckon. 


I bouncing it to here for you, and I expect to help with those parts 
presumably in addition to Fabien's help: 
https://commitfest.postgresql.org/action/patch_view?id=1581



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-25 Thread Gregory Smith

On 9/25/14, 2:02 PM, Peter Eisentraut wrote:
But having the same parameter setting mean different things in 
different versions is the path to complete madness. 


Could we go so far as to remove support for unitless time settings 
eventually?  The fact that people are setting raw numbers in the 
configuration file and have to know the unit to understand what they 
just did has never been something I like.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-24 Thread Gregory Smith

On 9/24/14, 6:45 PM, Peter Eisentraut wrote:

But then this proposal is just one of several others that break backward
compatibility, and does so in a more or less silent way.  Then we might
as well pick another approach that gets closer to the root of the problem.


I was responding to some desire to get a quick fix that cut off the 
worst of the problem, and the trade-offs of the other ideas bothered me 
even more.  Obvious breakage is annoying, but small and really subtle 
version to version incompatibilities drive me really crazy. A 60X scale 
change to log_rotation_age will be big enough that impacted users should 
definitely notice, while not being so big it's scary.  Rounding 
differences are small enough to miss.  And I do see whacking the sole 
GUC that's set in minutes, which I was surprised to find is a thing that 
existed at all, as a useful step forward.


I can't agree with Stephen's optimism that some wording cleanup is all 
that's needed here.  I predict it will be an annoying, multiple commit 
job just to get the docs right, describing both the subtle rounding 
change and how it will impact users.  (Cry an extra tear for the 
translators)


Meanwhile, I'd wager the entire work of my log_rotation_scale unit 
change idea, from code to docs to release notes, will take less time 
than just getting the docs done on any rounding change.  Probably cause 
less field breakage and confusion in the end too.


The bad case I threw out is a theorized one that shows why we can't go 
completely crazy with jerking units around, why 1000:1 adjustments are 
hard.  I'm not actually afraid of that example being in the wild in any 
significant quantity.  The resulting regression from a 60X scale change 
should not be so bad that people will suffer greatly from it either.  
Probably be like the big 10:1 change in default_statistics_target.  
Seemed scary that some people might be nailed by it; didn't turn out to 
be such a big deal.


I don't see any agreement on the real root of a problem here yet. That 
makes gauging whether any smaller change leads that way or not fuzzy.  I 
personally would be fine doing nothing right now, instead waiting until 
that's charted out--especially if the alternative is applying any of the 
rounding or error throwing ideas suggested so far.  I'd say that I hate 
to be that guy who tells everyone else they're wrong, but we all know I 
enjoy it.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add modulo (%) operator to pgbench

2014-09-24 Thread Gregory Smith

On 9/24/14, 10:10 PM, Robert Haas wrote:
I think you're making a mountain out of a molehill. I implemented this 
today in about three hours.


I think you're greatly understating your own efficiency at 
shift/reducing parser mountains down to molehills.  Fabien even guessed 
the LOC size of the resulting patch with less than a 9% error.  That's 
some top notch software metrics and development work there boys; kudos 
all around.


Let's get this operator support whipped into shape, then we can add the 
2 to 3 versions of the modulo operator needed to make the major use 
cases work.  (There was never going to be just one hacked in with a 
quick patch that satisfied the multiple ways you can do this)


Then onto the last missing pieces of Fabien's abnormally distributed 
test workload vision.  He seems kind of stressed about the process 
lately; not sure what to say about it.  Yes, the PostgreSQL community is 
hostile to short, targeted feature improvements unless they come already 
fit into a large, standards compliant strategy for improving the 
database.  That doesn't work well when approached by scratch an itch 
stye development.  Nothing we can really do about it


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RLS feature has been committed

2014-09-23 Thread Gregory Smith

On 9/23/14, 9:00 AM, Andres Freund wrote:
I also think committers need to be much more careful when committing 
patches which they (or their employer) appear to have a business 
interest in. Rushing ahead to commit the patch of somebody 'unrelated' 
leaves a completely different taste than committing your colleagues 
patch. *INDEPENDENT* of the actual reasons and the state of the patch.


I haven't been doing much personal development work around here lately, 
but I did more than a little of the planning on how to handle this as 
responsibly as I felt it deserved.  I think this is worth talking about 
a little bit because this whole topic, how to deal with a funded project 
where the committer is also a contributor, isn't something a lot of 
people have visibility into. (Not talking about you, Andres, you know 
the deal)


This commit didn't happen until I first succeeded in getting Stephen 
Frost fully funded as a community PostgreSQL contributor (for this job, 
as well as others with a broader community appeal).  Everyone attached 
to the budget side very explicitly understands that includes an 
open-ended responsibility to the community for addressing fall-out from 
RLS, going from unforeseen issues to revisiting the things left on the 
known open items list.  It's hard to reach perfect before commit; 
eventually you just have to just shoot and see what happens.


Just to be really safe, I also kicked off training a whole new 
PostgreSQL contributor (Adam Brightwell) five months ago, so that by the 
time we got to here he's also making his own contributions to the 
security code of the database.  And that's included exercises tracking 
down bugs in the early RLS POC deployments already going on here at 
Crunchy, so that Stephen isn't the sole Postgres community expertise 
bottleneck on the code even at this company.  (I am NOT on the hook for 
fixing RLS bugs)


The decision on when to commit was strictly Stephen's.  During our last 
chat, we agreed this seemed like an ideal time of year in the 
development cycle for such a thing to land though.  9.5 is a fresh 
branch, and there is plenty of time to clean up and/or revert if that's 
the unfortunate end for anything going in today.  Plus the ongoing 
CommitFest schedule means everyone in the community already is arranging 
to provide review resources needed in the near future pipeline.


I can understand that Robert feels a procedural sting and/or slight due 
to the exact sequence of events here, and I'm staying out of that.  But 
in general, I don't know what we could have done much better to be a 
responsible contributor, to finally push this long in development 
feature over the finish line.  A description like rushing ahead would 
feel inappropriate to me for this one, given how much care went into 
timing even roughly when this particular commit should happen.  The time 
of year in particular was very specifically aimed at for minimal 
disruption, based on both major version release dates and the related 
community development cycle around them.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-23 Thread Gregory Smith

On 9/23/14, 10:31 AM, Robert Haas wrote:

I suggest we count these things:

1. The number of buffers the reclaimer has put back on the free list.
2. The number of times a backend has run the clocksweep.
3. The number of buffers past which the reclaimer has advanced the 
clock sweep (i.e. the number of buffers it had to examine in order to 
reclaim the number counted by #1).
4. The number of buffers past which a backend has advanced the 
clocksweep (i.e. the number of buffers it had to examine in order to 
allocate the number of buffers count by #3).
5. The number of buffers allocated from the freelist which the backend 
did not use because they'd been touched (what you're calling 
buffers_touched_freelist).


All sound reasonable.  To avoid wasting time here, I think it's only 
worth doing all of these as DEBUG level messages for now.  Then only go 
through the overhead of exposing the ones that actually seem relevant.  
That's what I did for the 8.3 work, and most of that data at this level 
was barely relevant to anyone but me then or since. We don't want the 
system views to include so much irrelevant trivia that finding the 
important parts becomes overwhelming.


I'd like to see that level of instrumentation--just the debug level 
messages--used to quantify the benchmarks that people are running 
already, to prove they are testing what they think they are.  That would 
have caught the test error you already stumbled on for example.  Simple 
paranoia says there may be more issues like that hidden in here 
somewhere, and this set you've identified should find any more of them 
around.


If all that matches up so the numbers for the new counters seem sane, I 
think that's enough to commit the first basic improvement here.  Then 
make a second pass over exposing the useful internals that let everyone 
quantify either the improvement or things that may need to be tunable.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core function in Postgres

2014-09-23 Thread Gregory Smith

On 9/23/14, 8:02 PM, Michael Paquier wrote:
pgsql-hackers is as well a mailing list where people have technical 
discussions about features and patches, hence your question would be 
more adapted for pgsql-novice or pgsql-general.


Let's be fair and get the details perfect if we're going to advise 
people on list policy, and that's not quite right if you compare against 
http://www.postgresql.org/list/


The official description says you consider trying elsewhere first, and  
If your question cannot be answered by people in the other lists, and it 
is likely that only a developer will know the answer, you may re-post 
your question in this list.


From that, it's completely reasonable that Mingzhe guessed how main() 
is used in PostgreSQL is something only a developer would know the 
answer to, and therefore should go here instead of pgsql-general.


My feedback would be to point out that that pgsql-general is also a list 
where  many of the developers monitor traffic, and that's the right 
place to post first for this sort of question.  The mailing list page 
does not make that completely obvious though, so this is an 
understandable thing to be unsure about.  This is definitely not a 
pgsql-novice question.


The mailing list page could probably use an explicit clarification that 
pgsql-general *is* an appropriate venue for simpler questions that only 
a developer will know the answer.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-23 Thread Gregory Smith

On 9/23/14, 1:21 AM, David Johnston wrote:
This patch should fix the round-to-zero issue.  If someone wants to 
get rid of zero as a special case let them supply a separate patch for 
that improvement.


I am going to wander into this fresh after just reading everything once 
(but with more than a little practice with real-world GUC mucking) and 
say that, no, it should not even do that.  The original complaint here 
is real and can be straightforward to fix, but I don't like any of the 
proposals so far.


My suggestion:  fix the one really bad one in 9.5 with an adjustment to 
the units of log_rotation_age.  That's a small commit that should thank 
Tomonari Katsumata for discovering the issue and even suggesting a fix 
(that we don't use) and a release note; sample draft below.  Stop there.


Let someone with a broader objection take on the fact that zero (and -1) 
values have special properties, and that sucks, as a fully reasoned 
redesign.  I have a larger idea for minimizing rounding issues of 
timestamps in particular at the bottom of this too.  I wouldn't dare 
take on changes to rounding of integers without sorting out the special 
flag value issue first, because the variety of those in the database is 
large compared to the time ones.


== log_rotation_age ==

Back to where this started at 
http://www.postgresql.org/message-id/53992ff8.2060...@po.ntts.co.jp : 
log_rotation_age would be disabled if we set it less than one minute, 
with this example of a surprising behavior:


log_rotation_age = 10s

postgres=# show log_rotation_age ;
log_rotation_age
--
0

That's bad and the GUC system is not being friendly; fully agreed. But 
this is just one where the resolution for the parameter is poor.  
log_rotation_age is the *only* GUC with a time interval that's set in 
minutes:


postgres=# SELECT name, unit FROM pg_settings WHERE unit ='min';
   name   | unit
--+--
 log_rotation_age | min

For comparison, there are 10 GUCs set in seconds and 13 in ms in HEAD.

We could just change the units for log_rotation_age to seconds, then let 
the person who asks for a 10s rotation time suffer for that decision 
only with many log files.  The person who instead chooses 10ms may find 
log rotation disabled altogether because that rounds to zero, but now we 
are not talking about surprises on something that seems sane--that's a 
fully unreasonable user request.


Following that style of fix all the way through to the sort of release 
notes needed would give something like this:


log_rotation_age is now set in units of seconds instead of minutes. 
Earlier installations of PostgreSQL that set this value directly, to a 
value in minutes, should change that setting to use a time unit before 
migrating to this version.


And we could be done for now.

== Flag values and error handling ==

I would like to see using 0 and -1 as special values in GUCs overhauled 
one day, with full disregard for backward compatibility as a 
straightjacket on doing the right thing.  This entire class of behavior 
is annoying.  But I am skeptical anything less than such an overhaul 
will really be useful.  To me it's not worth adding new code, 
documentation, and associated release notes to guide migration all to 
try and describe new rounding trivia--which I don't see as better nor 
worse than the trade-offs of what happens today.


I can't take the idea of throwing errors for something this minor 
seriously at all.  There are a lot more precedents for spreading 
settings around in a few places now, from include_dir to 
postgresql.auto.conf.  There are so many ways to touch a config now and 
not notice the error message now, I really don't want to see any more 
situations appear where trying to change a setting will result in a 
broken file added into that mix.  None of this broken units due to 
rounding stuff that I've found, now that I went out of my way looking 
for some, even comes close to rising to that level of serious to me.  
Only this log rotation one is so badly out of line that it will act 
quite badly.


== Overhauling all time unit GUCs ==

Here's the complete list of potential ugly time settings to review in 
the future, if my suggestion of only fixing log_rotation_age were followed:


gsmith=# SELECT name,setting, unit, min_val FROM pg_settings where unit 
in ('s','ms') and (min_val::integer)=0 order by unit,min_val,name;

 name | setting | unit | min_val
--+-+--+-
 autovacuum_vacuum_cost_delay | 20  | ms   | -1
 log_autovacuum_min_duration  | -1  | ms   | -1
 log_min_duration_statement   | -1  | ms   | -1
 max_standby_archive_delay| 3   | ms   | -1
 max_standby_streaming_delay  | 3   | ms   | -1
 lock_timeout | 0   | ms   | 0
 statement_timeout| 0   | ms   | 0
 vacuum_cost_delay| 0   | ms   | 0
 wal_receiver_timeout | 6   

Re: [HACKERS] proposal: rounding up time value less than its unit.

2014-09-23 Thread Gregory Smith

On 9/23/14, 10:52 PM, David Johnston wrote:

​Given the following why not just pick ms for log_rotation_age now?


Right now there are people out there who have configurations that look 
like this:


log_rotation_age=60

In order to get hourly rotation.  These users will suffer some trauma 
should they upgrade to a version where this parameter now means a new 
log file pops every 60 seconds instead.  If they didn't catch the 
warning in the release notes and it happens, I'm pretty comfortable 
they'll survive though, just with a bunch of cursing until it's 
resolved.  I'd even wager a beer that more than 10% of PostgreSQL 
installs that get hit won't even notice.


I'd prefer not to make that experience into one where they get a log 
file every 60ms though.  That's seriously bad.  I'm not opposed to 
making major changes like that, I just like them to be as part of 
updates big enough that people are unlikely to miss that something is 
different.  Just doing this log_rotation_age thing is small enough that 
I expect people to miss the change in the release notes.  That limits 
how much I think we can change the magnitude of an easy to miss setting 
with a large unit adjustment.


? are there any special considerations for people using pg_dump vs. 
those using pg_upgrade?​


I don't know that there's any code in pg_upgrade aiming at this sort of 
job.  I'd prefer not to add find parameters that have changed in 
meaning and flag them to pg_upgrade's job requirements.  It has enough 
problems to worry about, and we really don't do this very often.


If we are going to go that far why not simply modify the GUC input 
routine to require unit-values on these particular parameters?  Direct 
manipulation of pg_settings probably would need some attention but 
everything else could reject integers and non-unit-specifying text.


That would be fine by me as a long-term direction, but it's more of a 
two to three version release level of change.  To keep that from being 
terrible for users, we'd need to provide a transition release that 
helped people find the problem ones without units.  After that was in 
the wild for a while, then could we have some confidence that enough 
people had flushed the issue out enough to turn it into a hard requirement.


The project went through this exact sort of exercise with the 
standard_conforming_strings GUC being the way we flagged the bad 
constructs for people.  That was a much harder job because it touched 
everyone's application code too.  But it took many years to play out.  
I'd be shocked if we could herd our flock of old time users fast enough 
to remove unitless GUC values from PostgreSQL in less than 3 years worth 
of new releases.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-23 Thread Gregory Smith

On 9/23/14, 7:13 PM, Robert Haas wrote:
I think we expose far too little information in our system views. Just 
to take one example, we expose no useful information about lwlock 
acquire or release, but a lot of real-world performance problems are 
caused by lwlock contention.
I sent over a proposal for what I was calling Performance Events about a 
year ago.  The idea was to provide a place to save data about lock 
contention, weird checkpoint sync events, that sort of thing.  Replacing 
log parsing to get at log_lock_waits data was my top priority.  Once 
that's there, lwlocks was an obvious next target.  Presumably we just 
needed collection to be low enough overhead, and then we can go down to 
whatever shorter locks we want; lower the overhead, faster the event we 
can measure.


Sometimes the database will never be able to instrument some of its 
fastest events without blowing away the event itself.  We'll still have 
perf / dtrace / systemtap / etc. for those jobs.  But those are not the 
problems of the average Postgres DBA's typical day.


The data people need to solve this sort of thing in production can't 
always show up in counters.  You'll get evidence the problem is there, 
but you need more details to actually find the culprit.  Some info about 
the type of lock, tables and processes involved, maybe the query that's 
running, that sort of thing.  You can kind of half-ass the job if you 
make per-tables counter for everything, but we really need more, both to 
serve our users and to compare well against what other databases provide 
for tools.  That's why I was trying to get the infrastructure to capture 
all that lock detail, without going through the existing logging system 
first.


Actually building Performance Events fell apart on the storage side:  
figuring out where to put it all without waiting for a log file to hit 
disk.  I wanted in-memory storage so clients don't wait for anything, 
then a potentially lossy persistence writer.  I thought I could get away 
with a fixed size buffer like pg_stat_statements uses.  That was 
optimistic.  Trying to do better got me lost in memory management land 
without making much progress.


I think the work you've now done on dynamic shared memory gives the 
right shape of infrastructure that I could pull this off now.  I even 
have funding to work on it again, and it's actually the #2 thing I'd 
like to take on as I get energy for new feature development.  (#1 is the 
simple but time consuming job of adding block write counters, the lack 
of which which is just killing me on some fast growing installs)


I have a lot of unread messages on this list to sort through right now.  
I know I saw someone try to revive the idea of saving new sorts of 
performance log data again recently; can't seem to find it again right 
now.  That didn't seem like it went any farther than thinking about the 
specifications though.  The last time I jumped right over that and hit a 
wall with this one hard part of the implementation instead, low overhead 
memory management for saving everything.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per table autovacuum vacuum cost limit behaviour strange

2014-09-21 Thread Gregory Smith

On 8/28/14, 12:18 PM, Robert Haas wrote:
At least in situations that I've encountered, it's typical to be able 
to determine the frequency with which a given table needs to be 
vacuumed to avoid runaway bloat, and from that you can work backwards 
to figure out how fast you must process it in MB/s, and from there you 
can work backwards to figure out what cost delay will achieve that 
effect. But if the system tinkers with the cost delay under the hood, 
then you're vacuuming at a different (slower) rate and, of course, the 
table bloats.
The last time I took a whack at this, I worked toward making all of the 
parameters operate in terms of target MB/s, for exactly this style of 
thinking and goal.  Those converted into the same old mechanism under 
the hood and I got the math right to give the same behavior for the 
simple cases, but that could have been simplified eventually.  I 
consider that line of thinking to be the only useful one here.


The answer I like to these values that don't inherit as expected in the 
GUC tree is to nuke that style of interface altogether in favor of 
simplifer bandwidth measured one, then perhaps add multiple QoS levels.  
Certainly no interest in treating the overly complicated innards of cost 
computation as a bug and fixing them with even more complicated behavior.


The part of this I was trying hard to find time to do myself by the next 
CF was a better bloat measure tool needed to actually see the problem 
better.  With that in hand, and some nasty test cases, I wanted to come 
back to simplified MB/s vacuum parameters with easier to understand 
sharing rules again.  If other people are hot to go on that topic, I 
don't care if I actually do the work; I just have a pretty clear view of 
what I think people want.


The only plausible use case for setting a per-table rate that I can 
see is when you actually want the system to use that exact rate for 
that particular table. That's the main one, for these must run on 
schedule or else jobs.

Yes.

On 8/29/14, 9:45 AM, Alvaro Herrera wrote:

Anyway it seems to me maybe there is room for a new table storage
parameter, say autovacuum_do_balance which means to participate in the
balancing program or not.


If that eliminates some of the hairy edge cases, sure.

A useful concept to consider is having a soft limit that most thing work 
against, along with a total hard limit for the server.  When one of 
these tight schedule queries with !autovacuum_do_balance starts, they 
must run at their designed speed with no concern for anyone else.  Which 
means:


a) Their bandwidth gets pulled out of the regular, soft limit numbers 
until they're done.  Last time I had one of these jobs, once the big 
important boys were running, everyone else in the regular shared set 
were capped at vacuum_cost_limit=5 worth of work.  Just enough to keep 
up with system catalog things, and over the course of many hours process 
small tables.


b) If you try to submit multiple locked rate jobs at once, and the total 
goes over the hard limit, they have to just be aborted.  If the rush of 
users comes back at 8AM, and you can clean the table up by then if you 
give it 10MB/s, what you cannot do is let some other user decrease your 
rate such that you're unfinished at 8AM.  Then you'll have aggressive AV 
competing against the user load you were trying to prepare for.  It's 
better to just throw a serious error that forces someone to look at the 
hard limit budget and adjust the schedule instead.  The systems with 
this sort of problem are getting cleaned up every single day, almost 
continuously; missing a day is not bad as long as it's noted and fixed 
again before the next cleanup window.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-21 Thread Gregory Smith

On 9/16/14, 8:18 AM, Amit Kapila wrote:

I think the main reason for slight difference is that
when the size of shared buffers is almost same as data size, the number
of buffers it needs from clock sweep are very less, as an example in first
case (when size of shared buffers is 12286MB), it actually needs at most
256 additional buffers (2MB) via clock sweep, where as bgreclaimer
will put 2000 (high water mark) additional buffers (0.5% of shared buffers
is greater than 2000 ) in free list, so bgreclaimer does some extra work
when it is not required
This is exactly what I was warning about, as the sort of lesson learned 
from the last round of such tuning.  There are going to be spots where 
trying to tune the code to be aggressive on the hard cases will work 
great.  But you need to make that dynamic to some degree, such that the 
code doesn't waste a lot of time sweeping buffers when the demand for 
them is actually weak.  That will make all sorts of cases that look like 
this slower.


We should be able to tell these apart if there's enough instrumentation 
and solid logic inside of the program itself though.  The 8.3 era BGW 
coped with a lot of these issues using a particular style of moving 
average with fast reaction time, plus instrumenting the buffer 
allocation rate as accurately as it could. So before getting into 
high/low water note questions, are you comfortable that there's a clear, 
accurate number that measures the activity level that's important here?  
And have you considered ways it might be averaging over time or have a 
history that's analyzed? The exact fast approach / slow decay weighted 
moving average approach of the 8.3 BGW, the thing that tried to smooth 
the erratic data set possible here, was a pretty critical part of 
getting itself auto-tuning to workload size.  It ended up being much 
more important than the work of setting the arbitrary watermark levels.




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Scaling shared buffer eviction

2014-09-12 Thread Gregory Smith

On 9/11/14, 7:01 AM, Andres Freund wrote:
I'm not convinced that these changes can be made without also changing 
the bgwriter logic. Have you measured whether there are differences in 
how effective the bgwriter is? Not that it's very effective right now :)


The current background writer tuning went out of its way to do nothing 
when it wasn't clear there was something that always worked.  What 
happened with all of the really clever schemes was that they worked on 
some workloads, and just trashed others.  Most of the gain from the 8.3 
rewrite came from looking at well theorized ideas for how to handle 
things like pre-emptive LRU scanning for writes, and just throwing them 
out altogether in favor of ignoring the problem.  The magic numbers left 
in or added to the code  were tuned to do very little work, 
intentionally.  If anything, since then the pressure to do nothing has 
gone up in the default install, because now people are very concerned 
about extra wakeups using power.


To find bad cases before, I was running about 30 different test 
combinations by the end, Heikki was running another set in the EDB lab, 
I believe there was a lab at NTT running their own set too. What went in 
was the combination that didn't break any of them badly--not the one 
that performed best on the good workloads.


This looks like it's squashed one of the very fundamental buffer scaling 
issues though; well done Amit.  What I'd like to see is preserving the 
heart of that while touching as little as possible. When in doubt, do 
nothing; let the backends suck it up and do the work themselves.


I had to take a health break from community development for a while, and 
I'm hoping to jump back into review again for the rest of the current 
development cycle.  I'll go back to my notes and try to recreate the 
pathological cases that plagued both the 8.3 BGW rewrite and the aborted 
9.2 fsync spreading effort I did; get those running again and see how 
they do on this new approach.  I have a decent sized 24 core server that 
should be good enough for this job.  I'll see what I can do.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench throttling latency limit

2014-09-12 Thread Gregory Smith

On 9/10/14, 10:57 AM, Fabien COELHO wrote:
Indeed. I think that people do not like it to change. I remember that 
I suggested to change timestamps to .yy instead of the 
unreadable  yyy, and be told not to, because some people have 
tool which process the output so the format MUST NOT CHANGE. So my 
behavior is not to avoid touching anything in this area.


That somewhat hysterical version of events isn't what I said. Heikki has 
the right idea for backpatching, so let me expand on that rationale, 
with an eye toward whether 9.5 is the right time to deal with this.


Not all software out there will process epoch timestamps with 
milliseconds added as a fraction at the end.  Being able to read an 
epoch time in seconds as an integer is a well defined standard; the 
fraction part is not.


Here's an example of the problem, from a Mac OS X system:

$ date -j -f %a %b %d %T %Z %Y `date` +%s
1410544903
$ date -r 1410544903
Fri Sep 12 14:01:43 EDT 2014
$ date -r 1410544903.532
usage: date [-jnu] [-d dst] [-r seconds] [-t west] [-v[+|-]val[ymwdHMS]] ...
[-f fmt date | [[[mm]dd]HH]MM[[cc]yy][.ss]] [+format]

The current file format allows any random shell script to use a tool 
like cut to pull out the second resolution timestamp column as an epoch 
integer field, then pass it through even a utility as simple as date to 
reformat that.  And for a lot of people, second resolution is perfectly 
fine anyway.


The change you propose will make that job harder for some people, in 
order to make the job you're interested in easier.  I picked the 
simplest possible example, but there are more.  Whether epoch timestamps 
can have millisecond parts depends on your time library in Java, in 
Python some behavior depends on whether you have 2.6 or earlier, I don't 
think gnuplot handles milllisecond ones at all yet; the list goes on and 
on.  Some people will just have to apply a second split for timestamp 
string pgbench outputs, at the period and use the left side, where right 
now they can just split the whole thing on a space.


What you want to do is actually fine with me--and as far as I know, I'm 
the producer of the most popular pgbench latency parsing script 
around--but it will be a new sort of headache.  I just wanted the 
benefit to outweigh that.  Breaking the existing scripts and burning 
compatibility with simple utilities like date was not worth the tiny 
improvement you wanted in your personal workflow.  That's just not how 
we do things in PostgreSQL.


If there's a good case that the whole format needs to be changed anyway, 
like adding a new field, then we might as well switch to fractional 
epoch timestamps too now though.  When I added timestamps to the latency 
log in 8.3, parsers that handled milliseconds were even more rare.  
Today it's still inconsistent, but the workarounds are good enough to me 
now.  There's a lot more people using things like Python instead of bash 
pipelines here in 2014 too.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Support for N synchronous standby servers

2014-09-12 Thread Gregory Smith

On 9/12/14, 2:28 PM, Robert Haas wrote:
I hate to be the guy always suggesting a mini-language (cf. recent 
discussion of an expression syntax for pgbench), but we could do much 
more powerful and flexible things here if we had one. For example, 
suppose we let each element of synchronous_standby_names use the 
constructs (X,Y,Z,...)


While I have my old list history hat on this afternoon, when the 9.1 
deadline was approaching I said that some people were not going to be 
happy until is it safe to commit? calls an arbitrary function that is 
passed the names of all the active servers, and then they could plug 
whatever consensus rule they wanted into there.  And then I said that if 
we actually wanted to ship something, it should be some stupid simple 
thing like just putting a list of servers in synchronous_standby_names 
and proceeding if one is active.  One of those two ideas worked out...


Can you make a case for why it needs to be a mini-language instead of a 
function?


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-12 Thread Gregory Smith

On 6/11/14, 10:26 AM, Robert Haas wrote:
Now, as soon as we introduce the concept that selecting from a table 
might not really mean read from the table but read from the table 
after applying this owner-specified qual, we're opening up a whole 
new set of attack surfaces. Every pg_dump is an opportunity to hack 
somebody else's account, or at least audit their activity. 


I'm in full agreement we should clearly communicate the issues around 
pg_dump in particular, because they can't necessarily be eliminated 
altogether without some major work that's going to take a while to 
finish.  And if the work-around is some sort of GUC for killing RLS 
altogether, that's ugly but not unacceptable to me as a short-term fix.


One of the difficult design requests in my inbox right now asks how 
pg_dump might be changed both to reduce its overlap with superuser 
permissions and to allow auditing of its activity.  Those requests 
aren't going away; their incoming frequency is actually rising quite 
fast right now.  They're both things people expect from serious SQL 
oriented commercial database products, and I'd like to see PostgreSQL 
continue to displace those as we reach feature parity in those areas.


Any way you implement finer grained user permissions and auditing 
features will be considered a new attack vector when you use those 
features.  The way the proposed RLS feature inserts an arbitrary 
function for reads has a similar new attack vector when you use that 
feature.


I'm kind of surprised to see this turn into a hot button all of the 
sudden though, because my thought on all that so far has been a giant so 
what?  This is what PostgreSQL does.


You wanna write your own C code and then link the thing right into the 
server, so that bugs can expose data and crash the whole server?  Not 
only can you shoot yourself in the foot that way, we supply a sample gun 
and bullets in contrib.  How about writing arbitrary code in any one of 
a dozen server-side languages of wildly varying quality, then hooking 
that code so it runs as a trigger function whenever you change a row?  
PostgreSQL is *on it*; we love letting people write some random thing, 
and then running that random thing against your data as a side-effect of 
doing an operation.  And if you like that...just wait until you learn 
about this half-assed rules feature we have too!


And when the database breaks because the functions people inserted were 
garbage, that's their fault, not a cause for a CVE.  And when someone 
blindly installs adminpack because it sounded like a pgAdmin 
requirement, lets a monitoring system run as root so it can watch 
pg_stat_activity, and then discovers that pair of reasonable decisions 
suddenly means any fool with monitoring access can call 
pg_file_unlink...that's their fault too. These are powerful tools with 
serious implications, and they're expected to be used by equally serious 
users.


We as a development community do need to put a major amount of work into 
refactoring all of these security mechanisms.  There should be less of 
these embarrassing incidents where bad software design really forced the 
insecure thing to happen, which I'd argue is the case for that 
pg_stat_activity example.  And luckily so far development resources are 
appearing for organizations I know of working in that direction 
recently, as fast as the requirements are rising.  I think there's a 
good outcome at the end of that road.


But let's not act like RLS is a scary bogeyman because it introduces a 
new way to hack the server or get surprising side-effects.  That's 
expected and possibly unavoidable behavior in a feature like this, and 
there are much worse instances of arbitrary function risk throughout the 
core code already.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-04-10 Thread Gregory Smith

On 4/9/14 9:56 PM, Stephen Frost wrote:

As for docs and testing, those are things we would certainly be better
off with and may mean that this isn't able to make it into 9.4, which is
fair, but I wouldn't toss it out solely due to that.


We have a git repo with multiple worked out code examples, ones that 
have been in active testing for months now.  It's private just to reduce 
mistakes as things are cleared for public consumption, but I (and Mark 
and Jeff here) can pull anything we like from it to submit to hackers.  
There's also a test case set from Yeb Havinga he used for his review.


I expected to turn at least one of those into a Postgres regression 
test.  The whole thing squealed to a stop when the regression tests 
Craig was working on were raising multiple serious questions.  I didn't 
see much sense in bundling more boring, passing tests when the ones we 
already had seemed off--and no one was sure why.


Now that Tom has given some guidance on the row locking weirdness, maybe 
it's time for me to start updating those into make check form.  The 
documentation has been in a similar holding pattern.  I have lots of 
resources to help document what does and doesn't work here to the 
quality expected in the manual.  I just need a little more confidence 
about which feature set is commit worthy.  The documentation that makes 
sense is very different if only updatable security barrier views is 
committed.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pending 9.4 patches

2014-04-08 Thread Gregory Smith

On 4/7/14 2:59 AM, Craig Ringer wrote:

On 04/05/2014 03:57 AM, Andres Freund wrote:


c07) Updatable security barrier views.
  This needs a serious look by a committer.

I've been exercising it via row security and it's been looking pretty
solid. It isn't a huge or intrusive patch, and it's seen several rounds
of discussion during its development and refinement. (Thanks Dean).


Same here, nothing but good feedback from testing.  The updatable 
security barrier views has been sitting in Ready For Committer since 
late January.  Unfortunately, I just learned that some of the people who 
might commit in this area--Stephen Frost for example--thought there were 
still major oustanding issues with that part.


I (and I think Craig too) been waiting for that to be picked up by a 
committer, Stephen was waiting for me or Craig to fix unrelated bugs, 
and that's where the CF process has been deadlocked on this one.


A core bugfix with locking in security barrier views is required 
before the regression tests can be fixed up properly, for one thing. 
Tom also expressed concerns about how plan invalidation works, though 
it's not yet clear whether that was just miscommunication about how it 
works on my part or whether there's a concrete problem there.


This is similarly stuck.  I'm not out of resources, there's just nothing 
I can do here myself.  For this to move forward, a committer needs to 
pick up the security barrier views part.  We also need a bug fix for the 
issue that's breaking the regression tests.  Once all that's done, RLS 
on top of updateable security barrier views might be commitable.  But 
there's no way to tell for sure until those other two bits are sorted out.


I'd really love to finish this off for 9.4, but other projects have to 
come first. 


I have no other projects ahead of this for the remainder of this month.  
I just can't figure out what to do next until there's a committer (or 
committers, if someone else is going to take on the locking bug) 
identified.  I looked at the locking problem enough to know that one is 
over my head.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] four minor proposals for 9.5

2014-04-08 Thread Gregory Smith

On 4/6/14 2:46 PM, Pavel Stehule wrote:


Proposed options are interesting for enterprise using, when you have 
a some more smart tools for log entry processing, and when you need a 
complex view about performance of billions queries - when cancel time 
and lock time is important piece in mosaic of server' fitness.


I once sent a design proposal over for something I called Performance 
Events that included this.  It will be difficult to get everything 
people want to track into log_line_prefix macro form.  You're right that 
this particular one can probably be pushed into there, but you're adding 
four macros just for this feature. And that's only a fraction of what 
people expect from database per-query performance metrics.


The problem I got stuck on with the performance event project was where 
to store the data collected.  If you want to keep up with read rates, 
you can't use the existing log infrastructure.  It has to be something 
faster, lighter.  I wanted to push the data into shared memory somewhere 
instead.  Then some sort of logging consumer could drain that queue and 
persist it to disk.


Since then, we've had a number of advances, particularly these two:

-Dynamic shared memory allocation.
-Query log data from pg_stat_statements can persist.

With those important fundamentals available, I'm wandering around right 
now trying to get development resources to pick the whole event logging 
idea up again.  The hardest parts of the infrastructure I was stuck on 
in the past are in the code today.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-23 Thread Gregory Smith

On 1/20/14 9:46 AM, Mel Gorman wrote:
They could potentially be used to evalate any IO scheduler changes. 
For example -- deadline scheduler with these parameters has X 
transactions/sec throughput with average latency of Y millieseconds 
and a maximum fsync latency of Z seconds. Evaluate how well the 
out-of-box behaviour compares against it with and without some set of 
patches. At the very least it would be useful for tracking historical 
kernel performance over time and bisecting any regressions that got 
introduced. Once we have a test I think many kernel developers (me at 
least) can run automated bisections once a test case exists. 


That's the long term goal.  What we used to get out of pgbench were 
things like 60 second latencies when a checkpoint hit with GBs of dirty 
memory.  That does happen in the real world, but that's not a realistic 
case you can tune for very well.  In fact, tuning for it can easily 
degrade performance on more realistic workloads.


The main complexity I don't have a clear view of yet is how much 
unavoidable storage level latency there is in all of the common 
deployment types.  For example, I can take a server with a 256MB 
battery-backed write cache and set dirty_background_bytes to be smaller 
than that.  So checkpoint spikes go away, right?  No. Eventually you 
will see dirty_background_bytes of data going into an already full 256MB 
cache.  And when that happens, the latency will be based on how long it 
takes to write the cached 256MB out to the disks.  If you have a single 
disk or RAID-1 pair, that random I/O could easily happen at 5MB/s or 
less, and that makes for a 51 second cache clearing time.  This is a lot 
better now than it used to be because fsync hasn't flushed the whole 
cache in many years now. (Only RHEL5 systems still in the field suffer 
much from that era of code)  But you do need to look at the distribution 
of latency a bit because of how the cache impact things, you can't just 
consider min/max values.


Take the BBWC out of the equation, and you'll see latency proportional 
to how long it takes to clear the disk's cache out. It's fun upgrading 
from a disk with 32MB of cache to 64MB only to watch worst case latency 
double.  At least the kernel does the right thing now, using that cache 
when it can while forcing data out when fsync calls arrive.  (That's 
another important kernel optimization we'll never be able to teach the 
database)


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.4] row level security

2014-01-17 Thread Gregory Smith

On 12/13/13 11:40 PM, Craig Ringer wrote:

You may want to check out the updated writable security-barrier views patch.

http://www.postgresql.org/message-id/52ab112b.6020...@2ndquadrant.com

It may offer a path forward for the CF submission for RLS, letting us
get rid of the var/attr fiddling that many here objected to.


With my advocacy hat on, I'd like to revisit this idea now that there's 
a viable updatable security barrier view submission.  I thought the most 
serious showstopper feedback from the last CF's RLS submission was that 
this needed to be sorted out first.  Reworking KaiGai's submission to 
merge against Dean's new one makes it viable again in my mind, and I'd 
like to continue toward re-reviewing it as part of this CF in that 
light.  Admittedly it's not ideal to try and do that at the same time 
the barrier view patch is being modified, but I see that as a normal CF 
merge of things based on other people's submissions.


I mentioned advocacy because the budding new PostgreSQL test instances 
I'm seeing now will lose a lot of momentum if we end up with no user 
visible RLS features in 9.4.  The pieces we have now can assemble into 
something that's useful, and I don't think that goal is unreasonably far 
away.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Lsf-pc] [HACKERS] Linux kernel impact on PostgreSQL performance

2014-01-17 Thread Gregory Smith

On 1/17/14 10:37 AM, Mel Gorman wrote:
There is not an easy way to tell. To be 100%, it would require an 
instrumentation patch or a systemtap script to detect when a 
particular page is being written back and track the context. There are 
approximations though. Monitor nr_dirty pages over time.


I have a benchmarking wrapper for the pgbench testing program called 
pgbench-tools:  https://github.com/gregs1104/pgbench-tools  As of 
October, on Linux it now plots the Dirty value from /proc/meminfo over 
time.  You get that on the same time axis as the transaction latency 
data.  The report at the end includes things like the maximum amount of 
dirty memory observed during the test sampling. That doesn't tell you 
exactly what's happening to the level someone reworking the kernel logic 
might want, but you can easily see things like the database's checkpoint 
cycle reflected by watching the dirty memory total.  This works really 
well for monitoring production servers too.  I have a lot of data from a 
plugin for the Munin monitoring system that plots the same way.  Once 
you have some history about what's normal, it's easy to see when systems 
fall behind in a way that's ruining writes, and the high water mark 
often correlates with bad responsiveness periods.


Another recent change is that pgbench for the upcoming PostgreSQL 9.4 
now allows you to specify a target transaction rate.  Seeing the write 
latency behavior with that in place is far more interesting than 
anything we were able to watch with pgbench before.  The pgbench write 
tests we've been doing for years mainly told you the throughput rate 
when all of the caches were always as full as the database could make 
them, and tuning for that is not very useful. Turns out it's far more 
interesting to run at 50% of what the storage is capable of, then watch 
what happens to latency when you adjust things like the dirty_* parameters.


I've been working on the problem of how we can make a benchmark test 
case that acts enough like real busy PostgreSQL servers that we can 
share it with kernel developers, and then everyone has an objective way 
to measure changes.  These rate limited tests are working much better 
for that than anything I came up with before.


I am skeptical that the database will take over very much of this work 
and perform better than the Linux kernel does.  My take is that our most 
useful role would be providing test cases kernel developers can add to a 
performance regression suite.  Ugly we never though that would happen 
situations seems at the root of many of the kernel performance 
regressions people here get nailed by.


Effective I/O scheduling is very hard, and we are unlikely to ever out 
innovate the kernel hacking community by pulling more of that into the 
database.  It's already possible to experiment with moving in that 
direction with tuning changes.  Use a larger database shared_buffers 
value, tweak checkpoints to spread I/O out, and reduce things like 
dirty_ratio.  I do some of that, but I've learned it's dangerous to 
wander too far that way.


If instead you let Linux do even more work--give it a lot of memory to 
manage and room to re-order I/O--that can work out quite well. For 
example, I've seen a lot of people try to keep latency down by using the 
deadline scheduler and very low settings for the expire times.  Theory 
is great, but it never works out in the real world for me though.  
Here's the sort of deadline I deploy instead now:


echo 500   ${DEV}/queue/iosched/read_expire
echo 30${DEV}/queue/iosched/write_expire
echo 1048576   ${DEV}/queue/iosched/writes_starved

These numbers look insane compared to the defaults, but I assure you 
they're from a server that's happily chugging through 5 to 10K 
transactions/second around the clock.  PostgreSQL forces writes out with 
fsync when they must go out, but this sort of tuning is basically giving 
up on it managing writes beyond that.  We really have no idea what order 
they should go out in.  I just let the kernel have a large pile of work 
queued up, and trust things like the kernel's block elevator and 
congestion code are smarter than the database can possibly be.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] gaussian distribution pgbench

2013-12-19 Thread Gregory Smith

On 12/19/13 5:52 PM, Gavin Flower wrote:
Curious, wouldn't the common usage pattern tend to favour a skewed 
distribution, such as the  Poisson Distribution (it has been over 40 
years since I studied this area, so there may be better candidates).




Some people like database load testing with a Pareto principle 
distribution, where 80% of the activity hammers 20% of the rows such 
that locking becomes important.  (That's one specific form of Pareto 
distribution)  The standard pgbench load indirectly gets you quite a bit 
of that due to all the contention on the branches table. Targeting all 
of that at a single table can be more realistic.


My last round of reviewing a pgbench change left me pretty worn out with 
wanting to extend that code much further.  Adding in some new 
probability distributions would be fine though, that's a narrow change.  
We shouldn't get too excited about pgbench remaining a great tool for 
too much longer though.  pgbench is fast approaching a wall nowadays, 
where it's hard for any single client server to fully overload today's 
larger server.  You basically need a second large server to generate 
load, whereas what people really want is a bunch of coordinated small 
clients.  (That sort of wall was in early versions too, it just got 
pushed upward a lot by the multi-worker changes in 9.0 coming around the 
same time desktop core counts really skyrocketed)


pgbench started as a clone of a now abandoned Java project called 
JDBCBench.  I've been seriously considering a move back toward that 
direction lately.  Nowadays spinning up ten machines to run load 
generation is trivial.  The idea of extending pgbench's C code to 
support multiple clients running at the same time and collating all of 
their results is not a project I'd be excited about.  It should remain a 
perfectly fine tool for PostgreSQL developers to find code hotspots, but 
that's only so useful.


(At this point someone normally points out Tsung solved all of those 
problems years ago if you'd only give it a chance.  I think it's kind of 
telling that work on sysbench is rewriting the whole thing so you can 
use Lua for your test scripts.)



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-19 Thread Gregory Smith

On 12/18/13 10:21 PM, Craig Ringer wrote:
In the end, sometimes I guess there's no replacement for WHERE 
call_some_procedure()


That's where I keep ending up at.  The next round of examples I'm 
reviewing this week plug pl/pgsql code into that model.  And the one 
after that actually references locally cached data that starts stored in 
LDAP on another machine altogether.  That one I haven't even asked for 
permission to share with the community because of my long standing LDAP 
allergy, but the whole thing plugs into the already submitted patch just 
fine.  (Shrug)


I started calling all of the things that generate data for RLS to filter 
on label providers.  You've been using SELinux as an example future 
label provider.  Things like this LDAP originated bit are another 
provider.  Making the database itself a richer label provider one day is 
an interesting usability improvement to map out.  But on the proof of 
concept things I've been getting passed I haven't seen an example where 
I'd use that yet anyway.  The real world label providers are too 
complicated.





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-16 Thread Gregory Smith

On 12/16/13 9:36 AM, Craig Ringer wrote:

- Finish and commit updatable security barrier views. I've still got a
lot of straightening out to do there.
I don't follow why you've put this part first.  It has a lot of new 
development and the risks that go along with that, but the POC projects 
I've been testing are more interested in the view side issues.



- Decide on and implement a structure for row-security functionality its
self. I'm persuaded by Robert's comments here, that whatever we expose
must be significantly more usable than a DIY on top of views (with the
fix for updatable security barrier views to make that possible).


Can't say i agree we should be getting tangled into making this slick on 
the user side right now.  If you open a new can of worms like 
heirachical access labels, this goes back into basic design, and we'll 
spend months on that before returning to exactly the same implementation 
details.  I tried to make a case for how having a really generic 
mechanism that's doesn't presume to know how labels will be assigned can 
be a good thing.


Given the state this is all in right now, I'd much rather publish a hard 
to use but powerful API than to presume you're going to get an easier to 
use design right.  The place I'm at here is trying to figure out the 
simplest useful thing that could be committed and then hammer on the 
details.  (Not the first time I've beat that drum on a feature)  Your 
roadmap goes way past that, which is great to avoid being painted into a 
corner, but I'm thinking more about what a useful feature freeze point 
would look like given it's December 16 now.


--
Greg Smith greg.sm...@crunchydatasolutions.com
Chief PostgreSQL Evangelist - http://crunchydatasolutions.com/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers