[HACKERS] date_part()/EXTRACT(second) behaviour with time data type

2009-07-29 Thread Gregory Stark

I think we broke date_part for extracting seconds from time arguments. It
appears we leave out the milliseconds whereas we don't for timestamp
arguments. This was not the case in 8.3 where we included the milliseconds for
both data types.

Unless this is intentional? I know we wacked around both the meaning of SECOND
for intervals and the code for date_part in a lot of ways. But I don't see why
it would be different for time versus timestamp.

postgres=# select extract(second from now()::time with time zone);
 date_part 
---
27
(1 row)

postgres=# select extract(second from now()::time without time zone);
 date_part 
---
27
(1 row)

postgres=# select extract(second from now()::timestamp with time zone);
 date_part 
---
 27.782458
(1 row)

postgres=# select extract(second from now()::timestamp without time zone);
 date_part 
---
 27.782458
(1 row)


-- 
  Gregory Stark
  http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Gregory Stark
 *appendplans, bool isTarget, List *tlist);
+extern Append *make_append(PlannerInfo *root, List *appendplans, 
+		   bool isTarget, List *tlist, List *pathkeys);
 extern RecursiveUnion *make_recursive_union(List *tlist,
 	 Plan *lefttree, Plan *righttree, int wtParam,
 	 List *distinctList, long numGroups);

-- 
  Gregory Stark
  http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Merge Append Patch merged up to 85devel

2009-07-05 Thread Gregory Stark
 *appendplans, bool isTarget, List *tlist);
+extern Append *make_append(PlannerInfo *root, List *appendplans, 
+		   bool isTarget, List *tlist, List *pathkeys);
 extern RecursiveUnion *make_recursive_union(List *tlist,
 	 Plan *lefttree, Plan *righttree, int wtParam,
 	 List *distinctList, long numGroups);

-- 
  Gregory Stark
  http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-10 Thread Gregory Stark
Andres Freund and...@anarazel.de writes:

 Plan time (averaged) without change:
 cnt: 40 (4 times per session)
 avg: 4572ms

 Plan time (averaged) with change:
 cnt: 40 (4 times per session)
 avg: 4236ms

 ~7% difference. Same with higher number of repetitions and with most other
 planner settings I tried

What are you comparing here?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] postmaster recovery and automatic restart suppression

2009-06-08 Thread Gregory Stark
Fujii Masao masao.fu...@gmail.com writes:

 On the other hand, the primary postgres might *not* restart automatically.
 So, it's difficult for clusterware to choose whether to do failover when it
 detects the death of the primary postgres, I think.


I think the accepted way to handle this kind of situation is called STONITH --
Shoot The Other Node In The Head.

You need some way when the cluster software decides to initiate failover to
ensure that the first node *cannot* come back up. That could mean shutting the
power to it at the PDU or disabling its network connection at the switch, or
various other options.

  Gregory Stark
  http://mit.edu/~gsstark/resume.pdf

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Plan time Improvement - 64bit bitmapset

2009-06-03 Thread Gregory Stark
Kevin Grittner kevin.gritt...@wicourts.gov writes:

 Andres Freund and...@anarazel.de wrote: 
  
 long plan times (minutes and up)
  
 Wow.  I thought I had some pretty complex queries, including some
 which join using several views, each of which has several joins; but
 I've never gone to multiple seconds on plan time (much less multiple
 minutes!) without very high statistics targets and many indexes on the
 tables.  Any rough estimates on those?

My money's still on very large statistics targets. If you have a lot of
columns and 1,000-element arrays for each column that can get big pretty
quickly.

But that doesn't explain the bitmap ops being important. Hm. Actually having a
lot of columns and then joining a lot of tables could mean having fairly large
bitmapsets.



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] valgrind error in tsvectorin

2009-05-14 Thread Gregory Stark

Was just running the regression tests under valgrind and aside from the usual
false positives caused by structure padding I noticed this:

==19366== Source and destination overlap in memcpy(0x4BB7FC0, 0x4BB7FC0, 12)
==19366==at 0x4026B12: memcpy (mc_replace_strmem.c:402)
==19366==by 0x8389750: uniqueentry (tsvector.c:128)
==19366==by 0x8389C63: tsvectorin (tsvector.c:265)
==19366==by 0x83B1888: InputFunctionCall (fmgr.c:1878)
==19366==by 0x83B1B46: OidInputFunctionCall (fmgr.c:2009)
==19366==by 0x8171651: stringTypeDatum (parse_type.c:497)
==19366==by 0x8171CAC: coerce_type (parse_coerce.c:239)
==19366==by 0x8171A72: coerce_to_target_type (parse_coerce.c:86)
==19366==by 0x8166DB5: transformTypeCast (parse_expr.c:2016)
==19366==by 0x8162FA8: transformExpr (parse_expr.c:181)
==19366==by 0x8174990: transformTargetEntry (parse_target.c:75)
==19366==by 0x8174B01: transformTargetList (parse_target.c:145)

After a quick glance at the code I suspect res and ptr end up pointing to the
same object, perhaps the loop condition has a fencepost error. But I don't
really understand what it's trying to do at all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] valgrind errors

2009-05-14 Thread Gregory Stark

And lest anyone think Teodor and Oleg are the only ones that have that kind of
problem, here are two in resolve_polymorphic_tupdesc that fire several times
in the regression tests:

==20391== Source and destination overlap in strncpy(0x4BD91DA, 0x4BD91DA, 64)
==20391==at 0x4026CC4: strncpy (mc_replace_strmem.c:291)
==20391==by 0x83224CF: namestrcpy (name.c:217)
==20391==by 0x809635F: TupleDescInitEntry (tupdesc.c:463)
==20391==by 0x83B2BA1: resolve_polymorphic_tupdesc (funcapi.c:500)
==20391==by 0x83B272C: internal_get_result_type (funcapi.c:323)
==20391==by 0x83B256B: get_expr_result_type (funcapi.c:234)
==20391==by 0x816DE41: addRangeTableEntryForFunction (parse_relation.c:1138)
==20391==by 0x816023E: transformRangeFunction (parse_clause.c:594)
==20391==by 0x816059A: transformFromClauseItem (parse_clause.c:709)
==20391==by 0x815F686: transformFromClause (parse_clause.c:123)
==20391==by 0x813ADE3: transformSelectStmt (analyze.c:800)
==20391==by 0x8139B83: transformStmt (analyze.c:183)
==20372== Source and destination overlap in strncpy(0x4BD887D, 0x4BD887D, 64)
==20372==at 0x4026CC4: strncpy (mc_replace_strmem.c:291)
==20372==by 0x83224CF: namestrcpy (name.c:217)
==20372==by 0x809635F: TupleDescInitEntry (tupdesc.c:463)
==20372==by 0x83B2BEA: resolve_polymorphic_tupdesc (funcapi.c:507)
==20372==by 0x83B272C: internal_get_result_type (funcapi.c:323)
==20372==by 0x83B256B: get_expr_result_type (funcapi.c:234)
==20372==by 0x816DE41: addRangeTableEntryForFunction (parse_relation.c:1138)
==20372==by 0x816023E: transformRangeFunction (parse_clause.c:594)
==20372==by 0x816059A: transformFromClauseItem (parse_clause.c:709)
==20372==by 0x815F686: transformFromClause (parse_clause.c:123)
==20372==by 0x813ADE3: transformSelectStmt (analyze.c:800)
==20372==by 0x8139B83: transformStmt (analyze.c:183)

I'm not sure if there are any realistic platforms where strcpy with source and
destination exactly equal is really going to cause an actual problem, but...

And another one in tsearch:

==20349== Source and destination overlap in memcpy(0x71ADAE4, 0x71ADAE4, 8)
==20349==at 0x4026B12: memcpy (mc_replace_strmem.c:402)
==20349==by 0x82DC762: dispell_lexize (dict_ispell.c:141)
==20349==by 0x83B0A66: FunctionCall4 (fmgr.c:1360)
==20349==by 0x82D315F: LexizeExec (ts_parse.c:208)
==20349==by 0x82D3776: parsetext (ts_parse.c:399)
==20349==by 0x82E1EBC: to_tsvector_byid (to_tsany.c:226)
==20349==by 0x81E9016: ExecMakeFunctionResult (execQual.c:1665)
==20349==by 0x81E9932: ExecEvalFunc (execQual.c:2097)
==20349==by 0x81ED49F: ExecEvalExprSwitchContext (execQual.c:4076)
==20349==by 0x8276D89: evaluate_expr (clauses.c:3841)
==20349==by 0x82763C0: evaluate_function (clauses.c:3448)
==20349==by 0x8275EA1: simplify_function (clauses.c:3252)

And i think this is just uninitialized data at the end of the bitfield that is
being masked off anyways:

==20253== Conditional jump or move depends on uninitialised value(s)
==20253==at 0x82F2C72: array_bitmap_copy (arrayfuncs.c:3854)
==20253==by 0x82EF4A3: array_set_slice (arrayfuncs.c:2538)
==20253==by 0x81E6B8C: ExecEvalArrayRef (execQual.c:397)
==20253==by 0x81EF8D7: ExecTargetList (execQual.c:4988)
==20253==by 0x81EFDA2: ExecProject (execQual.c:5203)
==20253==by 0x81EFF1F: ExecScan (execScan.c:143)
==20253==by 0x825: ExecSeqScan (nodeSeqscan.c:131)
==20253==by 0x81E5F3C: ExecProcNode (execProcnode.c:363)
==20253==by 0x81E3597: ExecutePlan (execMain.c:1504)
==20253==by 0x81E16F3: standard_ExecutorRun (execMain.c:309)
==20253==by 0x81E15A4: ExecutorRun (execMain.c:258)
==20253==by 0x82CD9E5: ProcessQuery (pquery.c:196)


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Serializable Isolation without blocking

2009-05-07 Thread Gregory Stark
Albe Laurenz laurenz.a...@wien.gv.at writes:

 So I think one would have to add intention locks for rows considered
 in the WHERE clause to guarantee true serializability.

Does the paper explain how to deal with rows considered in the WHERE clause
which don't yet exist? Ie, SELECT count(*) WHERE foo needs to take out a
lock which would cause any transaction which inserts a new record where foo is
true to be abort.

In MSSQL this requires locking the page of the index where such records would
be inserted (or the entire table if there's no index). In Predicate locking
schemes this requires a separate storage structure for storing such predicates
which can be arbitrarily complex expressions to check any new tuple being
inserted against.

Are these intention locks predicate locks, in that they're not associated with
actual pages or records but with potential records which might be inserted in
the future?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] unalias of ACL_SELECT_FOR_UPDATE

2009-04-20 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Greg Stark st...@enterprisedb.com writes:
 I think we're talking at cross purposes here. I think Kai Gai's
 descriptions make sense if you start with a different set of
 assumptions. The idea behind SELinux is that each individual object is
 access controlled and each user has credentials which grant access to
 specific operations on specific objects. As I understand it part of
 the goal is to eliminate situations where setuid or other forms of
 privilege escalation is required.

 Well, if so, the idea is a miserable failure.  SELinux has just as many
 setuid programs as any other Unix, and absolutely zero hope of removing
 them.  I am not going to take the idea of remove setuid seriously when
 they haven't been able to accomplish it anywhere else.

But can you remove privileges from users to make these programs ineffective?
So even if you obtain root privileges you're missing the SE privilege which
the program expects to use?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Unicode support

2009-04-13 Thread Gregory Stark

- - crossroads0...@googlemail.com writes:

 The original post seemed to be a contrived attempt to say you should
 use ICU.

 Indeed.  The OP should go read all the previous arguments about ICU
 in our archives.

 Not at all. I just was making a suggestion. You may use any other
 library or implement it yourself (I even said that in my original
 post). www.unicode.org - the official website of the Unicode
 consortium, have a complete database of all Unicode characters which
 can be used as a basis.

 But if you want to ignore the normalization/multiple code point issue,
 point 2--the collation problem--still remains. And given that even a
 crappy database as MySQL supports Unicode collation, this isn't
 something to be ignored, IMHO.

Sure, supporting multiple collations in a database is definitely a known
missing feature. There is a lot of work required to do it and a patch to do so
was too late to make it into 8.4 and required more work so hopefully the
issues will be worked out for 8.5.

I suggest you read the old threads and make any contibutions you can
suggesting how to solve the problems that arose.


 I don't believe that the standard forbids the use of combining chars at all.
 RFC 3629 says:

  ... This issue is amenable to solutions based on Unicode Normalization
  Forms, see [UAX15].

This is the relevant part. Tom was claiming that the UTF8 encoding required
normalizing the string of unicode codepoints before encoding. I'm not sure
that's true though, is it?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Translation conventions

2009-04-09 Thread Gregory Stark

I happen to be doing this grep (the error message could perhaps use a HINT as
I couldn't figure out how to satisfy it...)

I noticed that the Croatian msgid string seems not to match the other
languages. Did this message change recently? Does this get updated only when
we reach beta and declare a string freeze?

find . -type f -print0 | xargs -0 -e grep -nH -e 'a column definition list is 
required for functions returning'
./parser/parse_relation.c:1159:  errmsg(a 
column definition list is required for functions returning \record\),
./po/af.po:4942:msgid a column definition list is required for functions 
returning \record\
./po/de.po:7267:msgid a column definition list is required for functions 
returning \record\
./po/es.po:7404:msgid a column definition list is required for functions 
returning \record\
./po/fr.po:7752:msgid a column definition list is required for functions 
returning \record\
./po/hr.po:4902:msgid a column definition list is required for functions 
returning RECORD


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Strange query plan with redundant aggregate nodes

2009-04-09 Thread Gregory Stark

This query surprised me. I expected us to do the Aggregate once for all the
aggregate functions in the select target which is what normally happens. If I
simplify the query further it actually does so. 

I don't really understand what's going on here. It can't be the volatile
random() because in fact even if I make them depend on the random value the
subplans are executed with the same parameter values anyways and the sums end
up being the same.

postgres=# postgres=# explain select sum(n),sum(n)
from (select (select count(*) as n from a ) as n 
from (select random() as s) as xyzzy) as xyzzy ;

   QUERY PLAN   
---
 Aggregate  (cost=5676.06..5676.07 rows=1 width=0)
   InitPlan
 -  Aggregate  (cost=2838.00..2838.01 rows=1 width=0)
   -  Seq Scan on a  (cost=0.00..2588.00 rows=10 width=0)
 -  Aggregate  (cost=2838.00..2838.01 rows=1 width=0)
   -  Seq Scan on a  (cost=0.00..2588.00 rows=10 width=0)
   -  Result  (cost=0.00..0.01 rows=1 width=0)
(7 rows)


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] default parameters for built-in functions

2009-04-06 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 So I think that's out.  The only alternative that comes to mind is
 to have initdb issue an additional SQL command to establish the default
 after the bootstrap phase; that is, something like

 CREATE OR REPLACE FUNCTION
   pg_start_backup(label text, fast boolean DEFAULT false)
   RETURNS text LANGUAGE internal STRICT AS 'start_backup';

 in system_views.sql or some such place.

Well, not that this is appropriate at this moment, but I had been considering
proposing we do this to most of pg_proc and pg_operator. It's really quite a
pain to maintain these files manually, especially pg_operator.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] small but useful patches for text search

2009-03-16 Thread Gregory Stark
Oleg Bartunov o...@sai.msu.su writes:

 We would like to have your opinion what to do with these patches - leave them
 for 8.5 or provide them to  hackers to review for 8.4.

In theory 8.5, though you wouldn't be the first to start sneaking in late
commits and given how long before the release I can't really think people
would complain too much.

Things have reached a perverse state. We've now been in a holding pattern for
4 1/2 months while development has basically ceased. Aside from committers, no
contributors have been able to make any progress on any work for already that
time and months remain before any reviewers will pay any attention to
submissions.

I have a bunch of ideas I wanted to follow up posix_fadvise with including
posix_fallocate and more uses of posix_fadvise. I also wanted to return to the
ordered merge-append which I still think is critical for partitioned table
plans.

I think it's clear that stretching feature freezes is a failed policy. Aside
from delaying everyone else's work, it hasn't helped the projects we held the
release for either. Those projects would have hit 8.5 in due course just fine
but now surely we'll delay 8.5 based on the 8.4 release date. So they'll be
delayed just like everyone else's work.

I think in the future we should adopt a policy that only minor patches can be
received after the first commitfest. Major patches are expected to submitted
in the *first* commitfest and reworked based on feedback on subsequent
commitfests. The last commitfest should be a real feature-freeze where only
bug-fixes and minor changes are accepted, not major features.

There seems to be a lot of resistance on the basis that there would be a long
wait until features are seen in a release. Personally I'm only interested in
when features get committed, not when they hit a release. But in any case I
think experience shows that this would result in hitting the same release
anyways and that release would be sooner as well.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Has anybody think about changing BLCKSZ to an option of initdb?

2009-03-14 Thread Gregory Stark
Joshua D. Drake j...@commandprompt.com writes:

 On Sat, 2009-03-14 at 13:53 +0100, Martijn van Oosterhout wrote:
 On Wed, Mar 11, 2009 at 01:29:43PM +, Greg Stark wrote:
  The main advantage would be for circumstances such as the Windows
  installer where users are installing precompiled binaries. They don't
  get an opportunity to choose the block size at all. (Similarly for
  users of binary-only commercial products such as EDB's but the Windows
  installer makes a pretty good argument on its own).
 
 And all the linux distributions which ship precompiled binaries. I'm
 sure there are people who compile postgres themselves but I think there
 are more who don't.

 I think that is an understatement. I would say 99% of postgresql users
 do NOT compile from source. Heck the only time I compile from source is
 when I need to fix mis-configured defaults in RH packages (which is why
 we now have rpms that fix those defaults) or when we have back patched
 something for a customer.

So has anyone here done any experiments with live systems with different block
sizes? What were your experiences? 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Should SET ROLE inherit config params?

2009-03-13 Thread Gregory Stark
Guillaume Smet guillaume.s...@gmail.com writes:

 On Fri, Mar 13, 2009 at 2:39 AM, Josh Berkus j...@agliodbs.com wrote:
 SET ROLE special WITH SETTINGS

 ... or similar; I'd need to find an existing keyword which works.

 Perhaps something like SET ROLE special NEW SESSION;.

 It solves a problem mentioned by Tom as it's very clear that it's a
 new session so when you reset the settings to what they were at
 session start, you take the default settings of special.

So this is just syntactic sugar for 

SET ROLE;
RESET ALL;

Or is it more or less?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1710)

2009-03-11 Thread Gregory Stark
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:

 KaiGai Kohei wrote:
  * ACL_SELECT_FOR_UPDATE has same value with ACL_UPDATE, so SE-PostgreSQL
checks db_table:{update} permission on SELECT ... FOR SHARE OF,
instead of db_table:{lock} permission.

 This again falls into the category of trying to have more fine-grained
 permissions than vanilla PostgreSQL has. Just give up on the lock permission,
 and let it check update permission instead. Yes, it can be annoying that you
 need update-permission to do SELECT FOR SHARE, but that's an existing problem
 and not in scope for this patch.

Would it make sense to instead of removing and deferring pieces bit by bit to
instead work the other way around? Extract just the part of the patch that
maps SELinux capabilities to Postgres privileges as a first patch? Then
discuss any other parts individually at a later date? 

That might relieve critics of the sneaking suspicion that there may be some
semantic change that hasn't been identified and discussed and snuck through?
Some of them are probably good ideas but if they are they're probably good
ideas even for non-SE semantics too.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1704)

2009-03-10 Thread Gregory Stark
KaiGai Kohei kai...@kaigai.gr.jp writes:

 Heikki Linnakangas wrote:
 If we drop the goal of trying to restrict what a superuser can do, is the
 patch still useful?

 I want to keep permission checks on files specified by users, because
 the superuser permission affects very wide scope, and all or nothing
 policy in other word.
 However, the combination of clients and files is not so simple, and
 I think it is necessary to apply permission checks individually.

I would think the big advantage of something like SELinux is precisely in
cases like this. So for example a client that has a capability that allows him
to read a file can pass that capability to the server and be able to use COPY
to read it directly on the server.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Potential problem with HOT and indexes?

2009-03-08 Thread Gregory Stark

In answering the recent question on -general I reread README.HOT and had a
couple thoughts:

 Practically, we prevent certain transactions from using the new index by
 setting pg_index.indcheckxmin to TRUE.  Transactions are allowed to use
 such an index only after pg_index.xmin is below their TransactionXmin
 horizon, thereby ensuring that any incompatible rows in HOT chains are
 dead to them. (pg_index.xmin will be the XID of the CREATE INDEX
 transaction.  The reason for using xmin rather than a normal column is
 that the regular vacuum freezing mechanism will take care of converting
 xmin to FrozenTransactionId before it can wrap around.)

So it occurs to me that freezing xmin won't actually do what we want for
indexcheckxmin. Namely it'll make the index *never* be used.

I'm not sure what we would want to happen in this admittedly pretty unlikely
scenario. We don't actually have anything protecting against having
transactions with xmin much older than freeze_threshold still hanging around.



 This means in particular that the transaction creating the index will be
 unable to use the index if the transaction has old snapshots.  We
 alleviate that problem somewhat by not setting indcheckxmin unless the
 table actually contains HOT chains with RECENTLY_DEAD members.

In index.c:

  * A side effect is to set indexInfo-ii_BrokenHotChain to true if we detect
  * any potentially broken HOT chains.  Currently, we set this if there are
  * any RECENTLY_DEAD entries in a HOT chain, without trying very hard to
  * detect whether they're really incompatible with the chain tip.

I wonder if this particular case is good evidence that we need to be cleverer
about checking if the indexed fields have actually changed.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential problem with HOT and indexes?

2009-03-08 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Gregory Stark st...@enterprisedb.com writes:
 So it occurs to me that freezing xmin won't actually do what we want for
 indexcheckxmin. Namely it'll make the index *never* be used.

 How do you figure that?  FrozenXID is certainly in the past from any
 vantage point.

Uhm, I'm not sure what I was thinking.

Another thought now though. What if someone updates the pg_index entry --
since we never reset indcheckxmin then the new tuple will have a new xmin and
will suddenly become invisible again for no reason.

Couldn't this happen if you set a table WITHOUT CLUSTER for example? Or if
--as possibly happened in the user's case-- you reindex the table and don't
find any HOT update chains but the old pg_index entry had indcheckxmin set
already?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Potential problem with HOT and indexes?

2009-03-08 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Gregory Stark st...@enterprisedb.com writes:
 Another thought now though. What if someone updates the pg_index entry --
 since we never reset indcheckxmin then the new tuple will have a new xmin and
 will suddenly become invisible again for no reason.

 Hmm ... if updates to pg_index entries were common then I could get
 worried about that, but they really aren't.

Fixing this for REINDEX is fairly straightforward I think. It already updates
the pg_index line to fix indisvalid and indisready. see:

diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 8b14b96..c12bf6c 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -2332,6 +2332,9 @@ reindex_index(Oid indexId)
 * If the index is marked invalid or not ready (ie, it's from a failed
 * CREATE INDEX CONCURRENTLY), we can now mark it valid.  This allows
 * REINDEX to be used to clean up in such cases.
+*
+* Also if the index was originally built using CREATE INDEX 
CONCURRENTLY
+* we want to clear the indcheckxmin field since it's no longer 
relevant.
 */
pg_index = heap_open(IndexRelationId, RowExclusiveLock);
 
@@ -2342,10 +2345,11 @@ reindex_index(Oid indexId)
elog(ERROR, cache lookup failed for index %u, indexId);
indexForm = (Form_pg_index) GETSTRUCT(indexTuple);
 
-   if (!indexForm-indisvalid || !indexForm-indisready)
+   if (!indexForm-indisvalid || !indexForm-indisready || 
indexForm-indcheckxmin)
{
indexForm-indisvalid = true;
indexForm-indisready = true;
+   indexForm-indcheckxmin = false;
simple_heap_update(pg_index, indexTuple-t_self, indexTuple);
CatalogUpdateIndexes(pg_index, indexTuple);
}

 Couldn't this happen if you set a table WITHOUT CLUSTER for example? Or if
 --as possibly happened in the user's case-- you reindex the table and don't
 find any HOT update chains but the old pg_index entry had indcheckxmin set
 already?

 This is all useless guesswork until we find out whether he was using
 REINDEX or CREATE INDEX CONCURRENTLY.

Well he said he had a nightly REINDEX script. What's unknown is whether the
index was originally built with CREATE INDEX CONCURRENTLY. But I don't know
any other reason for a newly built index to go unused when the query is very
selective and then to suddenly start being used after a restart.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Out parameters handling

2009-03-07 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 On Sat, Mar 7, 2009 at 9:08 AM, Rod Taylor rod.tay...@gmail.com wrote:
 It wouldn't be so bad if you could assign internal and external column 
 names.

 This is a good point.  Uglifying the parameter names is sort of OK for
 input parameters, but is much more annoying for output parameters.

 How much of this pain would go away if we changed over to the arguably
 correct (as in Or*cle does it that way) scoping for names, wherein the
 parser first tries to match a name against column names of tables of the
 current SQL statement, and only failing that looks to see if they are
 plpgsql variables?

I'm not sure that's any better. The case where I've run into this is when I
have something like:

 balance := new value
 UPDATE tab SET balance = balance

In that case the only way we could get it right is if we default to the local
variable but only in contexts where an expression is valid.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Validating problem in the isn contrib module

2009-03-06 Thread Gregory Stark
Andreas 'ads' Scherbaum adsm...@wars-nicht.de writes:

 The validator function should use the weak mode for itself to return
 'f' in case of invalid input. It cannot be the users job to make sure a
 validator function is working as expected.

Well the input function is being called before the validator function ever
gets a chance to act. The validator function receives an already-formed isbn13
datum.

Is there an is_valid() which takes text input? Perhaps there should be --
though I think it would have to be isbn_is_valid() or something like that.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Validating problem in the isn contrib module

2009-03-06 Thread Gregory Stark

Andreas 'ads' Scherbaum adsm...@wars-nicht.de writes:

 So this function is useless. If the syntax is valid, the check is
 already done and is_valid() returns true. If the syntax is invalid, PG
 will raise an error even before this function returns. The invalid
 marker is not checked at all.

This seems pretty clearly explained in the documentation. 

  When you insert invalid numbers in a table using the weak mode, the number
  will be inserted with the corrected check digit, but it will be displayed with
  an exclamation mark (!) at the end, for example 0-11-000322-5!. This invalid
  marker can be checked with the is_valid function and cleared with the
  make_valid function.

  You can also force the insertion of invalid numbers even when not in the weak
  mode, by appending the ! character at the end of the number.

 This leads back to my initial question: intended behaviour or just a
 bug? And how can one validate an ISBN without raising an error?



-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TOAST table (or index?) corruption on PITR replica

2009-03-06 Thread Gregory Stark
Alvaro Herrera alvhe...@commandprompt.com writes:

 14903 2009-02-28 22:22:02 EST kangaroo app 49a9fe2e.3a37 ERROR: index
 pg_toast_42299_index contains unexpected zero page at block 23768

 There were actually several different instances of the first error
 involving different toast values.

Do you know how big this table was when the backup was taken? Were these
blocks present then or were they added since then? My suspicion would lie with
the relation extension code where there have been bugs before, but I don't see
any found recently.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] cardinality()

2009-03-01 Thread Gregory Stark
Peter Eisentraut pete...@gmx.net writes:

 The standard represents multidimensional arrays as arrays of arrays (like in 
 C).

Uh, C doesn't represent multidimensional arrays as arrays of arrays so you've
lost me already.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] would it be a lot of work, to add optimizations accross unions ?

2009-02-28 Thread Gregory Stark

Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

 Say I have:

 select foo (
  select foo from bar1
   union all
  select foo from bar2
   union all
  select foo from bar3
   ...
 ) a order by foo desc limit X;


 (and I can give you few other examples around the same 'note', say with when
 foo=N in outer subselect)

 Would anyone consider such optimization, when postgres will apply the same
 condition to inner queries, providing that their size is substantial?

Well you haven't said what optimization you're looking for here. 

I posted a patch to look for an ordered path for members of a union a while
back but it still needed a fair amount of work before it was usable.

The LIMIT can't be pushed into the union unless we do have ordered paths so
that's further down the line. (And even then it would only work if there are
no conditions on there append path.)

And I believe we already do push down where clauses like foo=N.

 Same would actually apply for different subqueries, without union/
 intersect/etc:

 select foo( select foo from bar1 ) a where foo in (x,y,z) order by foo  desc
 limit N

huh?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] would it be a lot of work, to add optimizations accross unions ?

2009-02-28 Thread Gregory Stark
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

 On 28 Feb 2009, at 11:37, Gregory Stark wrote:

 I posted a patch to look for an ordered path for members of a union  a while
 back but it still needed a fair amount of work before it was usable.

 I belive limit it self can't be pushed down, but with order by - why  not ? 

Because my patch wasn't finished? There were still things about the planner I
didn't understand which blocked me from finishing it at the time.

 select foo( select foo from bar1 ) a where foo in (x,y,z) order by  foo
 desc
 limit N

 huh?
 Just a simple example, perhaps oversimplified.
 The thing is, in case like that - planner could merge two queries  together.

No, I meant I don't understand what you're trying to do with this query or
what you would propose the planner should do with it. Afaict this isn't a
valid query at all and I don't see two queries to merge in it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Add quotes to message

2009-02-23 Thread Gregory Stark

pet...@postgresql.org (Peter Eisentraut) writes:

 Log Message:
 ---
 Add quotes to message

 errdetail(Returned type %s does not match expected type 
!  %s in column \%s\.,
   OidIsValid(returned-attrs[i]-atttypid) ?
   format_type_be(returned-attrs[i]-atttypid) :
   _(dropped_column_type),

I'm surprised there isn't a regression test for this case.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding a field in Path Structure and Plan Structure

2009-02-23 Thread Gregory Stark
Robert Haas robertmh...@gmail.com writes:

 Attach to the backend with gdb.  Then you can get a backtrace, esp. if
 you've built with --enable-debug.

 It may be helpful to use lsof to figure out which backend your psql
 session is connected to.

select backend_pid();

is pretty handy for this. Though if you have the backend crashing on every
plan it may not help...
-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Multi calendar system for pgsql

2009-02-22 Thread Gregory Stark
Mohsen Alimomeni m.alimom...@gmail.com writes:

 Hi,
 To implement my local calendar, I tried adding a new type (pdate) to pgsql
 as an extension. At first I used a struct of size 6, and I returned a
 pointer to it in pdate_in with no problem. Now I changed the type to int32,
 returning PG_RETURN_INT32. I removed all palloc calls. but the server
 crashes with segmentation fault before returning in pdate_in.

You want to set PASSEDBYVALUE

(And you probably want to adjust alignment though I don't think it's causing
any problem aside from wasted space)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: hooking parser

2009-02-20 Thread Gregory Stark
Pavel Stehule pavel.steh...@gmail.com writes:

 Curiously enough, Oracle has it so that || of null arguments treats the
 arguments as empty string.

 It's beyond comprehension.

 what is result of '' || '' ?

Well the result of this is NULL of course (which is the same as '')

What's more puzzling is what the answer to 'foo' || NULL is...


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator progress

2009-02-18 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 No, but this would just be the same situation that prevails after
 OID-counter wraparound, so I don't see a compelling need for us to
 change the OID counter in the new DB.  If the user has done the Proper
 Things (ie, made unique indexes on his OIDs) then it won't matter.
 If he didn't, his old DB was a time bomb anyway.

Well it was a time bomb but it wasn't necessarily about to go off... He may
very well know how close or far he is from oid wraparound and have contingency
plans in place. 

Also I wonder about the performance of skipping over thousands or even
millions of OIDs for something like a toast table.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_migrator progress

2009-02-18 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Gregory Stark st...@enterprisedb.com writes:
 Also I wonder about the performance of skipping over thousands or even
 millions of OIDs for something like a toast table.

 I think that argument is a red herring.  In the first place, it's
 unlikely that there'd be a huge run of consecutive OIDs *in the same
 table*.  

Really? Wouldn't all it take be a single large COPY loading data into a table
with one or more columns receiving large data which need to be toasted?

 In the second place, if he does have such runs, the claim that he can't
 possibly have dealt with OID wraparound before seems pretty untenable ---
 he's obviously been eating lots of OIDs.

Well there's a pretty wide margin between millions and 4 billion. I suppose
you could say it would only be a one-time cost (or a few separate one-time
costs until the oid counter passed the old value). So a few minutes after
doing an in-place upgrade while the oid counter skimmed past all the existing
values would be bearable.

 But having said that, there isn't any real harm in fixing the OID
 counter to match what it was.  You need to run pg_resetxlog to set the
 WAL position and XID counter anyway, and it can set the OID counter too.

Yeah, if it was massive amounts of code I could see arguing that it's not
justified, but given that it's about the same degree of complexity either way
it seems clear to me that it's better to do it than not to do it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] graph representation of data structures in optimizer

2009-02-18 Thread Gregory Stark
Adriano Lange adri...@c3sl.ufpr.br writes:

 Hi,

 I'm interested in data representation and debug of optimizer routines. Thus,
 I've changed the debug functions of allpaths.c to make a graphviz-like output
 of RelOptInfo structure.

 Any idea about this?
 Is there some project or improvement like this?

Several people have asked about ways to see what possible plans were
considered and why the were rejected, it was one of the repeat offenders in
the recent Postgres Pet Peeves thread so this is a very interesting area to
explore.

However I have to say this graph you've generated is amazingly hard to
decipher :) It took me a while to even figure out what information it was
presenting.

Worse, it's not useful unless you add a lot more information to it such as
what relations are actually being scanned or joined at each path which is
going to make it a hell of a lot harder to read.

I'm not sure how to do any better but I would be fascinated to see any new
images you generate :)

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #4660: float functions return -0

2009-02-17 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 The CVS commit message says
   Check for zero in unary minus floating point code (IEEE allows an
   explicit negative zero which looks ugly in a query result!).

 I'm of the opinion that minus zero was put into the IEEE floating point
 standard by people who know a great deal more about the topic than
 anyone on this list does, and that we do not have the expertise to be
 second-guessing how it should work.  Not long ago we took out code that
 was interfering with spec-compliant treatment of IEEE infinity; I think
 we should take out this code too.

If the original complaint was that it looked ugly in query results then the
right way to fix it would surely in float4out and float8out. Interfering with
IEEE floating points may be a bad idea but surely it's up to us how we want to
represent those values in text.

But without a convenient and widely used binary format that kind of restricts
our options. If we squash -0 on float[48]out then dumps will lose information.
So I guess there's nothing we can do about it now. I wonder if we're going to
find users complaining about things like displaying -0 matching results
though...


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SE-PostgreSQL and row level security

2009-02-16 Thread Gregory Stark
KaiGai Kohei kai...@ak.jp.nec.com writes:

 Martijn van Oosterhout wrote:
 On Mon, Feb 16, 2009 at 11:10:19AM +0900, KaiGai Kohei wrote:
 At the previous discussion, two items were pointed out.

 The one is called as covert channel. When a tuple with PK is refered by
 one or more tuples with FK, row-level control prevents to update or delete
 the PK, even if the FK is invisible from users. It allows users to infer
 existence of invisible FK. 
 
 One thing I keep missing in this discussion: the term row-level
 security in the above senstence in not the important part. Right now
 you can revoke SELECT permission on a table with a foreign key and it
 will still prevent UPDATEs and DELETEs of the primary key, allowing
 users to infer the existance of an invisible FK.
 
 This is the same covert channel, so why is it a problem for
 SE-Postgres and not for normal Postgres?

 Please note that I don't consider it is a problem, even if SE-PostgreSQL.

 Both of SE-PostgreSQL and vanilla PostgreSQL don't give an assurance to
 eliminate information leaks via such kind of covert channels, so they
 don't violate any specifications of them. Thus, it is not a problem.

If that's true then I don't see why we would try to automatically hide records
you don't have access to. The only reason to do so is to try to close these
covert channels and if we can't do that then I don't see any benefit to doing
so. 

If users want to select all matching records the user has access to they
should just put that in the WHERE clause (and we should provide a convenient
function to do so). If we implicitly put it in the WHERE clause then
effectively we're providing incorrect answers to the SQL query they did
submit.

This is a big part of the breaking SQL semantics argument. Since the
automatic row hiding provides different answers than the SQL query is really
requesting it means we can't trust the results to follow the usual rules.

I think there's more to it though. Tom pointed out some respects in which the
hooks are too late and too low level to really know what privilege set is in
effect. The existing security checks are all performed earlier in plan
execution, not at low level row access routines. This is a more fundamental
change which you'll have to address before for *any* row level security scheme
even without the automatic data hiding.

So, assuming the SELinux integration for existing security checks is committed
for 8.4 I think the things you need to address for 8.5 will be:

1) Row level security checks in general (whether SELinux or native Postgres
   security model) and showing that the hooks are in the right places for
   Tom's concerns.

2) Dealing with the scaling to security labels for billions of objects and
   dealing with garbage collecting unused labels. I think it might be simpler
   to have security labels be explicitly allocated and dropped instead of
   creating them on demand.

3) The data hiding scheme -- which frankly I think is dead in the water. It
   amounts to a major change to the SQL semantics where every query
   effectively has a volatile function in it which produces different answers
   for different users. And it doesn't accomplish anything since the covert
   channels it attempts to address are still open.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] polyphase merge?

2009-02-04 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 On Wed, 2009-02-04 at 10:55 +, Greg Stark wrote:
 Is this basically the same as our current algorithm but without
 multiplexing the tapes onto single files? I have been wondering
 whether we multiplex the tapes any better than filesystems can lay out
 separate files actually.

 I don't think you'll be able to do that more efficiently than we already
 do. Read the top of tuplesort.c

Huh?

The question I posed was whether we do it any better than filesystems do, not
whether we could do a better job. If filesystems can do as good a job then we
might as well create separate files for each tape and let the filesystem
decide where to allocate space. That would mean we could massively simplify
logtape.c and just create a separate file for every tape.

Possible reasons that filesystems could do better than us are that they have
access to more information about actual storage layout on disk, that they have
more information about hardware characteristics, and that it would give the
filesystem cache a better idea of the access pattern which logtape.c might be
confusing the picture of currently.

On the other hand possible reasons why filesystems would suck at this include
creating and deleting new files being a slow or locking operation on many
filesystems, and dealing with directories of large numbers of files being
poorly optimized on others.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby (v9d)

2009-02-03 Thread Gregory Stark
Hannu Krosing ha...@krosing.net writes:

 Actually we came up with a solution to this - use filesystem level
 snapshots (like LVM2+XFS or ZFS), and redirect backends with
 long-running queries to use fs snapshot mounted to a different
 mountpoint.

Uhm, how do you determine which snapshot to direct the backend to? There could
have been several generations of tuples in that tid since your query started.
Do you take a snapshot every time there's a vacuum-snapshot conflict and
record which snapshot goes with that snapshot?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding stuff to parser, question

2009-02-01 Thread Gregory Stark
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

 On 31 Jan 2009, at 17:17, Gregory Stark wrote:

 I don't see any reason offhand why it should have to be a reserved  word
 though. You should be able to make it an UNRESERVED_KEYWORD. Oh, and  you'll
 want to add it to the list of tokens in unreserved_keyword in gram.y  as
 well.

 removing it from keywords.c and adding it to unserserved_keywords  crowd 
 didn't
 make it... so I'll stick with keywords.c for timebeing.

I'm sorry if I was unclear. It needs to be in keywords.c but can probably be
marked as UNRESERVED_KEYWORD there rather than RESERVED_KEYWORD.

It has to be in the grammar rule for the corresponding production, either
reserved_keyword or unreserved_keyword.

In other words there are two places where you have to indicate whether it's
reserved or not, keywords.c and gram.y.

 So far I got mostly critique here, even tho - I haven't started much,  which 
 is
 quite sad in a way - because it is not very pro-creative, but  I'll still
 continue on with the patch - whatever the outcome.

Any change to the grammar meets the question of whether it conflicts with the
standard. That's just the way it is and doesn't reflect on you or your work.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] adding stuff to parser, question

2009-01-31 Thread Gregory Stark
Grzegorz Jaskiewicz g...@pointblue.com.pl writes:

You're going to kick yourself, but:


   {table, TABLE, RESERVED_KEYWORD},
 + {table, TABLES, RESERVED_KEYWORD},

   ^

I don't see any reason offhand why it should have to be a reserved word
though. You should be able to make it an UNRESERVED_KEYWORD. Oh, and you'll
want to add it to the list of tokens in unreserved_keyword in gram.y as well.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Space reservation v02

2009-01-30 Thread Gregory Stark
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:

 Zdenek Kotala wrote:
 Bruce Momjian píše v pá 30. 01. 2009 v 10:41 -0500:
 Well, I was thinking the new pg_class column would allow the upgrade to
 verify the pre-upgrade script was run properly, but a flat file works
 just as well if we assume we are going to pre-upgrade in one pass.

 Flat file or special table for pg_upgrade will work fine. 

 Right, there's no difference in what you can achieve, whether you store the
 additional info in a flat file, special table or extra pg_class columns. If 
 you
 can store something in pg_class, you can store it elsewhere just as well.

Well having a column in pg_class does have some advantages. Like, you could
look at the value from an sql session more easily. And if there are operations
which we know are unsafe -- such as adding columns -- we could clear it from
the server side easily.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get SE-PostgreSQL acceptable

2009-01-29 Thread Gregory Stark
Robert Haas robertmh...@gmail.com writes:

 I'm wondering if this problem could be solved with a sort of
 mark-and-sweep garbage collection: 
...
 Then you can write something which goes through and sets all the rows
 to false and then visits every row of every table in the database and
 forces OID lookups on the security ID of each.  When you get done, any
 rows that still say false are unreferenced and can be killed.

This sounds awfully similar to the bitmap index vacuum problem. I wonder if
security labels could be implemented as some kind of funky special index.

Just thinking out loud. I don't have a well-formed idea based on this.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest infrastructure

2009-01-29 Thread Gregory Stark


I thought reviewboard looked pretty good for code quality patch review. It
would be cool if someone could write a mail filter which automatically added
any patches posted to the list to reviewboard.

Incidentally one issue with reviewboard/patchwork/whatever is that they tend
to encourage the review to do a line-by-line review of the code itself rather
than the overall architecture. That's fine for the kind of code-quality
reviews some authors (like myself I admit :( ) need sometimes. And it would
make it easier to do hit-and-run comments when you see minor issues not worth
pestering authors about on-list. But I don't think it really helps with the
hard reviews.


But that's just a cute tool for one particular part of the work. I don't think
it addresses workflow management like RT or debbugs (or trac?) would.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-29 Thread Gregory Stark
Robert Haas robertmh...@gmail.com writes:

 read up-thread, i've already shown that this would not be the case. remember,
 we reduce the pressure from the large, complex patches that bottleneck the
 process, which allows more parralell review/commit.

 I read what you wrote - I just don't believe it.  My own experience is
 that doing more releases is more work.  

Yeah, more releases than once a year is kind of crazy.

 Also, two commitfests per release means that if you can't get your patch up
 to snuff in two iterations, you're bumped.

I wish we could get rid of the whole concept and stigma of being bumped your
patch will be released in the next release after it's committed. What does it
matter if there's been another release since you started or not?

ISTM there are two ways to make the release step take less time. Either we
sacrifice quality or we change the development process so more testing and
review happens during development. I see the latter as realistic if we hold
off committing (but not reviewing) any major patches submitted after
commitfest#1 until the next commitfest#1. That means when it comes time to
release there won't be any major changes in it that people haven't had months
to experiment with already.

I would still like an answer to my question about what steps there are that
take so many months for a release, but I expect most of them boil down to
(justified) paranoia about testing major features that people haven't already
tested outside of development environments.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Commitfest infrastructure

2009-01-29 Thread Gregory Stark
Stefan Kaltenbrunner ste...@kaltenbrunner.cc writes:

 Peter Eisentraut wrote:
 On Thursday 29 January 2009 11:40:48 Stefan Kaltenbrunner wrote:
 well from a quick glance there is the bugzilla demo install as well as
 pieces of reviewboard and patchwork on the trackerdemo jail.

 So what's the URL and where can we sign up?

 note the pieces part of my mail :-) As far as I recall the patchworks 
 install
 somehow collided with the reviewboard one so it was disabled because Zdenek 
 was
 still actively using reviewboard.

For what it's worth Google seems to have rolled out reviewboard as a Google
App. We could use it hosted on there if we wanted.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Scan cost expression

2009-01-28 Thread Gregory Stark
Amit Gupta amit.pc.gu...@gmail.com writes:

 Moreover it only models a single index scan. It assumes nothing is cached
  prior to the index scan which is very much not true if we're repeatedly
  scanning similar ranges of keys.


 It's reasonable to assume that nothing is cached for estimating the cost.

Not really, but doing otherwise is just hard. There's nothing in the query to
give Postgres a hint about which tables are more heavily used and more likely
to be in cache than others.

 1 block !/O per index probe is a considerable cost.

Well it's closer to 0 than n...

 Well they're all different but I suspect the root of what you're observing 
 are
  all the same thing. Cache doesn't affect any of these nodes unless we start
  with something in the cache from previous queries and we don't model that. 
 We
  assume each query and even each plan node is run on a cold cache.

 Cost of evaluating operators depend heavily on available cache size,
 which is not considered by the Postgre optimizer at many places. For
 instance,
 - # I/O for sorting = T log_M T/M, where T is size of relation, and M
 is available memory.

 However, postgre assumes constant avaliable memory of 1M for sorting.
 Since sort is a blocking operator, which means that exeuction of other
 operators should halt when sorting is in progress, it should be able
 to hog all the available memory.

Well you're free to raise work_mem. Also, while sorting all happens in one
step the memory remains used after the sort is done. 

It's true that if work_mem is set low but there's lots of cache available it
will speed up spilled hashes and on-disk merge sorts. But you would be better
off raising work_mem in those cases. If they're spilling because they don't
fit in RAM at all then will cache still have an effect?

  I'm also not clear what kinds of formulas work for this. It has to be
  something that can be composed effectively. That is, even if a plan node
  doesn't want to discount itself at all for repetitions it has to include the
  discount that any subplans have asked for. For example a large sequential 
 scan
  which expects to overflow effective_cache_size might not want to be 
 discounted
  at all, but if it has a subquery which scans a small table it will want to
  discount that 100% for any repetitions since it'll be cached after the first
  scan.

 We also need robust statistics to feed into these complex cost
 expression for accurate estimation. For instance, Oracle lets user to
 analyze each column to create distribution graph using histograms.
 These histograms is used by the optimizer to figure out exact number
 of rows (and their values ) output by an operator.

Well we certainly compute histograms and use them for selectivity estimates.
The challenge in this case is that you need to combine the distribution from
the outer node with the distribution in the inner node to estimate how much
overlap in disk accesses will result. So you need more than histograms, you
need some kind of cross-table statistics.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise v22

2009-01-28 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 What I intend to do over the next day or so is commit the prefetch
 infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
 indexscan part back for rework.  I think that it should be implemented
 in or near index_getnext() and pay attention to
 effective_io_concurrency.

The biggest question I have here is about doing it at the index_* abstraction
level. I've looked pretty hard at how to do this and run into some pretty
major problems.

I don't see any way to do it without changing the access method api. The
index_* methods cannot start fiddling with the current scan position without
messing up things like CURRENT OF and mark/restore irrecoverably.

But if we're going to change the index am api then we lose all of the
advantages of putting the logic in indexam.c in the first place. It won't
help any other index am without special code in each one

The best plan I came up with at this level is to add an am method 

amgetprefetchbitmap(IndexScanDesc scan, 
  ScanDirection direction, 
  TIDBitmap *bitmap, 
  int nblocks)

Which returns up to nblocks worth of bitmap starting from the current scan
position in the specified direction based on whatever's convenient to the
internal representation. I think it would be best if it stopped at the end
of the page at least if the next index page isn't in shared buffers.

Then nodeIndexscan.c would keep track of how the value of target_prefetch
just like nodeBitmapHeapScan, incrementing it whenever the caller continues
the scan and resetting it to zero if the direction changes.

However the getprefetchbitmap() call would only remove duplicates from the
upcoming blocks. It wouldn't know which blocks have already been prefetched
from previous calls. So nodeIndexscan would also have to remove duplicates
itself.

This splitting the work between three layers of abstraction is pretty messy
and creates a lot of duplicated work and doesn't seem to buy us anything. It
*still* wouldn't help any non-btree index types until they add the new method
-- and if they add the new method they might as well just add the USE_PREFETCH
code anyways. I don't see how the new method is useful for anything else.



I have another plan which would be a lot simpler but is a much more
brute-force approach. If we add a new scan pointer in addition to the current
position and the mark and add a slew of new methods like getnext_prefetch()
and reset_prefetch() to reset it to the current position. Also, add a hash
table internal to PrefetchBuffer and have it return a boolean indicating
whether it actually did a prefetch. Then index_prefetch would reset the
prefetch pointer and scan forward using it calling PrefetchBuffer on *every*
pointer counting how many trues returns it sees from PrefetchBuffer.*

*[Hm, not quite, we want to count recent prefetches that probably are still
queued separately from old ones that are probably in cache. And we also have
to think about how long to treat prefetches as probably still being in cache.
But with some additional thought I think this could be made to work.]





-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-28 Thread Gregory Stark
Magnus Hagander mag...@hagander.net writes:

 Josh Berkus wrote:
 
 One client is planning on deploying a rather complex FS cloning
 infrastructure just to have a bunch of reporting, testing and read-only
 search databases they need.  They'd be thrilled with an HS feature which
 produced DBs which were an hour out of date (or even 6 hours out of
 date), but ran read-only queries.

 I have a lot of clients who would be thrilled to have stuff that's been
 in our tree for half a year by now, and they'd be thrilled to have it
 *now*. How much extra should we have them wait for the needs of your
 clients?

I really am unconvinced by the argument that delaying existing features is a
big deal. Logically it's less of a big deal than delaying HS a whole release
cycle which I already said I think isn't a big deal either. This is purely a
question of latency between development and release; we still get just as much
in each release, it's just 6-12 months later than it might have been.

What bothers me is delaying work on things like Bitmap Indexes which won't
really start in earnest until Gianni can get feedback from the lists after the
release. Or Join Removal which Simon isn't going to look at until after HS is
committed (not *released* -- once it's *committed* he'll be free to go on to
other things). This would impact *bandwidth* of development which I think is a
much bigger deal. It reduces the amount of new features in each release, not
just which release they fall in.

I'm a bit shocked by how long Tom expects the release cycle to take even if we
froze the code today. I guess I forget how long it takes and how many steps
there are from past releases. If it's going to be 9+ months between Nov 1st
and the first commitfest I'm worried about how many patches will be
languishing in the queue with their authors having moved on to other more
fruitful pastures in the mean time. If we delay further we're talking about
close to a year with developers left hanging.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Gregory Stark
 understand why you need unobserved_xids. We don't need this
in normal running, an xid we don't know for certain is committed is exactly
the same as a transaction we know is currently running or aborted. So why do
you need it during HS?

The comment says:

+  * This is very important because if we leave 
+  * those xids out of the snapshot then they will appear to be already 
complete. 
+  * Later, when they have actually completed this could lead to confusion as 
to 
+  * whether those xids are visible or not, blowing a huge hole in MVCC. 
+  * We need 'em.

But that doesn't sound rational to me. I'm not sure what confusion this
would cause. If they later actually complete then any existing snapshots would
still not see them. And any later snapshots wouldn't be confused by the
earlier conclusions.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get SE-PostgreSQL acceptable

2009-01-28 Thread Gregory Stark
Joshua Brindle met...@manicmethod.com writes:

 partitions don't help because, once again, the application would be making the
 determination about which partition to query. 

Not necessarily since the database can be programmed to automatically put the
records into the right partition. Right now it's a pain but we're definitely
headed in that direction.

 Further, partitioning isn't fine grained. I can't say user X can read secret
 rows and read/write top secret rows and get that data out in a transparent way
 (the applications would have to be aware of the partitions). Relabeling of 
 data
 also looks like a challenge with partitions (if I correctly understand how 
 they
 work).

I think the transparent is the source of the problem. The application should
issue a query for the data it wants. It shouldn't transparently get magic
extra clauses attached to the query. That's where the SQL semantics are being
violated.

Row-level security isn't inherently a problem. It's just that the security is
affecting the data returned that's causing a problem.

I don't think partitioning is really the same thing as row-level security. But
I wonder if some of the same infrastructure could be used for both -- once we
have it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby (v9d)

2009-01-28 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 On Wed, 2009-01-28 at 14:56 -0500, Tom Lane wrote:

 Well, those unexpectedly cancelled queries could have represented
 critical functionality too.  I think this argument calls the entire
 approach into question.  If there is no safe setting for the parameter
 then we need to find a way to not have the parameter.

 I see the opposite: We don't know what tradeoffs, if any, the user is
 willing to put up with, so we need input. 

Well, if you see it that way then it seems to me you should be arguing for
making max_standby_delay a mandatory parameter. Without it don't start allow
connections. I hadn't considered that and am not exactly sure where I would
stand on it.

 The essential choice is What would you like the max failover time to
 be?. Some users want one server with max 5 mins behind, some want two
 servers, one with 0 seconds behind, one with 12 hours behind

Sure. But if they don't configure one then we shouldn't impose one. You're
thinking of precisely one use case and taking positive action to interrupt the
user's requests on the basis of it. But there are plenty of other use cases. I
claim the default has to be to do as the user instructed without intervention.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to get SE-PostgreSQL acceptable

2009-01-28 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 I don't believe I will ever think that row-level checks are a good idea; as
 long as those are in the patch I will vote against applying it.

I think we're conflating two behaviours here. 

The data hiding behaviour clearly changes the semantics of queries in ways
that make a lot of deductions about the data incorrect. That's a pretty severe
problem which would cause massive consequences down the road.

The imposing of security restrictions based on the data in the row isn't
really in the same league. I'm not sure I see any semantic problems with it at
all.

It's still true that it would be quite invasive to Postgres to implement. The
current security restrictions are all checked and determined statically based
on the query. Once a query is planned we can execute it without checking any
security constraints at run-time.


I don't think raising partitioning makes a useful contribution to this
discussion. Firstly, it's not like partitioning doesn't change SQL semantics
in its own way anyways. Even aside from that, partitioning is largely about
the location that data is stored. Forcing data to be stored in different
physical places just because our security model is based on the place the data
is stored is kind of silly.

Unless perhaps we implement partitioning which supports having many partitions
share the same underlying, er, physical partition. But then I don't see how
that's any different from row-level permissions.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index Scan cost expression

2009-01-27 Thread Gregory Stark
Amit Gupta amit.pc.gu...@gmail.com writes:

 While trying to figure out an appropriate cost expression function for
 Thick indexes, i learned that we are using Mackert and Lohman formula
 (described in their paper Index Scans Using a Finite LRU Buffer: A
 Validated I/O Model, ACM Transactions on Database Systems).
...
 Please note that the above result only computes _heap_ page reads.

Moreover it only models a single index scan. It assumes nothing is cached
prior to the index scan which is very much not true if we're repeatedly
scanning similar ranges of keys.

Omitting the index pages is a very coarse attempt to model the caching across
multiple plan invocations since upper level index pages will nearly always be
cached and even lower index pages have a good chance of being cached.

The problem is that modeling cross-plan-invocation caching is a hard problem
we have few ideas for.

 Upon taking a cursory look at the cost functions of other operators, I
 realized that available memory (effective_cache_size) is not
 considered for estimating the costs of hash/sort/NLjoin/etc. Why is
 that the case?

Well they're all different but I suspect the root of what you're observing are
all the same thing. Cache doesn't affect any of these nodes unless we start
with something in the cache from previous queries and we don't model that. We
assume each query and even each plan node is run on a cold cache.

The problem is that how much to discount the cost of the inner node depends,
not only on the type of node it is, but also on the types of parameters it
will be called with. So it needs very different results for something like a
nested loop between an outer table with few closely spaced values and another
with an outer table with values sparsely spread throughout the inner table.

This is complicated by the fact that the repetitions don't necessarily come
from the parent of the plan in question. You could have, for example, a
subquery several nodes down from the scan that causes repetitions.

I think the only way to tackle it is to come up with some parametric formula
for how much to discount repetitions and carry the parameters of that formula
on every plan node. So when a node needs to add the cost of n repetitions of a
lower node it applies that formula using the parameters advertised by the
sub-node.

The tricky part of that is coming up with a formula and figuring parameters to
model plan nodes using it. Consider that how much to discount repetitions will
depend heavily on what distribution of parameters the plan is executed with.

I'm also not clear what kinds of formulas work for this. It has to be
something that can be composed effectively. That is, even if a plan node
doesn't want to discount itself at all for repetitions it has to include the
discount that any subplans have asked for. For example a large sequential scan
which expects to overflow effective_cache_size might not want to be discounted
at all, but if it has a subquery which scans a small table it will want to
discount that 100% for any repetitions since it'll be cached after the first
scan.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More FOR UPDATE/FOR SHARE problems

2009-01-27 Thread Gregory Stark
Kevin Grittner kevin.gritt...@wicourts.gov writes:

 I think Greg has it right: without predicate locking we can't really
 achieve the behavior you're expecting. So how would we better approach the
 semantics you want without it?
  
 Well, this thread was talking about dealing with situations where
 queries using FOR UPDATE/FOR SHARE return something other than what is
 requested, or results based on viewing only part of what was committed
 by another transaction.  My feeling is that we should be looking
 harder at recognizing these cases and rolling back a transaction with
 a serialization failure before returning bad data.  

Well that's precisely what our SERIALIZABLE isolation mode does.

What I thought was intriguing was the thought of applying the serializable
logic to individual commands instead of the whole transaction. That is, repeat
the same command with a new snapshot instead of having to restart the whole
transaction over again. That's something you can't actually emulate right now.
The only way to get notified by Postgres that there's been an update is to be
in serializable mode which has to be set for the whole transaction, so even if
you catch the error you don't get a new snapshot.

 This doesn't seem to me to be that different from other situations
 where people have said It's easy to return results quickly if you
 don't care whether they're accurate.

Well I think the fundamental point is that SELECT FOR UPDATE gives you the
right data *for a subsequent update* of that row. If you never actually peek
at that data except to use it in a subsequent update then you get the same
results as if you had just issued the UPDATE and the resulting data in the
table is reasonable. If you use the data for purposes external to that row
then strange things do indeed result.

 Regarding the broader issues -- during discussion of documentation for
 the anomalies in snapshot isolation I was made aware of recent work,
 published by the ACM last year, which provides techniques for a more
 general and comprehensive solution.  

Hopefully it's not patent encumbered? Might be better not to check actually.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-27 Thread Gregory Stark
Joshua D. Drake j...@commandprompt.com writes:

 This is my take as well. This is very real, very scary things that are
 being worked on. HS should only ship after a very, very long non change
 cycle (meaning no significant bugs (or refactoring) found in HS patch
 for X period of time)... say after a full 8.5 dev cycle. I do not want
 to commit this patch and then have to yank it out 3 months from now.

In general I'm for planning large features with the potential to break
existing functionality going in the beginning of cycles. I don't think that's
the same as no changes though. The reason we make changes is because they're
believed to be for the better. 

The point in my mind is to get more people playing with the new feature in
contexts that *aren't* expected by the developers. Developers are notoriously
bad at testing their work no matter how diligent they are they just don't
think of things they didn't anticipate when they're coding. (Which is only
logical -- surely they would have just written it right the first time if they
anticipated the problems...)

 Lastly, the last time a developer told me two weeks it was 3 months.
 Unless we get a written development plan that describes specifically
 what, when, why and how long I am severely suspect that Heikki or Simon
 have a clue on an actual deliverable time line (no offense guys).

Well, Simon's been pretty impressively bang-on with his estimates for his
*development* projects going back at least to async-commit.

The *review* process, however, is inherently hard to estimate though. I doubt
anyone will give Tom better than even odds on his side bet, even if that's our
best estimate.

Simon has been refactoring and recoding based on Heikki's suggestions as fast
as he's been proposing them though. It seems the question isn't how fast Simon
will get the work done so much as how many items we'll want to change before
committing it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-27 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 Yeah, people like certification, but they also like products that work.
 Did you stop reading before getting to my non-security-based complaints?

 I read them, but I suspect they are issues that can be addressed.  How
 would any of this affect join removal, anyway?

 It would prevent us from making optimizations that assume foreign key
 constraints hold; which is a performance issue not a covert-channel
 issue.

It does seem weird to simply omit records rather than throw an error and
require the user to use a where clause, even if it's something like WHERE
pg_accessible(tab).

I wonder if we need a special kind of relational integrity trigger which
requires that the privileges on a source row be a superset of the privileges
on the target row. 

Can you even test superset on these privileges? Or are they too general for
that? And would you have trouble adjusting the privileges later because giving
someone access to a label would require checking every row to see if they have
access to every referenced row too?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] V4 of PITR performance improvement for 8.4

2009-01-26 Thread Gregory Stark

Koichi Suzuki koichi@gmail.com writes:

 It's simply because we should not refer to system catalog during the recovery.

I don't understand how this is connected to anything to do with prefetching?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark

Simon Riggs si...@2ndquadrant.com writes:

 I fail to see how rejecting unreviewed patches provides benefit for
 users, developers or sponsors. 

Nobody has suggested rejecting either sync replication or standby database.
The debate here is over whether to commit into 8.4 or into 8.5.

Put another way, the choice here is whether to have a half-baked delayed 8.4
release in 6 months or a polished on-time 8.5 release in 12 months. Either way
the feature ships and on a not terribly different timeline either.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:

 Not really, except maybe started 6 months too late. Big patches simply take a
 long time to mature.

 Looking back at the timeline for hot standby, it doesn't look unreasonable at
 all:

 September: First discussion about the user-visible behavior, transaction
 isolation etc. 

Is that right? I had the impression Simon had started working on it
previously. If so then given that feature freeze was scheduled to be November
1st starting in September does seem like it was more than a bit unrealistic.

Large patches should really be targeted towards the *first* commitfest of the
cycle, not the last. Targeting the last is putting all your eggs in one basket
as far as getting everything right the first time.

Someone else asked why this works for Linux. The reason it works is because
work is *always* committed early in the release cycle. The first couple weeks
of the release cycle are the *only* time significant patches are taken. 

The entire rest of the cycle is spent in feature freeze with development
happening exclusively on subsystem maintainer's private branches. When the
next release cycle begins subsystem maintainers send up patches for all the
changes that have happened since the last cycle that they think are ready for
release.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 Gregory Stark st...@enterprisedb.com writes:
 Put another way, the choice here is whether to have a half-baked delayed 8.4
 release in 6 months or a polished on-time 8.5 release in 12 months. Either 
 way
 the feature ships and on a not terribly different timeline either.

 This is pretty much exactly how I see it.  *Hot standby is not ready*,
 and committing it into 8.4 isn't going to magically make that better.
 The earliest we are going to have a HS feature that I would trust my
 data to is probably ten-twelve months off.  The decision we need to
 make now is whether that release will be called 8.4 or 8.5; in the
 former case meaning that all the stuff already in 8.4 will not reach
 users' hands for close to a year more.

I'm not even so much worried about the stuff already in 8.4. I'm worried about
the queue for the first 8.5 commitfest. Nobody's going to be reviewing new
patches until 8.4 is out which will leave all that stuff sitting in limbo for
months. Worse, it'll take months to sort through it and nobody will be able to
work on anything which depends on stuff in that queue for, uh, 2*months.

This is stuff like sync replication, join removal, the posix_fadvise for WAL
replay, index-only scans, etc. If this stuff isn't ready to be committed until
next September we'll be back in the same boat again.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Dave Page dp...@pgadmin.org writes:
 On Mon, Jan 26, 2009 at 3:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 This is pretty much exactly how I see it.  *Hot standby is not ready*,

 So can you give us an idea of what parts of the code are in need of
 rethinking etc? I assume you've looked at it now if you can estimate
 it's going to take another 10 -12 months?

 No, I'm just estimating that based on the amount of design churn that's
 still going on according to the mailing list discussions.  

That's, uh, not a very useful metric. It should argue in *favour* of the patch
being near completion that the patch is getting feedback and Simon is rapidly
taking such suggestions to heart and making changes responsively.

And incidentally most of the changes Heikki's been making have been
simplifications. Every bit which is simplified is one fewer area to find
further problems in. It's not like he's been finding problems which require
new code which in turn could have more problems.

 I haven't looked at the code at all. (If you expect me to sign off on it you
 can figure it'll be a couple of months even for that to happen...)

Well, given that you're basing your judgements on the mailing list traffic and
haven't read the code I guess we have to go on Heikki's judgement that there
are about two known weeks of work and possibly more afterwards.

The real question is how long will the beta period be with or without Hot
Standby? How many users have been pounding on it so far and how many more will
we get in beta?

If we assume it's about a month before the patch is committed, and then the
beta period is an extra month or two long (on top of what I assume will be
about a month) then we're looking at 3-4 months before 8.4 or about May.

I fear that 3-4 months will already be long enough to back up development on
8.5 and cause a big queue for the first commitfest. Frankly, at 3 months I
think feature freeze has _already_ gone on too long, I don't think we need the
doom scenario of 10-12 months before we create a serious problem again.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark

Merlin Moncure mmonc...@gmail.com writes:

 HS is working very well (Simon's ongoing work aside).  I am pretty
 confident based on my personal testing that it would represent the
 project well if committed today.

I think a lot of people weren't aware there was anybody testing this patch
other than Simon and Heikki -- I wasn't until just today. I wonder how many
more people are trying it out?

This is one of the changes for the better from the past. Though I still think
a lot more would try it out once it's committed.

Here's a thought experiment. If it was committable *today* would we be willing
to go with it and plan to release with it? Assume that it would *still* mean a
longer beta process, so it would still mean releasing in, say April instead of
February or March.

While personally I want us to switch to a mode where we commit large patches
early in the development cycle I don't believe we would refuse to commit it
today if it was ready. And I can't imagine two weeks would make the
difference either.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More FOR UPDATE/FOR SHARE problems

2009-01-26 Thread Gregory Stark
Jeff Davis pg...@j-davis.com writes:

 It seems like it would be a challenge to know that the tuple with i=3
 would be updated to a value that matches the search condition j=10. So
 can you tell me a little more about the mechanism by which Sybase solves
 this problem?

This example is a case of the same issue we were discussing earlier involving
predicate locking. To solve it you need a way to lock records that your
query *isn't* accessing and may not even exist yet to prevent them from being
turned into (or inserted as) records your query should be accessing.

As Kevin described it earlier Sybase locks the index pages containing the key
range you're accessing preventing anyone from inserting new index pointers in
that range. If there's no index it locks the entire table on every select to
prevent any updates or inserts in the table until your transaction finishes.

In any case note that your example is not *serializable*. (Though in Postgres
it can happen even in serializable mode, so that's not much of a defence.) I'm
unclear what whether it manifests any of the phenomenon which are prohibited
for READ COMMITTED.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8.4 release planning

2009-01-26 Thread Gregory Stark

Ron Mayer rm...@cheapcomplexdevices.com writes:

 I realize in the current system (emailed patches), this would be a horrible
 pain to maintain such a branch; but perhaps some of the burden could be
 pushed down to the patch submitters (asking them to merge their own changes
 into this merged branch).   

I've considered maintaining such a repository a few times and dismissed it
when I realized how much work it would be to maintain.

 And I hate bringing up the version control flame war again; but git really
 would make this easier. If all patches were on their own branches; the
 painful merges into this shared branch would be rare, as the source control
 system would remember the painful parts of the merges.

We have git repositories, I still think maintaining a merged tree with dozens
of patches would be a lot of work.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 8SEPostgres WAS: .4 release planning

2009-01-26 Thread Gregory Stark
Joshua Brindle met...@manicmethod.com writes:

 Yes, I will look at them to the extent I am able. As I am not familiar with 
 the
 postgresql codebase I won't be able to assert the correctness of the hook
 placement (that is, where the security functions are called with respect to 
 the
 data they are protecting being accessed). The postgresql community should be
 more familiar with the hook call sites and hopefully can assist there.

I would suggest looking at the documentation (which I assume the patch does
update). If it's not clear that things are in order from the documentation
then either it needs better documentation or something's wrong...

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] V4 of PITR performance improvement for 8.4

2009-01-25 Thread Gregory Stark

Koichi Suzuki koichi@gmail.com writes:

 Please find enclosed 2nd patch of pg_readahead which include a patch
 to bufer manager to skip prefetch of pages already in shared buffer.

I'm a bit confused by this comment. PrefetchBuffer already checks if the page
is in shared buffers. 

What is tricky to avoid is prefetching the same page twice -- since the first
prefetch doesn't actually put it in shared buffers there's no way to avoid
prefetching it again unless you keep some kind of hash of recently prefetched
buffers.

For the index scan case I'm debating about whether to add such a cache
directly to PrefetchBuffer -- in which case it would remember if some other
scan prefetched the same buffer -- or to keep it in the index scan code.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Automatic view update rules Bernd Helmle

2009-01-24 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 ON UPDATE DO INSTEAD SUBSTITUTE base_table [ (base_column_name, ...) ]
   [ WHERE ... ]

 ON UPDATE DO INSTEAD
   UPDATE base_table SET base_col_1 = new.derived_col_1, base_col_2 ...
   WHERE CURRENT OF VIEW;

What would happen with these if the view is defined with SELECT * and I add
a new column or drop columns from the table? It seems like the former with the
optional list of columns would magically apply to the new columns which would
make it behave differently from the normal select rule. Or would you expand an
ommitted column list like we do with select *

In any case the fact that the latter allows you to extend things with computed
values seems pretty attractive. We could always allow shortcuts like SET *
WHERE CURRENT OF VIEW analogous to SELECT * for manually created views. We
could also allow the rhs of the expressions to be skipped so you could do

UPDATE base_table SET col1, col2, col, base_col = new.derived_col - 1
 WHERE CURRENT OF VIEW

This same machinery isn't present in the normal executor is it? I mean, if I
can update a view then ISTM I should be able to update a view written inline
in the query like:

 UPDATE (select * from a where x=1) set y=2

just like I can with SELECTs. This does incidentally work in Oracle and is its
way of doing what we do with UPDATE...FROM. It's the only way AFAIK to get
merge join update plans out of it.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More FOR UPDATE/FOR SHARE problems

2009-01-24 Thread Gregory Stark
Jeff Davis pg...@j-davis.com writes:

 On Sat, 2009-01-24 at 19:45 +, Greg Stark wrote:
 There already is quite an extensive discussion of how FOR UPDATE  
 behaves including these kinds of violations.

 Not in the documentation, that I can see. And I think it's important
 that it be there for the reasons I mentioned.

 Can you refer me to the dicussion that you're talking about? I don't
 remember any discussion that points out that FOR UPDATE/FOR SHARE is
 broken in the simple case of a simple WHERE clause.

http://www.postgresql.org/docs/8.3/static/transaction-iso.html#XACT-READ-COMMITTED

 Because of the above rule, it is possible for an updating command to see an
 inconsistent snapshot: it can see the effects of concurrent updating commands
 that affected the same rows it is trying to update, but it does not see
 effects of those commands on other rows in the database. This behavior makes
 Read Committed mode unsuitable for commands that involve complex search
 conditions. However, it is just right for simpler cases. For example, consider
 updating bank balances with transactions like
 ...

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Controlling hot standby

2009-01-23 Thread Gregory Stark
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:

 Merlin Moncure wrote:
 Is 'hot standby' going to be the official moniker for the feature?
 (not 'standby replication', or something else?).  I wonder if we
 should pick something more descriptive.

 Could also be something like allow_connections_during_recovery.

 I'd keep the word replication out of this..

Just wondering, but does this make more sense as a recovery.conf parameter?


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 The original design of Postgres allowed pluggable index access methods,
 but that capability has not been brought forward to allow for WAL. This
 patch would bridge that gap.

Well I think what people do is what GIST did early on -- they just don't
support recoverability until they get merged into core.

Nonetheless this *would* be a worthwhile problem to put effort into solving. I
agree that there are lots of exotic index methods out there that it would be
good to be able to develop externally.

But to do that we need an abstract interface that doesn't depend on internal
data structures, not a generic plugin facility that allows the plugin to
hijack the whole system.

We need something more like indexams which provides a set of call points which
do specific functions, only get called when they're needed, and are expected
to only do the one thing they've been asked to do.

This could be a bit tricky since the catalog isn't available to the wal replay
system. We can't just store the info needed in the pg_indexam table. And it
has to span all the databases in the cluster in any case.

Perhaps this should be solved along with the plugins thread. Binary modules
could have some way to register their rmgr id so you could guarantee that
there aren't two plugins with conflicting rmgr ids or version mismatches.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FWD: Re: Updated backslash consistency patch

2009-01-21 Thread Gregory Stark
Martin Pihlak martin.pih...@gmail.com writes:

 Bruce Momjian wrote:
 Bruce Momjian wrote:
 I know we don't like the current behavior, but I think we need to make
 them consistent first for easy testing and so when we change it, it will
 remain consistent.

 I will work on a consensus patch soon for the new behavior.
 

 The \dXU *.* commands also display objects from information_schema. IMHO
 these should also be classified as system objects. It is most annoying to
 run '\dfU *.*' and see a list of information_schema internal functions show 
 up.
 Whereas the intent was to see the user defined functions in all schemas.


You know I think I've come around to agreeing with one of Tom's proposals.

I think we should do the following:

\dX   : list user objects
\dXS  : list system objects
\dX pat : list all matching objects based on search_path
\dX *.*   : list all objects in all schemas

I've basically come to the conclusion that having the output agree with
behaviour at run-time is a big advantage and anything else would actually be
too dangerous.

If you do something like \dt p* or \df a* and are annoyed by the output
you just have to make your pattern something more specific. For tables we
already prefix them all with pg_ so one more letter ought to be enough. 

For functions it would be nice if we could trim the output quite a bit. I
wonder if we could rename all our internal functions which implement operators
and indexam methods without introducing any backwards compatibility issues. We
don't document things like int4gt after all.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rmgr hooks (v2)

2009-01-21 Thread Gregory Stark

Simon Riggs si...@2ndquadrant.com writes:

 On Wed, 2009-01-21 at 14:28 +, Greg Stark wrote:

 The only advantage that remains, I think, is the real-world concern
 that you can have proprietary plugins 

 ** I have no plans for selling software that has been enabled by this
 patch. **

Hm, I didn't specifically mean this. However I'm not sure why this would be
considered so prejudicial. The Postgres project isn't generally hostile to
commercial use and extensions. If there was something you *did* want to sell
based on this and you needed a clean, generally useful interface to do it then
I think it would be an argument in *favour* of providing it, not against.

But I meant more generally, that the real-world use case for a generic rmgr
plugin function is for providing interfaces for things which cannot -- for
whatever non-code-related reason -- be integrated in core. That is, from a
code point of view they would be best integrated in core. So either they're
not generally useful, not production quality, not license compatible, or
whatever.

 The plugin approach was suggested because it brings together so many use
 cases in one and adds missing robustness to a case where we already have
 extensibility. Extensibility is about doing things for specific
 implementations *without* needing to patch Postgres, not just allowing
 external projects to exist alongside.

I think a generic plugin architecture is *too* many use cases. That is it's
too flexible and doesn't make any promises at all of what its intended to do.
As a result the system can't be sure it's calling the right method, can't
detect conflicts or catch errors. There's a sweet spot of abstraction where
the interface has to be specific enough to be useful but general enough to
cover all the use cases.

I'm not sure though, your comments in the other email make me think there
might be more to the patch that I had the impression was there. Will now go
read the patch and see if I was mistaken.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable Indexes

2009-01-21 Thread Gregory Stark
Josh Berkus j...@agliodbs.com writes:

 Right.  And I'm saying that pluggability is PostgreSQL's main reason for
 existence, if you look at our place in the future of databases.  So it's worth
 paying *some* cost, provided that the cost/benefit ratio works for the
 particular patch.

I agree that pluggability is a huge deal for Postgres. But note that the
interface is critical. If we provided a plugin architecture for functions and
operators which was simply a hook where you replaced part of the
infrastructure of the parser and executor it would be pointless. 

Instead we provide an interface where your function has to know as little as
possible about the rest of the system. And the parser and executor get enough
information about your function that they can do most of the work. That you
can create a new operator in Postgres *without* knowing how operators actually
are implemented and without worrying about what other operators exist is what
makes the feature so useful.

This is made a lot harder with WAL because a) it spans the entire cluster, not
just a database so any meta-information has to be stored somewhere global and
b) the consequences for getting something wrong are so much more dire. The
entire cluster is dead and can't even be restored from backup.

 To rephrase: I can't judge the rmgr patch one way or the other.  I'm only
 objecting to the idea expressed by Heikki and others that pluggable indexes 
 are
 stupid and unnecessary.

Well we support pluggable indexes -- they just can't be recoverable right now.
Presumably if they're merged into the core database they would have
recoverability added like how GIST progressed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-18 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 The really nasty cases are like this:

   const char *myfmt = gettext_noop(Some bleat about object \%s\.);

   ...

   errmsg(myfmt, objectname)

 where there really is no simple way to convince the compiler that you
 know what you're doing without breaking functionality.  This is probably
 why -Wformat-security doesn't warn about the latter type of usage.  It
 does kind of beg the question of why bother with that warning though ...

It makes sense to me: if you have arguments for the format string then
presumably you've at some point had to check that the format string has
escapes for those arguments.

The only danger in the coding style comes from the possibility that there are
escapes you didn't anticipate. It's a lot harder to expect specific non-zero
escapes and find something else than to just not think about it at all and
unknowingly depend on having no escapes.

And it would take willful ignorance to depend on having some specific set of
escapes in an unchecked string provided by an external data source, which is
where the worst danger lies.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MemoryContextSwitchTo (Re: [GENERAL] Autovacuum daemon terminated by signal 11)

2009-01-17 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Simon Riggs si...@2ndquadrant.com writes:
 Can we add something to memory contexts to make this fail every time?

 No, not really.  AFAICS the reason for Alvaro not seeing it must be that
 on his machine the new transaction happens to allocate its
 TopTransactionContext control block right in the same place where the
 old one was.

 We could have a debugging mode in which pfree'd space is never recycled
 for reuse (just filled with 0xdeadbeef or whatever and left to sit).

Hm, I wonder how much more practical it would be if we recycled it but offset
the pointer by maxalign. We would waste 4/8 bytes per palloc/free cycle
instead of the whole chunk.

(Whether palloc could actually do this would be another question, there are
lots of allocator algorithms that wouldn't be able to, I think.)

If we had a more formalized save_memory_context()/restore_memory_context()
which gave you more than just a pointer we could do better for this particular
case. save_memory_context() could hand you a struct with a pointer as well as
some sanity check values about the context you're saving.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixes for compiler warnings

2009-01-17 Thread Gregory Stark

Peter Eisentraut pete...@gmx.net writes:

 You apparently have your compiler configured with -Wformat-security.  Our 
 code 
 doesn't do that.  I think the cases the warning complains about are fine and 
 the way the warning is designed is a bit bogus.

Hm, only a bit. You know, we've had precisely this bug at least once not that
long ago. And the way the warning is designed it won't fire any false
positives except in cases that are easily avoided.

There's an argument to be made that the code is easier to audit if you put the
%s format string in explicitly too. Even if the current code is correct you
have to trace the variable back up to its source to be sure. If you add the
escape then you can see that the code is safe just from that line of code
alone.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visibility map and freezing

2009-01-16 Thread Gregory Stark
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:

 Jeff Davis wrote:
 On Fri, 2009-01-09 at 13:49 +0200, Heikki Linnakangas wrote:

 I'm now leaning towards:

 autovacuum_freeze_max_age
 vacuum_freeze_table_age
 vacuum_freeze_min_age

 where autovacuum_freeze_max_age and vacuum_freeze_min_age are unchanged, and
 vacuum_freeze_table_age is the new setting that controls when VACUUM or
 autovacuum should perform a full scan of the table to advance relfrozenxid.

 I'm still bothered by the fact that max and min really mean the same
 thing here.

 Yeah. Those are existing names, though, and I don't recall any complaints from
 users.

 I don't think we can perfectly capture the meaning of these GUCs in the
 name. I think our goal should be to avoid confusion between them.

 Agreed.

I was thinking it would be clearer if the options which control *when*
autovacuum fires off a worker consistently had some action word in them like
trigger or start or launch.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FWD: Re: Updated backslash consistency patch

2009-01-16 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Well, as I said before, I'm not averse to having the default behavior
 *with no pattern* to be that we omit system objects --- and I think we
 could make that apply across the board.  What I'm saying is that when
 you give a pattern it should not matter whether an object is system or
 user.  It would go like this:

 \df   -- all non-system functions
 \df sin   -- the active (visible) definition of sin()
 \df sin*  -- all visible functions starting with sin

 lesser used cases:

 \df * -- all visible functions
 \df *.*   -- all functions, period
 \df pg_catalog.*  -- all system functions
 \df public.*  -- all functions in public

So it seems to me that we made \df consistent with \dt when in fact what we
really wanted is for it to be consistent with \d. I hadn't actually realized
that the behaviour for \d was so different from \dt myself.

The behaviour of \dt in the face of tables which shadow system tables is
actually even stranger:

postgres=# create table pg_proc (t text);
CREATE TABLE
postgres=# commit;
COMMIT
postgres=# \dt pg_proc
No matching relations found.


And I don't see any reason aggregates, operators, etc, shouldn't be any more
susceptible the shadowing problem.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updates of SE-PostgreSQL 8.4devel patches (r1403)

2009-01-14 Thread Gregory Stark

Robert Haas robertmh...@gmail.com writes:

 Just out of curiosity, does C89, or whatever standard we follow, allow this?

 int
 somefunc(int x)
 {
 int foo[x];
 /* use foo[] for scratch space */
 }

It's not in C89 but look up alloca. 

We don't use it anywhere in postgres currently so it's kind of unlikely we
would start now.

I think C99 does allow what you typed, and I think gcc has an extension to
allow it too.

 Obviously this is a bad plan if x can be a big number because you
 might crash your stack, but suppose we know that's not an issue?  It
 seems a shame to have to do palloc/pfree in a situation like this.

palloc really isn't that expensive, unless you're allocating tons of tiny
objects or you're in a tight loop it's not worth worrying about.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Visibility map, partial vacuums

2009-01-14 Thread Gregory Stark
Bruce Momjian br...@momjian.us writes:

 Would someone tell me why 'autovacuum_freeze_max_age' defaults to 200M
 when our wraparound limit is around 2B?

I suggested raising it dramatically in the post you quote and Heikki pointed
it controls the maximum amount of space the clog will take. Raising it to,
say, 800M will mean up to 200MB of space which might be kind of annoying for a
small database.

It would be nice if we could ensure the clog got trimmed frequently enough on
small databases that we could raise the max_age. It's really annoying to see
all these vacuums running 10x more often than necessary.

The rest of the thread is visible at the bottom of:

http://article.gmane.org/gmane.comp.db.postgresql.devel.general/107525

 Also, is anything being done about the concern about 'vacuum storm'
 explained below?

I'm interested too.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Open item: kerberos warning message

2009-01-13 Thread Gregory Stark
Magnus Hagander mag...@hagander.net writes:

 Now that we have support for mappings, I expect it will be more common
 for a user to authenticate with princ 'A' and then connect using their
 Unix id 'B' to a PG user 'B'.  As such, I'm alright with dropping
 support for this.  Users can always use -U (or equiv) if necessary.

 I have applied this version of the patch.

yay!

Incidentally, this will have to be in the update notes I think.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recovery Test Framework

2009-01-13 Thread Gregory Stark

Tom Lane t...@sss.pgh.pa.us writes:

 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 Robert Haas wrote:
 (It would be interesting to here how much value people think it has
 added, and get suggestions on how to do things better next time.)

 I'm not sure how much round-robin-review has taken load off committers, 
 you have to read and understand a patch before committing anyway. It has 
 helped, for sure, but not dramatically. However, I think that it has 
 made a big difference from authors point of view; you get feedback earlier.

 I think it's helped from the committers' standpoint too, in the form of
 taking care of some issues that would otherwise have had to be dealt
 with by the committer.  (Which was all we asked for anyway.)

I was pleasantly surprised by how helpful the feedback was on posix_fadvise. I
don't know how much real work it removed from Tom's plate but I suspect it did
reduce the little annoyances significantly.

 In my mind though, the real benefit of the system and the reason we
 should keep it up is to get more people looking at the code.  New
 committers don't grow on trees, they come from people getting involved.

Good point.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Status of issue 4593

2009-01-13 Thread Gregory Stark
Kevin Grittner kevin.gritt...@wicourts.gov writes:

 Josh Berkus j...@agliodbs.com wrote: 
 we'd break 100,000 existing Java applications if we changed the
 error. 
  
 In what way would an application want to treat deadlocks and update
 conflicts differently?  Both result from conflicts with concurrent
 transactions and can be retried automatically.  It seems like an
 implementation detail with little chance of impact on applications to
 me.  Can anyone provide a contrary example or argument?

Well generally deadlocks are treated differently in that they are treated by
rewriting the application to not cause deadlocks.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] A single escape required for log_filename

2009-01-13 Thread Gregory Stark
Andrew Dunstan and...@dunslane.net writes:

 Then Debian is (surprise!) not doing the smartest thing. Not using the logging
 collector means you miss several possible advantages, including CSV logs and
 protection against multiplexed log lines.

Well it's not the smartest thing by your set of priorities. Debian's
priorities are generally to have all packages behave in consistent ways. In
particular it's very useful to be able to guarantee that all log files will go
to the same place and be governed by a single set of policies defined in a
single place.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] about truncate

2009-01-12 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 On Mon, 2009-01-12 at 11:43 +0200, Peter Eisentraut wrote:
 Peter Eisentraut wrote:
  Tom Lane wrote:
  +1 for making TRUNCATE and LOCK support ONLY.
  
  Patch attached.
 
 This was committed.

 Please could we put in a GUC to allow that to be toggled in this release

That seems like it would just be putting off the pain. It doesn't make it any
easier to migrate in the end.

 and warning issued for non-optional behaviour change in following
 release? 

We do print INFO messages when drops cascade. We could print similar messages
when DDL applies recursively by default. (We can't do DML since it would fill
logs quickly). That seems reasonable to me.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recovery Test Framework

2009-01-12 Thread Gregory Stark
Simon Riggs si...@2ndquadrant.com writes:

 Is it flaky? Not fundamentally; code wise I see it more as a
 question of time. 

a question of time indeed.

 If we insist upon cuts, ...

 Even if we reject replication entirely ...

There's a clear difference between how you're thinking about this and I do.
The way I see it nobody suggested cutting or rejecting anything, just
committing it into a different branch for a different release date. It would
give us a year of experience seeing the code in action before releasing it on
the world.

I'm not sure whether it's too immature to commit, I haven't read the patch;
from what I see in the mailing list it seems about as ready as other large
patches in the past which were committed. But from my point of view it would
just always be better to commit large patches immediately after forking a
release instead of just before the beta to give them a whole release cycle of
exposure to developers before beta testers.

I'm not sure if this is the right release cycle to start this policy, but I
would like to convince people of this at some point so we can start having a
flood of big commits at the *beginning* of the release cycle and then a whole
release cycle of incremental polishing to those features rather than always
having freshly committed features in our releases that none of us has much
experience with.

 a recovery test framework is going to increase the robustness of what we
 *currently* have. It's not a new requirement; what is new is I now have some
 sponsorship money explicitly earmarked for this, but as part of the HS
 project and indeed the test framework relies upon HS to operate at all.

I agree with you that additional tests don't represent any immaturity in the
patch. They don't affect the run-time behaviour and I love the fact that they
might turn up any problems with our existing recovery process.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] Status of issue 4593

2009-01-12 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Peter Eisentraut pete...@gmx.net writes:
 I can see two ways forward:

 1) We document bluntly that ORDER BY + FOR UPDATE can return unordered 
 results, or

 2) We prohibit ORDER BY + FOR UPDATE, like we do with a number of other 
 clauses.  (There would be no loss of functionality, because you can run 
 the query a second time in the transaction with ORDER BY.)

 That code has been working like this for eight or ten years now and this
 is the first complaint, so taking away functionality on the grounds that
 someone might happen to update the ordering column doesn't seem like the
 answer to me.

Can we detect it at run-time? If a recheck happens can we somehow know which
columns could be problematic to find updated and check that they're unchanged?
I'm pretty sure the answer is no, but I figured I would throw it out there in
case it gives anyone an idea.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sample of user-define window function and other things

2009-01-12 Thread Gregory Stark
Peter Eisentraut pete...@gmx.net writes:

 Hitoshi Harada wrote:
 - CREATE FUNCTION command accepts WINDOW keyword for non-c language
 like plpgsql. Don't we need to throw error?

 The validator procedures of those languages should be made to reject that case
 if they can't support it.

Even if they can support it shouldn't they reject functions that aren't
actually window functions? What happens if you mark a perfectly normal
function as a window function, does it behave sanely?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Tweak order of operations in BitmapHeapNext() to avoid the case

2009-01-12 Thread Gregory Stark

t...@postgresql.org (Tom Lane) writes:

 Log Message:
 ---
 Tweak order of operations in BitmapHeapNext() to avoid the case of prefetching
 the same page we are nanoseconds away from reading for real.  There should be
 something left to do on the current page before we consider issuing a 
 prefetch.

Doesn't this break things if, say, there's precisely one tuple on every page?
You'll keep raising the prefetch_target but never actually prefetch anything.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recovery Test Framework

2009-01-12 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Yeah, but there are already a number of things in 8.4 that are killer
 features for various applications --- window functions and WITH to take
 two recently-committed examples.  Should we sit on those for however
 long it will take to make replication release-worthy?

Do we know it's not release-worthy now? From what I see Heikki is proposing
refactorings which improve the code but hasn't found anything actually broken.
I'm all for cleaner simpler code -- especially in critical backup processes
since simpler means safer -- but just because there are better ways to do
things doesn't mean the current code isn't acceptable.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch for str_numth() in PG 7.4

2009-01-12 Thread Gregory Stark
Andreas 'ads' Scherbaum adsm...@wars-nicht.de writes:

 I tracked the problem done to the function str_numth() in
 src/backend/utils/adt/formatting.c. The fix (attached) is easy: i
 stole the function code from the latest 8.0 version which looks fare
 more sane in handling the string pointers.

I think you attached the wrong patch.

 *** src/bin/psql/describe.c   2008-12-21 13:19:40.0 +0100
 --- src/bin/psql/describe.c.orig  2008-12-21 02:16:31.0 +0100
 ***
 *** 464,476 
d.datctype as 
 \%s\,\n,
 gettext_noop(Collation),
 gettext_noop(Ctype));
 ! if (pset.sversion = 80100)
 ! appendPQExpBuffer(buf,
 !  
 pg_catalog.array_to_string(d.datacl, E'\\n') as \%s\,
 !   gettext_noop(Access Privileges));
 ! else
 ! appendPQExpBuffer(buf,
 !  
 pg_catalog.array_to_string(d.datacl, '\\n') as \%s\,
 gettext_noop(Access Privileges));
   if (verbose  pset.sversion = 80200)
   appendPQExpBuffer(buf,
 --- 464,471 
d.datctype as 
 \%s\,\n,
 gettext_noop(Collation),
 gettext_noop(Ctype));
 ! appendPQExpBuffer(buf,
 !  d.datacl as \%s\,
 gettext_noop(Access Privileges));
   if (verbose  pset.sversion = 80200)
   appendPQExpBuffer(buf,

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Recovery Test Framework

2009-01-12 Thread Gregory Stark
Bruce Momjian br...@momjian.us writes:

 As for the process used, I think it is useful to understand how
 committers choose what to work on next.  One criteria is that the patch
 has stabilized;  if a patch is still be modified regularly, the
 committer might as well work on another patch that has stabilized.  Now,
 a committer could ask for the patch to stabilize to work on it, but if
 he has other patches that are stable, there is no point in asking for a
 stable version;  he might as well work on just stable ones until only
 unstable ones are left.

 Now, maybe this is unfair to patches that are frequently updated, but
 this is the typical process we follow, and it explains why the patches
 above have not gotten near commit status yet.

It's not just unfair. It's counter-productive. It means you're ignoring the
very patches whose authors are mostly likely to be responsive to requests to
change them. And who would be most likely to be fertile ground for further
improvements.

Perhaps it would be useful for you to understand how it looks from a
submitter's point of view. As long as the patch sits in limbo only minor
tweaks and refinements are worth bothering with. Any thoughts of continuing on
any subsequent phases of development are all crushed since all that work might
go down the drain when the committer makes changes to the code it's based on.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgsql: Re-enable the old code in xlog.c that tried to use

2009-01-11 Thread Gregory Stark

t...@postgresql.org (Tom Lane) writes:

 Log Message:
 ---
 Re-enable the old code in xlog.c that tried to use posix_fadvise(), so that
 we can get some buildfarm feedback about whether that function is still
 problematic.  (Note that the planned async-preread patch will not really
 prove anything one way or the other in buildfarm testing, since it will
 be inactive with default GUC settings.)

! #if defined(USE_POSIX_FADVISE)  defined(POSIX_FADV_DONTNEED)

I think if we're not putting the AC_CHECK_RUN check in then this should just
be HAVE_POSIX_FADVISE not USE_POSIX_FADVISE. The latter was the define I made
if the run check passed.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Get trained by Bruce Momjian - ask me about EnterpriseDB's PostgreSQL 
training!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

 I've started to look through this, and the only part I seriously don't
 like is the nbtsearch.c changes.  I've got three complaints about that:

 * Doing it inside the index AMs is wrong, or at the very least forces
 us to do it over for each index AM (which the patch fails to do).

ok

 * As coded, it generates prefetch bursts that are much too large and too
 widely spaced to be effective, not to mention that they entirely
 ignore the effective_io_concurrency control knob as well as the order
 in which the pages will actually be needed.  I wonder now whether
 Robert's inability to see any benefit came because he was testing
 indexscans and not bitmap scans.

Well experiments showed that it was very effective, even more so than for
bitmap scans. So much so that it nearly obliterated bitmap scans' advantage
over index scans.

I originally thought like you that all that logic was integral to the thing
but eventually came around to think the opposite. That logic is to overcome a
fundamental problem with bitmap scans -- that there's no logical group to
prefetch and a potentially unbounded stream of pages. Index scans just don't
have that problem so they don't need that extra logic.

 * It's only accidental that it's not kicking in during a bitmap
 indexscan and bollixing up the much-more-carefully-written
 nodeBitmapHeapscan prefetch logic.

ok.

 What I intend to do over the next day or so is commit the prefetch
 infrastructure and the bitmap scan prefetch logic, but I'm bouncing the
 indexscan part back for rework.  I think that it should be implemented
 in or near index_getnext() and pay attention to
 effective_io_concurrency.

So to do that I would have a few questions.

a) ISTM necessary to keep a dynahash of previously prefetched pointers around
   to avoid repeatedly prefetching the same pages. That could get quite large
   though. Or do you think it would be fine to visit the buffer cache,
   essentially using that as the hash, for every index pointer?

b) How would index_getnext keep two read pointers like bitmap heap scans? I
   think this would require adding a new index AM api similar to your
   tuplestore api where the caller can maintain multiple read pointers in the
   scan. And I'm not sure how that would work with mark/restore.

c) I would be afraid that pushing the index scan to reach for the next index
   leaf page prematurely (and not just a async prefetch, but a blocking read)
   would cause extra random i/o which would slow down the critical path of
   reading the current index tuples. So I think we would still want to pause
   when we hit the end of the current leaf page. That would require some form
   of feedback in the index am api as well.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] posix_fadvise v22

2009-01-11 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Robert Haas robertmh...@gmail.com writes:
 OK, here's an update of Greg's patch with the runtime configure test
 ripped out, some minor documentation tweaks, and a few unnecessary
 whitespace diff hunks quashed.  I think this is about ready for
 committer review.

 I've applied most of this, with a couple of notable revisions:

 1. The runtime check on whether posix_fadvise works is really gone.
 We'll see whether we need to put anything back, but I think our first
 try should be under the assumption that it works.  (BTW, I was wrong
 in my earlier claim that the buildfarm wouldn't exercise posix_fadvise
 by default --- the patch defaults to io_concurrency = 1 if fadvise
 is available, so it will try to prefetch one page ahead.)

 2. I fixed it so that setting effective_io_concurrency to zero disables
 prefetching altogether; there was no way to do that in the patch as
 submitted.

Hm. the original intent was that effective_io_concurrency 1 meant no
prefetching because there was only one drive.

I wonder if that worked earlier and got lost along the way or if I always had
this wrong.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCHES] updated hash functions for postgresql v1

2009-01-10 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 Jeff Davis pg...@j-davis.com writes:
 I ran 5 times on both old and new code, eliminating the top and bottom
 and taking the average of the remaining 3, and I got a 6.9% performance
 improvement with the new code.

 The question that has been carefully evaded throughout the discussion
 of this patch is whether the randomness of the hash result is decreased,

In fairness that doesn't seem to be the case. The original patch was posted
with such an analysis using cracklib and raw binary data:

http://article.gmane.org/gmane.comp.db.postgresql.devel.general/105675

  marginal performance improvement in the hash function itself (which is
 already shown to be barely measurable in the total context of a
 hash-dependent operation...)

If it's a 6% gain in the speed of Hash Join or HashAggregate it would be very
interesting. But I gather it's a 6% speedup in the time spent actually in the
hash function. Is that really where much of our time is going? If it's 10% of
the total time to execute one of these nodes then we're talking about a 0.6%
optimization...

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's PostGIS support!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Buffer pool statistics in Explain Analyze

2009-01-09 Thread Gregory Stark
Tom Lane t...@sss.pgh.pa.us writes:

 No, I think you misunderstood me entirely.  The reason that I rejected
 those parts of the patch is that I think the statistics that are
 available are wrong/useless.  If the bufmgr.c counters were really good
 for something they'd have been exposed long since (and we'd probably
 never have built a lot of the other stats collection infrastructure).

The collective stats across the whole cluster and the individual stats for a
specific query broken down by plan node are complementary. Depending on the
circumstance people sometimes need each.

I actually also wrote a patch exposing this same data. I think the bufmgr
counters are flawed but still useful. Just as an example think of how often
you have to explain why a sequential scan of a small table can be faster than
an index scan. Seeing the index scan actually require more logical buffer
fetches than a sequential scan would go a long way to clearing up that
confusion. Better yet, users would be in a position to see whether the planner
is actually estimating i/o costs accurately.

 The EXPLAIN ANALYZE code you submitted is actually kinda cute, and
 I'd have had no problem with it if I thought it were displaying
 numbers that were useful and unlikely to be obsoleted in future
 releases.  The work that needs to be done is on collecting the
 numbers more than displaying them.

I agree that we need more data -- my favourite direction is to use a
programmatic interface to dtrace to find out how many buffer reads are
satisfied from filesystem cache and how many from physical reads. But when we
do that doesn't obviate the need for these stats, it would enhance them. You
would get a clear view of how many buffer fetches were satisfied from shared
buffers, filesystem cache, and physical reads.


-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's RemoteDBA services!

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

2009-01-05 Thread Gregory Stark

Robert Haas robertmh...@gmail.com writes:

 Regardless of whether we do that or not, no one has offered any
 justification of the arbitrary decision not to compress columns 1MB,

Er, yes, there was discussion before the change, for instance:

 http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php


And do you have any response to this point?

 I think the right value for this setting is going to depend on the
 environment. If the system is starved for cpu cycles then you won't want to
 compress large data. If it's starved for i/o bandwidth but has spare cpu
 cycles then you will.

http://archives.postgresql.org/pgsql-hackers/2009-01/msg00074.php


 and at least one person (Peter) has suggested that it is exactly
 backwards.  I think he's right, and this part should be backed out.

Well the original code had a threshold above which we *always* compresed even
if it saved only a single byte.






-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's On-Demand Production Tuning

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   3   4   5   6   7   8   9   10   >