Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread jesper
 Jeff Janes  wrote:
 Kevin Grittner  wrote:

 create table t (id int not null primary key);
 insert into t select generate_series(1, 100);
 vacuum freeze analyze;
 explain analyze select count(*) from t
 where id between 50 and 500010;

 That gives you an index-only scan; but without the WHERE clause it
 uses a seq scan.

 If you convert the where clause to where id is not null it uses
 the index only scan again, but only if you nudge it too with
 enable_seqscan=off.

 Clever way to get a full-table test.

 It turns out that for the above, with your trick to use the index
 only scan, it comes out 12% faster to do a seqscan, even when the
 table and index are fully cached (based on the average time of ten
 runs each way).  There's very little overlap, so the difference looks
 real.  But that's on a very narrow record, having just the one column
 used in the index.  I added one wide column like this:

 alter table t add column x text;
 update t set x = (repeat(random()::text, (random() * 100)::int));
 cluster t USING t_pkey;
 vacuum freeze analyze;

 With that change the index-only scan time remained unchanged, while
 the seqscan time grew to about 2.6 times the index only scan time.
 That was mildly surprising for me, considering it was all still
 cached.

Moving data around in memory isn't that cheap, and you have probably made
the row size 5 times larger by the above thing. However if you added 4000
bytes instead of 100, you should be back to the previous results since
data then ends up (most likely,otherwise add a bit more) in a toast table.

Jesper



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Bug in walsender when calling out to do_pg_stop_backup (and others?)

2011-10-11 Thread Magnus Hagander
On Tue, Oct 11, 2011 at 03:29, Florian Pflug f...@phlo.org wrote:
 On Oct10, 2011, at 21:25 , Magnus Hagander wrote:
 On Thu, Oct 6, 2011 at 23:46, Florian Pflug f...@phlo.org wrote:
 It'd be nice to generally terminate a backend if the client vanishes, but so
 far I haven't had any bright ideas. Using FASYNC and F_SETOWN unfortunately
 sends a signal *everytime* the fd becomes readable or writeable, not only on
 EOF. Doing select() in CHECK_FOR_INTERRUPTS seems far too expensive. We 
 could
 make the postmaster keep the fd's of around even after forking a backend, 
 and
 make it watch for broken connections using select(). But with a large 
 max_backends
 settings, we'd risk running out of fds in the postmaster...

 Ugh. Yeah. But at least catching it and terminating it when we *do*
 notice it's down would certainly make sense...

 I'll try to put together a patch that sets a flag if we discover a broken
 connection in pq_flush, and tests that flag in CHECK_FOR_INTERRUPTS. Unless 
 you
 wanna, of course.

Please do, I won't have time to even think about it until after
pgconf.eu anyway ;)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Thom Brown
On 11 October 2011 02:14, Florian Pflug f...@phlo.org wrote:
 On Oct10, 2011, at 20:06 , Thom Brown wrote:
 Okay, a real example of why discrete should be '[]' and continuous
 should be '[)'.

 If you book a meeting from 09:00 to 11:00 (tsrange), at 11:00
 precisely it either becomes free or is available to someone else, so
 it can be booked 11:00 to 12:00 without conflict.

 If you have raffle tickets numbered 1 to 100 (int4range), and you ask
 for tickets 9 to 11, no-one else can use 11 as it aligns with the last
 one you bought.

 So for me, it's intuitive for them to behave differently.  So yes,
 default behaviour would vary between types, but I didn't previously
 read anything on default_flags, so I don't know where that comes into
 it.  Shouldn't it be the case that if a type has a canonical function,
 it's entirely inclusive, otherwise it's upper boundary is exclusive?

 First, there's the type date, which in my book is discrete. So we'd make
 date ranges closed by default, not half-open. And there's timestamp, which
 is continuous so we'd make its default half-open. That doesn't seem exactly
 intuitive to me.

Ah yes, I agree there.  Okay, I see your point.

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-10-11 Thread Kyotaro HORIGUCHI

At Fri, 7 Oct 2011 12:25:08 -0400, Robert Haas robertmh...@gmail.com wrote in 
ca+tgmoao2oozbmusfp3zc0_lgxsv3jbvy9eyr5h+czyez7j...@mail.gmail.com
 OK, I think this is reasonably close to being committable now.  There
 are a few remaining style and grammar mistakes but I can fix those up
 before committing.

Thank you. 

   One thing I still think it would be useful to add,
 though, is some comments to pg_utf8_increment() and
 pg_eucjp_increment() describing the algorithm being used.  Can you
 take a crack at that?

 Yes I'll do it in a day or two.

 Regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule
Hello

A current limits of dynamic work with row types in PL/pgSQL can be
decreased with a possible casts between rows and arrays. Now we have a
lot of tools for arrays, and these tools should be used for rows too.

postgres=# \d mypoint
Composite type public.mypoint
 Column │  Type   │ Modifiers
┼─┼───
 a  │ integer │
 b  │ integer │

postgres=# select cast(rmypoint '(10,20) as int[]);
   array

 {10,20}
(1 row)

postgres=# select cast(ARRAY[10,20] AS mypoint);
 mypoint
─
 (10,20)
(1 row)

What do you think about this idea?

Regards

Pavel Stehule

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-10-11 Thread Jun Ishiduka

 I can't see a reason why we would use a new WAL record for this,
 rather than modify the XLOG_PARAMETER_CHANGE record type which was
 created for a very similar reason.
 The code would be much simpler if we just extend
 XLOG_PARAMETER_CHANGE, so please can we do that?

Sure.

 The log message full_page_writes on master is set invalid more than
 once during online backup should read at least once rather than
 more than once.

Yes.

 lastFpwDisabledLSN needs to be initialized.

I think it don't need because all values in XLogCtl is initialized 0.

 Is there a reason to add lastFpwDisabledLSN onto the Control file? If
 we log parameters after every checkpoint then we'll know the values
 when we startup. If we keep logging parameters this way we'll end up
 with a very awkward and large control file. I would personally prefer
 to avoid that, but that thought could go either way. Let's see if
 anyone else thinks that also.

Yes. I add to CreateCheckPoint().

Image:
  CreateCheckPoint()
  {
 if (!shutdown  XLogStandbyInfoActive())
 {
LogStandbySnapshot()
XLogReportParameters()
 }
   }

  XLogReportParameters()
  {
 if (fpw == 'off' || ... )
 XLOGINSERT()
  }

However, it'll write XLOG_PARAMETER_CHANGE every checkpoints when FPW is 'off'.
(It will increases the amount of WAL.)
Is it OK?


Regards.


Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka@po.ntts.co.jp




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:19 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 I have mostly-working code for approach #3, but I haven't tried to make
 EXPLAIN work yet.  While looking at that I realized that there's a
 pretty good argument for adding the above-mentioned explicit TargetEntry
 list representing the index columns to index-only plan nodes.  Namely,
 that if we don't do it, EXPLAIN will have to go to the catalogs to find
 out what's in that index, and this will fall down for hypothetical
 indexes injected into the planner by index advisors.  We could imagine
 adding some more hooks to let the advisor inject bogus catalog data at
 EXPLAIN time, but on the whole it seems easier and less fragile to just
 have the planner include a data structure it has to build anyway into
 the finished plan.

 The need for this additional node list field also sways me in a
 direction that I'd previously been on the fence about, namely that
 I think index-only scans need to be their own independent plan node type
 instead of sharing a node type with regular indexscans.  It's just too
 weird that a simple boolean indexonly property would mean completely
 different contents/interpretation of the tlist and quals.

 Attached is a draft patch for this.  It needs some more review before
 committing, but it does pass regression tests now.

 One point worth commenting on is that I chose to rename the OUTER and
 INNER symbols for special varnos to OUTER_VAR and INNER_VAR, along with
 adding a new special varno INDEX_VAR.  It's bothered me for some time
 that those macro names were way too generic/susceptible to collision;
 and since I had to look at all the uses anyway to see if the INDEX case
 needed to be added, this seemed like a good time to rename them.

+1 for that renaming, for sure.

Have you given any thought to what would be required to support
index-only scans on non-btree indexes - particularly, GIST?  ISTM we
might have had a thought that some GIST opclasses but not others would
be able to regurgitate tuples, or maybe even that it might vary from
index tuple to index tuple.  But that discussion was a long time ago,
and my memory is fuzzy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Robert Haas
On Mon, Oct 10, 2011 at 3:15 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:

 Right now, our costing model for index-only scans is pretty dumb.
 It assumes that using an index-only scan will avoid 10% of the
 heap fetches.  That could easily be low, and on an insert-only
 table or one where only the recently-updated rows are routinely
 accessed, it could also be high.

 As a reality check, I just ran this query on a table in a statewide
 copy of our data:

 select count(*),
  sum(case when xmin = '2'::xid then 0 else 1 end) as read_heap
  from CaseHist;

 and got:

   count   | read_heap
 ---+---
  205765311 |   3934924

 So on our real-world database, it would skip something on the order
 of 98% of the heap reads, right?

Yeah, if it's scanning the whole table.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: table/index options | was: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Robert Haas
On Mon, Oct 10, 2011 at 3:16 PM, Cédric Villemain
cedric.villemain.deb...@gmail.com wrote:
 2011/10/10 Robert Haas robertmh...@gmail.com:
 On Mon, Oct 10, 2011 at 1:36 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:

 That gives you an index-only scan; but without the WHERE clause it
 uses a seq scan.  I think it's mainly a matter of doing enough
 benchmarks to figure out how best to model the costs of the index
 scan so that it can be picked for that case.

 Right now, our costing model for index-only scans is pretty dumb.  It
 assumes that using an index-only scan will avoid 10% of the heap
 fetches.  That could easily be low, and on an insert-only table or one
 where only the recently-updated rows are routinely accessed, it could
 also be high.  To use an index-only scan for a full-table COUNT(*),
 we're going to have to be significantly smarter, because odds are good
 that skipping 10% of the heap fetches won't be sufficient inducement
 to the planner to go that route; we are going to need a real number.

 I have to raise that I think we are going to face the exact same issue
 with the visibility_fraction that we face with the hack to set
 random_page_cost very low to help optimizer (when index/table is
 mostly in cache).

 4 options have been viewed so far:
 1. pg_class (initial proposal to store the cache estimates)
 2. pg_class_nt (revived by Alvaro IIRC)
 3. reloption
 4. GUC (by Tom for visibility_fraction)

 I am in favor of 1 or 2, 4 is a backup option, and 3 an open door to
 planner hint (others also let DBA use its knowledge if he wants, but 3
 make it mandatory for the DBA to decide, and no automatic way can be
 used to update it, except if someone make ALTER TABLE lock free)

 (It does not prevent a cost_indexonly() to be written meawhile...)

 What do you think/prefer/suggest ?

Well, I think a GUC is kind of useless, because you're going to want
to make this per-table.

As to the rest, I think they're all going to have the same problems -
or non-problems - with ALTER TABLE locking the full table.  If that's
a show-stopper, we should try to fix it.  But how to do that is a
topic for another thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Royce Ausburn
Hi all,

I wonder, could the recent work on index only scans pave the way for auto 
clustered tables?  Consider a wide, mostly insert table with some subset of 
columns that I'd like to cluster on.  I'm after locality of tuples that are 
very frequently fetched together, but not keen on the downtime for a cluster, 
nor the maintenance that it requires.  Would it be a stretch to have an index 
that branches on the subset of cluster columns, but still stores all the 
columns, making it a covering index?  Given that we can already index 
concurrently, such an index would not require downtime, and would be self 
maintaining.  From my understanding of the index-only scan implementation, I 
suspect that such an index would effectively give locality, with some caveats… 

I'd expect the overhead of inserting in to such a table would be high, perhaps 
prohibitive.  Perhaps better ways have been discussed.  Stupid idea?

--Royce


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Buildfarm git failures

2011-10-11 Thread Dave Page
The reported git failures from the buildfarm should clear on the next run -
they're the result of an upgrade to the git package on the git server which
left the git-daemon in an non-functioning state. It's back up now.

Apologies for the noise.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-11 Thread Royce Ausburn

On 08/10/2011, at 1:56 AM, Yeb Havinga wrote:

 Attach is v2 of the patch.
 
 Mixed notation now raises an error.
 
 In contrast with what I said above, named parameter related errors are thrown 
 before any syntax errors. I tested with raising syntax errors first but the 
 resulting code was a bit more ugly and the sql checking under a error 
 condition (i.e. double named parameter error means there is one parameter in 
 short)  was causing serious errors.
 
 Documentation was also added, regression tests updated.

I've tested this patch out and can confirm mixed notation produces an error:

psql:plsqltest.sql:27: ERROR:  mixing positional and named parameter assignment 
not allowed in cursor cur1
LINE 10:   open cur1( param2 := 4, 2, 5); 

Just one small thing: it'd be nice to have an example for cursor declaration 
with named parameters.  Your patch adds one for opening a cursor, but not for 
declaring one.

Other than that, I think the patch is good.  Everything works as advertised =)

--Royce


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Alexander Korotkov
On Tue, Oct 11, 2011 at 2:36 PM, Robert Haas robertmh...@gmail.com wrote:

 Have you given any thought to what would be required to support
 index-only scans on non-btree indexes - particularly, GIST?  ISTM we
 might have had a thought that some GIST opclasses but not others would
 be able to regurgitate tuples, or maybe even that it might vary from
 index tuple to index tuple.  But that discussion was a long time ago,
 and my memory is fuzzy.

In some GiST opclasses original tuple can't be restored from index tuple.
For example, polygon can't be restored from it's MBR. In some opclasses, for
example box_ops and point_ops, original tuple can be easily restore from
index tuple.
I can't remeber any implementation where this possibility can vary from
index tuple to index tuple. GiST opclass for ts_vector have different
representation of leaf index tuple depending on it's length. But, at most,
it store array of hashes of words, so it's lossy anyway.
I think GiST interface could be extended with optional function which
restores original tuple. But there is some additional difficulty, when some
opclasses of composite index provide this function, but others - not.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

Well, a ROW can contain values of different types; an ARRAY can't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-11 Thread Yeb Havinga

Hello Royce,

Thanks again for testing.

On 2011-10-11 13:55, Royce Ausburn wrote:


Just one small thing: it'd be nice to have an example for cursor declaration 
with named parameters.  Your patch adds one for opening a cursor, but not for 
declaring one.


Declaration of cursors with named parameters is already part of 
PostgreSQL (so it is possible to use the parameter names in the cursor 
query instead of $1, $2, etc.) and it also already documented with an 
example, just a few lines above the open examples. See curs3 on 
http://developer.postgresql.org/pgdocs/postgres/plpgsql-cursors.html



Other than that, I think the patch is good.  Everything works as advertised =)


Thanks!

--
Yeb Havinga
http://www.mgrid.net/
Mastering Medical Data


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [REVIEW] Patch for cursor calling with named parameters

2011-10-11 Thread Royce Ausburn

On 11/10/2011, at 11:38 PM, Yeb Havinga wrote:

 Declaration of cursors with named parameters is already part of PostgreSQL 
 (so it is possible to use the parameter names in the cursor query instead of 
 $1, $2, etc.) and it also already documented with an example, just a few 
 lines above the open examples. See curs3 on 
 http://developer.postgresql.org/pgdocs/postgres/plpgsql-cursors.html

Doh - my apologies!

Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread David Fetter
On Mon, Oct 10, 2011 at 10:25:18PM -0700, Jeff Davis wrote:
 On Tue, 2011-10-11 at 03:14 +0200, Florian Pflug wrote:
  Maybe ranges over discrete types are slightly more likely to be
  closed, and ranges over continuous types slightly more likely to
  be open. Still, I very much doubt that the skew in the
  distribution is large enough to warrant the confusion and
  possibility of subtle bugs we introduce by making the semantics of
  a range type's constructor depend on the definition of the range
  and/or base type.
 
 I think you persuaded me on the consistency aspect.
 
 I'm wondering whether to do away with the default argument entirely,
 and just force the user to always specify it during construction. It
 seems like a shame that such pain is caused over the syntax, because
 in a perfect world it wouldn't be a bother to specify it at all. I
 even considered using prefix/postfix operators to try to make it
 nicer, but it seems like every idea I had was just short of
 practical. Maybe a few extra characters at the end aren't so bad.
 
 I'd like to hear from some potential users though to see if anyone
 recoils at the common case.

I'd recoil at not having ranges default to left-closed, right-open.
The use case for that one is so compelling that I'm OK with making it
the default from which deviations need to be specified.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule
2011/10/11 Robert Haas robertmh...@gmail.com:
 On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

 Well, a ROW can contain values of different types; an ARRAY can't.

yes, I know - but it should be problem only in few cases - when is not
possible to cast a row field to array field.

This is based on user knowledge - it has to choose a adequate array type

sometimes he can use a numeric or int array, sometimes he have to
select text array. Target type is selected by user, and cast fail when
conversion is not possible.

Pavel




 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule

 sometimes he can use a numeric or int array, sometimes he have to
 select text array. Target type is selected by user, and cast fail when
 conversion is not possible.


using a some selected type (for array field) allows a processing in plpgsql.

motivation for this feature is simplification and speedup similar
requests like this
http://stackoverflow.com/questions/7711432/how-to-set-value-of-composite-variable-field-using-dynamic-sql/7722575#7722575

Pavel

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Boszormenyi Zoltan

Hi,

2011-10-11 14:23 keltezéssel, Robert Haas írta:

On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehulepavel.steh...@gmail.com  wrote:

Hello

A current limits of dynamic work with row types in PL/pgSQL can be
decreased with a possible casts between rows and arrays. Now we have a
lot of tools for arrays, and these tools should be used for rows too.

postgres=# \d mypoint
Composite type public.mypoint
  Column │  Type   │ Modifiers
┼─┼───
  a  │ integer │
  b  │ integer │

postgres=# select cast(rmypoint '(10,20) as int[]);
   array

  {10,20}
(1 row)

postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
─
  (10,20)
(1 row)

What do you think about this idea?

Well, a ROW can contain values of different types; an ARRAY can't.


this reminds me that recently I thought about making anyelement
a real type. anyelement[] would allow you to have different types in
the same array. The real type OID and the data both would be stored and
anyelement to cstring would reveal both in e.g. 'oid,value_converted_by_outfunc'
format. The anyelement to real type and any type to anyelement conversion
would be painless.

The problem is that anyelement (when the underlying type in not text) to text
conversion would be ambiguous and give different answers:
anyelement - cstring - text gives 'oid,value_converted_by_outfunc'
anyelement - real type - cstring - text gives 'value_converted_by_outfunc'
Stupid idea.

Best regards,
Zoltán Böszörményi

--
--
Zoltán Böszörményi
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de
 http://www.postgresql.at/


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule
2011/10/11 Boszormenyi Zoltan z...@cybertec.at:
 Hi,

 2011-10-11 14:23 keltezéssel, Robert Haas írta:

 On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehulepavel.steh...@gmail.com
  wrote:

 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

 Well, a ROW can contain values of different types; an ARRAY can't.

 this reminds me that recently I thought about making anyelement
 a real type. anyelement[] would allow you to have different types in
 the same array. The real type OID and the data both would be stored and
 anyelement to cstring would reveal both in e.g.
 'oid,value_converted_by_outfunc'
 format. The anyelement to real type and any type to anyelement conversion
 would be painless.


 The problem is that anyelement (when the underlying type in not text) to
 text
 conversion would be ambiguous and give different answers:
 anyelement - cstring - text gives 'oid,value_converted_by_outfunc'
 anyelement - real type - cstring - text gives
 'value_converted_by_outfunc'
 Stupid idea.


it's near a variant datatype - Some times I though about some like
late binding - but my proposal is significantly simpler, because it
doesn't play with automatic choose of common subtype. It is based on
user choose.

Regards

Pavel

 Best regards,
 Zoltán Böszörményi

 --
 --
 Zoltán Böszörményi
 Cybertec Schönig  Schönig GmbH
 Gröhrmühlgasse 26
 A-2700 Wiener Neustadt, Austria
 Web: http://www.postgresql-support.de
     http://www.postgresql.at/


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Merlin Moncure
On Tue, Oct 11, 2011 at 3:40 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

Not sure what it buys you over the syntax we already have:

select row(foo[1], bar[2]);
select array[(bar).a, (bar).b];

Also, in my coding of composite types, homogeneously typed rows don't
really come up that often...

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule
2011/10/11 Merlin Moncure mmonc...@gmail.com:
 On Tue, Oct 11, 2011 at 3:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

 Not sure what it buys you over the syntax we already have:

 select row(foo[1], bar[2]);
 select array[(bar).a, (bar).b];

You can do it manually for known combinations of rowtype and
arraytype. But proposed casts do it generally - what has sense mainly
for plpgsql functions or some sql functions.


 Also, in my coding of composite types, homogeneously typed rows don't
 really come up that often...

you can use everywhere text type.

When I wrote 
http://stackoverflow.com/questions/7711432/how-to-set-value-of-composite-variable-field-using-dynamic-sql/7722575#7722575
then I had to do lot of string operations. Proposed casts
significantly do this simply - and it is enought general for general
usage.

Pavel


 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Florian Pflug
On Oct11, 2011, at 14:43 , David Fetter wrote:
 I'd recoil at not having ranges default to left-closed, right-open.
 The use case for that one is so compelling that I'm OK with making it
 the default from which deviations need to be specified.

The downside of that is that, as Tom pointed out upthread, we cannot
make [) the canonical representation of ranges. It'd require us to
increment the right boundary of a closed range, but that incremented
boundary might no longer be in the base type's domain.

So we'd end up with [) being the default for range construction,
but [] being the canonical representation, i.e. what you get back
when SELECTing a range (over a discrete base type).

Certainly not the end of the world, but is the convenience of being
able to write somerange(a, b) instead of somerange(a, b, '[)') really
worth it? I kind of doubt that...

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 Have you given any thought to what would be required to support
 index-only scans on non-btree indexes - particularly, GIST?  ISTM we
 might have had a thought that some GIST opclasses but not others would
 be able to regurgitate tuples, or maybe even that it might vary from
 index tuple to index tuple.  But that discussion was a long time ago,
 and my memory is fuzzy.

It would have to be a per-opclass property, for sure, since the basic
criterion is whether the opclass's compress function is lossy.
I don't think we can tolerate a situation where some of the index tuples
might be able to yield data and others not --- that would undo all the
improvements I've been working on over the past few days.

I haven't thought as far ahead as how we might get the information
needed for a per-opclass flag.  A syntax addition to CREATE OPERATOR
CLASS might be the only way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread David Fetter
On Tue, Oct 11, 2011 at 03:20:05PM +0200, Florian Pflug wrote:
 On Oct11, 2011, at 14:43 , David Fetter wrote:
  I'd recoil at not having ranges default to left-closed,
  right-open.  The use case for that one is so compelling that I'm
  OK with making it the default from which deviations need to be
  specified.
 
 The downside of that is that, as Tom pointed out upthread, we cannot
 make [) the canonical representation of ranges. It'd require us to
 increment the right boundary of a closed range, but that incremented
 boundary might no longer be in the base type's domain.
 
 So we'd end up with [) being the default for range construction, but
 [] being the canonical representation, i.e. what you get back when
 SELECTing a range (over a discrete base type).
 
 Certainly not the end of the world, but is the convenience of being
 able to write somerange(a, b) instead of somerange(a, b, '[)')
 really worth it? I kind of doubt that...

You're making a persuasive argument for the latter based solely on the
clarity.  If people see that 3rd element in the DDL, or need to
provide it, it's *very* obvious what's going on.

Cheers,
David (who suspects that having a syntax like somerange[a,b) just
won't work with the current state of parsers, etc.)
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Merlin Moncure
On Tue, Oct 11, 2011 at 8:18 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 2011/10/11 Merlin Moncure mmonc...@gmail.com:
 On Tue, Oct 11, 2011 at 3:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

 Not sure what it buys you over the syntax we already have:

 select row(foo[1], bar[2]);
 select array[(bar).a, (bar).b];

 You can do it manually for known combinations of rowtype and
 arraytype. But proposed casts do it generally - what has sense mainly
 for plpgsql functions or some sql functions.


 Also, in my coding of composite types, homogeneously typed rows don't
 really come up that often...

 you can use everywhere text type.

 When I wrote 
 http://stackoverflow.com/questions/7711432/how-to-set-value-of-composite-variable-field-using-dynamic-sql/7722575#7722575
 then I had to do lot of string operations. Proposed casts
 significantly do this simply - and it is enought general for general
 usage.

How does your approach compare to hstore?  hstore to me is just
enhanced generic container type which supports the operations you are
trying to do.  It can be trivially (as of 9.0) moved in an out of both
arrays and record types:

postgres=# create type foo_t as (a int, b text, c float);
CREATE TYPE

postgres=# select row(1, 'abc', 1.0)::foo_t  #= 'b=def';
 ?column?
---
 (1,def,1)
(1 row)

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Alexander Korotkov
On Tue, Oct 11, 2011 at 5:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 I haven't thought as far ahead as how we might get the information
 needed for a per-opclass flag.  A syntax addition to CREATE OPERATOR
 CLASS might be the only way.

Shouldn't it be implemented through additional interface function? There are
situations when restoring of original tuple requires some transformation.
For example, in point_ops we store box in the leaf index tuple, while point
can be easily restored from box.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Pavel Stehule
2011/10/11 Merlin Moncure mmonc...@gmail.com:
 On Tue, Oct 11, 2011 at 8:18 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 2011/10/11 Merlin Moncure mmonc...@gmail.com:
 On Tue, Oct 11, 2011 at 3:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 A current limits of dynamic work with row types in PL/pgSQL can be
 decreased with a possible casts between rows and arrays. Now we have a
 lot of tools for arrays, and these tools should be used for rows too.

 postgres=# \d mypoint
 Composite type public.mypoint
  Column │  Type   │ Modifiers
 ┼─┼───
  a      │ integer │
  b      │ integer │

 postgres=# select cast(rmypoint '(10,20) as int[]);
   array
 
  {10,20}
 (1 row)

 postgres=# select cast(ARRAY[10,20] AS mypoint);
  mypoint
 ─
  (10,20)
 (1 row)

 What do you think about this idea?

 Not sure what it buys you over the syntax we already have:

 select row(foo[1], bar[2]);
 select array[(bar).a, (bar).b];

 You can do it manually for known combinations of rowtype and
 arraytype. But proposed casts do it generally - what has sense mainly
 for plpgsql functions or some sql functions.


 Also, in my coding of composite types, homogeneously typed rows don't
 really come up that often...

 you can use everywhere text type.

 When I wrote 
 http://stackoverflow.com/questions/7711432/how-to-set-value-of-composite-variable-field-using-dynamic-sql/7722575#7722575
 then I had to do lot of string operations. Proposed casts
 significantly do this simply - and it is enought general for general
 usage.

 How does your approach compare to hstore?  hstore to me is just
 enhanced generic container type which supports the operations you are
 trying to do.  It can be trivially (as of 9.0) moved in an out of both
 arrays and record types:

for replace some value is hstore ok, but cast to arrays is more
general - you can do some tricks like table transposition, you can use
a all tricks that we have for arrays.


 postgres=# create type foo_t as (a int, b text, c float);
 CREATE TYPE

 postgres=# select row(1, 'abc', 1.0)::foo_t  #= 'b=def';
  ?column?
 ---
  (1,def,1)
 (1 row)

In some future version I would to have a general indexable types - and
then we can have a hash (hstore) in code, but casts to arrays or to
hashs can be useful - in higher languages like plpgsql or sql.

Pavel


 merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] DROP statement reworks

2011-10-11 Thread Robert Haas
On Mon, Oct 10, 2011 at 1:38 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 I'm sorry again. I tought it was obvious from the filenames.

I guess I got confused because you re-posted part 2 without the other
parts, and I got mixed up and thought you were reposting part one.

I've committed a stripped-down version of the part one patch, which
had several mistakes even in just the part I committed - e.g., you
forgot TABLESPACEOID.  I also did some renaming for clarity.

I'm going to throw this back to you for rebasing at this point, which
I realize is going to be somewhat of an unenjoyable task given the way
I cut up your changes to objectaddress.c, but I wasn't very confident
that all of the entries were correct (the one for attributes seemed
clearly wrong to me, for example), and I didn't want to commit a bunch
of stuff that wasn't going to be exercised.  I suggest that you merge
the remainder of the part-one changes into part-two.  On the flip
side, I think you should take the stuff that deals with dropping
relations OUT of part two.  I don't see what good it does us to try to
centralize the drop logic if we still have to have special cases for
relations, so let's just leave that separate for now until we figure
out a better approach, or at least split it off as a separate patch so
that it doesn't hold up all the other changes.

I think get_object_namespace() needs substantial revision.  Instead of
passing the object type and the object address, why not just pass the
object address?  You should be able to use the classId in the address
to figure out everything you need to know.  Since this function is
private to objectaddress.c, there's no reason for it to use those
accessor functions - it can just iterate through the array just as
object_exists() does.  That way you also avoid iterating through the
array multiple times.  I also think that we probably ought to revise
AlterObjectNamespace() to make use of this new machinery, instead of
making the caller pass in all the same information that
objectaddress.c is now learning how to provide.  That would possibly
open the way to a bunch more consolidation of the SET SCHEMA code; in
fact, we might want to clean that up first, before dealing with the
DROP stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread PostgreSQL - Hans-Jürgen Schönig
On Oct 7, 2011, at 8:47 PM, Joshua D. Drake wrote:

 
 On 10/07/2011 11:40 AM, Tom Lane wrote:
 Robert Haasrobertmh...@gmail.com  writes:
 Please find attached a patch implementing a basic version of
 index-only scans.
 
 I'm making some progress with this, but I notice what seems like a
 missing feature: there needs to be a way to turn it off.  Otherwise
 performance comparisons will be difficult to impossible.
 
 The most obvious solution is a planner control GUC, perhaps
 enable_indexonlyscan.  Anyone object, or want to bikeshed the name?
 
 enable_onlyindexscan
 
 I'm kidding.
 
 +1 on Tom's proposed name.


+1 ...
definitely an important thing to do.

regards,

hans

--
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt, Austria
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Tom Lane
Pavel Stehule pavel.steh...@gmail.com writes:
 2011/10/11 Robert Haas robertmh...@gmail.com:
 On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 What do you think about this idea?

It's a bad one.

 Well, a ROW can contain values of different types; an ARRAY can't.

 yes, I know - but it should be problem only in few cases - when is not
 possible to cast a row field to array field.

This idea is basically the same as data types don't matter, which is
not SQL-ish and certainly not Postgres-ish.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Bruce Momjian
Fujii Masao wrote:
 On Tue, Oct 11, 2011 at 6:37 AM, Simon Riggs si...@2ndquadrant.com wrote:
  On Mon, Oct 10, 2011 at 6:52 PM, Josh Berkus j...@agliodbs.com wrote:
 
  Tatsuo/Josh/Robert also discussed how recovery.conf can be used to
  provide parameters solely for recovery. That is difficult to do
  without causing all downstream tools to make major changes in the ways
  they supply parameters.
 
  Actually, this case is easily solved by an include recovery.conf
  parameter. ?So it's a non-issue.
 
  That is what I've suggested and yes, doing that is straightforward.
 
 Even if we do that, you still need to modify the tool so that it can handle
 the recovery trigger file. recovery.conf is used as just a configuration file
 (not recovery trigger file at all). It's not renamed to recovery.done at the
 end of recovery. If the tool depends on the renaming from recovery.conf
 to recovery.done, it also would need to be modified. If the tool needs to
 be changed anyway, why do you hesitate in changing it so that it adds
 include recovery.conf into postgresql.conf automatically?
 
 Or you think that, to keep the backward compatibility completely,
 recovery.conf should be used as not only a configuration file but also a
 recovery trigger one and it should be renamed to recovery.done at
 the end of recovery?

As much as I appreciate Simon's work in this area, I think we are still
unclear if keeping backward-compatibility is worth the complexity
required for future users.  Historically we have been bold in changing
postgresql.conf settings to improve clarity, and that approach has
served us well.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET variable - Permission issues

2011-10-11 Thread Bruce Momjian
Kevin Grittner wrote:
 Joe Conway m...@joeconway.com wrote:
  On 10/10/2011 01:52 PM, Gurjeet Singh wrote:
  
  ALTER USER novice SET MIN_VAL OF statement_timeout TO '1';
  -- So that the user cannot turn off the timeout
  
  ALTER DATABASE super_reliable SET ENUM_VALS OF synchronous_commit
TO 'on';
  -- So that the user cannot change the synchronicity of
  transactions against this database.
  
  I like this better than GRANT/REVOKE on SET.
  
 +1
  
 I would really like a way to prevent normal users from switching
 from the default transaction isolation level I set.  This seems like
 a good way to do that.  Putting sane bounds on some other settings,
 more to protect against the accidental bad settings than malicious
 mischief, would be a good thing, too.

Is this a TODO?  We might not want to make work_mem SUSET, but it would
allow administrators to control this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-10-11 Thread Jun Ishiduka

  I can't see a reason why we would use a new WAL record for this,
  rather than modify the XLOG_PARAMETER_CHANGE record type which was
  created for a very similar reason.
  The code would be much simpler if we just extend
  XLOG_PARAMETER_CHANGE, so please can we do that?
 
 Sure.
 
  The log message full_page_writes on master is set invalid more than
  once during online backup should read at least once rather than
  more than once.
 
 Yes.
 
  lastFpwDisabledLSN needs to be initialized.
 
 I think it don't need because all values in XLogCtl is initialized 0.
 
  Is there a reason to add lastFpwDisabledLSN onto the Control file? If
  we log parameters after every checkpoint then we'll know the values
  when we startup. If we keep logging parameters this way we'll end up
  with a very awkward and large control file. I would personally prefer
  to avoid that, but that thought could go either way. Let's see if
  anyone else thinks that also.
 
 Yes. I add to CreateCheckPoint().
 
 Image:
   CreateCheckPoint()
   {
  if (!shutdown  XLogStandbyInfoActive())
  {
 LogStandbySnapshot()
 XLogReportParameters()
  }
}
 
   XLogReportParameters()
   {
  if (fpw == 'off' || ... )
  XLOGINSERT()
   }
 
 However, it'll write XLOG_PARAMETER_CHANGE every checkpoints when FPW is 
 'off'.
 (It will increases the amount of WAL.)
 Is it OK?

Done.

Updated patch attached.

Regards.


Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka@po.ntts.co.jp



standby_online_backup_09base-02fpw.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Robert Haas
On Mon, Oct 10, 2011 at 3:59 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 I do have some concern about whether the performance improvements
 from reduced LW locking contention elsewhere in the code may (in
 whack-a-mole fashion) cause the percentages to go higher in SSI.
 The biggest performance issues in some of the SSI benchmarks were on
 LW lock contention, so those may become more noticeable as other
 contention is reduced.  I've been trying to follow along on the
 threads regarding Robert's work in that area, with hopes of applying
 some of the same techniques to SSI, but it's not clear whether I'll
 have time to work on that for the 9.2 release.  (It's actually
 looking improbably at this point.)

I ran my good old pgbench -S, scale factor 100, shared_buffers = 8GB
test on Nate Boley's box.  I ran it on both 9.1 and 9.2dev, and at all
three isolation levels.  As usual, I took the median of three 5-minute
runs, which I've generally found adequate to eliminate the noise.  On
both 9.1 and 9.2dev, read committed and repeatable read have basically
identical performance; if anything, repeatable read may be slightly
better - which would make sense, if it cuts down the number of
snapshots taken.

Serializable mode is much slower on this test, though.  On
REL9_1_STABLE, it's about 8% slower with a single client.  At 8
clients, the difference rises to 43%, and at 32 clients, it's 51%
slower.  On 9.2devel, raw performance is somewhat higher (e.g. +51% at
8 clients) but the performance when not using SSI has improved so much
that the performance gap between serializable and the other two
isolation levels is now huge: with 32 clients, in serializable mode,
the median result was 21114.577645 tps; in read committed,
218748.929692 tps - that is, read committed is running more than ten
times faster than serializable.  Data are attached, in text form and
as a plot.  I excluded the repeatable read results from the plot as
they just clutter it up - they're basically on top of the read
committed results.

I haven't run this with LWLOCK_STATS, but my seat-of-the-pants guess
is that there's a single lightweight lock that everything is
bottlenecking on.  One possible difference between this test case and
the ones you may have used is that this case involves lots and lots of
really short transactions that don't do much.  The effect of anything
that only happens once or a few times per transaction is really
magnified in this type of workload (which is why the locking changes
make so much of a difference here - in a longer or heavier-weight
transaction that stuff would be lost in the noise).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
# clients PG91.read-committed PG91.repeatable-read PG91.serializable 
PG92dev.read-committed PG92dev.repeatable-read PG92dev.serializable
1 4309.029470 4309.786279 3997.033815 4420.435075 4432.991835 4158.783631
8 26881.573137 27105.962897 15604.313328 33369.189146 33744.015354 23607.824697
16 44578.177731 44347.193447 21687.135630 62690.803360 63391.583338 23308.801395
32 38725.876808 38773.523347 17995.772382 218748.929692 218928.089856 
21114.577645
attachment: isolation-scaling.png
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
  Certainly not the end of the world, but is the convenience of being
  able to write somerange(a, b) instead of somerange(a, b, '[)')
  really worth it? I kind of doubt that...

 You're making a persuasive argument for the latter based solely on the
 clarity.  If people see that 3rd element in the DDL, or need to
 provide it, it's *very* obvious what's going on.

 That was how I originally thought, but we're also providing built-in
 range types like tsrange and daterange. I could see how if the former
 excluded the endpoint and the latter included it, it could be confusing.

 We could go back to having different constructor names for different
 inclusivity; e.g. int4range_cc(1,10). That at least removes the
 awkwardness of typing (and seeing) '[]'.

The cure seems worse than the disease.  What is so bad about '[]'?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread David Fetter
On Tue, Oct 11, 2011 at 12:09:01PM -0400, Robert Haas wrote:
 On Tue, Oct 11, 2011 at 12:03 PM, Jeff Davis pg...@j-davis.com wrote:
  On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
   Certainly not the end of the world, but is the convenience of being
   able to write somerange(a, b) instead of somerange(a, b, '[)')
   really worth it? I kind of doubt that...
 
  You're making a persuasive argument for the latter based solely on the
  clarity.  If people see that 3rd element in the DDL, or need to
  provide it, it's *very* obvious what's going on.
 
  That was how I originally thought, but we're also providing built-in
  range types like tsrange and daterange. I could see how if the former
  excluded the endpoint and the latter included it, it could be confusing.
 
  We could go back to having different constructor names for different
  inclusivity; e.g. int4range_cc(1,10). That at least removes the
  awkwardness of typing (and seeing) '[]'.
 
 The cure seems worse than the disease.  What is so bad about '[]'?

Nothing's bad about '[]' per se.  What's better, but possibly out of
the reach of our current lexing and parsing system, would be things
like:

[1::int, 10)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:12 PM, David Fetter da...@fetter.org wrote:
 Nothing's bad about '[]' per se.  What's better, but possibly out of
 the reach of our current lexing and parsing system, would be things
 like:

 [1::int, 10)

That's been discussed before.  Aside from the parser issues (which are
formidable) it would break brace-matching in most if not all commonly
used editors.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote:
 The cure seems worse than the disease.  What is so bad about '[]'?

OK, so we stick with the 3-argument form. Do we have a default for the
third argument, or do we scrap it to avoid confusion?

There were some fairly strong objections to using '[]' as the default or
having the default vary between types. So, the only real option
remaining, if we do have a default, is '[)'.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 06:28 -0700, David Fetter wrote:
  Certainly not the end of the world, but is the convenience of being
  able to write somerange(a, b) instead of somerange(a, b, '[)')
  really worth it? I kind of doubt that...
 
 You're making a persuasive argument for the latter based solely on the
 clarity.  If people see that 3rd element in the DDL, or need to
 provide it, it's *very* obvious what's going on.

That was how I originally thought, but we're also providing built-in
range types like tsrange and daterange. I could see how if the former
excluded the endpoint and the latter included it, it could be confusing.

We could go back to having different constructor names for different
inclusivity; e.g. int4range_cc(1,10). That at least removes the
awkwardness of typing (and seeing) '[]'.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:30 PM, Jeff Davis pg...@j-davis.com wrote:
 On Tue, 2011-10-11 at 12:09 -0400, Robert Haas wrote:
 The cure seems worse than the disease.  What is so bad about '[]'?

 OK, so we stick with the 3-argument form. Do we have a default for the
 third argument, or do we scrap it to avoid confusion?

 There were some fairly strong objections to using '[]' as the default or
 having the default vary between types. So, the only real option
 remaining, if we do have a default, is '[)'.

I think using '[)' is fine.  At some level, this is just a question of
expectations.  If you expect that int4range(1,4) will create a range
that includes 4, well, you're wrong.  Once you get used to it, it will
seem normal, and you'll know that you need to write
int4range(1,4,'[]') if that's what you want.  As long as the system is
designed around a set of consistent and well-thought-out principles,
people will get used to the details.  I don't see that the idea of a
half-open range over a discrete-valued type is particularly confusing
- we use them all the time in programming, when we make the end
pointer point to the byte following the end of the array, rather than
the last element - but even if it is, overall design consistency
trumps what someone may find to be the absolutely perfect behavior in
some particular case.  And saving typing is nearly always good -
unless it creates a LOT more confusion than I think this will.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Bruce Momjian
Josh Berkus wrote:
 It occurs to me that we could really use two things to make it easier to
 move copies of database stuff around:
 
 pg_dump -r, which would include a CREATE ROLE for all roles needed to
 restore the database (probably without passwords), and
 
 pg_dumpall -r --no-passwords which would dump the roles but without
 CREATE PASSWORD statements.  This would be useful for cloning databases
 for use in Dev, Test and Staging, where you don't what to copy the md5s
 of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Bruce Momjian
Greg Stark wrote:
 On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  My intention was to allow it to consider any covering index. ?You're
  thinking about the cost estimate, which is really entirely different.
 
 
 Is there any reason to consider more than one? I would have expected
 the narrowest one to be the best choice. There's something to be said
 for using the same index consistently but we already have that problem
 and make no attempt to do that. And partial indexes might be better
 but then we would already be considering them if their constraints are
 satisfied.

Actually, I think the smallest non-partial one on disk might be the best
--- that is very easy to find out.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Andrew Dunstan



On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

It occurs to me that we could really use two things to make it easier to
move copies of database stuff around:

pg_dump -r, which would include a CREATE ROLE for all roles needed to
restore the database (probably without passwords), and

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements.  This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?



As stated, it would not export the passwords.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote: 
 
 I ran my good old pgbench -S, scale factor 100, shared_buffers =
 8GB test on Nate Boley's box.  I ran it on both 9.1 and 9.2dev,
 and at all three isolation levels.  As usual, I took the median of
 three 5-minute runs, which I've generally found adequate to
 eliminate the noise. On both 9.1 and 9.2dev, read committed and
 repeatable read have basically identical performance; if anything,
 repeatable read may be slightly better - which would make sense,
 if it cuts down the number of snapshots taken.
 
Right.  Thanks for running this.  Could you give enough details to
allow reproducing on this end (or point to a previous post with the
details)?
 
 Serializable mode is much slower on this test, though.  On
 REL9_1_STABLE, it's about 8% slower with a single client.  At 8
 clients, the difference rises to 43%, and at 32 clients, it's 51%
 slower.  On 9.2devel, raw performance is somewhat higher (e.g.
 +51% at 8 clients) but the performance when not using SSI has
 improved so much that the performance gap between serializable and
 the other two isolation levels is now huge: with 32 clients, in
 serializable mode, the median result was 21114.577645 tps; in read
 committed, 218748.929692 tps - that is, read committed is running
 more than ten times faster than serializable.
 
Yeah.  I was very excited to see your numbers as you worked on that,
but I've been concerned that with the Performance Whack A Mole
nature of things (to borrow a term from Josh Berkus), SSI
lightweight locks might be popping their heads up.
 
 Data are attached, in text form and as a plot.  I excluded the
 repeatable read results from the plot as they just clutter it up -
 they're basically on top of the read committed results.
 
That was kind, but really the REPEATABLE READ results are probably
the more meaningful comparison, even if they are more embarrassing. 
:-(
 
 I haven't run this with LWLOCK_STATS, but my seat-of-the-pants
 guess is that there's a single lightweight lock that everything is
 bottlenecking on.
 
The lock in question is SerializableXactHashLock.  A secondary
problem is SerializableFinishedListLock, which is used for
protecting cleanup of old transactions.  This is per Dan's reports,
who had a better look at in on a 16 core machine, but is consistent
with what I saw on fewer cores.
 
Early in development we had a bigger problem with
SerializablePredicateLockListLock, but Dan added a local map to
eliminate contention during lock promotion decision, and I reworked
that lock from the SHARED read and EXCLUSIVE write approach to the
SHARED for accessing your own data and EXCLUSIVE for accessing data
for another process technique.  Combined, that made the problems
with that negligible.
 
 One possible difference between this test case and the ones you
 may have used is that this case involves lots and lots of really
 short transactions that don't do much.
 
I did some tests like that, but not on a box with that many
processors, and I probably didn't try using a thread count more than
double the core count, so I probably never ran into the level of
contention you're seeing.  The differences at the low connection
counts are surprising to me.  Maybe it will make more sense when I
see the test case.  There's also some chance that late elimination
of some race conditions found in testing affected this, and I didn't
re-run those tests late enough to see that.  Not sure.
 
 The effect of anything that only happens once or a few times per
 transaction is really magnified in this type of workload (which is
 why the locking changes make so much of a difference here - in a
 longer or heavier-weight transaction that stuff would be lost in
 the noise).
 
Did these transactions write anything?  If not, were they declared
to be READ ONLY?  If they were, in fact, only reading, it would be
interesting to see what the performance looks like if the
recommendation to use the READ ONLY attribute is followed.  That's
at the top of the list of performance tips for SSI at:
 
http://www.postgresql.org/docs/9.1/interactive/transaction-iso.html#XACT-SERIALIZABLE
 
Anyway, this isolates a real issue, even if the tests exaggerate it
beyond what anyone is likely to see in production.  Once this CF is
over, I'll put a review of this at the top of my PG list.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Bruce Momjian
Andrew Dunstan wrote:
 
 
 On 10/11/2011 12:40 PM, Bruce Momjian wrote:
  Josh Berkus wrote:
  It occurs to me that we could really use two things to make it easier to
  move copies of database stuff around:
 
  pg_dump -r, which would include a CREATE ROLE for all roles needed to
  restore the database (probably without passwords), and
 
  pg_dumpall -r --no-passwords which would dump the roles but without
  CREATE PASSWORD statements.  This would be useful for cloning databases
  for use in Dev, Test and Staging, where you don't what to copy the md5s
  of passwords for possible cracking.
  What would this do that pg_dumpall --globals-only doesn't?
 
 
 As stated, it would not export the passwords.

What is the logic for not dumping passwords but the CREATE ROLE
statement?  I don't see how anyone would recognize that behavior as
logical.  If you want to add a --no-passwords option to pg_dumpall, that
seems more logical to me.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Simon Riggs
On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 How do we turn it on/off to allow the overhead to be measured?

 User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
 easiest way, if you're doing it for all transactions (which I
 recommend) is to set default_transaction_isolation.

Most apps use mixed mode serializable/repeatable read and therefore
can't be changed by simple parameter. Rewriting the application isn't
a sensible solution.

I think it's clear that SSI should have had and still needs an off
switch for cases that cause performance problems.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:46 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 I ran my good old pgbench -S, scale factor 100, shared_buffers =
 8GB test on Nate Boley's box.  I ran it on both 9.1 and 9.2dev,
 and at all three isolation levels.  As usual, I took the median of
 three 5-minute runs, which I've generally found adequate to
 eliminate the noise. On both 9.1 and 9.2dev, read committed and
 repeatable read have basically identical performance; if anything,
 repeatable read may be slightly better - which would make sense,
 if it cuts down the number of snapshots taken.

 Right.  Thanks for running this.  Could you give enough details to
 allow reproducing on this end (or point to a previous post with the
 details)?

Sure, it's pretty much just a vanilla pgbench -S run, but the scripts
I used are attached here.  I build the head of each branch using the
test-build script and then used the runtestiso script to drive the
test runs.  These scripts are throwaway so they're not really
documented, but hopefully it's clear enough what it's doing.  The
server itself is a 32-core AMD 6128.

 Data are attached, in text form and as a plot.  I excluded the
 repeatable read results from the plot as they just clutter it up -
 they're basically on top of the read committed results.

 That was kind, but really the REPEATABLE READ results are probably
 the more meaningful comparison, even if they are more embarrassing.
 :-(

They're neither more nor less embarrassing - they're pretty much not
different at all.  I just didn't see any point in making a graph with
6 lines on it when you could only actually see 4 of them.

 Did these transactions write anything?  If not, were they declared
 to be READ ONLY?  If they were, in fact, only reading, it would be
 interesting to see what the performance looks like if the
 recommendation to use the READ ONLY attribute is followed.

pgbench -S doesn't do any writes, or issue any transaction control
statements.  It just fires off SELECT statements against a single
table as fast as it can, retrieving values from rows chosen at random.
 Each SELECT implicitly begins and ends a transaction.  Possibly the
system could gaze upon the SELECT statement and infer that the
one-statement transaction induced thereby can't possibly write any
tuples, and mark it read-only automatically, but I'm actually not that
excited about that approach - trying to fix the lwlock contention
that's causing the headache in the first place seems like a better use
of time, assuming it's possible to make some headway there.

My general observation is that, on this machine, a lightweight lock
that is taken in exclusive mode by a series of lockers in quick
succession seems to max out around 16-20 clients, and the curve starts
to bend well before that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


test-build
Description: Binary data


runtestiso
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 How do we turn it on/off to allow the overhead to be measured?

 User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
 easiest way, if you're doing it for all transactions (which I
 recommend) is to set default_transaction_isolation.

 Most apps use mixed mode serializable/repeatable read and therefore
 can't be changed by simple parameter. Rewriting the application isn't
 a sensible solution.

 I think it's clear that SSI should have had and still needs an off
 switch for cases that cause performance problems.

Is it possible that you are confusing the default level, which is READ
COMMITTED, with REPEATABLE READ?  I can't see why anyone would code up
their application to use REPEATABLE READ for some things and
SERIALIZABLE for other things unless they were explicitly trying to
turn SSI off for a subset of their transactions.  In all releases
prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so
there wouldn't be any reason for a legacy app to mix-and-match between
the two.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 12:43 PM, Bruce Momjian br...@momjian.us wrote:
 Greg Stark wrote:
 On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  My intention was to allow it to consider any covering index. ?You're
  thinking about the cost estimate, which is really entirely different.
 

 Is there any reason to consider more than one? I would have expected
 the narrowest one to be the best choice. There's something to be said
 for using the same index consistently but we already have that problem
 and make no attempt to do that. And partial indexes might be better
 but then we would already be considering them if their constraints are
 satisfied.

 Actually, I think the smallest non-partial one on disk might be the best
 --- that is very easy to find out.

I doubt there is any need to write special-purpose code to decide
which index ought to be used for a full table scan.  We can just throw
all of the otherwise-useless indexes into the costing machinery with
empty pathkeys, and let them duke it out.  All but the best one will
be instantly discarded, and the best one will either beat or lose to a
sequential scan.  All of this will happen before we start trying to
build join paths, so there's no combinatorial explosion in planning
time - it'll just be a straightforward cost comparison between plans
with identical pathkeys, and should be quite fast.

The real issue is that the costing estimates need to be accurate, and
that's where the rubber hits the road.  Otherwise, even if we pick the
right way to scan the table, we may do silly things up the line when
we go to start constructing the join order.  I think we need to beef
up ANALYZE to gather statistics on the fraction of the pages that are
marked all-visible, or maybe VACUUM should gather that information.
The trouble is that if we VACUUM and then ANALYZE, we'll often get
back a value very close to 100%, but then the real value may diminish
quite a bit before the next auto-analyze fires.  I think if we can
figure out what to do about that problem we'll be well on our way...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 6:14 PM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Oct 11, 2011 at 1:11 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Mon, Oct 10, 2011 at 11:31 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Simon Riggs si...@2ndquadrant.com wrote:

 How do we turn it on/off to allow the overhead to be measured?

 User REPEATABLE READ transactions or SERIALIZABLE transactions.  The
 easiest way, if you're doing it for all transactions (which I
 recommend) is to set default_transaction_isolation.

 Most apps use mixed mode serializable/repeatable read and therefore
 can't be changed by simple parameter. Rewriting the application isn't
 a sensible solution.

 I think it's clear that SSI should have had and still needs an off
 switch for cases that cause performance problems.

 Is it possible that you are confusing the default level, which is READ
 COMMITTED, with REPEATABLE READ?  I can't see why anyone would code up
 their application to use REPEATABLE READ for some things and
 SERIALIZABLE for other things unless they were explicitly trying to
 turn SSI off for a subset of their transactions.  In all releases
 prior to 9.0, REPEATABLE READ and SERIALIZABLE behaved identically, so
 there wouldn't be any reason for a legacy app to mix-and-match between
 the two.

Yes, I mistyped read when I meant committed. You are right to
point out there is no problem if people were using repeatable read and
serializable.

Let me retype, so there is no confusion:

It's common to find applications that have some transactions
explicitly coded to use SERIALIZABLE mode, while the rest are in the
default mode READ COMMITTED. So common that TPC-E benchmark has been
written as a representation of such workloads. The reason this is
common is that some transactions require SERIALIZABLE as a fix for
transaction problems.

If you alter the default_transaction_isolation then you will break
applications like this, so it is not a valid way to turn off SSI.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:
 
 Did these transactions write anything?  If not, were they
 declared to be READ ONLY?  If they were, in fact, only reading,
 it would be interesting to see what the performance looks like if
 the recommendation to use the READ ONLY attribute is followed.
 
 pgbench -S doesn't do any writes, or issue any transaction control
 statements.  It just fires off SELECT statements against a single
 table as fast as it can, retrieving values from rows chosen at
 random.  Each SELECT implicitly begins and ends a transaction.
 
So that test could be accomplished by setting
default_transaction_read_only to on.  That's actually what we're
doing, because we have a lot more of them than of read-write
transactions.  But, with the scripts I can confirm the performance
of that on this end.  It should be indistinguishable from the
repeatable read line; if not, there's something to look at there.
 
 Possibly the system could gaze upon the SELECT statement and infer
 that the one-statement transaction induced thereby can't possibly
 write any tuples, and mark it read-only automatically, but I'm
 actually not that excited about that approach
 
I wasn't intending to suggest that.  In fact I hadn't really thought
of it.  It might be a fun optimization, although it would be well
down my list, and it wouldn't be trivial because you couldn't use if
for any statements with volatile functions -- so the statement would
need to be planned far enough to know whether that was the case
before making this decision.  In fact, I'm not sure the community
would want to generate an error if a user marked a function other
than volatile and ran it in this way.  Definitely not something to
even look at any time soon.
 
 trying to fix the lwlock contention that's causing the headache in
 the first place seems like a better use of time, assuming it's
 possible to make some headway there.
 
Absolutely.  I just thought the timings with READ ONLY would make
for an interesting data point.  For one thing, it might reassure
people that even this artificial use cases doesn't perform that
badly if the advice in the documentation is heeded.  For another, a
result slower than repeatable read would be a surprise that might
point more directly to the problem.
 
 My general observation is that, on this machine, a lightweight
 lock that is taken in exclusive mode by a series of lockers in
 quick succession seems to max out around 16-20 clients, and the
 curve starts to bend well before that.
 
OK, I will keep that in mind.
 
Thanks,
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 
 It's common to find applications that have some transactions
 explicitly coded to use SERIALIZABLE mode, while the rest are in
 the default mode READ COMMITTED. So common that TPC-E benchmark
 has been written as a representation of such workloads.
 
I would be willing to be that any such implementations assume S2PL,
and would not prevent anomalies as expected unless all transactions
are serializable.
 
 The reason this is common is that some transactions require
 SERIALIZABLE as a fix for transaction problems.
 
That is a mode of thinking which doesn't work if you only assume
serializable provides the guarantees required by the standard.  Many
people assume otherwise.  It does *not* guarantee blocking on
conflicts, and it does not require that transactions appear to have
executed in the order of successful commit.  It requires only that
the result of concurrently running any mix of serializable
transactions produce a result consistent with some one-at-a-time
execution of those transactions.  Rollback of transactions to
prevent violations of that guarantee are allowed.  I don't see any
guarantees about how serializable transactions interact with
non-serializable transactions beyond each transaction not seeing any
of the phenomena prohibited for its isolation level.
 
 If you alter the default_transaction_isolation then you will break
 applications like this, so it is not a valid way to turn off SSI.
 
I don't follow you here.  What would break?  In what fashion?  Since
the standard allows any isolation level to provide more strict
transaction isolation than required, it would be conforming to
*only* support serializable transactions, regardless of the level
requested.  Not a good idea for some workloads from a performance
perspective, but it would be conforming, and any application which
doesn't work correctly with that is not written to the standard.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Jeff Davis
On Tue, 2011-10-11 at 12:40 -0400, Robert Haas wrote:
 I think using '[)' is fine.  At some level, this is just a question of
 expectations.  If you expect that int4range(1,4) will create a range
 that includes 4, well, you're wrong.  Once you get used to it, it will
 seem normal, and you'll know that you need to write
 int4range(1,4,'[]') if that's what you want.  As long as the system is
 designed around a set of consistent and well-thought-out principles,
 people will get used to the details.  I don't see that the idea of a
 half-open range over a discrete-valued type is particularly confusing
 - we use them all the time in programming, when we make the end
 pointer point to the byte following the end of the array, rather than
 the last element - but even if it is, overall design consistency
 trumps what someone may find to be the absolutely perfect behavior in
 some particular case.  And saving typing is nearly always good -
 unless it creates a LOT more confusion than I think this will.

That sounds very reasonable to me.

Tom made an observation about '[1,INT_MAX]' thowing an error because
canonicalization would try to increment INT_MAX. But I'm not
particularly disturbed by it. If you want a bigger range, use int8range
or numrange -- the same advice we give to people who want unsigned
types. Or, for people who really need the entire range of signed int4
exactly, they can easily make their own range type that canonicalizes to
'[]'.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Tom Lane
Jeff Davis pg...@j-davis.com writes:
 Tom made an observation about '[1,INT_MAX]' thowing an error because
 canonicalization would try to increment INT_MAX. But I'm not
 particularly disturbed by it. If you want a bigger range, use int8range
 or numrange -- the same advice we give to people who want unsigned
 types. Or, for people who really need the entire range of signed int4
 exactly, they can easily make their own range type that canonicalizes to
 '[]'.

I agree we shouldn't contort the entire design to avoid that corner
case.  We should, however, make sure that the increment *does* throw
an error, and not just silently overflow.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread David Fetter
On Tue, Oct 11, 2011 at 12:18:18PM -0400, Robert Haas wrote:
 On Tue, Oct 11, 2011 at 12:12 PM, David Fetter da...@fetter.org wrote:
  Nothing's bad about '[]' per se.  What's better, but possibly out
  of the reach of our current lexing and parsing system, would be
  things like:
 
  [1::int, 10)
 
 That's been discussed before.  Aside from the parser issues (which
 are formidable) it would break brace-matching in most if not all
 commonly used editors.

That being the situation, ubiquitous support for the natural syntax
looks like it's a decade away, minimum. :(

Trying to be cheery,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: casts row to array and array to row

2011-10-11 Thread Merlin Moncure
On Tue, Oct 11, 2011 at 8:36 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 2011/10/11 Robert Haas robertmh...@gmail.com:
 On Tue, Oct 11, 2011 at 4:40 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 What do you think about this idea?

 It's a bad one.

 Well, a ROW can contain values of different types; an ARRAY can't.

 yes, I know - but it should be problem only in few cases - when is not
 possible to cast a row field to array field.

 This idea is basically the same as data types don't matter, which is
 not SQL-ish and certainly not Postgres-ish.

hm. I agree, but if it were possible to create sql/plpgsql functions
accepting 'record', then you could at least rig the cast in userland
around hstore without resorting to hacky text manipulation and/or
flattening the record to text before doing the operation.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET variable - Permission issues

2011-10-11 Thread Kevin Grittner
Bruce Momjian br...@momjian.us wrote:
 Kevin Grittner wrote:
 Joe Conway m...@joeconway.com wrote:
 On 10/10/2011 01:52 PM, Gurjeet Singh wrote:
  
 ALTER USER novice SET MIN_VAL OF statement_timeout TO '1';
 -- So that the user cannot turn off the timeout
 
 ALTER DATABASE super_reliable SET ENUM_VALS OF
   synchronous_commit TO 'on';
 -- So that the user cannot change the synchronicity of
 transactions against this database.
 
 I like this better than GRANT/REVOKE on SET.
  
 +1
  
 I would really like a way to prevent normal users from switching
 from the default transaction isolation level I set.  This seems
 like a good way to do that.  Putting sane bounds on some other
 settings, more to protect against the accidental bad settings
 than malicious mischief, would be a good thing, too.
 
 Is this a TODO?  We might not want to make work_mem SUSET, but it
 would allow administrators to control this.
 
Well, we've identified a few people who like the idea, but I'm not
sure we have the degree of consensus we normally look for before
putting something on the TODO list.  After the discussion on this
thread, are there still any *objections* to allowing bounds or
subsets to be SUSET to limit GUC values more strictly than the
limits hard-coded in C?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 7:08 AM, Royce Ausburn royce...@inomial.com wrote:
 I wonder, could the recent work on index only scans pave the way for auto 
 clustered tables?  Consider a wide, mostly insert table with some subset of 
 columns that I'd like to cluster on.  I'm after locality of tuples that are 
 very frequently fetched together, but not keen on the downtime for a cluster, 
 nor the maintenance that it requires.  Would it be a stretch to have an index 
 that branches on the subset of cluster columns, but still stores all the 
 columns, making it a covering index?  Given that we can already index 
 concurrently, such an index would not require downtime, and would be self 
 maintaining.  From my understanding of the index-only scan implementation, I 
 suspect that such an index would effectively give locality, with some caveats…

I guess we could do that, but I'm not convinced there would be much
benefit.  The only thing you'd be saving would be the cost of keeping
the tuples sorted by only the high-order columns rather than all of
them, and I doubt that's significant.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET variable - Permission issues

2011-10-11 Thread Joe Conway
On 10/11/2011 11:53 AM, Kevin Grittner wrote:
 Bruce Momjian br...@momjian.us wrote:
 Is this a TODO?  We might not want to make work_mem SUSET, but it 
 would allow administrators to control this.
 
 Well, we've identified a few people who like the idea, but I'm not 
 sure we have the degree of consensus we normally look for before 
 putting something on the TODO list.

That's pretty much what I was thinking.

 After the discussion on this thread, are there still any *objections*
 to allowing bounds or subsets to be SUSET to limit GUC values more
 strictly than the limits hard-coded in C?

No objections here.

Joe

-- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 [implement clustered index as a covering index with all columns
 which are present in the heap]

 I guess we could do that, but I'm not convinced there would be
 much benefit. 
 
The traditional way to implement a clustered index is to have the
leaf level of the index contain the tuples rather than pointers to
the tuples.  If we're going to do clustered tables, we might want to
jump all the way to that, rather than a half-way solution which
stores everything twice.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Josh Berkus
On 10/10/11 9:53 PM, Fujii Masao wrote:
 Or you think that, to keep the backward compatibility completely,
 recovery.conf should be used as not only a configuration file but also a
 recovery trigger one and it should be renamed to recovery.done at
 the end of recovery?

That's precisely my point.  The trigger file nature of recovery.conf is
a problem in itself, and I don't see any way to support that and fix it
at the same time.  Maybe Simon can?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Robert Haas
On Tue, Oct 11, 2011 at 3:02 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 [implement clustered index as a covering index with all columns
 which are present in the heap]
 I guess we could do that, but I'm not convinced there would be
 much benefit.

 The traditional way to implement a clustered index is to have the
 leaf level of the index contain the tuples rather than pointers to
 the tuples.  If we're going to do clustered tables, we might want to
 jump all the way to that, rather than a half-way solution which
 stores everything twice.

Not a bad thought.

Actually, I've been thinking for a while now that we might need a
pluggable heapam, similar to the pluggable indexams we already have.
Our current heap format has served us pretty well, but there are any
number of things that we can't really do without changing it.  Of
course, if we came up with a new format that was better in every case,
across the board, then perhaps we'd be willing to just replace the
current format outright -- though even then, that would break
pg_upgrade, which would be painful, to put it mildly.  And it seems to
me that there could easily be format changes that would make sense for
particular cases, but not across the board, like:

- index-organized tables (heap is a btree, and secondary indexes
reference the PK rather than the TID; this is how MySQL does it, and
Oracle offers it as an option)
- WORM tables (no updates or deletes, and no inserts after creating
transaction commits, allowing a much smaller tuple header)
- non-transactional tables (tuples visible as soon as they're written,
again allowing for smaller tuple header; useful for internal stuff and
perhaps for insert-only log tables)

Alternatively, we could try to graft the concept of a self-clustering
table on top of the existing heap implementation.  But I'm having
trouble seeing how that would work.  The TODO describes it as
something like maintain CLUSTER ordering, but that's a gross
oversimplification, because we have no structure that would allow us
to sensibly do any such thing...  the current heap implementation is
just that: a pile of stuff.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Intermittent regression test failure from index-only scans patch

2011-10-11 Thread Magnus Hagander
On Sun, Oct 9, 2011 at 06:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Oct 8, 2011, at 11:04 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm inclined to fix this by changing the test to examine idx_tup_read
 not idx_tup_fetch.  Alternatively, we could have the test force
 enable_indexonlyscan off.  Thoughts?

 No preference.

 I ended up doing it the second way (ie enable_indexonlyscan = off)
 because it turns out that pg_stat_user_tables doesn't have the
 idx_tup_read column --- we track that count per index, not per table.
 I could have complicated the test's stats queries some more, but it
 seemed quite not relevant to the goals of the test.

 Should we have another counter for heap fetches avoided?  Seems like that 
 could be useful to know.

 Hm.  I'm hesitant to add another per-table (or per index?) statistics
 counter because of the resultant bloat in the stats file.  But it
 wouldn't be a bad idea for somebody to take two steps back and rethink
 what we're counting in this area.  The current counter definitions are
 mostly backwards-compatible with pre-8.1 behavior, and it seems like the
 goalposts have moved enough that maybe it's time to break compatibility.

We certainly need *some* way to figure out if this has been used,
IMHO. So yeah, if the current way doesn't scale enough, we need to
think of some other way. But I'm not sure one more counter would
really bloat it that bad? OTOH, repeating that reasoning enough time
will eventually make it enough to care about...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Josh Berkus

 What is the logic for not dumping passwords but the CREATE ROLE
 statement?  I don't see how anyone would recognize that behavior as
 logical.  If you want to add a --no-passwords option to pg_dumpall, that
 seems more logical to me.

That's what I'm suggesting.

Incidentally, what's the difference between -g and -r in terms of actual
output, anyway?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Intermittent regression test failure from index-only scans patch

2011-10-11 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Sun, Oct 9, 2011 at 06:34, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 Should we have another counter for heap fetches avoided?  Seems like that 
 could be useful to know.

 Hm.  I'm hesitant to add another per-table (or per index?) statistics
 counter because of the resultant bloat in the stats file.

 We certainly need *some* way to figure out if this has been used,
 IMHO. So yeah, if the current way doesn't scale enough, we need to
 think of some other way. But I'm not sure one more counter would
 really bloat it that bad? OTOH, repeating that reasoning enough time
 will eventually make it enough to care about...

You can already tell whether it's happening by comparing idx_tup_read
versus idx_tup_fetch.  Now that measure does conflate some things, like
whether the tuple was not read at all or was read and rejected as not
visible, but I'm not at all convinced that another counter is worth its
weight.  If invisible tuples are a significant part of the table then
index-only scanning isn't going to be very useful to you anyway.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Bruce Momjian
Josh Berkus wrote:
 
  What is the logic for not dumping passwords but the CREATE ROLE
  statement?  I don't see how anyone would recognize that behavior as
  logical.  If you want to add a --no-passwords option to pg_dumpall, that
  seems more logical to me.
 
 That's what I'm suggesting.
 
 Incidentally, what's the difference between -g and -r in terms of actual
 output, anyway?

Acording the docs, I assume -r is only roles, while -g includes
tablespace, so what you want is already available in pg_dumpall.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-10-11 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Oct11, 2011, at 14:43 , David Fetter wrote:
 I'd recoil at not having ranges default to left-closed, right-open.
 The use case for that one is so compelling that I'm OK with making it
 the default from which deviations need to be specified.

I agree with David on this.

 The downside of that is that, as Tom pointed out upthread, we cannot
 make [) the canonical representation of ranges.

Yeah, we certainly *can* do that, we just have to allow ranges that
include the last element of the domain to be corner cases that require
special handling.  If we don't want to just fail, we have to
canonicalize them to closed instead of open ranges.  It does not follow
that the default on input has to be closed.

Note that the INT_MAX case is probably not the worst issue in practice.
What is going to be an issue is ranges over enum types, where having the
last element being part of the range is a much more likely use-case.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] B-tree parent pointer and checkpoints

2011-10-11 Thread Bruce Momjian
Heikki Linnakangas wrote:
 On 11.03.2011 19:41, Tom Lane wrote:
  Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:
  On 11.03.2011 17:59, Tom Lane wrote:
  But that will be fixed during WAL replay.
 
  Not under the circumstances that started the original thread:
 
  1. Backend splits a page
  2. Checkpoint starts
  3. Checkpoint runs to completion
  4. Crash
  (5. Backend never got to insert the parent pointer)
 
  WAL replay starts at the checkpoint redo pointer, which is after the
  page split record, so WAL replay won't insert the parent pointer. That's
  an incredibly tight window to hit in practice, but it's possible in theory.
 
  Hmm.  It's not so improbable that checkpoint would start inside that
  window, but that the parent insertion is still pending by the time the
  checkpoint finishes is pretty improbable.
 
  How about just reducing the deletion-time ERROR for missing downlink to a 
  LOG?
 
 Well, the code that follows expects to have a valid parent page locked, 
 so you can't literally do just that. But yeah, LOG and aborting the page 
 deletion seems fine to me.

Added to TODO:

Fix problem with btree page splits during checkpoints

http://archives.postgresql.org/pgsql-hackers/2010-11/msg00052.php
http://archives.postgresql.org/pgsql-hackers/2011-09/msg00184.php 

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Florian Pflug
On Oct11, 2011, at 21:27 , Robert Haas wrote:
 Alternatively, we could try to graft the concept of a self-clustering
 table on top of the existing heap implementation.  But I'm having
 trouble seeing how that would work.  The TODO describes it as
 something like maintain CLUSTER ordering, but that's a gross
 oversimplification, because we have no structure that would allow us
 to sensibly do any such thing...  the current heap implementation is
 just that: a pile of stuff.

We could still be smarter about where we insert new rows in a clustered
table, though.

Upon INSERT and UPDATE, we'd need to lookup the leaf page where the new
tuple will eventually go in the index we're supposed to maintain CLUSTER
for. Then we'd check if any of the pages referenced there contains enough
space, and if so place the new tuple there. If not it'd go at the end.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET variable - Permission issues

2011-10-11 Thread Dimitri Fontaine
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Well, we've identified a few people who like the idea, but I'm not
 sure we have the degree of consensus we normally look for before
 putting something on the TODO list.  After the discussion on this
 thread, are there still any *objections* to allowing bounds or
 subsets to be SUSET to limit GUC values more strictly than the
 limits hard-coded in C?

No objection here, I like this whole idea.  Adding the information
visible at the right places is a fun project in itself, too :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 10/11/2011 12:40 PM, Bruce Momjian wrote:
 Josh Berkus wrote:
 pg_dumpall -r --no-passwords which would dump the roles but without
 CREATE PASSWORD statements.  This would be useful for cloning databases
 for use in Dev, Test and Staging, where you don't what to copy the md5s
 of passwords for possible cracking.

 What would this do that pg_dumpall --globals-only doesn't?

 As stated, it would not export the passwords.

I can see some possible use in a --no-passwords option that's orthogonal
to all else.  The rest of this seems rather confused...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 6:44 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:

 If you alter the default_transaction_isolation then you will break
 applications like this, so it is not a valid way to turn off SSI.

 I don't follow you here.  What would break?  In what fashion?  Since
 the standard allows any isolation level to provide more strict
 transaction isolation than required, it would be conforming to
 *only* support serializable transactions, regardless of the level
 requested.  Not a good idea for some workloads from a performance
 perspective, but it would be conforming, and any application which
 doesn't work correctly with that is not written to the standard.

If the normal default_transaction_isolation = read committed and all
transactions that require serializable are explicitly marked in the
application then there is no way to turn off SSI without altering the
application. That is not acceptable, since it causes changes in
application behaviour and possibly also performance issues.

We should provide a mechanism to allow people to upgrade to 9.1+
without needing to change the meaning and/or performance of their
apps.

I strongly support the development of SSI, but I don't support
application breakage. We can have SSI without breaking anything for
people that can't or don't want to use it.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Greg Sabino Mullane
Robert Haas:
 Serializable mode is much slower on this test, though.  On
 REL9_1_STABLE, it's about 8% slower with a single client.  At 8
 clients, the difference rises to 43%, and at 32 clients, it's 51%
 slower.

Bummer. Thanks for putting some numbers out there; glad I was able 
to jump start a deeper look at this. Based on this thread so far, 
I am probably going to avoid serializable in this particular case, 
and stick to repeatable read. Once things are in place, perhaps I'll 
be able to try switching to serializable and get some measurements, 
but I wanted to see if the impact was minor enough to safely start 
with serializable. Seems not. :) Keep in mind this is not even a 
formal proposal yet for our client, so any benchmarks from me may 
be quite a while.

Kevin Grittner:

 Did these transactions write anything?  If not, were they declared
 to be READ ONLY?  If they were, in fact, only reading, it would be
 interesting to see what the performance looks like if the
 recommendation to use the READ ONLY attribute is followed.

Yes, I'll definitely look into that, but the great majority of the 
things done in this case are read/write.

Simon Riggs:
 Most apps use mixed mode serializable/repeatable read and therefore
 can't be changed by simple parameter. Rewriting the application isn't
 a sensible solution.
 
 I think it's clear that SSI should have had and still needs an off
 switch for cases that cause performance problems.

Eh? It has an off switch: repeatable read.


Thanks for all replying to this thread, it's been very helpful.


-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgpkFVkl3Xl3T.pgp
Description: PGP signature


Re: [HACKERS] Index only scan paving the way for auto clustered tables?

2011-10-11 Thread Kääriäinen Anssi
Robert Haas wrote:

And it seems to me that there could easily be format changes that
would make sense for particular cases, but not across the board,
like:

- index-organized tables (heap is a btree, and secondary indexes
reference the PK rather than the TID; this is how MySQL does it, and
Oracle offers it as an option)
- WORM tables (no updates or deletes, and no inserts after creating
transaction commits, allowing a much smaller tuple header)
- non-transactional tables (tuples visible as soon as they're written,
again allowing for smaller tuple header; useful for internal stuff and
perhaps for insert-only log tables)


This is probably a silly idea, but I have been wondering about the
following idea: Instead of having visibility info in the row header,
have a couple of row visibility slots in the page header. These slots
could be shared between rows in the page, so that if you do a bulk
insert/update/delete you would only use one slot. If the slots
overflow, you would use external slots buffer.

When the row is all visible, no slot would be used at all.

The xmin, xmax and cid would be in the slots. ctid would have its
current meaning, except when the external slots would be used,
then ctid would point to the external slot, and it would have the real
row header. I don't know if there would be any other row header
parts which could be shared.

The external slots buffer would then contain xmin, xmax, cid and
the real ctid.

Updates would write the new rows to another page in the heap,
and old rows would stay in place, just as now. So there would not
be any redo log like configuration. Also, the external slots buffer
would be small (18 bytes per row), so it would not get out of
cache too easily.

The performance would suck if you had lots of small updates, or
long running transactions. On the other hand in data warehousing,
where bulk loads are normal, and there are a lot of small rows,
this could actually work.

As said, this is probably a silly idea. But as pluggable heap types
came up, I thought to ask if this could actually work. If this kind of
wondering posts are inappropriate for this list, please tell me so
that I can avoid these in the future.

 - Anssi Kääriäinen

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Greg Sabino Mullane
 If the normal default_transaction_isolation = read committed and all
 transactions that require serializable are explicitly marked in the
 application then there is no way to turn off SSI without altering the
 application. That is not acceptable, since it causes changes in
 application behaviour and possibly also performance issues.

Performance, perhaps. What application behavior changes? Less 
serialization conflicts?

 We should provide a mechanism to allow people to upgrade to 9.1+
 without needing to change the meaning and/or performance of their
 apps.

That ship has sailed.

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgpsMUli41Pnm.pgp
Description: PGP signature


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 5:45 AM, Greg Stark st...@mit.edu wrote:
 On Mon, Oct 10, 2011 at 9:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 My intention was to allow it to consider any covering index.  You're
 thinking about the cost estimate, which is really entirely different.


 Is there any reason to consider more than one? I would have expected
 the narrowest one to be the best choice. There's something to be said
 for using the same index consistently but we already have that problem
 and make no attempt to do that. And partial indexes might be better
 but then we would already be considering them if their constraints are
 satisfied.

You raise a fantastic idea. Use the frequency of use as a factor of an
index in the cost of optimising a query.

We have previously discussed the idea of using the RAM residency of an
index to control the cost. That is difficult to judge.

Using the long term prevalence of usage as a weighting factor makes a
great deal of sense for queries that could potentially utilise
multiple indexes. That information is readily available and directly
applicable. The prevalence of use directly drives RAM residency, so it
makes sense to use the causal factor as input to the cost.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 3:29 PM, Bruce Momjian br...@momjian.us wrote:

 As much as I appreciate Simon's work in this area, I think we are still
 unclear if keeping backward-compatibility is worth the complexity
 required for future users.  Historically we have been bold in changing
 postgresql.conf settings to improve clarity, and that approach has
 served us well.

You raise a good point. First, thank you for the respectful comment;
my viewpoint is not formed from resistance to change per se, even if
may appear to be so.  Thank you for raising that possibility to allow
me to explain and refute that.

I am genuinely concerned that we show respect to downstream software
that uses our APIs and have no personal or corporate ulterior motive.

Most people are used to the 3 year cycle of development on which
SQLServer and Oracle have now standardised. Our 1 year cycle provides
a considerable benefit in agility, but it also provides for x3
complexity in release management and a continual temptation to change
for no good reason. I want to encourage people to adopt our APIs, not
give them a headache for attempting to do so. We know that software
exists that follows the previous API and we should take steps to
deprecate that across multiple releases, with appropriate notice, just
as we do in other cases, such as standard conforming strings where our
lack of boldness is appropriate.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, Oct 11, 2011 at 3:29 PM, Bruce Momjian br...@momjian.us wrote:
 
  As much as I appreciate Simon's work in this area, I think we are still
  unclear if keeping backward-compatibility is worth the complexity
  required for future users. ?Historically we have been bold in changing
  postgresql.conf settings to improve clarity, and that approach has
  served us well.
 
 You raise a good point. First, thank you for the respectful comment;
 my viewpoint is not formed from resistance to change per se, even if
 may appear to be so.  Thank you for raising that possibility to allow
 me to explain and refute that.
 
 I am genuinely concerned that we show respect to downstream software
 that uses our APIs and have no personal or corporate ulterior motive.
 
 Most people are used to the 3 year cycle of development on which
 SQLServer and Oracle have now standardised. Our 1 year cycle provides
 a considerable benefit in agility, but it also provides for x3
 complexity in release management and a continual temptation to change
 for no good reason. I want to encourage people to adopt our APIs, not
 give them a headache for attempting to do so. We know that software
 exists that follows the previous API and we should take steps to
 deprecate that across multiple releases, with appropriate notice, just
 as we do in other cases, such as standard conforming strings where our
 lack of boldness is appropriate.

Well, let me be specific.  Around 2003 to 2006, we added many new
configuration parameters for logging, which required renaming or
removing older parameters.  There really wasn't a smooth way to allow
for this to be done without impacting users, and the current system we
have enjoyed since 2006 is logical only because we made the changes
necessary.

We can look at trying to phase changes in, but often the phasing becomes
more complicated that just doing the change.  Logging parameter changes
were easier because it was assumed logging was an admin-only task, as I
assume pitr and replication are as well.  Standard conforming strings
was tricky because it was more user-facing, or certainly SQL-facing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Andrew Dunstan



On 10/11/2011 04:19 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 10/11/2011 12:40 PM, Bruce Momjian wrote:

Josh Berkus wrote:

pg_dumpall -r --no-passwords which would dump the roles but without
CREATE PASSWORD statements.  This would be useful for cloning databases
for use in Dev, Test and Staging, where you don't what to copy the md5s
of passwords for possible cracking.

What would this do that pg_dumpall --globals-only doesn't?

As stated, it would not export the passwords.

I can see some possible use in a --no-passwords option that's orthogonal
to all else.  The rest of this seems rather confused...




Yes, I also think that will meet the case.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 9:21 PM, Greg Sabino Mullane g...@endpoint.com wrote:

 Simon Riggs:
 Most apps use mixed mode serializable/repeatable read and therefore
 can't be changed by simple parameter. Rewriting the application isn't
 a sensible solution.

 I think it's clear that SSI should have had and still needs an off
 switch for cases that cause performance problems.

 Eh? It has an off switch: repeatable read.

You mean: if we recode the application and retest it, we can get it to
work same way as it used to.

To most people that is the same thing as it doesn't work with this
release, ask any application vendor.

There is no off switch and there should be.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] COUNT(*) and index-only scans

2011-10-11 Thread Josh Berkus

 The trouble is that if we VACUUM and then ANALYZE, we'll often get
 back a value very close to 100%, but then the real value may diminish
 quite a bit before the next auto-analyze fires.  I think if we can
 figure out what to do about that problem we'll be well on our way...

It's not so much an issue of when the last auto-analyze was as an issue
of the number of rows in write transactions against that table in the
last X minutes.  This is where it really hurts us that
pg_stat_user_tables is not time-based.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Bruce Momjian
Greg Sabino Mullane wrote:
-- Start of PGP signed section.
  If the normal default_transaction_isolation = read committed and all
  transactions that require serializable are explicitly marked in the
  application then there is no way to turn off SSI without altering the
  application. That is not acceptable, since it causes changes in
  application behaviour and possibly also performance issues.
 
 Performance, perhaps. What application behavior changes? Less 
 serialization conflicts?
 
  We should provide a mechanism to allow people to upgrade to 9.1+
  without needing to change the meaning and/or performance of their
  apps.
 
 That ship has sailed.

Simon seems to value backward-compatibility more than the average
hackers poster.  The lack of complaints about 9.1 I think means that the
hackers decision of _not_ providing a swich was the right one.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, Oct 11, 2011 at 6:44 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 
  If you alter the default_transaction_isolation then you will break
  applications like this, so it is not a valid way to turn off SSI.
 
  I don't follow you here. ?What would break? ?In what fashion? ?Since
  the standard allows any isolation level to provide more strict
  transaction isolation than required, it would be conforming to
  *only* support serializable transactions, regardless of the level
  requested. ?Not a good idea for some workloads from a performance
  perspective, but it would be conforming, and any application which
  doesn't work correctly with that is not written to the standard.
 
 If the normal default_transaction_isolation = read committed and all
 transactions that require serializable are explicitly marked in the
 application then there is no way to turn off SSI without altering the
 application. That is not acceptable, since it causes changes in
 application behaviour and possibly also performance issues.
 
 We should provide a mechanism to allow people to upgrade to 9.1+
 without needing to change the meaning and/or performance of their
 apps.
 
 I strongly support the development of SSI, but I don't support
 application breakage. We can have SSI without breaking anything for
 people that can't or don't want to use it.

The problem is that all the switches needed to allow for no application
breakage makes configuration of the server and source code more
complicated.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Tom Lane
Alexander Korotkov aekorot...@gmail.com writes:
 On Tue, Oct 11, 2011 at 5:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I haven't thought as far ahead as how we might get the information
 needed for a per-opclass flag.  A syntax addition to CREATE OPERATOR
 CLASS might be the only way.
 
 Shouldn't it be implemented through additional interface function? There are
 situations when restoring of original tuple requires some transformation.
 For example, in point_ops we store box in the leaf index tuple, while point
 can be easily restored from box.

Hm.  I had been supposing that lossless compress functions would just be
no-ops.  If that's not necessarily the case then we might need something
different from the opclass's decompress function to get back the
original data.  However, that doesn't really solve the problem I'm
concerned about, because the existence and use of such a function would
be entirely internal to GiST.  There still needs to be a way for the
planner to know which opclasses support data retrieval.  And I do *not*
want to see us hard-wire the presence of opclass function 8 means a
GiST opclass can return data into the planner.

Maybe, instead of a simple constant amcanreturn column, we need an AM
API function that says whether the index can return data.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Bruce Momjian
Josh Berkus wrote:
 
  Acording the docs, I assume -r is only roles, while -g includes
  tablespace, so what you want is already available in pg_dumpall.
 
 No, it's not.  You don't seem to be actually reading any of my proposals.
 
 (1) I cannot produce a single file in custom dump format which includes
 both a single database and all of the roles I need to build that database.
 
 (2) I cannot dump a set of roles without md5 passwords.
 
 Both of these are things I need to support dev/stage/testing integration
 at multiple sites.

We are not writing this software for you.  Please submit a clear
proposal.  I am sure you have 10k customers who want this.  :-|

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 9:28 PM, Bruce Momjian br...@momjian.us wrote:

 Standard conforming strings
 was tricky because it was more user-facing, or certainly SQL-facing.

Why is SQL more important than backup?

There is no good reason to do this so quickly.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Kevin Grittner
Simon Riggs si...@2ndquadrant.com wrote:
 Greg Sabino Mullane g...@endpoint.com wrote:
 
 Eh? It has an off switch: repeatable read.
 
 You mean: if we recode the application and retest it, we can get
 it to work same way as it used to.
 
 To most people that is the same thing as it doesn't work with
 this release, ask any application vendor.
 
 There is no off switch and there should be.
 
This was discussed at some length, and nobody seemed to favor a
behavior-changing GUC.  One example of such a thread is here:
 
http://archives.postgresql.org/pgsql-hackers/2009-05/msg01165.php
 
It came up at least a couple other times, and the outcome was always
the same -- after discussion, nobody was in favor of a GUC to make
the semantics of these statement variable.  I'm sorry if you missed
those discussions.  It would certainly be a trivial change to
implement; the problem is convincing others that it's a good idea.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SET variable - Permission issues

2011-10-11 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Adding the information visible at the right places is a fun
 project in itself, too :)
 
 I was thinking a couple new columns in pg_settings (and what backs
 it) would be the main thing, but I haven't searched the source code
 yet.  Does something else leap to mind for you?

This isn't exactly a trivial matter.  What happens for instance if you
try to change the limit, and there are already active values outside the
limit in some processes?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Greg Sabino Mullane
On Tue, Oct 11, 2011 at 04:32:45PM -0400, Bruce Momjian wrote:
...
 Simon seems to value backward-compatibility more than the average
 hackers poster.  The lack of complaints about 9.1 I think means that the
 hackers decision of _not_ providing a swich was the right one.

I wouldn't go that far: 9.1 is very new. Certainly the release notes do 
not explain the change enough: part of the reason I wrote:

http://blog.endpoint.com/2011/09/postgresql-allows-for-different.html

Simon has a point, but I think that having applications switch from 
serializable to repeatable read is a pain point people should 
pay when going to 9.1, rather than adding some switch now.

-- 
Greg Sabino Mullane g...@endpoint.com
End Point Corporation
PGP Key: 0x14964AC8


pgp7jN6DSMohw.pgp
Description: PGP signature


Re: [HACKERS] index-only scans

2011-10-11 Thread Alexander Korotkov
On Wed, Oct 12, 2011 at 12:35 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Hm.  I had been supposing that lossless compress functions would just be
 no-ops.  If that's not necessarily the case then we might need something
 different from the opclass's decompress function to get back the
 original data.  However, that doesn't really solve the problem I'm
 concerned about, because the existence and use of such a function would
 be entirely internal to GiST.  There still needs to be a way for the
 planner to know which opclasses support data retrieval.  And I do *not*
 want to see us hard-wire the presence of opclass function 8 means a
 GiST opclass can return data into the planner.

 Maybe, instead of a simple constant amcanreturn column, we need an AM
 API function that says whether the index can return data.

I like idea of such AM API function. Since single multicolumn index can use
multiple opclasses, AM API function should also say *what* data index can
return.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] ALTER EXTENSION .. ADD/DROP weirdness

2011-10-11 Thread Dimitri Fontaine
Robert Haas robertmh...@gmail.com writes:
 On Mon, Oct 10, 2011 at 2:52 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The underlying issue here is whether objects dependent on an extension
 member should have direct dependencies on the extension too, and if not,
 how do we prevent that?  The recordDependencyOnCurrentExtension calls
 don't have enough information to know what to do, I think.

I think the original patch, that didn't have the DEPENDENCY_EXTENSION
tracking but relied on the INTERNAL stuff, did only record first level
objects as a dependency.  Given the way INTERNAL dependencies following
are done, that kind of worked in a limited set of cases.

 Well, I'm not an expert on this code, but from a user perspective, I
 think it would be nicer if only the view ended up being a member of
 the extension, and the generated types did not.  Otherwise, writing an
 extension upgrade script requires detailed knowledge of what other
 objects are going to be generated internally.  In fact, it doesn't
 seem implausible that the set of internally generated objects from a
 given DDL command could change between releases, which would really be
 rather ugly here.

The reason why the original patch got changed by Tom is, of course, that
it failed to work properly in some interesting cases. Specifically,
handling both your use case and extension dependencies (earthdistance
depends on cube) is not so easy. How do you know you're crossing a
dependency unit when recursing in pg_depends is a nice exercise if you
want to be very familiar with WITH RECURSIVE catalog queries.  Been
there, done that :)

The main test case is DROP EXTENSION earthdistance;, adding CASCADE is
easier because you then don't care about stopping at the right place.

Of course I'm just trying to help you figure out why the problem is not
already solved, please feel free to come back with a design that make it
simple enough :)

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Simon Riggs
On Tue, Oct 11, 2011 at 9:32 PM, Bruce Momjian br...@momjian.us wrote:
 Greg Sabino Mullane wrote:
 -- Start of PGP signed section.
  If the normal default_transaction_isolation = read committed and all
  transactions that require serializable are explicitly marked in the
  application then there is no way to turn off SSI without altering the
  application. That is not acceptable, since it causes changes in
  application behaviour and possibly also performance issues.

 Performance, perhaps. What application behavior changes? Less
 serialization conflicts?

If you change default_transaction_isolation then the behaviour of the
application will change.


  We should provide a mechanism to allow people to upgrade to 9.1+
  without needing to change the meaning and/or performance of their
  apps.

 That ship has sailed.

 Simon seems to value backward-compatibility more than the average
 hackers poster.  The lack of complaints about 9.1 I think means that the
 hackers decision of _not_ providing a swich was the right one.

So its been out 1 month and you think that is sufficient time for us
to decide that there are no user complaints about SSI? I doubt it.
Longer term I have every confidence that it will be appreciated.

I'm keen to ensure people enjoy the possibility of upgrading to the
latest release. The continual need to retest applications mean that
very few users upgrade quickly or with anywhere near the frequency
with which we put out new releases. What is the point of rushing out
software that nobody can use? pg_upgrade doesn't change your
applications, so there isn't a fast path to upgrade in the way you
seem to think.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dumping roles improvements?

2011-10-11 Thread Josh Berkus

 Acording the docs, I assume -r is only roles, while -g includes
 tablespace, so what you want is already available in pg_dumpall.

No, it's not.  You don't seem to be actually reading any of my proposals.

(1) I cannot produce a single file in custom dump format which includes
both a single database and all of the roles I need to build that database.

(2) I cannot dump a set of roles without md5 passwords.

Both of these are things I need to support dev/stage/testing integration
at multiple sites.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 There is no off switch and there should be.

As Greg said, that ship has sailed.  I believe that we specifically
discussed the notion of an off switch via a GUC or similar during
9.1 development, and rejected it on the grounds that GUCs changing
fundamental transactional behavior are dangerous.  I don't believe that
you've made a case for changing that decision, and even if you had,
it's too late; 9.1 is what it is.  Can we end this subthread please,
and concentrate on something actually useful, like improving SSI's
performance?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-10-11 Thread Bruce Momjian
Simon Riggs wrote:
 On Tue, Oct 11, 2011 at 9:28 PM, Bruce Momjian br...@momjian.us wrote:
 
  Standard conforming strings
  was tricky because it was more user-facing, or certainly SQL-facing.
 
 Why is SQL more important than backup?

Because the percentage of database users it affects is different. 
Administrators know when they are installing a new version of Postgres
and already are probably changing these configuration files. 
Application binaries and perhaps application developers are not as aware
of a change, and there are a far higher percentage of them in an
organization than administrators.

 There is no good reason to do this so quickly.

I just gave you a reason above, and as I said, doing backward
compatibility can make the system more complex.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Kevin Grittner
Greg Sabino Mullane g...@endpoint.com wrote: 
 
 Kevin Grittner:
  
 Did these transactions write anything?  If not, were they
 declared to be READ ONLY?  If they were, in fact, only reading,
 it would be interesting to see what the performance looks like if
 the recommendation to use the READ ONLY attribute is followed.
 
 Yes, I'll definitely look into that, but the great majority of
 the things done in this case are read/write.
 
But it is precisely *because* those were fully cached read-only
transactions that the numbers came out so bad.  As Robert pointed
out, in other loads the difference in time per transaction could be
lost in the noise.
 
Now, I know SSI won't be good fit for all applications, but you
might not want to write it off on performance grounds for an
application where the great majority of the things done ... are
read/write based on a test which ran only read-only transactions
without declaring them READ ONLY.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] index-only scans

2011-10-11 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I haven't thought as far ahead as how we might get the information
 needed for a per-opclass flag.  A syntax addition to CREATE OPERATOR
 CLASS might be the only way.

It looks to me like it's related to the RECHECK property.  Maybe it's
just too late, though.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Overhead cost of Serializable Snapshot Isolation

2011-10-11 Thread Bruce Momjian
Simon Riggs wrote:
  Simon seems to value backward-compatibility more than the average
  hackers poster. ?The lack of complaints about 9.1 I think means that the
  hackers decision of _not_ providing a swich was the right one.
 
 So its been out 1 month and you think that is sufficient time for us
 to decide that there are no user complaints about SSI? I doubt it.
 Longer term I have every confidence that it will be appreciated.
 
 I'm keen to ensure people enjoy the possibility of upgrading to the
 latest release. The continual need to retest applications mean that
 very few users upgrade quickly or with anywhere near the frequency
 with which we put out new releases. What is the point of rushing out
 software that nobody can use? pg_upgrade doesn't change your
 applications, so there isn't a fast path to upgrade in the way you
 seem to think.

Simon, I basically think you are swimming up-stream on this issue, and
on the recovery.conf thread as well.  You can keep arguing that backward
compatibility warrants more effort, but until there is more general
agreement in the group, you are going to lose these arguments, and
frankly, the arguments are getting tiring.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >