Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-02-10 Thread Noah Misch
On Fri, Feb 10, 2012 at 01:59:18PM -0500, Robert Haas wrote:
> On Fri, Feb 10, 2012 at 6:42 AM, Noah Misch  wrote:
> > I like the design you have chosen. ?It would find applications beyond
> > TRUNCATE, so your use of non-specific naming is sound. ?For example, older
> > snapshots will see an empty table "t" after "CREATE TABLE t AS SELECT 1"
> > commits; that's a comparable MVCC anomaly. ?Some of our non-MVCC-safe 
> > commands
> > should perhaps just become MVCC-safe, but there will always be use cases for
> > operations that shortcut MVCC. ?When one truly does want that, your proposal
> > for keeping behavior consistent makes plenty of sense.
> 
> I guess I'm not particularly excited by the idea of trying to make
> TRUNCATE MVCC-safe.  I notice that the example involves the REPEATABLE
> READ isolation level, which is already known to be busted in a variety
> of ways; that's why we now have SERIALIZABLE, and why most people use
> READ COMMITTED.  Are there examples of this behavior at other
> isolation levels?

I've yet to see an MVCC anomaly that one can reproduce at REPEATABLE READ and
not at READ COMMITTED.  They tend to be narrow race conditions at READ
COMMITTED, yet easy to demonstrate at REPEATABLE READ.  Related:
http://archives.postgresql.org/pgsql-performance/2011-02/msg00451.php

Incidentally, people use READ COMMITTED because they don't question the
default, not because they know hazards of REPEATABLE READ.  I don't know the
bustedness you speak of; could we improve the documentation to inform folks?

> But I have to admit I'm intrigued by the idea of extending this to
> other cases, if there's a reasonable way to do that.  For example, if
> we could fix things up so that we don't see a table at all if it was
> created after we took our snapshot, that would remove one of the
> obstacles to marking pages bulk-loaded into that table with FrozenXID
> and PD_ALL_VISIBLE from the get-go.  I think a lot of people would be
> mighty happy about that.

Exactly.

> But the necessary semantics seem somewhat different.  For TRUNCATE,
> you want to throw a serialization error; but is that also what you
> want for CREATE TABLE?  Or do you just want it to appear empty?

I think an error helps just as much there.  If you create a table and populate
it with data in the same transaction, letting some snapshot see an empty table
is an atomicity failure.

Your comment illustrates a helpful point: this is just another kind of
ordinary serialization failure, one that can happen at any isolation level.
No serial transaction sequence can yield one of these errors.

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] plpgsql leaking memory when stringifying datums

2012-02-10 Thread Tom Lane
=?UTF-8?B?SmFuIFVyYmHFhHNraQ==?=  writes:
> While chasing a PL/Python memory leak, I did a few tests with PL/pgSQL
> and I think there are places where memory is not freed sufficiently early.

I think the basic issue here is that the type output function might
generate (and not bother to free) additional cruft besides its output
string, so that pfree'ing the output alone is not sufficient to avoid
a memory leak if the call occurs in a long-lived context.

However, I don't much care for the details of the proposed patch: if
we're going to fix this by running the output function in the per-tuple
memory context, and expecting the caller to do exec_eval_cleanup later,
why should we add extra pstrdup/pfree overhead?  We can just leave the
result in the temp context in most cases, and thus get a net savings
rather than a net cost from fixing this.  The attached modified patch
does it like that.

BTW, it occurs to me to wonder whether we need to worry about such
subsidiary leaks in type input functions as well.  I see at least one
place where pl_exec.c is tediously freeing the result of
exec_simple_cast_value, but if there are secondary leaks that's not
going to be good enough.  Maybe we should switch over to a similar
definition where the cast result is in the per-tuple context, and you've
got to copy it if you want it to be long-lived.

regards, tom lane


diff --git a/src/pl/plpgsql/src/pl_exec.c b/src/pl/plpgsql/src/pl_exec.c
index bf952b62478792b5564fe2af744a318322ea197c..e93b7c63be5d8a385b420dc0d9afa04cb90174a7 100644
*** a/src/pl/plpgsql/src/pl_exec.c
--- b/src/pl/plpgsql/src/pl_exec.c
*** static void exec_move_row(PLpgSQL_execst
*** 188,201 
  static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate,
  	PLpgSQL_row *row,
  	TupleDesc tupdesc);
! static char *convert_value_to_string(Datum value, Oid valtype);
! static Datum exec_cast_value(Datum value, Oid valtype,
  Oid reqtype,
  FmgrInfo *reqinput,
  Oid reqtypioparam,
  int32 reqtypmod,
  bool isnull);
! static Datum exec_simple_cast_value(Datum value, Oid valtype,
  	   Oid reqtype, int32 reqtypmod,
  	   bool isnull);
  static void exec_init_tuple_store(PLpgSQL_execstate *estate);
--- 188,204 
  static HeapTuple make_tuple_from_row(PLpgSQL_execstate *estate,
  	PLpgSQL_row *row,
  	TupleDesc tupdesc);
! static char *convert_value_to_string(PLpgSQL_execstate *estate,
! 		Datum value, Oid valtype);
! static Datum exec_cast_value(PLpgSQL_execstate *estate,
! Datum value, Oid valtype,
  Oid reqtype,
  FmgrInfo *reqinput,
  Oid reqtypioparam,
  int32 reqtypmod,
  bool isnull);
! static Datum exec_simple_cast_value(PLpgSQL_execstate *estate,
! 	   Datum value, Oid valtype,
  	   Oid reqtype, int32 reqtypmod,
  	   bool isnull);
  static void exec_init_tuple_store(PLpgSQL_execstate *estate);
*** plpgsql_exec_function(PLpgSQL_function *
*** 430,436 
  		else
  		{
  			/* Cast value to proper type */
! 			estate.retval = exec_cast_value(estate.retval, estate.rettype,
  			func->fn_rettype,
  			&(func->fn_retinput),
  			func->fn_rettypioparam,
--- 433,440 
  		else
  		{
  			/* Cast value to proper type */
! 			estate.retval = exec_cast_value(&estate, estate.retval,
! 			estate.rettype,
  			func->fn_rettype,
  			&(func->fn_retinput),
  			func->fn_rettypioparam,
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1757,1763 
  	 * Get the value of the lower bound
  	 */
  	value = exec_eval_expr(estate, stmt->lower, &isnull, &valtype);
! 	value = exec_cast_value(value, valtype, var->datatype->typoid,
  			&(var->datatype->typinput),
  			var->datatype->typioparam,
  			var->datatype->atttypmod, isnull);
--- 1761,1767 
  	 * Get the value of the lower bound
  	 */
  	value = exec_eval_expr(estate, stmt->lower, &isnull, &valtype);
! 	value = exec_cast_value(estate, value, valtype, var->datatype->typoid,
  			&(var->datatype->typinput),
  			var->datatype->typioparam,
  			var->datatype->atttypmod, isnull);
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1772,1778 
  	 * Get the value of the upper bound
  	 */
  	value = exec_eval_expr(estate, stmt->upper, &isnull, &valtype);
! 	value = exec_cast_value(value, valtype, var->datatype->typoid,
  			&(var->datatype->typinput),
  			var->datatype->typioparam,
  			var->datatype->atttypmod, isnull);
--- 1776,1782 
  	 * Get the value of the upper bound
  	 */
  	value = exec_eval_expr(estate, stmt->upper, &isnull, &valtype);
! 	value = exec_cast_value(estate, value, valtype, var->datatype->typoid,
  			&(var->datatype->typinput),
  			var->datatype->typioparam,
  			var->datatype->atttypmod, isnull);
*** exec_stmt_fori(PLpgSQL_execstate *estate
*** 1789,1795 
  	if (stmt->step)
  	{
  		value = exec_eval_exp

[HACKERS] bitfield and gcc

2012-02-10 Thread Gaetano Mendola
I wonder if somewhere in Postgres source "we" are relying on the GCC 
"correct behaviour" regarding the read-modify-write of bitfield in

structures.

Take a read at this https://lwn.net/Articles/478657/

sorry if this was already mentioned.


Regards
Gaetano Mendola

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing special case OID generation

2012-02-10 Thread Jim Nasby

On 2/7/12 8:14 AM, Alvaro Herrera wrote:

Having one sequence for each toast table could be wasteful though.  I
mean, sequences are not the best use of shared buffer cache currently.
If we could have more than one sequence data in a shared buffer page,
things would be different.  Not sure how serious this really is.


This would actually be an argument for supporting multiple page sizes... too 
bad that's such a beast.

FWIW, from our most complex production database:

cnuapp_p...@postgres08.obr=# select relkind, count(*) from pg_class group by 1;
 relkind | count
-+---
 S   |   522
 r   |  1058
 t   |   698
 i   |  2894
 v   |   221
 c   |12
(6 rows)
--
Jim C. Nasby, Database Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] initdb and fsync

2012-02-10 Thread Peter Eisentraut
On sön, 2012-02-05 at 10:53 -0800, Jeff Davis wrote:
> > initdb should do these syncs by default and offer an option to
> disable them.
> 
> For test frameworks that run initdb often, that makes sense.
> 
> But for developers, it doesn't make sense to spend 0.5s typing an
> option
> that saves you 0.3s. So, we'd need some more convenient way to choose
> the no-fsync option, like an environment variable that developers can
> set. Or maybe developers don't care about 0.3s?
> 
You can use https://launchpad.net/libeatmydata for those cases.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Core Extensions relocation

2012-02-10 Thread Josh Berkus
All,

Andrew ran crake on these modules, and they do not add any links not
added by core postgres already.

As such, can we proceed with this patch?  Greg, do you have an updated
version to run against HEAD?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] pg_dump -s dumps data?!

2012-02-10 Thread Tom Lane
I wrote:
> Now, back to the original subject of this thread: both HEAD and 9.1 are
> now operating as designed, in that they will dump the (user-provided
> portion of the) contents of an extension config table whenever that
> extension is dumped, even if --schema is specified.

Or so I thought, anyway.  Further experimentation with despez's example
shows that in HEAD, --schema is still able to block dumping of extension
config table contents, and the reason appears to be commit
a4cd6abcc901c1a8009c62a27f78696717bb8fe1, which added yet another set
of filtering conditions in a poorly chosen place; or possibly I should
say it made arbitrary changes in the definition of the --schema switch.
That patch needs some rethinking too, though I'm not sure what yet.

I also note that his example shows that if you have a selective dump
(say, with a -t switch), config table contents will be dumped even when
the owning extension is not.  This seems like a pretty clear bug:
getExtensionMembership should not be creating TableDataInfo objects for
extension config tables if the owning extension is not to be dumped.
Barring objections, I'll go fix and back-patch that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pl/perl and utf-8 in sql_ascii databases

2012-02-10 Thread Alex Hunsaker
On Thu, Feb 9, 2012 at 03:21, Christoph Berg  wrote:
> Hi,
>
> we have a database that is storing strings in various encodings (and
> non-encodings, namely the arbitrary byte soup [ ... ]
> For this reason, the database uses
> sql_ascii encoding

> ...snip...

> In sql_ascii databases, utf_e2u does not do any recoding, but then
> SvUTF8_on still marks the string as utf-8, while it isn't.
>
> (Returned values might also need fixing.)
>
> In my view, this is clearly a bug in pl/perl on sql_ascii databases.

Yeah, there was some musing about this over in:
http://archives.postgresql.org/pgsql-hackers/2011-02/msg01142.php

Seems like we missed the fact that we still did SvUTF8_on() in sv2cstr
and SvPVUTF8() when turning a perl string into a cstring.

With the attached I get:
=> create or replace function perl_white(a text) returns text as $$
return shift; $$ language plperlu;
=> select perl_white(E'\200'), perl_white(E'\200')::bytea,
coalesce(perl_white(E'\200'), 'null');
 perl_white | perl_white | coalesce
++--
| \x80   |

=> select perl_white(E'\401');
 perl_white

 \x01
(1 row)

Does the attached fix the issue for you?

Ill note that all the pls seem to behave a bit differently:

=> create or replace function py_white(a text) returns text as $$
return a; $$ language plpython3u;
=> select py_white(E'\200'), py_white(E'\200')::bytea,
coalesce(py_white(E'\200'), 'null');
py_white | py_white | coalesce
--+--+--
  |  | null
(1 row)

=>select py_white(E'\401');
 py_white
--
 \x01
(1 row)

=> create or replace function tcl_white(text) returns text as $$
return $1; $$ language pltcl;
=> select tcl_white(E'\200'), tcl_white(E'\200')::bytea,
coalesce(tcl_white(E'\200'), 'null');
 tcl_white | tcl_white | coalesce
---+---+--
   | \x80  |

 => select tcl_white(E'\402');
 tcl_white
---
 \x02
(1 row)
*** a/src/pl/plperl/plperl_helpers.h
--- b/src/pl/plperl/plperl_helpers.h
***
*** 5,23 
   * convert from utf8 to database encoding
   */
  static inline char *
! utf_u2e(const char *utf8_str, size_t len)
  {
! 	int 	enc = GetDatabaseEncoding();
! 
! 	char	   *ret = (char *) pg_do_encoding_conversion((unsigned char *) utf8_str, len, PG_UTF8, enc);
  
  	/*
! 	* when we are a PG_UTF8 or SQL_ASCII database
! 	* pg_do_encoding_conversion() will not do any conversion or
! 	* verification. we need to do it manually instead.
  	*/
  	if (enc == PG_UTF8 || enc == PG_SQL_ASCII)
! 		pg_verify_mbstr_len(PG_UTF8, utf8_str, len, false);
  
  	if (ret == utf8_str)
  		ret = pstrdup(ret);
--- 5,24 
   * convert from utf8 to database encoding
   */
  static inline char *
! utf_u2e(char *utf8_str, size_t len)
  {
! 	int		   enc = GetDatabaseEncoding();
! 	char	   *ret = utf8_str;
  
  	/*
! 	* when we are a PG_UTF8 or SQL_ASCII database pg_do_encoding_conversion()
! 	* will not do any conversion or verification. we need to do it manually
! 	* instead.
  	*/
  	if (enc == PG_UTF8 || enc == PG_SQL_ASCII)
! 		pg_verify_mbstr_len(enc, utf8_str, len, false);
! 	else
! 		ret = (char *) pg_do_encoding_conversion((unsigned char *) utf8_str, len, PG_UTF8, enc);
  
  	if (ret == utf8_str)
  		ret = pstrdup(ret);
***
*** 66,72  sv2cstr(SV *sv)
  		 * we are done */
  		SvREFCNT_inc(sv);
  
! 	val = SvPVutf8(sv, len);
  
  	/*
  	 * we use perl's length in the event we had an embedded null byte to ensure
--- 67,80 
  		 * we are done */
  		SvREFCNT_inc(sv);
  
! 	/*
! 	 * when SQL_ASCII just treat it as byte soup, that is fetch the string out
! 	 * however it is currently stored by perl
! 	 */
! 	if (GetDatabaseEncoding() == PG_SQL_ASCII)
! 		val = SvPV(sv, len);
! 	else
! 		val = SvPVutf8(sv, len);
  
  	/*
  	 * we use perl's length in the event we had an embedded null byte to ensure
***
*** 89,99  static inline SV *
  cstr2sv(const char *str)
  {
  	SV		   *sv;
! 	char	   *utf8_str = utf_e2u(str);
  
  	sv = newSVpv(utf8_str, 0);
  	SvUTF8_on(sv);
- 
  	pfree(utf8_str);
  
  	return sv;
--- 97,112 
  cstr2sv(const char *str)
  {
  	SV		   *sv;
! 	char	   *utf8_str;
! 
! 	/* no conversion when SQL_ASCII */
! 	if (GetDatabaseEncoding() == PG_SQL_ASCII)
! 		return newSVpv(str, 0);
! 
! 	utf8_str = utf_e2u(str);
  
  	sv = newSVpv(utf8_str, 0);
  	SvUTF8_on(sv);
  	pfree(utf8_str);
  
  	return sv;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 2:23 PM, Jim Nasby  wrote:
> On 2/8/12 6:17 AM, Christian Nicolaisen wrote:
>> We have some tables with documents and their metadata (filename, filetype,
>> etc).
>> Most of the time we just want to get the metadata (filename, filetype,
>> etc) and not the document itself.
>> In this case it would be nice to have the metadata (which wouldn't end up
>> on the TOAST) on a fast array (SSD-based),
>> and put the filedata on a slow array (HDD-based). It's probably possible
>> to have two tables one for metadata and one
>> for the extra file, but this is a workaround.
>
> Did you intend to post a patch? Because nothing was attached. Also, if
> you're submitting a patch you should add it to the next commitfest.

He was commenting on the possible utility of a previously-submitted
patch, which got pushed off because we weren't sure it was good for
anything.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch : Allow toast tables to be moved to a different tablespace

2012-02-10 Thread Jim Nasby

On 2/8/12 6:17 AM, Christian Nicolaisen wrote:

Hi

We have some tables with documents and their metadata (filename, filetype, etc).
Most of the time we just want to get the metadata (filename, filetype, etc) and 
not the document itself.
In this case it would be nice to have the metadata (which wouldn't end up on 
the TOAST) on a fast array (SSD-based),
and put the filedata on a slow array (HDD-based). It's probably possible to 
have two tables one for metadata and one
for the extra file, but this is a workaround.

Did you intend to post a patch? Because nothing was attached. Also, if you're 
submitting a patch you should add it to the next commitfest.

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLOG contention, part 2

2012-02-10 Thread Simon Riggs
On Fri, Feb 10, 2012 at 7:01 PM, Ants Aasma  wrote:
>
> On Feb 9, 2012 1:27 AM, "Robert Haas" 
>
>> However, there is a potential fly in the ointment: in other cases in
>> which we've reduced contention at the LWLock layer, we've ended up
>> with very nasty contention at the spinlock layer that can sometimes
>> eat more CPU time than the LWLock contention did.   In that light, it
>> strikes me that it would be nice to be able to partition the
>> contention N ways rather than just 2 ways.  I think we could do that
>> as follows.  Instead of having one control lock per SLRU, have N
>> locks, where N is probably a power of 2.  Divide the buffer pool for
>> the SLRU N ways, and decree that each slice of the buffer pool is
>> controlled by one of the N locks.  Route all requests for a page P to
>> slice P mod N.  Unlike this approach, that wouldn't completely
>> eliminate contention at the LWLock level, but it would reduce it
>> proportional to the number of partitions, and it would reduce spinlock
>> contention according to the number of partitions as well.  A down side
>> is that you'll need more buffers to get the same hit rate, but this
>> proposal has the same problem: it doubles the amount of memory
>> allocated for CLOG.
>
> Splitting the SLRU into different parts is exactly the same approach as
> associativity used in CPU caches. I found some numbers that analyze cache
> hit rate with different associativities:

My suggested approach is essentially identical approach to the one we
already use for partitioning the buffer cache and lock manager. I
expect it to be equally effective at reducing contention.

There is little danger of all hitting same partition at once, since
there are many xids and they are served out sequentially. In the lock
manager case we use the relid as key, so there is some skewing.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CLOG contention, part 2

2012-02-10 Thread Ants Aasma
On Feb 9, 2012 1:27 AM, "Robert Haas" 
> However, there is a potential fly in the ointment: in other cases in
> which we've reduced contention at the LWLock layer, we've ended up
> with very nasty contention at the spinlock layer that can sometimes
> eat more CPU time than the LWLock contention did.   In that light, it
> strikes me that it would be nice to be able to partition the
> contention N ways rather than just 2 ways.  I think we could do that
> as follows.  Instead of having one control lock per SLRU, have N
> locks, where N is probably a power of 2.  Divide the buffer pool for
> the SLRU N ways, and decree that each slice of the buffer pool is
> controlled by one of the N locks.  Route all requests for a page P to
> slice P mod N.  Unlike this approach, that wouldn't completely
> eliminate contention at the LWLock level, but it would reduce it
> proportional to the number of partitions, and it would reduce spinlock
> contention according to the number of partitions as well.  A down side
> is that you'll need more buffers to get the same hit rate, but this
> proposal has the same problem: it doubles the amount of memory
> allocated for CLOG.

Splitting the SLRU into different parts is exactly the same approach as
associativity used in CPU caches. I found some numbers that analyze cache
hit rate with different associativities:

http://research.cs.wisc.edu/multifacet/misc/spec2000cache-data/

Now obviously CPU cache access patterns are different from CLOG patterns,
but I think that the numbers strongly suggest that the reduction in hitrate
might be less than what you fear. For example, the harmonic mean of data
cache misses over all benchmark for 16, 32 and 64 cache lines:
| Size | Direct | 2-way LRU | 4-way LRU | 8-way LRU | Full LRU |
|---+-+-+-+-+-|

| 1KB | 0.0863842-- | 0.0697167-- | 0.0634309-- | 0.0563450-- | 0.0533706--
|
| 2KB | 0.0571524-- | 0.0423833-- | 0.0360463-- | 0.0330364-- | 0.0305213--
|
| 4KB | 0.0370053-- | 0.0260286-- | 0.0222981-- | 0.0202763-- | 0.0190243--
|

As you can see, the reduction in hit rate is rather small down to 4 way
associative caches.

There may be a performance problem when multiple CLOG pages that happen to
sit in a single way become hot at the same time. The most likely case that
I can come up with is multiple scans going over unhinted pages created at
different time periods. If that is something to worry about, then a tool
that's used for CPUs is to employ a fully associative victim cache behind
the main cache. If a CLOG page is evicted, it is transferred into the
victim cache, evicting a page from there. When a page isn't found in the
main cache, the victim cache is first checked for a possible hit. The
movement between the two caches doesn't need to involve any memory copying
- just swap pointers in metadata.

The victim cache will bring back concurrency issues when the hit rate of
the main cache is small - like the pgbench example you mentioned. In that
case, a simple associative cache will allow multiple reads of clog pages
simultaneously. On the other hand - in that case lock contention seems to
be the symptom, rather than the disease. I think that those cases would be
better handled by increasing the maximum CLOG SLRU size. The increase in
memory usage should be a drop in the bucket for systems that have enough
transaction processing velocity for that to be a problem.

--
Ants Aasma


Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 6:42 AM, Noah Misch  wrote:
> I like the design you have chosen.  It would find applications beyond
> TRUNCATE, so your use of non-specific naming is sound.  For example, older
> snapshots will see an empty table "t" after "CREATE TABLE t AS SELECT 1"
> commits; that's a comparable MVCC anomaly.  Some of our non-MVCC-safe commands
> should perhaps just become MVCC-safe, but there will always be use cases for
> operations that shortcut MVCC.  When one truly does want that, your proposal
> for keeping behavior consistent makes plenty of sense.

I guess I'm not particularly excited by the idea of trying to make
TRUNCATE MVCC-safe.  I notice that the example involves the REPEATABLE
READ isolation level, which is already known to be busted in a variety
of ways; that's why we now have SERIALIZABLE, and why most people use
READ COMMITTED.  Are there examples of this behavior at other
isolation levels?

But I have to admit I'm intrigued by the idea of extending this to
other cases, if there's a reasonable way to do that.  For example, if
we could fix things up so that we don't see a table at all if it was
created after we took our snapshot, that would remove one of the
obstacles to marking pages bulk-loaded into that table with FrozenXID
and PD_ALL_VISIBLE from the get-go.  I think a lot of people would be
mighty happy about that.

But the necessary semantics seem somewhat different.  For TRUNCATE,
you want to throw a serialization error; but is that also what you
want for CREATE TABLE?  Or do you just want it to appear empty?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix pg_dump for inherited defaults & not-null flags

2012-02-10 Thread Tom Lane
Robert Haas  writes:
> ... I'd lean toward back-patching.

Not hearing any contrary opinions, that's what I've done.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] auto_explain produces invalid JSON

2012-02-10 Thread Peter Eisentraut
The auto_explain module appears to create invalid JSON (in all versions
since 9.0).  For example, with the settings

auto_explain.log_format = 'json'
auto_explain.log_min_duration = 0

the query

select * from pg_type;

produces this in the log:

LOG:  duration: 529.808 ms  plan:
[
  "Query Text": "select * from pg_type;",
  "Plan": {
"Node Type": "Seq Scan",
"Relation Name": "pg_type",
"Alias": "pg_type",
"Startup Cost": 0.00,
"Total Cost": 9.87,
"Plan Rows": 287,
"Plan Width": 611
  }
]

Note that at the top level, it uses the array delimiters [ ] for what is
actually an object (key/value).  Running this through a JSON parser will
fail.

By contrast, EXPLAIN (FORMAT JSON) on the command line produces:

QUERY PLAN 
---
 [
   {
 "Plan": {
   "Node Type": "Seq Scan",
   "Relation Name": "pg_type",
   "Alias": "pg_type",
   "Startup Cost": 0.00,
   "Total Cost": 9.87,
   "Plan Rows": 287,
   "Plan Width": 611
 }
   }
 ]
(1 row)

So there is evidently something out of sync between what EXPLAIN and
what auto_explain produces.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Greg Sabino Mullane

-BEGIN PGP SIGNED MESSAGE-
Hash: RIPEMD160


Robert Haas wrote:

> One thing that's been bugging me for a while is that the tab
> completion code all works by looking backward up to n words.  What we
> really want to know is what kind of statement we're in and where we
> are in it.  Absent other information, if we're in the target list of a
> SELECT statement (nested arbitrarily) that behavior is reasonable.  If
> we're someplace in a GRANT statement, or someplace in a CREATE
> STATEMENT where, say, a column name is expected, it's really not.

I played with this years ago, but readline does not really offer a 
good way to easily get what we want (the whole statement, chunked into 
nice bits to analyze). Of course at this point we should think about 
making things more generic so we can drop in whatever readline-alternative 
comes along in the future.

> Unfortunately, making the tab completion something other than
> incredibly stupid is likely to be an insane amount of work.

Insane amount of work? Check. Inredibly stupid? No, I think we've done 
pretty good given the limitations we have. You want to see incredibly 
stupid, see some of the *other* CLIs out there (hi, mysql! :)

- -- 
Greg Sabino Mullane g...@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201202101157
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iEYEAREDAAYFAk81TI4ACgkQvJuQZxSWSsivRQCfcze1WMq81rE+mtrOReHBQ6eV
SzEAn2JySDAoCokFkY/gtz//GqolVVm5
=d2LG
-END PGP SIGNATURE-



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix PL/Python metadata when there is no result

2012-02-10 Thread Jean-Baptiste Quenot
Dear hackers,

Thanks for the work on PLPython result metadata, it is very useful!  I
just came across a crash when trying to access this metadata on the
result of an UPDATE, which obviously cannot return any tuple (unless
you specify a RETURNING clause maybe?).

Please find attached a patch that solves this issue.  Instead of a PG
crash, we get the following message:

ERROR:  plpy.Error: no result fetched

All the best,
-- 
Jean-Baptiste Quenot


0001-Fix-PLPython-metadata-access-when-there-is-no-result.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 11:22 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 10, 2012 at 11:01 AM, Tom Lane  wrote:
>>> Well, if you want a patch with low standards, what about tab-completing
>>> function names anywhere that we do not see context suggesting something
>>> else?
>
>> I think that without a bit more contextual information that's likely
>> to lead to some odd results.  Unimplemented completions will lead to
>> bizarre things happening.
>
> True.  I was first thinking of doing this only if we know we're in
> a DML query, ie *first* word on the line is
> WITH/SELECT/INSERT/UPDATE/DELETE.  However, in the current
> implementation that is not terribly workable because we are only looking
> at the current line of text, not the whole input buffer; so making such
> a restriction would disable completion after the first line of a multi-
> line command.
>
>> One thing that's been bugging me for a while is that the tab
>> completion code all works by looking backward up to n words.
>
> Yup.  At the very least it would be good if it had access to the entire
> current command, so that we could sanity-check on the basis of the first
> word.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 10, 2012 at 11:01 AM, Tom Lane  wrote:
>> Well, if you want a patch with low standards, what about tab-completing
>> function names anywhere that we do not see context suggesting something
>> else?

> I think that without a bit more contextual information that's likely
> to lead to some odd results.  Unimplemented completions will lead to
> bizarre things happening.

True.  I was first thinking of doing this only if we know we're in
a DML query, ie *first* word on the line is
WITH/SELECT/INSERT/UPDATE/DELETE.  However, in the current
implementation that is not terribly workable because we are only looking
at the current line of text, not the whole input buffer; so making such
a restriction would disable completion after the first line of a multi-
line command.

> One thing that's been bugging me for a while is that the tab
> completion code all works by looking backward up to n words.

Yup.  At the very least it would be good if it had access to the entire
current command, so that we could sanity-check on the basis of the first
word.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Upcoming PG back-branch releases, end of this month

2012-02-10 Thread Tom Lane
In view of the recently fixed data-corruption issues in hot standby
operation (bugs 6200, 6425), the core team feels we should push out back
branch update releases soon.  Due to unavailability of some key people,
the earliest feasible schedule turns out to be wrap Thursday 2/23 for
public release Monday Feb 27.  So that's what it will be.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 11:01 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 10, 2012 at 10:20 AM, Tom Lane  wrote:
>>> I'm not against tab-completing functions, if people think that's
>>> useful.  I am against tab-completing them in 1% of use-cases, which is
>>> what this patch accomplishes.  The fact that it's short doesn't make it
>>> good.
>
>> Our tab completion is in general very incomplete; we have made a
>> practice of cherry-picking the most commonly encountered cases and
>> handling only those.  Whether or not that is a good policy is a
>> philosophical question, but there is no reason to hold this particular
>> patch to a higher standard than the quality of our tab completion code
>> in general.
>
> Well, if you want a patch with low standards, what about tab-completing
> function names anywhere that we do not see context suggesting something
> else?  I really think that doing it only immediately after SELECT is
> going to prove far more of an annoyance than a help, because once you
> get used to relying on it you are going to wish it worked elsewhere.

I think that without a bit more contextual information that's likely
to lead to some odd results.  Unimplemented completions will lead to
bizarre things happening.

One thing that's been bugging me for a while is that the tab
completion code all works by looking backward up to n words.  What we
really want to know is what kind of statement we're in and where we
are in it.  Absent other information, if we're in the target list of a
SELECT statement (nested arbitrarily) that behavior is reasonable.  If
we're someplace in a GRANT statement, or someplace in a CREATE
STATEMENT where, say, a column name is expected, it's really not.

Unfortunately, making the tab completion something other than
incredibly stupid is likely to be an insane amount of work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 10, 2012 at 10:20 AM, Tom Lane  wrote:
>> I'm not against tab-completing functions, if people think that's
>> useful.  I am against tab-completing them in 1% of use-cases, which is
>> what this patch accomplishes.  The fact that it's short doesn't make it
>> good.

> Our tab completion is in general very incomplete; we have made a
> practice of cherry-picking the most commonly encountered cases and
> handling only those.  Whether or not that is a good policy is a
> philosophical question, but there is no reason to hold this particular
> patch to a higher standard than the quality of our tab completion code
> in general.

Well, if you want a patch with low standards, what about tab-completing
function names anywhere that we do not see context suggesting something
else?  I really think that doing it only immediately after SELECT is
going to prove far more of an annoyance than a help, because once you
get used to relying on it you are going to wish it worked elsewhere.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix pg_dump for inherited defaults & not-null flags

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 10:52 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Thu, Feb 9, 2012 at 6:21 PM, Tom Lane  wrote:
>>> Although this is a bug fix, it's a nontrivial change in the logic and
>>> so I'm hesitant to back-patch into stable branches.  Given the lack of
>>> prior complaints, maybe it would be best to leave it unfixed in existing
>>> branches?  Not sure.  Thoughts?
>
>> I guess I'd be in favor of back-patching it, if that doesn't look like
>> too much of a job.  We shouldn't assume that because only one person
>> reports a problem, no one else has been or will be affected.
>
> I don't think it's too much work --- what I'm more worried about is
> introducing new bugs.  If I apply it only in HEAD then it will go
> through a beta test cycle before anybody relies on it in production.
> I *think* the patch is okay, but I've been wrong before.

Well, I'm not going to second-guess your judgement too much, but my
general feeling is that it's worth taking a bit of risk to get pg_dump
to DTRT.  Dump and restore failures are extremely painful and
difficult to work around, so in my book they rank pretty highly in the
list of things that are important to get fixed.  So if you're on the
fence, I'd lean toward back-patching.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix pg_dump for inherited defaults & not-null flags

2012-02-10 Thread Tom Lane
Robert Haas  writes:
> On Thu, Feb 9, 2012 at 6:21 PM, Tom Lane  wrote:
>> Although this is a bug fix, it's a nontrivial change in the logic and
>> so I'm hesitant to back-patch into stable branches.  Given the lack of
>> prior complaints, maybe it would be best to leave it unfixed in existing
>> branches?  Not sure.  Thoughts?

> I guess I'd be in favor of back-patching it, if that doesn't look like
> too much of a job.  We shouldn't assume that because only one person
> reports a problem, no one else has been or will be affected.

I don't think it's too much work --- what I'm more worried about is
introducing new bugs.  If I apply it only in HEAD then it will go
through a beta test cycle before anybody relies on it in production.
I *think* the patch is okay, but I've been wrong before.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 10:20 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Fri, Feb 10, 2012 at 1:24 AM, Tom Lane  wrote:
>>> That seems pretty nearly entirely bogus.  What is the argument for
>>> supposing that the word right after SELECT is a function name?
>
>> It isn't necessarily, but it might be.  It'd certainly be nice to type:
>> SELECT pg_si
>> and get:
>> SELECT pg_size_pretty(
>
> Yeah, and then you'll type
>
>        SELECT pg_size_pretty(pg_dat
>
> and get nothing, and curse the authors of such a misbegotten incomplete
> concept that leads your fingers to rely on something that doesn't work
> where it should.
>
> I'm not against tab-completing functions, if people think that's
> useful.  I am against tab-completing them in 1% of use-cases, which is
> what this patch accomplishes.  The fact that it's short doesn't make it
> good.

Our tab completion is in general very incomplete; we have made a
practice of cherry-picking the most commonly encountered cases and
handling only those.  Whether or not that is a good policy is a
philosophical question, but there is no reason to hold this particular
patch to a higher standard than the quality of our tab completion code
in general.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: fix pg_dump for inherited defaults & not-null flags

2012-02-10 Thread Robert Haas
On Thu, Feb 9, 2012 at 6:21 PM, Tom Lane  wrote:
> Attached is a proposed patch to deal with the issue described here:
> http://archives.postgresql.org/pgsql-bugs/2012-02/msg0.php
>
> Even though we'd previously realized that comparing the text of
> inherited CHECK expressions is an entirely unsafe way to detect
> expression equivalence (cf comments for guessConstraintInheritance),
> pg_dump is still doing that for inherited DEFAULT expressions, with
> the predictable result that it does the wrong thing in this sort
> of example.
>
> Furthermore, as I looked more closely at the code, I realized that
> there is another pretty fundamental issue: if an inherited column has a
> default expression or NOT NULL bit that it did not inherit from its
> parent, flagInhAttrs forces the column to be treated as non-inherited,
> so that it will be emitted as part of the child table's CREATE TABLE
> command.  This is *wrong* if the column is not attislocal; it will
> result in the column incorrectly having the attislocal property after
> restore.  (Note: such a situation could only arise if the user had
> altered the column's default or NOT NULL property with ALTER TABLE after
> creation.)
>
> All of this logic predates the invention of attislocal, and really is
> attempting to make up for the lack of that bit, so it's not all that
> surprising that it falls down.
>
> So the attached patch makes the emit-column-or-not behavior depend only
> on attislocal (except for binary upgrade which has its own kluge
> solution for the problem; though note that the whether-to-dump tests now
> exactly match the special cases for binary upgrade, which they did not
> before).  Also, I've dropped the former attempts to exploit inheritance
> of defaults, and so the code will now emit a default explicitly for each
> child in an inheritance hierarchy, even if it didn't really need to.
> Since the backend doesn't track whether defaults are inherited, this
> doesn't cause any failure to restore the catalog state properly.
>
> Although this is a bug fix, it's a nontrivial change in the logic and
> so I'm hesitant to back-patch into stable branches.  Given the lack of
> prior complaints, maybe it would be best to leave it unfixed in existing
> branches?  Not sure.  Thoughts?

I guess I'd be in favor of back-patching it, if that doesn't look like
too much of a job.  We shouldn't assume that because only one person
reports a problem, no one else has been or will be affected.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Progress on fast path sorting, btree index creation time

2012-02-10 Thread Peter Geoghegan
On 9 February 2012 14:51, Robert Haas  wrote:
> I'm not sure I entirely follow all this, but I'll look at the code
> once you have it.

I have attached a revision of the patch, with the adjustments already
described. Note that I have not made this support btree tuplesorting
yet, as there is an impedance mismatch that must be resolved,
particularly with the SortSupport stuff, and I wanted to know what you
think of the multiple key specialisation first. Arguably, we could get
away with only a single specialisation - I haven't really though about
it much.

You say "Well, how often will we sort 10,000 integers?", and I think
that btree index creation is one very common and useful case, so I'd
like to look at that in more detail. I certainly don't see any reason
to not do it too.

This should give you performance for sorting multiple-keys that is
almost as good as the single-key optimisation that you found to be
more compelling. Obviously the need to actually call comparetup_heap
to look at non-leading sortkeys will vary from case to case, and this
is based on your test case, where there are no duplicates and thus no
need to ever do that. That isn't too far from representative, as I
think that in general, a majority of comparisons won't result in
equality.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
new file mode 100644
index 1452e8c..e040ace
*** a/src/backend/utils/sort/tuplesort.c
--- b/src/backend/utils/sort/tuplesort.c
***
*** 113,118 
--- 113,119 
  #include "utils/pg_rusage.h"
  #include "utils/rel.h"
  #include "utils/sortsupport.h"
+ #include "utils/template_qsort_arg.h"
  #include "utils/tuplesort.h"
  
  
*** struct Tuplesortstate
*** 281,286 
--- 282,301 
  	int			currentRun;
  
  	/*
+ 	 * Will invocations of a tuple-class encapsulating comparator
+ 	 * (comparetup_heap, comparetup_index_btree, etc) skip the leading key,
+ 	 * because that has already been compared elsewhere?
+ 	 */
+ 	bool		skipLeadingkey;
+ 
+ 	/*
+ 	 * This specialization function pointer is sometimes used as an alternative
+ 	 * to the standard qsort_arg, when it has been determined that we can
+ 	 * benefit from various per number-of-sortkey performance optimizations.
+ 	 */
+ 	void (*qsort_arg_spec)(void *a, size_t n, size_t es, void *arg);
+ 
+ 	/*
  	 * Unless otherwise noted, all pointer variables below are pointers to
  	 * arrays of length maxTapes, holding per-tape data.
  	 */
*** static void readtup_datum(Tuplesortstate
*** 492,497 
--- 507,519 
  static void reversedirection_datum(Tuplesortstate *state);
  static void free_sort_tuple(Tuplesortstate *state, SortTuple *stup);
  
+ /*
+  * A set of type-neutral specializations, for single sortkey and multiple
+  * sortkey sorts. These specialization are used for sorting both heap and
+  * btree index tuples that meet that criteria.
+  */
+ DO_TEMPLATE_QSORT_ARG(single, ONE_ADDITIONAL_CODE, single_comparetup_inline)
+ DO_TEMPLATE_QSORT_ARG(mult, MULTI_ADDITIONAL_CODE, mult_comparetup_inline)
  
  /*
   *		tuplesort_begin_xxx
*** tuplesort_begin_heap(TupleDesc tupDesc,
*** 631,636 
--- 653,661 
  		PrepareSortSupportFromOrderingOp(sortOperators[i], sortKey);
  	}
  
+ 	state->qsort_arg_spec =
+ 		nkeys==1?single_qsort_arg:mult_qsort_arg;
+ 	state->skipLeadingkey = true;
  	MemoryContextSwitchTo(oldcontext);
  
  	return state;
*** tuplesort_performsort(Tuplesortstate *st
*** 1222,1232 
  			 * amount of memory.  Just qsort 'em and we're done.
  			 */
  			if (state->memtupcount > 1)
! qsort_arg((void *) state->memtuples,
! 		  state->memtupcount,
! 		  sizeof(SortTuple),
! 		  (qsort_arg_comparator) state->comparetup,
! 		  (void *) state);
  			state->current = 0;
  			state->eof_reached = false;
  			state->markpos_offset = 0;
--- 1247,1273 
  			 * amount of memory.  Just qsort 'em and we're done.
  			 */
  			if (state->memtupcount > 1)
! 			{
! /* Use a sorting specialization if available */
! if (state->qsort_arg_spec)
! 	/* specialization available */
! 	state->qsort_arg_spec(
! (void *) state->memtuples,
! state->memtupcount,
! sizeof(SortTuple),
! (void *) state);
! else
! 	/*
! 	 * Fall back on regular qsort_arg, with function pointer
! 	 * comparator, making no compile-time assumptions about the
! 	 * number of sortkeys or the datatype(s) to be sorted.
! 	 */
! 	qsort_arg((void *) state->memtuples,
!   state->memtupcount,
!   sizeof(SortTuple),
!   (qsort_arg_comparator) state->comparetup,
!   (void *) state);
! 			}
  			state->current = 0;
  			state->eof_reached = false;
  			state->markpos_offset = 0;
*** make_bounded_heap(Tuplesortstate *state)
*** 2407,2412 
---

Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Tom Lane
Robert Haas  writes:
> On Fri, Feb 10, 2012 at 1:24 AM, Tom Lane  wrote:
>> That seems pretty nearly entirely bogus.  What is the argument for
>> supposing that the word right after SELECT is a function name?

> It isn't necessarily, but it might be.  It'd certainly be nice to type:
> SELECT pg_si
> and get:
> SELECT pg_size_pretty(

Yeah, and then you'll type

SELECT pg_size_pretty(pg_dat

and get nothing, and curse the authors of such a misbegotten incomplete
concept that leads your fingers to rely on something that doesn't work
where it should.

I'm not against tab-completing functions, if people think that's
useful.  I am against tab-completing them in 1% of use-cases, which is
what this patch accomplishes.  The fact that it's short doesn't make it
good.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Benedikt Grundmann
On 10/02/12 08:50, Robert Haas wrote:
> On Fri, Feb 10, 2012 at 1:24 AM, Tom Lane  wrote:
> > Peter Eisentraut  writes:
> >> That seems pretty useful, and it's more or less a one-line change, as in
> >> the attached patch.
> >
> > That seems pretty nearly entirely bogus.  What is the argument for
> > supposing that the word right after SELECT is a function name?  I would
> > think it would be a column name (from who-knows-what table) much more
> > often.
> 
> It isn't necessarily, but it might be.  It'd certainly be nice to type:
> 
> SELECT pg_si
> 
> and get:
> 
> SELECT pg_size_pretty(
> 

Well the important problem in completion is how the dictionary of
possible terms is determined and how much context info is used to
reduce / enhance that dictionary.

case 1:

  select x

case 2:

  select x from bar

Possible dictionaries in case 1:

  1.1 complete nothing
  1.2 complete assuming the union of all columns from all tables 
  in the search_path as the dictionary.
  1.3 complete assuming the union of all function names in the
  search_path as the dictionary
  1.4 complete assuming 1.2 and 1.3

Possible dictionaries in case 2:

  2.1 treat it like case 1
  2.2 complete assuming the union of all columns from bar 
  in the search_path as the dictionary
  2.3   2.2 + 1.3

Now these are heuristics and the decision becomes a question
of usefulness vs amount of time it costs to implement and 
possibly how expensive computing the dictionary is.

I personally would like 1.3 in case 1 and 2.3 in case 2,
because I expect 1.2 to be to show too many possible
completions to be useful.  But even 1.3 in both cases wouldn't
be that bad.

Cheers,

Bene

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Pavel Stehule
2012/2/10 Robert Haas :
> On Fri, Feb 10, 2012 at 1:24 AM, Tom Lane  wrote:
>> Peter Eisentraut  writes:
>>> That seems pretty useful, and it's more or less a one-line change, as in
>>> the attached patch.
>>
>> That seems pretty nearly entirely bogus.  What is the argument for
>> supposing that the word right after SELECT is a function name?  I would
>> think it would be a column name (from who-knows-what table) much more
>> often.
>
> It isn't necessarily, but it might be.  It'd certainly be nice to type:
>
> SELECT pg_si
>
> and get:
>
> SELECT pg_size_pretty(

+1

Pavel

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql tab completion for SELECT

2012-02-10 Thread Robert Haas
On Fri, Feb 10, 2012 at 1:24 AM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> That seems pretty useful, and it's more or less a one-line change, as in
>> the attached patch.
>
> That seems pretty nearly entirely bogus.  What is the argument for
> supposing that the word right after SELECT is a function name?  I would
> think it would be a column name (from who-knows-what table) much more
> often.

It isn't necessarily, but it might be.  It'd certainly be nice to type:

SELECT pg_si

and get:

SELECT pg_size_pretty(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Collecting statistics on CSV file data

2012-02-10 Thread Shigeru Hanada
(2011/12/15 11:30), Etsuro Fujita wrote:
> (2011/12/14 15:34), Shigeru Hanada wrote:
>> I think this patch could be marked as "Ready for committer" with some
>> minor fixes.  Please find attached a revised patch (v6.1).

I've tried to make pgsql_fdw work with this feature, and found that few
static functions to be needed to exported to implement ANALYZE handler
in short-cut style.  The "Short-cut style" means the way to generate
statistics (pg_class and pg_statistic) for foreign tables without
retrieving sample data from foreign server.

Attached patch (export_funcs.patch) exports examine_attribute and
update_attstats which are necessary to implement ANALYZE handler for
pgsql_fdw.  In addition to exporting, update_attstats is also renamed to
vac_update_attstats to fit with already exported function
vac_update_relstats.

I also attached archive of WIP pgsql_fdw with ANALYZE support.  This
version has better estimation than original pgsql_fdw, because it can
use selectivity of qualifiers evaluated on local side to estimate number
of result rows.  To show the effect of ANALYZE clearly, WHERE push-down
feature is disabled.  Please see pgsqlAnalyzeForeignTable and
store_remote_stats in pgsql_fdw.c.

I used pgbench_accounts tables with 3 records, and got reasonable
rows estimation for queries below.


postgres=# UPDATE pgbench_accounts SET filler = NULL
postgres-# WHERE aid % 3 = 0;
postgres=# ANALYZE;


postgres=# ANALYZE pgbench_accounts;  -- needs explicit table name
postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE filler IS NULL;
 QUERY PLAN

 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=100030
width=97)
   Filter: (filler IS NULL)
   Remote SQL: DECLARE pgsql_fdw_cursor_13 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 100;
 QUERY PLAN

 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=96 width=97)
   Filter: (aid < 100)
   Remote SQL: DECLARE pgsql_fdw_cursor_14 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

postgres=# EXPLAIN SELECT * FROM pgbench_accounts WHERE aid < 1000;
 QUERY PLAN

 Foreign Scan on pgbench_accounts  (cost=100.00..40610.00 rows=1004
width=97)
   Filter: (aid < 1000)
   Remote SQL: DECLARE pgsql_fdw_cursor_15 SCROLL CURSOR FOR SELECT aid,
bid, abalance, filler FROM public.pgbench_accounts
(3 rows)

In implementing ANALYZE handler, hardest part was copying anyarray
values from remote to local.  If we can make it common in core, it would
help FDW authors who want to implement ANALYZE handler without
retrieving sample rows from remote server.

Regards,
-- 
Shigeru Hanada
commit bb28cb5a69aae3bd9c7fbebc8b9483d23711bec4
Author: Shigeru Hanada 
Date:   Thu Feb 9 16:06:14 2012 +0900

Export functions which are useful for FDW analyze support.

Export examine_attribute and update_attstats (with renaming to
vac_update_attstats) which are useful (and nealy required) to implement
short-cut version of ANALYZE handler in FDWs.

diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index 6a22d49..d0a323a 100644
--- a/src/backend/commands/analyze.c
+++ b/src/backend/commands/analyze.c
@@ -94,8 +94,6 @@ static void compute_index_stats(Relation onerel, double 
totalrows,
AnlIndexData *indexdata, int nindexes,
HeapTuple *rows, int numrows,
MemoryContext col_context);
-static VacAttrStats *examine_attribute(Relation onerel, int attnum,
- Node *index_expr);
 static int acquire_sample_rows(Relation onerel,
   HeapTuple *rows, int 
targrows,
   double *totalrows, 
double *totaldeadrows,
@@ -105,8 +103,6 @@ static int acquire_inherited_sample_rows(Relation onerel,
  double *totalrows, 
double *totaldeadrows,
  BlockNumber 
*totalpages, int elevel);
 static int compare_rows(const void *a, const void *b);
-static void update_attstats(Oid relid, bool inh,
-   int natts, VacAttrStats **vacattrstats);
 static Datum std_fetch_func(VacAttrStatsP stats, int 

Re: [HACKERS] RFC: Making TRUNCATE more "MVCC-safe"

2012-02-10 Thread Noah Misch
On Thu, Feb 09, 2012 at 11:11:16PM +0200, Marti Raudsepp wrote:
> I've always been a little wary of using the TRUNCATE command due to
> the warning in the documentation about it not being "MVCC-safe":
> queries may silently give wrong results and it's hard to tell when
> they are affected.
> 
> That got me thinking, why can't we handle this like a standby server
> does -- if some query has data removed from underneath it, it aborts
> with a serialization failure.
> 
> Does this solution sound like a good idea?
> 
> The attached patch is a lame attempt at implementing this. I added a
> new pg_class.relvalidxmin attribute which tracks the Xid of the last
> TRUNCATE (maybe it should be called reltruncatexid?). Whenever
> starting a relation scan with a snapshot older than relvalidxmin, an
> error is thrown. This seems to work out well since TRUNCATE updates
> pg_class anyway, and anyone who sees the new relfilenode automatically
> knows when it was truncated.

I like the design you have chosen.  It would find applications beyond
TRUNCATE, so your use of non-specific naming is sound.  For example, older
snapshots will see an empty table "t" after "CREATE TABLE t AS SELECT 1"
commits; that's a comparable MVCC anomaly.  Some of our non-MVCC-safe commands
should perhaps just become MVCC-safe, but there will always be use cases for
operations that shortcut MVCC.  When one truly does want that, your proposal
for keeping behavior consistent makes plenty of sense.

> Should I also add another counter to pg_stat_database_conflicts?
> Currently this table is only used on standby servers.

> ERROR:  canceling statement due to conflict with TRUNCATE TABLE on foo
> DETAIL:  Rows visible to this transaction have been removed.

My initial reaction is not to portray this like a recovery conflict, since
several aspects distinguish it from all recovery conflict types.

(I have not read your actual patch.)

Thanks,
nm

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog location arithmetic

2012-02-10 Thread Fujii Masao
On Fri, Feb 10, 2012 at 7:00 AM, Euler Taveira de Oliveira
 wrote:
> On 08-02-2012 09:35, Fujii Masao wrote:
>
> Fujii, new patch attached. Thanks for your tests.

Thanks for the new patch!

>> But another problem happened. When I changed pg_proc.h so that the unused
>> OID was assigned to pg_xlog_location_diff(), and executed the above again,
>> I encountered the segmentation fault:
>>
> I reproduced the problems in my old 32-bit laptop. I fixed it casting to
> int64. I also updated the duplicated OID.

Yep, in the updated patch, I could confirm that the function works fine without
any error in my machine. The patch looks fine to me except the following minor
comments:

In the document, it's better to explain clearly that the function subtracts the
second argument from the first.

-These functions cannot be executed during recovery.
+   These functions cannot be executed during recovery (except
+   pg_xlog_location_diff).

+   pg_xlog_location_diff calculates the difference in bytes
+   between two transaction log locations. It can be used with
+   pg_stat_replication or some functions shown in
+to get the replication 
lag.

Very minor comment: you should use spaces rather than a tab to indent each line.

>> Why OID needs to be reassigned?
>>
> There isn't a compelling reason. It is just a way to say: "hey, it is another
> function with the same old name".
>
> I'll not attach another version for pg_size_pretty because it is a matter of
> updating a duplicated OID.

Okay, I reviewed the previous patch again. That looks fine to me.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers