Re: [HACKERS] Support for REINDEX CONCURRENTLY

2013-03-31 Thread Michael Paquier
Hi,

I moved this patch to the next commit fest.
Thanks,
-- 
Michael


Re: [HACKERS] citext like searches using index

2013-03-31 Thread Andres Freund
On 2013-03-30 23:35:24 -0400, Tom Lane wrote:
 David E. Wheeler da...@kineticode.com writes:
  Hackers, what would be required to get an index on a CITEXT column to 
  support LIKE?
 
 The LIKE index optimization is hard-wired into
 match_special_index_operator(), which never heard of citext's ~~
 operators.
 
 I've wanted for years to replace that mechanism with something that
 would support plug-in extensions, but have no very good idea how to
 do it.

 A bigger problem though is that the LIKE optimization is generally
 pretty ineffective for ILIKE (which is what you're really asking for
 here) because we can't assume that both case versions are consecutive
 in the index.  I think the optimization just punts upon seeing any
 letter anyway, if the operator is ILIKE.

I think the most realistic way to get this part - while being far from easy -
is to support case-insensitive locales. Then it would only need some
magic to link a normal locale to its case-insensitive locale which then
could be used to check for the presence of an appropriate index.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Getting to 9.3 beta

2013-03-31 Thread Robert Haas
On Fri, Mar 29, 2013 at 11:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 - sepgsql: db_schema:search permission and sepgsql: db_procedure:execute
   permission:
   Not much review afaics.
   = looks unfair, but unless some comitter (robert?) takes it
   on I don't see much alternative to booting it to the next fest?

I'm going to try to pick these up early next week.  I suspect they
will be straightforward.

I'm sorry I've been buried in other things recently.  I'm attempting
to get unburied.  So far it's only sort-of working, but I keep
hoping...

...Robert


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pkg-config files for libpq and ecpg

2013-03-31 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 So here is my updated patch, with the ecpg business changed as explained
 above, and the extra magic removed from the Cflags lines.

No objections to this version.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Page replacement algorithm in buffer cache

2013-03-31 Thread Jeff Janes
On Friday, March 22, 2013, Ants Aasma wrote:

 On Fri, Mar 22, 2013 at 10:22 PM, Merlin Moncure 
 mmonc...@gmail.comjavascript:;
 wrote:
  well if you do a non-locking test first you could at least avoid some
  cases (and, if you get the answer wrong, so what?) by jumping to the
  next buffer immediately.  if the non locking test comes good, only
  then do you do a hardware TAS.
 
  you could in fact go further and dispense with all locking in front of
  usage_count, on the premise that it's only advisory and not a real
  refcount.  so you only then lock if/when it's time to select a
  candidate buffer, and only then when you did a non locking test first.
   this would of course require some amusing adjustments to various
  logical checks (usage_count = 0, heh).

 Moreover, if the buffer happens to miss a decrement due to a data
 race, there's a good chance that the buffer is heavily used and
 wouldn't need to be evicted soon anyway. (if you arrange it to be a
 read-test-inc/dec-store operation then you will never go out of
 bounds) However, clocksweep and usage_count maintenance is not what is
 causing contention because that workload is distributed. The issue is
 pinning and unpinning.


That is one of multiple issues.  Contention on the BufFreelistLock is
another one.  I agree that usage_count maintenance is unlikely to become a
bottleneck unless one or both of those is fixed first (and maybe not even
then)

...



 The issue with the current buffer management algorithm is that it
 seems to scale badly with increasing shared_buffers.


I do not think that this is the case.  Neither of the SELECT-only
contention points (pinning/unpinning of index root blocks when all data is
in shared_buffers, and BufFreelistLock when all data is not in
shared_buffers) are made worse by increasing shared_buffers that I have
seen.  They do scale badly with number of concurrent processes, though.

The reports of write-heavy workloads not scaling well with shared_buffers
do not seem to be driven by the buffer management algorithm, or at least
not the freelist part of it.  They mostly seem to center on the kernel and
the IO controllers.

 Cheers,

Jeff


Re: [HACKERS] Hash Join cost estimates

2013-03-31 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Jeff Davis (pg...@j-davis.com) wrote:
 In Stephen's case the table was only 41KB, so something still seems off.
 Maybe we should model the likelihood of a collision based on the
 cardinalities (assuming a reasonably good hash function)?

 It's not really 'hash collisions' that we're trying to be wary of, per
 se, it's the risk of duplicates.

I spent some time looking at this.  I think the real issue is that the
code is trying to charge something close to cpu_operator_cost for each
time an outer tuple is compared to a hashtable entry.  That was actually
reasonable back when the costing code was last touched --- the current
scheme of storing and comparing the hash codes before testing the
equality operator proper only dates to commit
849074f9ae422c64501bb1d53ef840de870bf65c.  I punted on changing the cost
estimates at that time, and I think what this example is showing is that
that was a bad decision.  Really, when we're traipsing down a bucket
list, skipping over bucket entries with the wrong hash code is just
about free, or at least it's a whole lot cheaper than applying ExecQual.

Perhaps what we should do is charge the hash_qual_cost only for some
small multiple of the number of tuples that we expect will *pass* the
hash quals, which is a number we have to compute anyway.  The multiple
would represent the rate of hash-code collisions we expect.

I'd still be inclined to charge something per bucket entry, but it
should be really small, perhaps on the order of 0.01 times
cpu_operator_cost.

Or we could just drop that term entirely.  It strikes me that the reason
to be worried about skewed distribution in the inner relation is not
really that it changes the run time much, but rather that it risks
blowing out work_mem if specific virtual buckets have too many members
(or at best, we're forced to increase the number of batches more than we
thought to stay under work_mem; which carries runtime costs of its own).
Maybe what we should be doing with the bucketsize numbers is estimating
peak memory consumption to gate whether we'll accept the plan at all,
rather than adding terms to the cost estimate.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays

2013-03-31 Thread Dimitri Fontaine
Pavel Stehule pavel.steh...@gmail.com writes:
 I understand, but I don't agree. W have to fix impractical design of
 arrays early. A ARRAY is 1st class - so there is not possible to use
 varchar2 trick.

 if we don't would to use GUC, what do you think about compatible
 extension? We can overload a system functions behave. This can solve a
 problem with updates and migrations.

In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have
a specific name, they call them vectors. If you create an array with
only 1 dimension, then it's a vector. All the array API works the same
on vectors.

I guess we could specialize some of our API on vectors and avoid having
to tell the system which dimension we are interested into when using
them rather than multi-dimensional arrays.

Now from the internals perspective I also guess we don't want to be
generating so many more types so maybe we would need some tricks to know
how to promote a 1-D array into a vector automatically?

-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Allow external recovery_config_directory

2013-03-31 Thread Dimitri Fontaine
Heikki Linnakangas hlinnakan...@vmware.com writes:
 That's not related to the -R option, the situation with config_file is the
 same with or without it. But yes, if you use config_file option to point
 outside the data directory, the config file won't be backed up. That feels
 intuitive to me, I wouldn't expect it to be. Same with include or

It's a pain when using debian. I think pg_basebackup should copy the
configuration files in the destination directory by default, with an
option to tell it where to store them. Or at least it should issue some
client side warnings when the configuration files are known not to be
included in the backup.

The reason why copying to the destination directory is a good default is
that the debian tool pg_createcluster will then install those
configuration file in the proper place in /etc. So that the procedure
would become:

  pg_basebackup -D dest ...
  pg_createcluster 9.3 main dest
  pg_ctlcluster 9.3 main start

And you now have a working standby. Whereas currently you need to add
some extra manual steps to cover the configuration.

 include_dir directives in the config file, as well as hba_file and
 ident_file - I wouldn't expect any of the files that those point to to be
 included in the backup.

Why?

 The filesystem-level backup procedure documented in the user manual, not
 using pg_basebackup, behaves the same.

You can't expect filesystem-level procedures to know that kind of
details, or if you want those to, then use symlinks. On the other hand
the PostgreSQL tools should know to use the pg_settings view.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pkg-config files for libpq and ecpg

2013-03-31 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 Right, we need to keep libpq in CPPFLAGS, but we can remove it from
 SHLIB_LINK.

Buildfarm member anole says otherwise.  I suspected something like that
might happen --- if memory serves, there are platforms where you have to
list all required libraries on the link line, even if some of them are
only indirect dependencies of shlibs you are linking in.

I think you need to revert 3780fc6.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fix JSON examples in docs

2013-03-31 Thread Dickson S. Guedes
Hi,

This patch fix the wrong examples in funcs.sgml introduced by the commit:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=a570c98d7fa0841f17bbf51d62d02d9e493c7fcc

regards
-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br


fix_json_docs.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise zero-dimensional arrays

2013-03-31 Thread Pavel Stehule
I am not sure if CL is best example - it has no NULL SQL logic.

Current issue is introduced by design, where empty array is similar to NULL
(but not same) - what is on half way to Oracle varchar behave. But it is
not consistent with other data types in postgres.

Regards

Pavel



2013/3/31 Dimitri Fontaine dimi...@2ndquadrant.fr

 Pavel Stehule pavel.steh...@gmail.com writes:
  I understand, but I don't agree. W have to fix impractical design of
  arrays early. A ARRAY is 1st class - so there is not possible to use
  varchar2 trick.
 
  if we don't would to use GUC, what do you think about compatible
  extension? We can overload a system functions behave. This can solve a
  problem with updates and migrations.

 In Common Lisp arrays are multi-dimensional too, and the 1-D arrays have
 a specific name, they call them vectors. If you create an array with
 only 1 dimension, then it's a vector. All the array API works the same
 on vectors.

 I guess we could specialize some of our API on vectors and avoid having
 to tell the system which dimension we are interested into when using
 them rather than multi-dimensional arrays.

 Now from the internals perspective I also guess we don't want to be
 generating so many more types so maybe we would need some tricks to know
 how to promote a 1-D array into a vector automatically?

 --
 Dimitri Fontaine
 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support